PinSAGE: Graph Convolutional Neural Networks for Web-Scale Recommender Systems, KDD 2018
[Paper Review] PinSAGE: Graph Convolutional Neural Networks for Web-Scale Recommender Systems, KDD 2018
Goal :
Generate high-quality embeddings/reps. of pins(items) that can be used for recommendations with graph
Challenge :
- GCN is infeasible when the graph has billions of nodes and the structure is constantly evolving. ($\because$ GCN requires operating on the full graph Laplacian during training.)
- Inefficiency when generating embeddings for billions of nodes
- Limitation of GraphSAGE is that whole graph is stored in GPU memory.
Solution :
- improve scalability via random-walk based localized convolution approach
Method :
1. on-the-fly convolutions
: localized convolutions
- Learn how to aggregate information from neighborhood
- Process of localized convolution operation : transform -> aggregator/pooling -> concat (aggregated neighborhood & central node) -> transform (concatenated vector) the output is a reps of central node that incorporates both information about itself and its local graph neighborhood.
2.Importance pooling
-
- Define importance-based neighborhood
- neighborhood of node $u$ is defined as the top $T$ nodes with the highest normalized visit counts 1) Random walks starting from node $u$ 2) compute $L_1$-normalized visit count of nodes visited by random walk
- 장점 : 1) Fixed number of neighborhood -> Control the memory footprint during training 2) the importance of neighbors can be considered during localized convolutions (importance based weighted-mean on aggregator/pooling)
3. Producer-consumer minibatch construction
- GPU에선 model computation (localized convolution step) 수행
- CPU에선 1) node feature extraction 2) re-indexing to create a sub-graph for mini-batch 3) negative-sampling
- multi-GPU training -GPU는 producer-consumer pattern을 디자인해서 current iteration 수행 -CPU는 next iteration을 위한 computation 수행
4. Max-margin-based Loss Function
- Maximize inner product of pos item with query item
- query와 연관된 아이템 $i$의 DP가 연관없는 아이템 $n_k$과의 DP보다 커질 수 있도록 학습 ( $z_q\cdot z_{n_k} < z_q\cdot z_{i}$ )
$J_G(z_q,z_i)=\mathbb{E}{n_k\sim P_n(q)}max{0,z_q\cdot z{n_k}-z_q\cdot z_i+\Delta}$ where $n_k$ is negative sample.
5. Curriculum training
: harder-and-harder samples
At epoch $n$ of the training, we add $n-1$ hard negative items to the set of negative items for each item.
- hard negative item :
hard negative item은 random negative item에 비해 query와 더 유사하기 때문에, 모델로 하여금 finer granularity를 학습시킬 수 있음.
- hard negative item sampling 방법 1) Personalized PageRank 스코어 기반 ranking 2) 2000-5000 ranked items 중 random sampling
- hard negative item을 사용하게 되면 converge 속도가 느려져서, curriculum training scheme을 적용함.
At epoch $n$ of the training, we add $n-1$ hard negative items to the set of negative items for each item.
6. MapReduce inference
: efficiently generate embeddings using a trained PinSage model. distribute the fully-trained GCN to generate embeddings for billions of nodes.
Evaluation :
Task 종류 1) Related Pin Recommendation task : query pin과 유사한 K-nearest neighbors를 추천하는 것 2) Homefeed recommendation task : 유저가 가장 최근에 pinned한 아이템과 가장 유사한 pin을 추천하는 것
- Offline Evaluation
Evaluate the performance on the related Pin recommendation task
- hit-rate : the fraction of queries $q$ where $i$ was ranked among the top $K$ of the test sample.
- Evaluation metric : Scaled MRR(Mean Reciprocal Rank) - query item $q$에 추천된 아이템들 중 item $j$의 랭킹
$MRR= {1\over n}\sum_{(q,i)\in L} {1\over [R_{i,q}/100]}$
-
User Studies 유저에게 두가지 candidate pins(서로 다른 reco. 알고리즘을 통해 얻은)를 주고 무엇이 query pin과 더 연관되어 보이는지 판단하게 함
-
Production A/B Test
- homefeed recommendation에서 유저로 하여금 re-pinned될 가능성이 높은 pins를 추천해주는 것
- metric : repin rate
measures the percentage of homefeed recommendations that have been saved by the users.
출처 : Graph Convolutional Neural Networks for Web-Scale Recommender Systems, KDD 18 https://arxiv.org/pdf/1806.01973.pdf
Leave a comment