PinSAGE: Graph Convolutional Neural Networks for Web-Scale Recommender Systems, KDD 2018

2 minute read

[Paper Review] PinSAGE: Graph Convolutional Neural Networks for Web-Scale Recommender Systems, KDD 2018

Goal :

Generate high-quality embeddings/reps. of pins(items) that can be used for recommendations with graph

Challenge :

GCN is infeasible when the graph has billions of nodes and the structure is constantly evolving. ($\because$ GCN requires operating on the full graph Laplacian during training.)
Inefficiency when generating embeddings for billions of nodes
Limitation of GraphSAGE is that whole graph is stored in GPU memory.

Solution :

improve scalability via random-walk based localized convolution approach

Method :

1. on-the-fly convolutions

: localized convolutions

Learn how to aggregate information from neighborhood
Process of localized convolution operation : transform -> aggregator/pooling -> concat (aggregated neighborhood & central node) -> transform (concatenated vector) the output is a reps of central node that incorporates both information about itself and its local graph neighborhood.

2.Importance pooling

Define importance-based neighborhood

neighborhood of node $u$ is defined as the top $T$ nodes with the highest normalized visit counts 1) Random walks starting from node $u$ 2) compute $L_1$-normalized visit count of nodes visited by random walk
장점 : 1) Fixed number of neighborhood -> Control the memory footprint during training 2) the importance of neighbors can be considered during localized convolutions (importance based weighted-mean on aggregator/pooling)

3. Producer-consumer minibatch construction

GPU에선 model computation (localized convolution step) 수행
CPU에선 1) node feature extraction 2) re-indexing to create a sub-graph for mini-batch 3) negative-sampling
multi-GPU training -GPU는 producer-consumer pattern을 디자인해서 current iteration 수행 -CPU는 next iteration을 위한 computation 수행

4. Max-margin-based Loss Function

Maximize inner product of pos item with query item
query와 연관된 아이템 $i$의 DP가 연관없는 아이템 $n_k$과의 DP보다 커질 수 있도록 학습 ( $z_q\cdot z_{n_k} < z_q\cdot z_{i}$ )

$J_G(z_q,z_i)=\mathbb{E}{n_k\sim P_n(q)}max{0,z_q\cdot z{n_k}-z_q\cdot z_i+\Delta}$ where $n_k$ is negative sample.

5. Curriculum training

: harder-and-harder samples

At epoch $n$ of the training, we add $n-1$ hard negative items to the set of negative items for each item.

hard negative item : hard negative item은 random negative item에 비해 query와 더 유사하기 때문에, 모델로 하여금 finer granularity를 학습시킬 수 있음.
- hard negative item sampling 방법 1) Personalized PageRank 스코어 기반 ranking 2) 2000-5000 ranked items 중 random sampling
hard negative item을 사용하게 되면 converge 속도가 느려져서, curriculum training scheme을 적용함.

At epoch $n$ of the training, we add $n-1$ hard negative items to the set of negative items for each item.

6. MapReduce inference

: efficiently generate embeddings using a trained PinSage model. distribute the fully-trained GCN to generate embeddings for billions of nodes.

Evaluation :

Task 종류 1) Related Pin Recommendation task : query pin과 유사한 K-nearest neighbors를 추천하는 것 2) Homefeed recommendation task : 유저가 가장 최근에 pinned한 아이템과 가장 유사한 pin을 추천하는 것

Offline Evaluation Evaluate the performance on the related Pin recommendation task
- hit-rate : the fraction of queries $q$ where $i$ was ranked among the top $K$ of the test sample.
- Evaluation metric : Scaled MRR(Mean Reciprocal Rank) - query item $q$에 추천된 아이템들 중 item $j$의 랭킹

$MRR= {1\over n}\sum_{(q,i)\in L} {1\over [R_{i,q}/100]}$

User Studies 유저에게 두가지 candidate pins(서로 다른 reco. 알고리즘을 통해 얻은)를 주고 무엇이 query pin과 더 연관되어 보이는지 판단하게 함
Production A/B Test
- homefeed recommendation에서 유저로 하여금 re-pinned될 가능성이 높은 pins를 추천해주는 것
- metric : repin rate
  
  measures the percentage of homefeed recommendations that have been saved by the users.

출처 : Graph Convolutional Neural Networks for Web-Scale Recommender Systems, KDD 18 https://arxiv.org/pdf/1806.01973.pdf

Share on

Twitter Facebook LinkedIn

Yenguage

PinSAGE: Graph Convolutional Neural Networks for Web-Scale Recommender Systems, KDD 2018

Goal :

Challenge :

Solution :

Method :

1. on-the-fly convolutions

2.Importance pooling

3. Producer-consumer minibatch construction

4. Max-margin-based Loss Function

5. Curriculum training

6. MapReduce inference

Evaluation :

Share on

Leave a comment

You may also enjoy

IGMC: Inductive Matrix Completion based on Graph Neural Networks, ICLR 2020

STAR-GCN: Stacked and Reconstructed Graph Convolutional Networks for Recommender Systems, IJCAI 19

GraphSAGE: Inductive Representation Learning on Large Graphs, Neurips 2017

TransGRec: Learning to Transfer Graph Embeddings for Inductive Graph based Recommendation, SIGIR 2020