2 minute read

[Paper Review] PinSAGE: Graph Convolutional Neural Networks for Web-Scale Recommender Systems, KDD 2018

Goal :

Generate high-quality embeddings/reps. of pins(items) that can be used for recommendations with graph

Challenge :

  • GCN is infeasible when the graph has billions of nodes and the structure is constantly evolving. ($\because$ GCN requires operating on the full graph Laplacian during training.)
  • Inefficiency when generating embeddings for billions of nodes
  • Limitation of GraphSAGE is that whole graph is stored in GPU memory.

Solution :

  • improve scalability via random-walk based localized convolution approach

Method :

1. on-the-fly convolutions

: localized convolutions

  • Learn how to aggregate information from neighborhood
  • Process of localized convolution operation : transform -> aggregator/pooling -> concat (aggregated neighborhood & central node) -> transform (concatenated vector) the output is a reps of central node that incorporates both information about itself and its local graph neighborhood.

2.Importance pooling

  • Define importance-based neighborhood
    neighborhood of node $u$ is defined as the top $T$ nodes with the highest normalized visit counts 1) Random walks starting from node $u$ 2) compute $L_1$-normalized visit count of nodes visited by random walk
  • 장점 : 1) Fixed number of neighborhood -> Control the memory footprint during training 2) the importance of neighbors can be considered during localized convolutions (importance based weighted-mean on aggregator/pooling)

3. Producer-consumer minibatch construction

  • GPU에선 model computation (localized convolution step) 수행
  • CPU에선 1) node feature extraction 2) re-indexing to create a sub-graph for mini-batch 3) negative-sampling
  • multi-GPU training -GPU는 producer-consumer pattern을 디자인해서 current iteration 수행 -CPU는 next iteration을 위한 computation 수행

4. Max-margin-based Loss Function

  • Maximize inner product of pos item with query item
  • query와 연관된 아이템 $i$의 DP가 연관없는 아이템 $n_k$과의 DP보다 커질 수 있도록 학습 ( $z_q\cdot z_{n_k} < z_q\cdot z_{i}$ )

$J_G(z_q,z_i)=\mathbb{E}{n_k\sim P_n(q)}max{0,z_q\cdot z{n_k}-z_q\cdot z_i+\Delta}$ where $n_k$ is negative sample.

5. Curriculum training

: harder-and-harder samples

At epoch $n$ of the training, we add $n-1$ hard negative items to the set of negative items for each item.

  • hard negative item : hard negative item은 random negative item에 비해 query와 더 유사하기 때문에, 모델로 하여금 finer granularity를 학습시킬 수 있음.
    • hard negative item sampling 방법 1) Personalized PageRank 스코어 기반 ranking 2) 2000-5000 ranked items 중 random sampling
  • hard negative item을 사용하게 되면 converge 속도가 느려져서, curriculum training scheme을 적용함.

    At epoch $n$ of the training, we add $n-1$ hard negative items to the set of negative items for each item.

6. MapReduce inference

: efficiently generate embeddings using a trained PinSage model. distribute the fully-trained GCN to generate embeddings for billions of nodes.

Evaluation :

Task 종류 1) Related Pin Recommendation task : query pin과 유사한 K-nearest neighbors를 추천하는 것 2) Homefeed recommendation task : 유저가 가장 최근에 pinned한 아이템과 가장 유사한 pin을 추천하는 것

  1. Offline Evaluation Evaluate the performance on the related Pin recommendation task
    • hit-rate : the fraction of queries $q$ where $i$ was ranked among the top $K$ of the test sample.
    • Evaluation metric : Scaled MRR(Mean Reciprocal Rank) - query item $q$에 추천된 아이템들 중 item $j$의 랭킹

$MRR= {1\over n}\sum_{(q,i)\in L} {1\over [R_{i,q}/100]}$

  1. User Studies 유저에게 두가지 candidate pins(서로 다른 reco. 알고리즘을 통해 얻은)를 주고 무엇이 query pin과 더 연관되어 보이는지 판단하게 함

  2. Production A/B Test

    • homefeed recommendation에서 유저로 하여금 re-pinned될 가능성이 높은 pins를 추천해주는 것
    • metric : repin rate

      measures the percentage of homefeed recommendations that have been saved by the users.

출처 : Graph Convolutional Neural Networks for Web-Scale Recommender Systems, KDD 18 https://arxiv.org/pdf/1806.01973.pdf

Leave a comment