GPT1: Improving Language Understanding by Generative Pre-Training, Technical report, OpenAI, 2018

less than 1 minute read

[Paper Review] GPT1: Improving Language Understanding by Generative Pre-Training, Technical report, OpenAI, 2018

Goal

Learn a universal representation that transfers with little adaptation to a wide range of tasks

Challenge

word-level information by unlabeled data 1) unclear of optimization objectives for effective transfer 2) no consensus on the most effective way to transfer these learned representations to the target task
기존의 다른 pretrained LM 들의 한계 (feature based approach) 1) restrict on short rangeELMo는 LSTM 기반 2) downstream task 를 위해 추가적인 아키텍처 필요

Solution :

two-stage semi-supervised approachcombination of unsupervised pre-training and supervised fine-tuning 1) generative pre-training of LM on a diverse corpus of unlabeled text (unsupervised) 2) then, discriminative fine-tuning on each specific task (supervised)
task-aware input transformation during fine-tuning for effective transfer w/ minimal changes of model architecture

Method:

use Transformer decoder for LM for long-term dependencies
task-specific input adaptations for robust transfer performance (traversal style)

Evaluation:

Evaluation on NLI, QA & Commonsense Reasoning, Sentence similarity, Classification task
Effect of number of layers transferredincreasing number of layers ~ transfer performance

source: Alec Radford, Karthik Narasimhan, Tim Salimans, and Ilya Sutskever. 2018. Improving language understanding with unsupervised learning. Technical report, OpenAI. https://www.cs.ubc.ca/~amuham01/LING530/papers/radford2018improving.pdf

Share on

Twitter Facebook LinkedIn

Yenguage

GPT1: Improving Language Understanding by Generative Pre-Training, Technical report, OpenAI, 2018

Goal

Challenge

Solution :

Method:

Evaluation:

Share on

Leave a comment

You may also enjoy

IGMC: Inductive Matrix Completion based on Graph Neural Networks, ICLR 2020

STAR-GCN: Stacked and Reconstructed Graph Convolutional Networks for Recommender Systems, IJCAI 19

GraphSAGE: Inductive Representation Learning on Large Graphs, Neurips 2017

TransGRec: Learning to Transfer Graph Embeddings for Inductive Graph based Recommendation, SIGIR 2020