GPT-1

E469810

GPT-1 is the first-generation Generative Pre-trained Transformer language model developed by OpenAI, introducing the pretrain-then-finetune paradigm for large-scale NLP.

Try in SPARQL Jump to: Statements Referenced by

Statements (48)

Predicate Object
instanceOf Generative Pre-trained Transformer
autoregressive language model
large language model
architectureDepth 12 layers
basedOn Transformer architecture
coAuthor Ilya Sutskever NERFINISHED
Karthik Narasimhan NERFINISHED
Tim Salimans NERFINISHED
designGoal improve sample efficiency for NLP tasks
leverage unsupervised data for representation learning
developer OpenAI NERFINISHED
field artificial intelligence
machine learning
natural language processing
fineTuningTasks natural language inference
question answering
reading comprehension
semantic similarity
text classification
improvedOver task-specific models trained from scratch
inferenceMode left-to-right generation
influenced GPT-2 NERFINISHED
GPT-3 NERFINISHED
subsequent large language models
inputType text
introducedConcept large-scale unsupervised pretraining for NLP
task-specific supervised fine-tuning after pretraining
language English
modelType unidirectional transformer
notableContribution demonstrated transfer learning in NLP
showed that a single pretrained model can be adapted to many tasks
numberOfParameters 117M
organization OpenAI NERFINISHED
outputType text
parameterCountCategory hundreds of millions of parameters
pretrainingDataType BooksCorpus-like web text
pretrainingTask language modeling
primaryAuthor Alec Radford NERFINISHED
publicationTitle Improving Language Understanding by Generative Pre-Training NERFINISHED
publicationVenue OpenAI technical report NERFINISHED
publicationYear 2018
tokenizerType Byte Pair Encoding NERFINISHED
trainingMethod supervised fine-tuning
unsupervised pretraining
trainingObjective next-token prediction
trainingParadigm pretrain-then-finetune
uses multi-head self-attention
positional embeddings

Referenced by (1)

Full triples — surface form annotated when it differs from this entity's canonical label.

GPT series hasMember GPT-1