GPT-2
E18339
GPT-2 is a large transformer-based language model known for generating coherent, human-like text and sparking widespread discussion about the implications of advanced AI text generation.
All labels observed (3)
| Label | Occurrences |
|---|---|
| GPT-2 canonical | 6 |
| Language Models are Unsupervised Multitask Learners | 2 |
| Generative Pre-trained Transformer | 1 |
How this entity was disambiguated
This entity first appeared as the object of triple T146312 — resolving that mention is where its identity was fixed. The disambiguator weighed these candidate entities and picked the highlighted one (or “None”, minting a new entity). This is how homonymy is resolved: the same surface form can point to different entities.
Target entity: GPT-2 Context triple: [OpenAI, developed, GPT-2]
-
A.
DALL·E
DALL·E is an AI model developed by OpenAI that generates images from natural language descriptions, enabling text-to-image synthesis.
-
B.
OpenAI
OpenAI is an artificial intelligence research organization best known for developing advanced AI models such as ChatGPT and GPT series.
-
C.
Claude
Claude is a given name most famously associated with Claude Shannon, the American mathematician and electrical engineer known as the father of information theory.
-
D.
Google Brain
Google Brain is a deep learning research team at Google that pioneered many advances in neural networks and artificial intelligence.
-
E.
Element AI
Element AI was a Montreal-based artificial intelligence company and research lab known for developing enterprise AI solutions and advancing deep learning research.
- F. None of above. chosen
- G. Unsure - the case is ambiguous/there is not enough information to decide.
Target entity: GPT-2 Target entity description: GPT-2 is a large transformer-based language model known for generating coherent, human-like text and sparking widespread discussion about the implications of advanced AI text generation.
-
A.
DALL·E
DALL·E is an AI model developed by OpenAI that generates images from natural language descriptions, enabling text-to-image synthesis.
-
B.
OpenAI
OpenAI is an artificial intelligence research organization best known for developing advanced AI models such as ChatGPT and GPT series.
-
C.
Claude
Claude is a given name most famously associated with Claude Shannon, the American mathematician and electrical engineer known as the father of information theory.
-
D.
Google Brain
Google Brain is a deep learning research team at Google that pioneered many advances in neural networks and artificial intelligence.
-
E.
Element AI
Element AI was a Montreal-based artificial intelligence company and research lab known for developing enterprise AI solutions and advancing deep learning research.
- F. None of above. chosen
Statements (53)
| Predicate | Object |
|---|---|
| instanceOf |
OpenAI model
ⓘ
autoregressive language model ⓘ large language model ⓘ neural network ⓘ transformer-based model ⓘ |
| announcementDate | 2019-02-14 ⓘ |
| architectureType | decoder-only transformer ⓘ |
| basedOn | Transformer architecture ⓘ |
| contextWindowSize | 1024 tokens ⓘ |
| developer | OpenAI ⓘ |
| evaluationSetting | zero-shot evaluation on downstream tasks ⓘ |
| framework | TensorFlow ⓘ |
| fullModelReleaseDate | 2019-11 ⓘ |
| implementationLanguage | Python ⓘ |
| influenced | subsequent large language models ⓘ |
| initialReleasePolicy |
partial model release
ⓘ
staged release ⓘ |
| inputType | text ⓘ |
| language | English ⓘ |
| largestVersionParameters | 1.5B ⓘ |
| license | OpenAI model license ⓘ |
| notableFor |
generating coherent long-form text
ⓘ
public debate about AI misuse risks ⓘ zero-shot performance on many NLP tasks ⓘ |
| notableImpact | popularized large-scale transformer language models ⓘ |
| openSourceImplementation | Hugging Face Transformers ⓘ |
| outputType | text ⓘ |
| paperAuthors |
Alec Radford
ⓘ
Dario Amodei ⓘ David Luan ⓘ Ilya Sutskever ⓘ Jeff Wu ⓘ Rewon Child ⓘ |
| paperTitle |
GPT-2
self-linksurface differs
ⓘ
surface form:
Language Models are Unsupervised Multitask Learners
|
| parameterCount |
1.5B
ⓘ
117M ⓘ 345M ⓘ 774M ⓘ |
| predecessor |
GPT-3
ⓘ
surface form:
GPT
|
| releaseDate | 2019-02 ⓘ |
| safetyConcern |
potential for generating misinformation
ⓘ
potential for impersonation ⓘ potential for spam generation ⓘ |
| successor | GPT-3 ⓘ |
| task |
language modeling
ⓘ
text completion ⓘ text generation ⓘ zero-shot learning ⓘ |
| tokenizerType | Byte Pair Encoding ⓘ |
| trainingDataSource | WebText dataset ⓘ |
| trainingDataType | web pages ⓘ |
| trainingObjective | next token prediction ⓘ |
| trainingParadigm | unsupervised learning ⓘ |
How these facts were elicited
The pipeline generated the facts above by prompting gpt-5.1 with this entity's name + description and the instruction below.
You are a knowledge base construction expert. Given a subject entity and a description of it, return factual statements that you know for the subject as a JSON list of dictionaries(triples), where keys must be "subject", "predicate" and "object". The number of facts may be very high, between 25 to 50 or more, for very popular subjects. For less popular subjects, the number of facts can be very low, like 5 or 10. # Requirements - If you don't know the subject at all, return an empty list. - If the subject is not a named entity, return an empty list. - Include at least one triple where predicate is "instanceOf". - Do not get too wordy. - Separate several objects into multiple triples with one object.
Subject: GPT-2 Description of subject: GPT-2 is a large transformer-based language model known for generating coherent, human-like text and sparking widespread discussion about the implications of advanced AI text generation.
Referenced by (9)
Full triples — surface form annotated when it differs from this entity's canonical label.