Triple

T848984
Position Surface form Disambiguated ID Type / Status
Subject GPT-2 E18339 entity
Predicate trainingDataSource P21073 FINISHED
Object WebText dataset
The WebText dataset is a large-scale corpus of web pages curated by OpenAI to train language models like GPT-2 on diverse, high-quality internet text.
E99319 NE FINISHED

Provenance (7 batches)

Stage Batch ID Job type Status
creating batch_69a4938b04208190b82e1df6b572c548 elicitation completed
NER batch_69a4ac1fac3481909cba7070ce31a9b3 ner completed
NED1 batch_69a792a0666c8190bfc9166d45b4e867 ned_source_triple completed
NED2 batch_69a7941add588190913198a7f7b20943 ned_description completed
NEDg batch_69a793563cc881909381f898f240c0bd nedg completed
PD batch_69a4aa807adc8190ad808a573cf8e923 pd completed
PDg batch_69a4abb157d08190a7d7281eb3f1b788 pdg completed
Created at: March 1, 2026, 7:38 p.m.