Triple

T5460228
Position Surface form Disambiguated ID Type / Status
Subject Tajik language E122576 entity
Predicate regulatesBy P4784 FINISHED
Object Committee of Language and Terminology under the Government of Tajikistan
The Committee of Language and Terminology under the Government of Tajikistan is a state body responsible for overseeing language policy, standardization, and official terminology for the Tajik language.
E520876 NE FINISHED

How this triple was built (4 steps)

Every LLM step that produced this triple, in pipeline order — named-entity classification, the disambiguation choices (the exact options shown, with the pick highlighted), and the generated description. The batch + timestamp of each is in the Provenance table below.

NER Named-entity recognition gpt-5-mini
Instruction
Given a phrase, classify it is english named entity (e.g., persons, organizations, works of art) in Latin script, or not (e.g., literals, dates, URLs, verbose phrases). For disambiguation, the statement where the phrase occurs as object is also given. Please return a JSON object with `phrase` (string, the phrase being analyzed) and `is_ne` (boolean, indicating whether the phrase is a Named Entity).
Input
Phrase: Committee of Language and Terminology under the Government of Tajikistan | Statement: [Tajik language, regulatesBy, Committee of Language and Terminology under the Government of Tajikistan]
NED1 Entity disambiguation (via context triple) gpt-5-mini-2025-08-07
Target entity: Committee of Language and Terminology under the Government of Tajikistan
Context triple: [Tajik language, regulatesBy, Committee of Language and Terminology under the Government of Tajikistan]
  • A. Tajik language
    The Tajik language is a variety of Persian spoken primarily in Tajikistan and written in the Cyrillic script.
  • B. Government of Tajikistan
    The Government of Tajikistan is the central executive authority of the Republic of Tajikistan, responsible for implementing laws, managing state affairs, and directing national policy.
  • C. Institute of Language and Literature of the Academy of Sciences of Turkmenistan
    The Institute of Language and Literature of the Academy of Sciences of Turkmenistan is a national scholarly institution responsible for researching, standardizing, and promoting the Turkmen language and its literary heritage.
  • D. Supreme Assembly of Tajikistan
    The Supreme Assembly of Tajikistan is the country’s national parliament and highest representative legislative body, responsible for making and passing laws.
  • E. Assembly of Representatives of Tajikistan
    The Assembly of Representatives of Tajikistan is the lower house of the country’s bicameral national parliament, responsible for drafting and passing legislation.
  • F. None of above. chosen
  • G. Unsure - the case is ambiguous/there is not enough information to decide.
NEDg Description generation gpt-5.1
Instruction
Generate a one-sentence description of the target entity. 
You are given a context triple in the form (subject, predicate, object), where the object is the target entity. 
# Instructions
Use the triple to infer relevant information about the entity. Describe the entity based on what is most defining, well-known. 
Avoid repeating the information from the triple, unless really essential.
# Response Format
Return only the sentence: "Description: [one-sentence description of the target entity]"
Input
Entity: Committee of Language and Terminology under the Government of Tajikistan
Triple: [Tajik language, regulatesBy, Committee of Language and Terminology under the Government of Tajikistan]
Generated description
The Committee of Language and Terminology under the Government of Tajikistan is a state body responsible for overseeing language policy, standardization, and official terminology for the Tajik language.
NED2 Entity disambiguation (via description) gpt-5-mini-2025-08-07
Target entity: Committee of Language and Terminology under the Government of Tajikistan
Target entity description: The Committee of Language and Terminology under the Government of Tajikistan is a state body responsible for overseeing language policy, standardization, and official terminology for the Tajik language.
  • A. Tajik language
    The Tajik language is a variety of Persian spoken primarily in Tajikistan and written in the Cyrillic script.
  • B. Government of Tajikistan
    The Government of Tajikistan is the central executive authority of the Republic of Tajikistan, responsible for implementing laws, managing state affairs, and directing national policy.
  • C. Institute of Language and Literature of the Academy of Sciences of Turkmenistan
    The Institute of Language and Literature of the Academy of Sciences of Turkmenistan is a national scholarly institution responsible for researching, standardizing, and promoting the Turkmen language and its literary heritage.
  • D. Supreme Assembly of Tajikistan
    The Supreme Assembly of Tajikistan is the country’s national parliament and highest representative legislative body, responsible for making and passing laws.
  • E. Assembly of Representatives of Tajikistan
    The Assembly of Representatives of Tajikistan is the lower house of the country’s bicameral national parliament, responsible for drafting and passing legislation.
  • F. None of above. chosen

Provenance (5 batches)

The batch behind each pipeline step, in order, with when it ran. Timestamps are batch-level — stages were processed in waves, so the object chain (NER → NED1 → NEDg → NED2) reads in order, but predicate / elicitation batches can sit in a different wave.

Step Stage Batch ID Status When
creating Elicitation batch_69bd46424248819085282ddf50a565f3 completed March 20, 2026, 1:06 p.m.
NER Named-entity recognition batch_69bd9200a3988190a06f253f99e68224 completed March 20, 2026, 6:29 p.m.
NED1 Entity disambiguation (via context triple) batch_69bf414c39a4819098f2862f3c4594c0 completed March 22, 2026, 1:09 a.m.
NEDg Description generation batch_69bf4207f4e4819096709c49fe001005 completed March 22, 2026, 1:12 a.m.
NED2 Entity disambiguation (via description) batch_69bf42d7b4b88190954084985134a8c2 completed March 22, 2026, 1:16 a.m.
Created at: March 20, 2026, 2:08 p.m.