Triple

T11169211
Position Surface form Disambiguated ID Type / Status
Subject Bora language E264231 entity
Predicate languageFamily P1047 FINISHED
Object Boran languages
The Boran languages are a small group of closely related indigenous languages spoken primarily in the northwestern Amazon region of South America.
E908811 NE FINISHED

How this triple was built (4 steps)

Every LLM step that produced this triple, in pipeline order — named-entity classification, the disambiguation choices (the exact options shown, with the pick highlighted), and the generated description. The batch + timestamp of each is in the Provenance table below.

NER Named-entity recognition gpt-5-mini
Instruction
Given a phrase, classify it is english named entity (e.g., persons, organizations, works of art) in Latin script, or not (e.g., literals, dates, URLs, verbose phrases). For disambiguation, the statement where the phrase occurs as object is also given. Please return a JSON object with `phrase` (string, the phrase being analyzed) and `is_ne` (boolean, indicating whether the phrase is a Named Entity).
Input
Phrase: Boran languages | Statement: [Bora language, languageFamily, Boran languages]
NED1 Entity disambiguation (via context triple) gpt-5-mini-2025-08-07
Target entity: Boran languages
Context triple: [Bora language, languageFamily, Boran languages]
  • A. Avar-Andic languages
    The Avar-Andic languages are a branch of the Northeast Caucasian language family spoken primarily in Dagestan and neighboring regions of the eastern Caucasus.
  • B. Katuic languages
    Katuic languages are a branch of the Austroasiatic language family spoken primarily in Laos, Vietnam, and neighboring regions by various indigenous ethnic groups.
  • C. Sara–Bagirmi languages
    The Sara–Bagirmi languages are a group of closely related Central Sudanic languages spoken primarily in Chad and neighboring regions of Central Africa.
  • D. Sami languages
    Sami languages are a group of Uralic Indigenous languages spoken by the Sámi people across northern parts of Norway, Sweden, Finland, and Russia.
  • E. Teda–Daza languages
    The Teda–Daza languages are a closely related pair of Saharan languages spoken primarily by the Toubou people across Chad, Niger, and Libya.
  • F. None of above. chosen
  • G. Unsure - the case is ambiguous/there is not enough information to decide.
NEDg Description generation gpt-5.1
Instruction
Generate a one-sentence description of the target entity. 
You are given a context triple in the form (subject, predicate, object), where the object is the target entity. 
# Instructions
Use the triple to infer relevant information about the entity. Describe the entity based on what is most defining, well-known. 
Avoid repeating the information from the triple, unless really essential.
# Response Format
Return only the sentence: "Description: [one-sentence description of the target entity]"
Input
Entity: Boran languages
Triple: [Bora language, languageFamily, Boran languages]
Generated description
The Boran languages are a small group of closely related indigenous languages spoken primarily in the northwestern Amazon region of South America.
NED2 Entity disambiguation (via description) gpt-5-mini-2025-08-07
Target entity: Boran languages
Target entity description: The Boran languages are a small group of closely related indigenous languages spoken primarily in the northwestern Amazon region of South America.
  • A. Avar-Andic languages
    The Avar-Andic languages are a branch of the Northeast Caucasian language family spoken primarily in Dagestan and neighboring regions of the eastern Caucasus.
  • B. Katuic languages
    Katuic languages are a branch of the Austroasiatic language family spoken primarily in Laos, Vietnam, and neighboring regions by various indigenous ethnic groups.
  • C. Sara–Bagirmi languages
    The Sara–Bagirmi languages are a group of closely related Central Sudanic languages spoken primarily in Chad and neighboring regions of Central Africa.
  • D. Sami languages
    Sami languages are a group of Uralic Indigenous languages spoken by the Sámi people across northern parts of Norway, Sweden, Finland, and Russia.
  • E. Teda–Daza languages
    The Teda–Daza languages are a closely related pair of Saharan languages spoken primarily by the Toubou people across Chad, Niger, and Libya.
  • F. None of above. chosen

Provenance (5 batches)

The batch behind each pipeline step, in order, with when it ran. Timestamps are batch-level — stages were processed in waves, so the object chain (NER → NED1 → NEDg → NED2) reads in order, but predicate / elicitation batches can sit in a different wave.

Step Stage Batch ID Status When
creating Elicitation batch_69d6aa9dafac8190bd90d2c74f661aa7 completed April 8, 2026, 7:21 p.m.
NER Named-entity recognition batch_69d7e8952e248190b0751669e8c960b7 completed April 9, 2026, 5:57 p.m.
NED1 Entity disambiguation (via context triple) batch_69e463a2e2fc819086126b681d86e94b completed April 19, 2026, 5:09 a.m.
NEDg Description generation batch_69e46c37efec81908aa709587c37569d completed April 19, 2026, 5:46 a.m.
NED2 Entity disambiguation (via description) batch_69e47292cdd08190b05c4c8b09f4f918 completed April 19, 2026, 6:13 a.m.
Created at: April 8, 2026, 9:29 p.m.