Triple

T9212751
Position Surface form Disambiguated ID Type / Status
Subject Aru Islands E221165 entity
Predicate hasLanguage P15 FINISHED
Object Aru languages E159656 NE FINISHED

How this triple was built (2 steps)

Every LLM step that produced this triple, in pipeline order — named-entity classification, the disambiguation choices (the exact options shown, with the pick highlighted), and the generated description. The batch + timestamp of each is in the Provenance table below.

NER Named-entity recognition gpt-5-mini
Instruction
Given a phrase, classify it is english named entity (e.g., persons, organizations, works of art) in Latin script, or not (e.g., literals, dates, URLs, verbose phrases). For disambiguation, the statement where the phrase occurs as object is also given. Please return a JSON object with `phrase` (string, the phrase being analyzed) and `is_ne` (boolean, indicating whether the phrase is a Named Entity).
Input
Phrase: Aru languages | Statement: [Aru Islands, hasLanguage, Aru languages]
NED1 Entity disambiguation (via context triple) gpt-5-mini-2025-08-07
Target entity: Aru languages
Context triple: [Aru Islands, hasLanguage, Aru languages]
  • A. Aru languages chosen
    Aru languages are a group of closely related Austronesian languages spoken primarily on the Aru Islands in eastern Indonesia.
  • B. Pearic languages
    Pearic languages are a small, endangered branch of the Austroasiatic language family spoken by indigenous Pearic communities in Cambodia and nearby regions.
  • C. Daju languages
    The Daju languages are a small group of closely related Eastern Sudanic languages spoken primarily in parts of Sudan and Chad.
  • D. Katuic languages
    Katuic languages are a branch of the Austroasiatic language family spoken primarily in Laos, Vietnam, and neighboring regions by various indigenous ethnic groups.
  • E. Batanic languages
    Batanic languages are a small subgroup of Austronesian languages spoken primarily in the Batanes Islands of the northern Philippines and parts of Taiwan, known for their unique phonological and lexical features.
  • F. None of above.
  • G. Unsure - the case is ambiguous/there is not enough information to decide.

Provenance (3 batches)

The batch behind each pipeline step, in order, with when it ran. Timestamps are batch-level — stages were processed in waves, so the object chain (NER → NED1 → NEDg → NED2) reads in order, but predicate / elicitation batches can sit in a different wave.

Step Stage Batch ID Status When
creating Elicitation batch_69ca83e9d0e081908bdb71097201a06c completed March 30, 2026, 2:08 p.m.
NER Named-entity recognition batch_69ccda05406081909893bec3a092d3ce completed April 1, 2026, 8:40 a.m.
NED1 Entity disambiguation (via context triple) batch_69d0660839f88190afdfb8bc2d710fc3 completed April 4, 2026, 1:14 a.m.
Created at: March 30, 2026, 7:27 p.m.