Triple
T9212751
| Position | Surface form | Disambiguated ID | Type / Status |
|---|---|---|---|
| Subject | Aru Islands |
E221165
|
entity |
| Predicate | hasLanguage |
P15
|
FINISHED |
| Object | Aru languages |
E159656
|
NE FINISHED |
How this triple was built (2 steps)
Every LLM step that produced this triple, in pipeline order — named-entity classification, the disambiguation choices (the exact options shown, with the pick highlighted), and the generated description. The batch + timestamp of each is in the Provenance table below.
NER
Named-entity recognition
gpt-5-mini
Instruction
Given a phrase, classify it is english named entity (e.g., persons, organizations, works of art) in Latin script, or not (e.g., literals, dates, URLs, verbose phrases). For disambiguation, the statement where the phrase occurs as object is also given. Please return a JSON object with `phrase` (string, the phrase being analyzed) and `is_ne` (boolean, indicating whether the phrase is a Named Entity).
Input
Phrase: Aru languages | Statement: [Aru Islands, hasLanguage, Aru languages]
NED1
Entity disambiguation (via context triple)
gpt-5-mini-2025-08-07
Target entity: Aru languages Context triple: [Aru Islands, hasLanguage, Aru languages]
-
A.
Aru languages
chosen
Aru languages are a group of closely related Austronesian languages spoken primarily on the Aru Islands in eastern Indonesia.
-
B.
Pearic languages
Pearic languages are a small, endangered branch of the Austroasiatic language family spoken by indigenous Pearic communities in Cambodia and nearby regions.
-
C.
Daju languages
The Daju languages are a small group of closely related Eastern Sudanic languages spoken primarily in parts of Sudan and Chad.
-
D.
Katuic languages
Katuic languages are a branch of the Austroasiatic language family spoken primarily in Laos, Vietnam, and neighboring regions by various indigenous ethnic groups.
-
E.
Batanic languages
Batanic languages are a small subgroup of Austronesian languages spoken primarily in the Batanes Islands of the northern Philippines and parts of Taiwan, known for their unique phonological and lexical features.
- F. None of above.
- G. Unsure - the case is ambiguous/there is not enough information to decide.
Provenance (3 batches)
The batch behind each pipeline step, in order, with when it ran. Timestamps are batch-level — stages were processed in waves, so the object chain (NER → NED1 → NEDg → NED2) reads in order, but predicate / elicitation batches can sit in a different wave.
| Step | Stage | Batch ID | Status | When |
|---|---|---|---|---|
| creating | Elicitation | batch_69ca83e9d0e081908bdb71097201a06c |
completed | March 30, 2026, 2:08 p.m. |
| NER | Named-entity recognition | batch_69ccda05406081909893bec3a092d3ce |
completed | April 1, 2026, 8:40 a.m. |
| NED1 | Entity disambiguation (via context triple) | batch_69d0660839f88190afdfb8bc2d710fc3 |
completed | April 4, 2026, 1:14 a.m. |
Created at: March 30, 2026, 7:27 p.m.