Triple

T19773902
Position Surface form Disambiguated ID Type / Status
Subject CSD System E474955 entity
Predicate includesComponent P1393 FINISHED
Object Cambridge Structural Database NE NERFINISHED

How this triple was built (2 steps)

Every LLM step that produced this triple, in pipeline order — named-entity classification, the disambiguation choices (the exact options shown, with the pick highlighted), and the generated description. The batch + timestamp of each is in the Provenance table below.

NER Named-entity recognition gpt-5-mini
Instruction
Given a phrase, classify it is english named entity (e.g., persons, organizations, works of art) in Latin script, or not (e.g., literals, dates, URLs, verbose phrases). For disambiguation, the statement where the phrase occurs as object is also given. Please return a JSON object with `phrase` (string, the phrase being analyzed) and `is_ne` (boolean, indicating whether the phrase is a Named Entity).
Input
Phrase: Cambridge Structural Database | Statement: [CSD System, includesComponent, Cambridge Structural Database]
NED1 Entity disambiguation (via context triple) gpt-5-mini-2025-08-07
Target entity: Cambridge Structural Database
Context triple: [CSD System, includesComponent, Cambridge Structural Database]
  • A. Cambridge Structural Database chosen
    The Cambridge Structural Database is a comprehensive repository of small-molecule organic and metal-organic crystal structures used worldwide for research in chemistry, materials science, and related fields.
  • B. Cambridge Structural Database CrossMiner
    Cambridge Structural Database CrossMiner is a cheminformatics tool that enables interactive 3D substructure and pharmacophore searching across the Cambridge Structural Database and related structural datasets.
  • C. Cambridge Crystallographic Data Centre
    The Cambridge Crystallographic Data Centre is a leading research organization that curates and distributes the world’s primary database of small-molecule organic and metal-organic crystal structures for the scientific community.
  • D. Crystallographic Information Framework
    The Crystallographic Information Framework is a standardized data format and ontology used worldwide for representing, exchanging, and archiving crystallographic and related structural science data.
  • E. IUCrData
    IUCrData is an open-access crystallography data journal published by the International Union of Crystallography, focusing on brief reports of crystal structure determinations and related datasets.
  • F. None of above.
  • G. Unsure - the case is ambiguous/there is not enough information to decide.

Provenance (2 batches)

The batch behind each pipeline step, in order, with when it ran. Timestamps are batch-level — stages were processed in waves, so the object chain (NER → NED1 → NEDg → NED2) reads in order, but predicate / elicitation batches can sit in a different wave.

Step Stage Batch ID Status When
creating Elicitation batch_69d8e51a43a08190956bc6df13c91a77 completed April 10, 2026, 11:55 a.m.
NER Named-entity recognition batch_69e6535e450c8190a2628245ae0d0bd3 completed April 20, 2026, 4:25 p.m.
Created at: April 10, 2026, 1:48 p.m.