Triple
T19773902
| Position | Surface form | Disambiguated ID | Type / Status |
|---|---|---|---|
| Subject | CSD System |
E474955
|
entity |
| Predicate | includesComponent |
P1393
|
FINISHED |
| Object | Cambridge Structural Database |
—
|
NE NERFINISHED |
How this triple was built (2 steps)
Every LLM step that produced this triple, in pipeline order — named-entity classification, the disambiguation choices (the exact options shown, with the pick highlighted), and the generated description. The batch + timestamp of each is in the Provenance table below.
NER
Named-entity recognition
gpt-5-mini
Instruction
Given a phrase, classify it is english named entity (e.g., persons, organizations, works of art) in Latin script, or not (e.g., literals, dates, URLs, verbose phrases). For disambiguation, the statement where the phrase occurs as object is also given. Please return a JSON object with `phrase` (string, the phrase being analyzed) and `is_ne` (boolean, indicating whether the phrase is a Named Entity).
Input
Phrase: Cambridge Structural Database | Statement: [CSD System, includesComponent, Cambridge Structural Database]
NED1
Entity disambiguation (via context triple)
gpt-5-mini-2025-08-07
Target entity: Cambridge Structural Database Context triple: [CSD System, includesComponent, Cambridge Structural Database]
-
A.
Cambridge Structural Database
chosen
The Cambridge Structural Database is a comprehensive repository of small-molecule organic and metal-organic crystal structures used worldwide for research in chemistry, materials science, and related fields.
-
B.
Cambridge Structural Database CrossMiner
Cambridge Structural Database CrossMiner is a cheminformatics tool that enables interactive 3D substructure and pharmacophore searching across the Cambridge Structural Database and related structural datasets.
-
C.
Cambridge Crystallographic Data Centre
The Cambridge Crystallographic Data Centre is a leading research organization that curates and distributes the world’s primary database of small-molecule organic and metal-organic crystal structures for the scientific community.
-
D.
Crystallographic Information Framework
The Crystallographic Information Framework is a standardized data format and ontology used worldwide for representing, exchanging, and archiving crystallographic and related structural science data.
-
E.
IUCrData
IUCrData is an open-access crystallography data journal published by the International Union of Crystallography, focusing on brief reports of crystal structure determinations and related datasets.
- F. None of above.
- G. Unsure - the case is ambiguous/there is not enough information to decide.
Provenance (2 batches)
The batch behind each pipeline step, in order, with when it ran. Timestamps are batch-level — stages were processed in waves, so the object chain (NER → NED1 → NEDg → NED2) reads in order, but predicate / elicitation batches can sit in a different wave.
| Step | Stage | Batch ID | Status | When |
|---|---|---|---|---|
| creating | Elicitation | batch_69d8e51a43a08190956bc6df13c91a77 |
completed | April 10, 2026, 11:55 a.m. |
| NER | Named-entity recognition | batch_69e6535e450c8190a2628245ae0d0bd3 |
completed | April 20, 2026, 4:25 p.m. |
Created at: April 10, 2026, 1:48 p.m.