Triple
T14393838
| Position | Surface form | Disambiguated ID | Type / Status |
|---|---|---|---|
| Subject | Jon Kleinberg |
E356907
|
entity |
| Predicate | notableWork |
P4
|
FINISHED |
| Object | HITS algorithm |
E336027
|
NE FINISHED |
How this triple was built (2 steps)
Every LLM step that produced this triple, in pipeline order — named-entity classification, the disambiguation choices (the exact options shown, with the pick highlighted), and the generated description. The batch + timestamp of each is in the Provenance table below.
NER
Named-entity recognition
gpt-5-mini
Instruction
Given a phrase, classify it is english named entity (e.g., persons, organizations, works of art) in Latin script, or not (e.g., literals, dates, URLs, verbose phrases). For disambiguation, the statement where the phrase occurs as object is also given. Please return a JSON object with `phrase` (string, the phrase being analyzed) and `is_ne` (boolean, indicating whether the phrase is a Named Entity).
Input
Phrase: HITS algorithm | Statement: [Jon Kleinberg, notableWork, HITS algorithm]
NED1
Entity disambiguation (via context triple)
gpt-5-mini-2025-08-07
Target entity: HITS algorithm Context triple: [Jon Kleinberg, notableWork, HITS algorithm]
-
A.
HITS algorithm
chosen
The HITS algorithm is a link analysis method that ranks web pages by separately evaluating their authority and hub scores based on the structure of hyperlinks.
-
B.
PageRank algorithm
The PageRank algorithm is a link analysis method used by search engines, notably Google, to rank web pages in search results based on their importance within the web’s link structure.
-
C.
The Anatomy of a Large-Scale Hypertextual Web Search Engine
"The Anatomy of a Large-Scale Hypertextual Web Search Engine" is a seminal research paper by Sergey Brin and Larry Page that introduced the design and PageRank algorithm behind the early Google search engine.
-
D.
Authoritative sources in a hyperlinked environment
"Authoritative sources in a hyperlinked environment" is the seminal research paper by Jon Kleinberg that introduced the HITS algorithm for identifying authoritative and hub pages on the web graph.
-
E.
Tarjan's strongly connected components algorithm
Tarjan's strongly connected components algorithm is a classic linear-time graph algorithm that efficiently identifies all strongly connected components in a directed graph using depth-first search and low-link values.
- F. None of above.
- G. Unsure - the case is ambiguous/there is not enough information to decide.
Provenance (3 batches)
The batch behind each pipeline step, in order, with when it ran. Timestamps are batch-level — stages were processed in waves, so the object chain (NER → NED1 → NEDg → NED2) reads in order, but predicate / elicitation batches can sit in a different wave.
| Step | Stage | Batch ID | Status | When |
|---|---|---|---|---|
| creating | Elicitation | batch_69d827927c988190ad98bb0360981783 |
completed | April 9, 2026, 10:26 p.m. |
| NER | Named-entity recognition | batch_69de902d114881908a8f3c01b3c6d309 |
completed | April 14, 2026, 7:06 p.m. |
| NED1 | Entity disambiguation (via context triple) | batch_69fd551b006c8190b84449f2e2b59b62 |
completed | May 8, 2026, 3:14 a.m. |
Created at: April 10, 2026, 1:16 a.m.