Dataset profile
Counts, distributions, and top-K rankings for the snapshot currently mounted in this browser.
Snapshot
Identifier
gpt51_270226
Computed 2026-05-07T23:18:03.865413+00:00
· DB file 4.7 GB
Tables
| Triples | 4,480,648 |
| Entities (instance nodes) | 438,870 |
| Predicates | 56,818 |
| Concepts | 15,984 |
Instance triples (specialised instanceOf) |
280,052 |
| Pipeline batches | 20,889 |
Object type distribution
Where each triple's object resolves to.
NE = disambiguated entity,
CONCEPT = type/class,
LITERAL = string,
UNRECOGNIZED = parsing failure.
CONCEPT |
280,052 |
LITERAL |
2,262,163 |
NE |
1,938,432 |
UNRECOGNIZED |
1 |
Pipeline status
Entity status
EXPLORED |
100,179 |
UNEXPLORED |
338,691 |
Triple object disambiguation
FINISHED |
4,372,772 |
GENERATED |
6,194 |
NERFINISHED |
101,681 |
ONNER |
1 |
Top predicates by triple count
| Predicate | ID | Triples |
|---|---|---|
| instanceOf | P0 |
285,355 |
| locatedIn | P40 |
204,760 |
| notableFor | P22 |
118,644 |
| hasPart | P35 |
112,473 |
| fieldOfWork | P3 |
112,214 |
| usedFor | P98 |
75,593 |
| countryOfOrigin | P26 |
71,654 |
| relatedTo | P37 |
70,088 |
| notableWork | P4 |
60,591 |
| focusesOn | P31 |
51,538 |
| category | P87 |
47,397 |
| hasFeature | P182 |
47,040 |
| influenced | P9 |
44,743 |
| languageOfWorkOrName | P15 |
44,000 |
| hasNotableFacility | P105 |
43,660 |
| memberOf | P10 |
36,327 |
| areaServed | P82 |
34,999 |
| genre | P14 |
34,922 |
| purpose | P79 |
34,259 |
| sector | P71 |
30,502 |
Top concepts by instance count
| Concept | ID | Instances |
|---|---|---|
| human | C0 |
21,377 |
| tourist attraction | C85 |
7,380 |
| municipality | C39 |
6,807 |
| film | C684 |
4,370 |
| city | C38 |
3,553 |
| politician | C67 |
3,236 |
| building | C240 |
3,177 |
| company | C423 |
2,521 |
| non-fiction work | C46 |
2,483 |
| award | C97 |
2,085 |
| cultural institution | C84 |
2,079 |
| neighborhood | C27 |
2,009 |
| transport hub | C82 |
1,931 |
| academic division | C57 |
1,730 |
| nonprofit organization | C6 |
1,654 |
| song | C396 |
1,553 |
| surname | C132 |
1,521 |
| military officer | C449 |
1,501 |
| sports venue | C344 |
1,399 |
| military organization | C298 |
1,349 |
Top entities by total degree
How many triples reference each entity (in-degree as object + out-degree as subject). High-degree entities are disambiguation magnets — every surface form variant resolves here.
| Entity | ID | In | Out | Total |
|---|---|---|---|---|
| United States of America | E14 |
52,278 | 83 | 52,361 |
| United Kingdom | E732 |
12,368 | 144 | 12,512 |
| France | E861 |
5,977 | 102 | 6,079 |
| English | E211 |
5,933 | 62 | 5,995 |
| New York City | E40 |
5,539 | 74 | 5,613 |
| England | E1791 |
5,117 | 58 | 5,175 |
| Europe | E833 |
4,434 | 97 | 4,531 |
| Germany | E1728 |
4,160 | 95 | 4,255 |
| Canada | E14901 |
4,193 | 54 | 4,247 |
| World War II | E32 |
4,063 | 92 | 4,155 |
| London, England | E1817 |
3,900 | 94 | 3,994 |
| California, United States | E26 |
3,775 | 101 | 3,876 |
| Eastern Time Zone | E288 |
3,783 | 80 | 3,863 |
| North America | E335 |
3,776 | 83 | 3,859 |
| Roman Catholicism | E384 |
3,697 | 67 | 3,764 |
| Washington, D.C. | E23 |
3,325 | 78 | 3,403 |
| Japan | E174 |
3,343 | 56 | 3,399 |
| Australia | E876 |
3,039 | 70 | 3,109 |
| Christianity | E348 |
2,815 | 90 | 2,905 |
| New York | E550 |
2,686 | 53 | 2,739 |