HFiles
E705294
HFiles are the immutable, sorted, file-based storage format used by Apache HBase to persist table data efficiently on disk.
All labels observed (1)
| Label | Occurrences |
|---|---|
| HFiles canonical | 1 |
How this entity was disambiguated
This entity first appeared as the object of triple T7985713 — resolving that mention is where its identity was fixed. The disambiguator weighed these candidate entities and picked the highlighted one (or “None”, minting a new entity). This is how homonymy is resolved: the same surface form can point to different entities.
NED1
Entity disambiguation (via context triple)
gpt-5-mini-2025-08-07
Target entity: HFiles Context triple: [Apache HBase, storesDataIn, HFiles]
-
A.
HFS
HFS is a Unix file system used by HP-UX for managing and organizing data on disk storage.
-
B.
HFS
HFS (Hierarchical File System) is a classic file system developed by Apple for early Macintosh computers, organizing data in a tree-structured hierarchy with support for resource forks and metadata.
-
C.
HDF
HDF is the acronym for the Hungarian Defence Forces, the unified military organization responsible for Hungary’s national defense and participation in international security operations.
-
D.
AA Files
AA Files is a scholarly architectural journal produced by the Architectural Association that features critical essays, research, and visual material on architecture and related disciplines.
-
E.
Filerimos
Filerimos is a historic hill on the island of Rhodes in Greece, known for its medieval monastery, ancient acropolis ruins, and panoramic views over the surrounding area.
- F. None of above. chosen
- G. Unsure - the case is ambiguous/there is not enough information to decide.
NED2
Entity disambiguation (via description)
gpt-5-mini-2025-08-07
Target entity: HFiles Target entity description: HFiles are the immutable, sorted, file-based storage format used by Apache HBase to persist table data efficiently on disk.
-
A.
HFS
HFS is a Unix file system used by HP-UX for managing and organizing data on disk storage.
-
B.
HFS
HFS (Hierarchical File System) is a classic file system developed by Apple for early Macintosh computers, organizing data in a tree-structured hierarchy with support for resource forks and metadata.
-
C.
HDF
HDF is the acronym for the Hungarian Defence Forces, the unified military organization responsible for Hungary’s national defense and participation in international security operations.
-
D.
AA Files
AA Files is a scholarly architectural journal produced by the Architectural Association that features critical essays, research, and visual material on architecture and related disciplines.
-
E.
Filerimos
Filerimos is a historic hill on the island of Rhodes in Greece, known for its medieval monastery, ancient acropolis ruins, and panoramic views over the surrounding area.
- F. None of above. chosen
Statements (50)
| Predicate | Object |
|---|---|
| instanceOf |
HBase storage format
ⓘ
file format ⓘ |
| compatibleWith | HDFS NERFINISHED ⓘ |
| contains |
Bloom filter metadata
ⓘ
HBase cells ⓘ block index ⓘ data blocks ⓘ file info metadata ⓘ key-value pairs ⓘ meta blocks ⓘ |
| dataModel | key-value store ⓘ |
| designedFor |
efficient random reads
ⓘ
efficient sequential scans ⓘ |
| implementationLanguage | Java NERFINISHED ⓘ |
| introducedIn | Apache HBase 0.x era ⓘ |
| lifecycle |
compacted by HBase region compactions
ⓘ
created by HBase flush from memstore ⓘ deleted when obsolete after compaction ⓘ |
| mutability | immutable ⓘ |
| optimizationGoal |
efficient compaction
ⓘ
high throughput ⓘ low latency reads ⓘ |
| ordering | sorted by key ⓘ |
| organizedBy |
column family
ⓘ
column qualifier ⓘ row key ⓘ timestamp ⓘ |
| partOf | HBase internal storage architecture ⓘ |
| primaryUse | persist HBase table data on disk ⓘ |
| readBy | HBase RegionServer NERFINISHED ⓘ |
| replaced | older MapFile-based storage in early HBase versions ⓘ |
| softwareDomain |
NoSQL databases
ⓘ
big data ⓘ distributed storage ⓘ |
| storageMedium | disk ⓘ |
| storedOn | Hadoop Distributed File System NERFINISHED ⓘ |
| supports |
Bloom filters
NERFINISHED
ⓘ
block compression ⓘ block indexes ⓘ block-based storage ⓘ time-to-live semantics via HBase ⓘ versioned cells ⓘ |
| supportsCompressionCodec |
GZIP
ⓘ
LZ4 NERFINISHED ⓘ LZO NERFINISHED ⓘ Snappy ⓘ ZSTD NERFINISHED ⓘ |
| usedBy | Apache HBase NERFINISHED ⓘ |
| versionedBy | HFile format version number ⓘ |
| writtenBy | HBase RegionServer NERFINISHED ⓘ |
How these facts were elicited
The pipeline generated the facts above by prompting gpt-5.1 with this entity's name + description and the instruction below.
Instruction
You are a knowledge base construction expert. Given a subject entity and a description of it, return factual statements that you know for the subject as a JSON list of dictionaries(triples), where keys must be "subject", "predicate" and "object". The number of facts may be very high, between 25 to 50 or more, for very popular subjects. For less popular subjects, the number of facts can be very low, like 5 or 10. # Requirements - If you don't know the subject at all, return an empty list. - If the subject is not a named entity, return an empty list. - Include at least one triple where predicate is "instanceOf". - Do not get too wordy. - Separate several objects into multiple triples with one object.
Input
Subject: HFiles Description of subject: HFiles are the immutable, sorted, file-based storage format used by Apache HBase to persist table data efficiently on disk.
Referenced by (1)
Full triples — surface form annotated when it differs from this entity's canonical label.