Apache Lucene
E358078
Apache Lucene is a high-performance, full-featured text search engine library written in Java and widely used as the core indexing and search technology in many applications and search platforms.
All labels observed (3)
| Label | Occurrences |
|---|---|
| Apache Lucene canonical | 4 |
| Apache Lucene project | 1 |
| Lucene index | 1 |
How this entity was disambiguated
This entity first appeared as the object of triple T3418864 — resolving that mention is where its identity was fixed. The disambiguator weighed these candidate entities and picked the highlighted one (or “None”, minting a new entity). This is how homonymy is resolved: the same surface form can point to different entities.
Target entity: Apache Lucene Context triple: [Apache Software Foundation, overseesProject, Apache Lucene]
-
A.
Apache Mahout
Apache Mahout is an open-source machine learning library designed to build scalable algorithms for clustering, classification, and recommendation on large datasets, often leveraging big data platforms.
-
B.
Apache HBase
Apache HBase is a distributed, scalable, NoSQL database designed for real-time read/write access to large datasets, typically running on top of the Hadoop ecosystem.
-
C.
xVelocity in-memory analytics engine
xVelocity in-memory analytics engine is a columnar, in-memory data processing engine developed by Microsoft to enable fast, compressed, and scalable analytical querying for business intelligence tools.
-
D.
Hadoop
Hadoop is an open-source framework that enables distributed storage and parallel processing of large data sets across clusters of commodity hardware.
-
E.
Apache Hive
Apache Hive is a data warehouse and SQL-like query system built on top of Hadoop for managing and analyzing large datasets stored in distributed storage.
- F. None of above. chosen
- G. Unsure - the case is ambiguous/there is not enough information to decide.
Target entity: Apache Lucene Target entity description: Apache Lucene is a high-performance, full-featured text search engine library written in Java and widely used as the core indexing and search technology in many applications and search platforms.
-
A.
Apache Mahout
Apache Mahout is an open-source machine learning library designed to build scalable algorithms for clustering, classification, and recommendation on large datasets, often leveraging big data platforms.
-
B.
Apache HBase
Apache HBase is a distributed, scalable, NoSQL database designed for real-time read/write access to large datasets, typically running on top of the Hadoop ecosystem.
-
C.
xVelocity in-memory analytics engine
xVelocity in-memory analytics engine is a columnar, in-memory data processing engine developed by Microsoft to enable fast, compressed, and scalable analytical querying for business intelligence tools.
-
D.
Hadoop
Hadoop is an open-source framework that enables distributed storage and parallel processing of large data sets across clusters of commodity hardware.
-
E.
Apache Hive
Apache Hive is a data warehouse and SQL-like query system built on top of Hadoop for managing and analyzing large datasets stored in distributed storage.
- F. None of above. chosen
Statements (52)
| Predicate | Object |
|---|---|
| instanceOf |
information retrieval library
ⓘ
open-source software ⓘ search engine library ⓘ |
| coreOf |
Apache Solr
ⓘ
Elasticsearch ⓘ OpenSearch ⓘ |
| developer | Apache Software Foundation ⓘ |
| feature |
Boolean queries
ⓘ
custom scoring ⓘ faceted search support ⓘ filtering ⓘ full-text search ⓘ fuzzy queries ⓘ highlighting ⓘ index compression ⓘ indexing ⓘ near real-time search ⓘ phrase queries ⓘ pluggable analyzers ⓘ range queries ⓘ ranking ⓘ scoring ⓘ segment-based index structure ⓘ sorting ⓘ stemming ⓘ tokenization ⓘ wildcard queries ⓘ |
| genre |
full-text search
ⓘ
text indexing ⓘ |
| implements | inverted index ⓘ |
| influenced |
Apache Solr
ⓘ
Elasticsearch ⓘ OpenSearch ⓘ |
| initialReleaseYear | early 2000s ⓘ |
| license | Apache License 2.0 ⓘ |
| operatingSystem | cross-platform ⓘ |
| organization | Apache Software Foundation ⓘ |
| originalAuthor | Doug Cutting ⓘ |
| partOf |
Apache Lucene
self-linksurface differs
ⓘ
surface form:
Apache Lucene project
|
| programmingLanguage | Java ⓘ |
| repository | https://github.com/apache/lucene ⓘ |
| supports |
BM25 ranking algorithm
ⓘ
vector search (kNN) in recent versions ⓘ |
| supportsLanguage |
English
ⓘ
multiple natural languages ⓘ |
| useCase |
application search
ⓘ
content management systems ⓘ document management systems ⓘ enterprise search ⓘ log search ⓘ website search ⓘ |
| writtenIn | Java ⓘ |
How these facts were elicited
The pipeline generated the facts above by prompting gpt-5.1 with this entity's name + description and the instruction below.
You are a knowledge base construction expert. Given a subject entity and a description of it, return factual statements that you know for the subject as a JSON list of dictionaries(triples), where keys must be "subject", "predicate" and "object". The number of facts may be very high, between 25 to 50 or more, for very popular subjects. For less popular subjects, the number of facts can be very low, like 5 or 10. # Requirements - If you don't know the subject at all, return an empty list. - If the subject is not a named entity, return an empty list. - Include at least one triple where predicate is "instanceOf". - Do not get too wordy. - Separate several objects into multiple triples with one object.
Subject: Apache Lucene Description of subject: Apache Lucene is a high-performance, full-featured text search engine library written in Java and widely used as the core indexing and search technology in many applications and search platforms.
Referenced by (6)
Full triples — surface form annotated when it differs from this entity's canonical label.