DataSet API

E711832

Apache Flink API batch processing API

DataSet API is Apache Flink’s now-legacy batch processing API for defining and executing scalable, distributed data transformations.

Try in SPARQL Jump to: Surface forms Disambiguation Statements Elicitation Referenced by

All labels observed (1)

Label	Occurrences
DataSet API canonical	1

How this entity was disambiguated

This entity first appeared as the object of triple T8093951 — resolving that mention is where its identity was fixed. The disambiguator weighed these candidate entities and picked the highlighted one (or “None”, minting a new entity). This is how homonymy is resolved: the same surface form can point to different entities.

NED1 Entity disambiguation (via context triple) gpt-5-mini-2025-08-07

Target entity: DataSet API
Context triple: [Apache Flink, hasAPI, DataSet API]

A. DataView
DataView is ML.NET’s core, schema-aware tabular data abstraction used to efficiently represent and process datasets for machine learning pipelines.
B. Data
Data is an android Starfleet officer in Star Trek: The Next Generation, known for his quest to understand humanity and develop emotions.
C. OS OpenData
OS OpenData is a collection of free, publicly available digital mapping and geographic datasets released by Ordnance Survey for use in analysis, applications, and research.
D. Ecdat
Ecdat is a term most likely associated with the name of Ecdat Park, suggesting it is a proper noun used in geographic or place naming.
E. Open Data Index
Open Data Index is a global initiative that evaluates and ranks the openness and accessibility of government data across countries.
F. None of above. chosen
G. Unsure - the case is ambiguous/there is not enough information to decide.

NED2 Entity disambiguation (via description) gpt-5-mini-2025-08-07

Target entity: DataSet API
Target entity description: DataSet API is Apache Flink’s now-legacy batch processing API for defining and executing scalable, distributed data transformations.

A. DataView
DataView is ML.NET’s core, schema-aware tabular data abstraction used to efficiently represent and process datasets for machine learning pipelines.
B. Data
Data is an android Starfleet officer in Star Trek: The Next Generation, known for his quest to understand humanity and develop emotions.
C. OS OpenData
OS OpenData is a collection of free, publicly available digital mapping and geographic datasets released by Ordnance Survey for use in analysis, applications, and research.
D. Ecdat
Ecdat is a term most likely associated with the name of Ecdat Park, suggesting it is a proper noun used in geographic or place naming.
E. Open Data Index
Open Data Index is a global initiative that evaluates and ranks the openness and accessibility of government data across countries.
F. None of above. chosen

Statements (44)

Predicate	Object
instanceOf	Apache Flink API ⓘ batch processing API ⓘ
category	big data framework API ⓘ data-parallel programming API ⓘ
designedFor	fault-tolerant processing ⓘ scalable processing ⓘ
developer	Apache Flink community ⓘ
documentationURL	https://nightlies.apache.org/flink/flink-docs-stable/dev/batch ⓘ
ecosystem	Apache Flink stack NERFINISHED ⓘ
executionEngine	Flink batch runtime ⓘ
executionModel	distributed ⓘ
feature	custom partitioning ⓘ grouping and aggregation ⓘ iterations ⓘ joins ⓘ operators like map, flatMap, filter, reduce ⓘ support for user-defined functions ⓘ type-safe transformations ⓘ
inputFormat	HDFS NERFINISHED ⓘ collections ⓘ files ⓘ
integratesWith	Flink runtime ⓘ
license	Apache License 2.0 ⓘ
notDesignedFor	unbounded streaming data ⓘ
outputFormat	HDFS NERFINISHED ⓘ files ⓘ
partOf	Apache Flink NERFINISHED ⓘ
programmingLanguage	Java ⓘ Scala NERFINISHED ⓘ
relation	predecessor of unified Flink APIs for batch and streaming ⓘ
replacedBy	Flink DataStream API NERFINISHED ⓘ Flink Table API NERFINISHED ⓘ
scope	bounded data ⓘ
status	legacy ⓘ
supports	batch processing ⓘ data transformations ⓘ distributed data processing ⓘ
supportsOptimization	automatic execution plan optimization ⓘ data pipelining ⓘ operator chaining ⓘ
targetUser	Java developers ⓘ Scala developers ⓘ big data engineers ⓘ data engineers ⓘ

How these facts were elicited

Referenced by (1)

Full triples — surface form annotated when it differs from this entity's canonical label.

Apache Flink → hasAPI → DataSet API ⓘ

All labels observed (1)

How this entity was disambiguated Show

Statements (44)

How these facts were elicited Show

Referenced by (1)

How this entity was disambiguated

How these facts were elicited