FlumeJava

E702188

FlumeJava is a Java-based library from Google for building, optimizing, and running large-scale data-parallel pipelines, later inspiring systems like Apache Beam.

Try in SPARQL Jump to: Statements Referenced by

Statements (47)

Predicate Object
instanceOf Java library
data-parallel programming framework
software library
abstractionLevel high-level API over MapReduce
basedOn MapReduce programming model
category big data framework
parallel data processing library
designGoal enable automatic optimization of data pipelines
separate logical pipeline description from physical execution
simplify writing data-parallel code in Java
developer Google
domain big data processing
parallel computing
executionEnvironment cluster computing environments
distributed systems
executionModel dataflow-style pipelines
influenced Apache Beam NERFINISHED
inspirationFor Google Cloud Dataflow model NERFINISHED
inspired Apache Beam NERFINISHED
introducedBy Google researchers
mainPurpose building data-parallel pipelines
optimizing data-parallel pipelines
running large-scale data-parallel pipelines
optimizationStrategy combining multiple operations into fewer passes over data
fusion of pipeline stages
reordering of operations
programmingLanguage Java
provides PCollection abstraction
parallel operations on collections
relatedTo Apache Beam NERFINISHED
Google MapReduce NERFINISHED
dataflow programming
supports automatic optimization of execution plans
combine operations
groupBy operations
join operations
lazy evaluation of pipelines
map operations
parallel collections
pipeline abstraction
reduce operations
targetUseCase large-scale batch data processing
targetUser Java developers
usedFor ETL pipelines
analytics pipelines
data aggregation
log processing

Referenced by (2)

Full triples — surface form annotated when it differs from this entity's canonical label.

Google MapReduce influenced FlumeJava
MapReduce influenced FlumeJava