Hadoop

E35621

Hadoop is an open-source framework that enables distributed storage and parallel processing of large data sets across clusters of commodity hardware.

Jump to: Surface forms Statements Referenced by

Observed surface forms (4)

Surface form Occurrences
Apache Hadoop 1
Hadoop Common 1
Hadoop Distributed File System 1

Statements (60)

Predicate Object
instanceOf big data framework
distributed computing framework
open-source software framework
developer Apache Software Foundation
domain big data
ecosystemIncludes Apache Flume
Apache HBase
Apache Hive
Apache Mahout
Apache Oozie
Apache Pig
Apache Sqoop
Apache ZooKeeper
hasComponent HDFS
Hadoop self-linksurface differs
surface form: Hadoop Common

Hadoop self-linksurface differs
surface form: Hadoop Distributed File System

MapReduce
YARN
Yet Another Resource Negotiator
influenced Apache Flink
Apache Spark
Apache Storm
initiallyInspiredBy Google File System
Google MapReduce
license Apache License 2.0
operatingSystem Cross-platform
partOf Apache Hadoop ecosystem
processingLayer MapReduce
programmingLanguage Java
resourceManagementLayer YARN
runsOn clusters of commodity hardware
storageLayer HDFS
supportsArchitecture master-slave architecture
supportsDataReplicationFactor configurable replication factor
supportsFeature batch processing
data replication
distributed storage
fault tolerance
horizontal scalability
parallel processing
supportsHighAvailability NameNode high availability
supportsModel MapReduce programming model
supportsOperatingSystem Linux
Windows
macOS
supportsProgrammingLanguage C++
Java
Python
R
Scala
supportsSecurity Kerberos-based authentication
useCase ETL workloads
data warehousing
large-scale data processing
log processing
machine learning at scale
writtenIn C
Java
Python
Unix shell
surface form: Shell

Referenced by (5)

Full triples — surface form annotated when it differs from this entity's canonical label.

Hadoop hasComponent Hadoop self-linksurface differs
this entity surface form: Hadoop Common
Hadoop hasComponent Hadoop self-linksurface differs
this entity surface form: Hadoop Distributed File System
this entity surface form: Apache Hadoop
ORC usedIn Hadoop
this entity surface form: Hadoop ecosystem