Dask

E426661

Dask is an open-source parallel computing library for Python that enables scalable, distributed data processing and analytics using familiar interfaces like NumPy, pandas, and scikit-learn.

All labels observed (1)

Label Occurrences
Dask canonical 3

How this entity was disambiguated

Statements (59)

Predicate Object
instanceOf Python library
data processing framework
open-source software
parallel computing library
canRunOn cloud infrastructure
cluster
multi-core machine
single machine
compatibleWith NumPy NERFINISHED
pandas
scikit-learn NERFINISHED
developedIn Python ecosystem
hasComponent Dask Array NERFINISHED
Dask Bag NERFINISHED
Dask DataFrame NERFINISHED
Dask Delayed NERFINISHED
Dask Distributed Scheduler NERFINISHED
Dask Futures NERFINISHED
Dask Local Scheduler NERFINISHED
hasScheduler distributed scheduler
multi-process scheduler
multi-threaded scheduler
single-threaded scheduler
isFreeSoftware true
license BSD 3-Clause License NERFINISHED
primaryUse large-scale data processing
parallelizing Python code
scaling single-machine workflows to clusters
programmingLanguage Python
providesInterfaceSimilarTo NumPy NERFINISHED
pandas
scikit-learn NERFINISHED
repository https://github.com/dask/dask
supports arrays
dataframes
distributed computing
machine learning workflows
out-of-core computation
parallel computing
scalable analytics
task scheduling
supportsClusterManager Kubernetes NERFINISHED
PBS NERFINISHED
SLURM NERFINISHED
SSH clusters
YARN NERFINISHED
supportsDataFormat CSV NERFINISHED
HDF5 NERFINISHED
JSON NERFINISHED
ORC
Parquet NERFINISHED
supportsExecutionModel dynamic task graphs
lazy evaluation
supportsLanguage Python
usedFor ETL workflows
data engineering
data science
machine learning pipelines
website https://www.dask.org

How these facts were elicited

Referenced by (3)

Full triples — surface form annotated when it differs from this entity's canonical label.