StatisticsGen

E457343

StatisticsGen is a TensorFlow Extended component that computes descriptive statistics over input datasets to support data analysis and validation in machine learning pipelines.

Try in SPARQL Jump to: Statements Referenced by

Statements (46)

Predicate Object
instanceOf TFX component
TensorFlow Extended component
belongsTo data analysis stage of ML pipelines
data validation stage of ML pipelines
canCompute feature histograms
feature means
feature standard deviations
feature value counts
missing value statistics
compatibleWith Airflow NERFINISHED
Apache Beam runners NERFINISHED
Kubeflow Pipelines NERFINISHED
Vertex AI Pipelines NERFINISHED
configurableWith StatsOptions
developedBy Google NERFINISHED
documentationURL https://www.tensorflow.org/tfx/guide/statsgen
hasParameter example_gen
output
stats_options
hasPurpose compute descriptive statistics over input datasets
support data analysis in machine learning pipelines
support data validation in machine learning pipelines
implementedIn Python NERFINISHED
inputType ExampleGen output
tf.Example data
tf.Record files
license Apache License 2.0
outputFormat protocol buffers
outputType ExampleStatistics
statistics artifact
outputUsedBy ExampleValidator NERFINISHED
SchemaGen NERFINISHED
other downstream TFX components
partOf TFX pipeline NERFINISHED
TensorFlow Extended NERFINISHED
repositoryURL https://github.com/tensorflow/tfx
runsIn TFX orchestration environment
supports feature-wise statistics computation
per-split statistics computation
slice-based statistics computation
usedFor data validation
detecting data anomalies
exploratory data analysis
schema inference support
usesLibrary TensorFlow Data Validation NERFINISHED
tfdv NERFINISHED

Referenced by (1)

Full triples — surface form annotated when it differs from this entity's canonical label.

TensorFlow Extended hasComponent StatisticsGen