StatisticsGen
E457343
StatisticsGen is a TensorFlow Extended component that computes descriptive statistics over input datasets to support data analysis and validation in machine learning pipelines.
Statements (46)
| Predicate | Object |
|---|---|
| instanceOf |
TFX component
ⓘ
TensorFlow Extended component ⓘ |
| belongsTo |
data analysis stage of ML pipelines
ⓘ
data validation stage of ML pipelines ⓘ |
| canCompute |
feature histograms
ⓘ
feature means ⓘ feature standard deviations ⓘ feature value counts ⓘ missing value statistics ⓘ |
| compatibleWith |
Airflow
NERFINISHED
ⓘ
Apache Beam runners NERFINISHED ⓘ Kubeflow Pipelines NERFINISHED ⓘ Vertex AI Pipelines NERFINISHED ⓘ |
| configurableWith | StatsOptions ⓘ |
| developedBy | Google NERFINISHED ⓘ |
| documentationURL | https://www.tensorflow.org/tfx/guide/statsgen ⓘ |
| hasParameter |
example_gen
ⓘ
output ⓘ stats_options ⓘ |
| hasPurpose |
compute descriptive statistics over input datasets
ⓘ
support data analysis in machine learning pipelines ⓘ support data validation in machine learning pipelines ⓘ |
| implementedIn | Python NERFINISHED ⓘ |
| inputType |
ExampleGen output
ⓘ
tf.Example data ⓘ tf.Record files ⓘ |
| license | Apache License 2.0 ⓘ |
| outputFormat | protocol buffers ⓘ |
| outputType |
ExampleStatistics
ⓘ
statistics artifact ⓘ |
| outputUsedBy |
ExampleValidator
NERFINISHED
ⓘ
SchemaGen NERFINISHED ⓘ other downstream TFX components ⓘ |
| partOf |
TFX pipeline
NERFINISHED
ⓘ
TensorFlow Extended NERFINISHED ⓘ |
| repositoryURL | https://github.com/tensorflow/tfx ⓘ |
| runsIn | TFX orchestration environment ⓘ |
| supports |
feature-wise statistics computation
ⓘ
per-split statistics computation ⓘ slice-based statistics computation ⓘ |
| usedFor |
data validation
ⓘ
detecting data anomalies ⓘ exploratory data analysis ⓘ schema inference support ⓘ |
| usesLibrary |
TensorFlow Data Validation
NERFINISHED
ⓘ
tfdv NERFINISHED ⓘ |
Referenced by (1)
Full triples — surface form annotated when it differs from this entity's canonical label.