tfx.v1.components.SchemaGen
Stay organized with collections
Save and categorize content based on your preferences.
A TFX SchemaGen component to generate a schema from the training data.
Inherits From: BaseComponent
, BaseNode
tfx.v1.components.SchemaGen(
statistics: tfx.v1.types.BaseChannel
,
infer_feature_shape: Optional[Union[bool, tfx.v1.dsl.experimental.RuntimeParameter
]] = True,
exclude_splits: Optional[List[str]] = None
)
Used in the notebooks
The SchemaGen component uses TensorFlow Data
Validation
to generate a schema from input statistics. The following TFX libraries use
the schema:
- TensorFlow Data Validation
- TensorFlow Transform
- TensorFlow Model Analysis
In a typical TFX pipeline, the SchemaGen component generates a schema which
is consumed by the other pipeline components.
Example
# Generates schema based on statistics files.
infer_schema = SchemaGen(statistics=statistics_gen.outputs['statistics'])
Component outputs
contains:
See the SchemaGen guide
for more details.
Args |
statistics
|
A BaseChannel of ExampleStatistics type (required if spec is
not passed). This should contain at least a train split. Other splits
are currently ignored. required
|
infer_feature_shape
|
Boolean (or RuntimeParameter) value indicating
whether or not to infer the shape of features. If the feature shape is
not inferred, downstream Tensorflow Transform component using the schema
will parse input as tf.SparseTensor. Default to True if not set.
|
exclude_splits
|
Names of splits that will not be taken into consideration
when auto-generating a schema. Default behavior (when exclude_splits is
set to None) is excluding no splits.
|
Attributes |
outputs
|
Component's output channel dict.
|
Methods
with_node_execution_options
with_node_execution_options(
node_execution_options: utils.NodeExecutionOptions
) -> typing_extensions.Self
Class Variables |
POST_EXECUTABLE_SPEC
|
None
|
PRE_EXECUTABLE_SPEC
|
None
|
Except as otherwise noted, the content of this page is licensed under the Creative Commons Attribution 4.0 License, and code samples are licensed under the Apache 2.0 License. For details, see the Google Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.
Last updated 2024-06-21 UTC.
[[["Easy to understand","easyToUnderstand","thumb-up"],["Solved my problem","solvedMyProblem","thumb-up"],["Other","otherUp","thumb-up"]],[["Missing the information I need","missingTheInformationINeed","thumb-down"],["Too complicated / too many steps","tooComplicatedTooManySteps","thumb-down"],["Out of date","outOfDate","thumb-down"],["Samples / code issue","samplesCodeIssue","thumb-down"],["Other","otherDown","thumb-down"]],["Last updated 2024-06-21 UTC."],[],[]]