Artifact that contains the training data.
Inherits From: Artifact
tfx.v1.types.standard_artifacts.Examples(
*args, **kwargs
)
Training data should be brought in to the TFX pipeline using components like ExampleGen. Data in Examples artifact is split and stored separately. The file and payload format must be specified as optional custom properties if not using default formats. Please see https://www.tensorflow.org/tfx/guide/examplegen#span_version_and_split to understand about span, version and splits.
Properties:
span
: Integer to distinguish group of Examples.version
: Integer to represent updated data.splits
: A list of split names. For example, ["train", "test"].
File structure:
{uri}/
Split-{split_name1}/
: Files for split- All direct children files are recognized as the data.
- File format and payload format are determined by custom properties.
Split-{split_name2}/
: Another split...
Commonly used custom properties of the Examples artifact:
file_format
: a string that represents the file format. See tfx/components/util/tfxio_utils.py:make_tfxio for available values.payload_format
: int (enum) value of the data payload format. See tfx/proto/example_gen.proto:PayloadFormat for available formats.
Attributes | |
---|---|
splits
|
Child Classes
Methods
path
path(
*, split: str
) -> str
Path to the artifact URI's split subdirectory.
This method DOES NOT create a directory path it returns; caller must make a directory of the returned path value before writing.
Args | |
---|---|
split
|
A name of the split, e.g. "train" , "validation" , "test" .
|
Raises | |
---|---|
ValueError
|
if the split is not in the self.splits .
|
Returns | |
---|---|
A path to {self.uri}/Split-{split} .
|
Class Variables | |
---|---|
PROPERTIES |
|
TYPE_NAME |
'Examples'
|