TensorFlow Extended (TFX) 的特征工程组件
此示例 Colab 笔记本提供了一个稍微更高级的示例,说明如何使用 TensorFlow Transform (tf.Transform
) 使用完全相同的代码来预处理数据,用于训练模型和在生产环境中提供推断。
TensorFlow Transform 是一个用于预处理 TensorFlow 输入数据的库,包括创建需要对整个训练数据集进行完整遍历的特征。例如,使用 TensorFlow Transform,您可以
- 使用均值和标准差对输入值进行归一化
- 通过对所有输入值生成词汇表将字符串转换为整数
- 根据观察到的数据分布将浮点数分配到桶中,从而将浮点数转换为整数
TensorFlow 内置支持对单个示例或一批示例进行操作。 tf.Transform
扩展了这些功能,以支持对整个训练数据集进行完整遍历。
tf.Transform
的输出以 TensorFlow 图的形式导出,您可以将其用于训练和服务。在训练和服务中使用相同的图可以防止偏差,因为在两个阶段都应用了相同的转换。
我们在本示例中要做什么
在本示例中,我们将处理 包含人口普查数据的广泛使用的数据集,并训练一个模型来进行分类。在此过程中,我们将使用 tf.Transform
对数据进行转换。
安装 TensorFlow Transform
pip install tensorflow-transform
# This cell is only necessary because packages were installed while python was
# running. It avoids the need to restart the runtime when running in Colab.
import pkg_resources
import importlib
importlib.reload(pkg_resources)
/tmpfs/tmp/ipykernel_186972/639106435.py:3: DeprecationWarning: pkg_resources is deprecated as an API. See https://setuptools.pypa.io/en/latest/pkg_resources.html import pkg_resources <module 'pkg_resources' from '/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/pkg_resources/__init__.py'>
导入和全局变量
首先导入我们需要的东西。
import math
import os
import pprint
import pandas as pd
import matplotlib.pyplot as plt
import tensorflow as tf
print('TF: {}'.format(tf.__version__))
import apache_beam as beam
print('Beam: {}'.format(beam.__version__))
import tensorflow_transform as tft
import tensorflow_transform.beam as tft_beam
from tensorflow_transform.keras_lib import tf_keras
print('Transform: {}'.format(tft.__version__))
from tfx_bsl.public import tfxio
from tfx_bsl.coders.example_coder import RecordBatchToExamplesEncoder
2024-04-30 10:48:55.479069: E external/local_xla/xla/stream_executor/cuda/cuda_dnn.cc:9261] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered 2024-04-30 10:48:55.479126: E external/local_xla/xla/stream_executor/cuda/cuda_fft.cc:607] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered 2024-04-30 10:48:55.480629: E external/local_xla/xla/stream_executor/cuda/cuda_blas.cc:1515] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered TF: 2.15.1 Beam: 2.55.1 Transform: 1.15.0
接下来下载数据文件
!wget https://storage.googleapis.com/artifacts.tfx-oss-public.appspot.com/datasets/census/adult.data
!wget https://storage.googleapis.com/artifacts.tfx-oss-public.appspot.com/datasets/census/adult.test
train_path = './adult.data'
test_path = './adult.test'
--2024-04-30 10:48:57-- https://storage.googleapis.com/artifacts.tfx-oss-public.appspot.com/datasets/census/adult.data Resolving storage.googleapis.com (storage.googleapis.com)... 173.194.206.207, 108.177.120.207, 142.250.103.207, ... Connecting to storage.googleapis.com (storage.googleapis.com)|173.194.206.207|:443... connected. HTTP request sent, awaiting response... 200 OK Length: 3974305 (3.8M) [application/octet-stream] Saving to: ‘adult.data’ adult.data 100%[===================>] 3.79M --.-KB/s in 0.02s 2024-04-30 10:48:58 (165 MB/s) - ‘adult.data’ saved [3974305/3974305] --2024-04-30 10:48:58-- https://storage.googleapis.com/artifacts.tfx-oss-public.appspot.com/datasets/census/adult.test Resolving storage.googleapis.com (storage.googleapis.com)... 173.194.206.207, 108.177.120.207, 142.250.103.207, ... Connecting to storage.googleapis.com (storage.googleapis.com)|173.194.206.207|:443... connected. HTTP request sent, awaiting response... 200 OK Length: 2003153 (1.9M) [application/octet-stream] Saving to: ‘adult.test’ adult.test 100%[===================>] 1.91M --.-KB/s in 0.01s 2024-04-30 10:48:58 (145 MB/s) - ‘adult.test’ saved [2003153/2003153]
命名我们的列
我们将创建一些方便的列表来引用数据集中列的引用。
CATEGORICAL_FEATURE_KEYS = [
'workclass',
'education',
'marital-status',
'occupation',
'relationship',
'race',
'sex',
'native-country',
]
NUMERIC_FEATURE_KEYS = [
'age',
'capital-gain',
'capital-loss',
'hours-per-week',
'education-num'
]
ORDERED_CSV_COLUMNS = [
'age', 'workclass', 'fnlwgt', 'education', 'education-num',
'marital-status', 'occupation', 'relationship', 'race', 'sex',
'capital-gain', 'capital-loss', 'hours-per-week', 'native-country', 'label'
]
LABEL_KEY = 'label'
以下是数据的快速预览
pandas_train = pd.read_csv(train_path, header=None, names=ORDERED_CSV_COLUMNS)
pandas_train.head(5)
one_row = dict(pandas_train.loc[0])
COLUMN_DEFAULTS = [
'' if isinstance(v, str) else 0.0
for v in dict(pandas_train.loc[1]).values()]
测试数据有 1 行标题需要跳过,并且每行末尾都有一个尾随的 "."。
pandas_test = pd.read_csv(test_path, header=1, names=ORDERED_CSV_COLUMNS)
pandas_test.head(5)
testing = os.getenv("WEB_TEST_BROWSER", False)
if testing:
pandas_train = pandas_train.loc[:1]
pandas_test = pandas_test.loc[:1]
定义我们的特征和模式
让我们根据输入列的类型定义一个模式。其中包括帮助正确导入它们。
RAW_DATA_FEATURE_SPEC = dict(
[(name, tf.io.FixedLenFeature([], tf.string))
for name in CATEGORICAL_FEATURE_KEYS] +
[(name, tf.io.FixedLenFeature([], tf.float32))
for name in NUMERIC_FEATURE_KEYS] +
[(LABEL_KEY, tf.io.FixedLenFeature([], tf.string))]
)
SCHEMA = tft.DatasetMetadata.from_feature_spec(RAW_DATA_FEATURE_SPEC).schema
[可选] 编码和解码 tf.train.Example 协议
本教程需要在几个地方将数据集中的示例从 tf.train.Example
协议转换为 tf.train.Example
协议。
下面的隐藏 encode_example
函数将数据集的特征字典转换为 tf.train.Example
。
现在您可以将数据集示例转换为 Example
协议
tf_example = encode_example(pandas_train.loc[0])
tf_example.features.feature['age']
float_list { value: 39.0 }
serialized_example_batch = tf.constant([
encode_example(pandas_train.loc[i]).SerializeToString()
for i in range(3)
])
serialized_example_batch
<tf.Tensor: shape=(3,), dtype=string, numpy= array([b'\n\xf9\x02\n\x0f\n\x03age\x12\x08\x12\x06\n\x04\x00\x00\x1cB\n\x12\n\x05label\x12\t\n\x07\n\x05<=50K\n\x1a\n\x0ehours-per-week\x12\x08\x12\x06\n\x04\x00\x00 B\n#\n\x0enative-country\x12\x11\n\x0f\n\rUnited-States\n\x1a\n\tworkclass\x12\r\n\x0b\n\tState-gov\n\x0f\n\x03sex\x12\x08\n\x06\n\x04Male\n\x18\n\x0ccapital-loss\x12\x08\x12\x06\n\x04\x00\x00\x00\x00\n\x19\n\reducation-num\x12\x08\x12\x06\n\x04\x00\x00PA\n!\n\x0crelationship\x12\x11\n\x0f\n\rNot-in-family\n\x1e\n\noccupation\x12\x10\n\x0e\n\x0cAdm-clerical\n#\n\x0emarital-status\x12\x11\n\x0f\n\rNever-married\n\x11\n\x04race\x12\t\n\x07\n\x05White\n\x1a\n\teducation\x12\r\n\x0b\n\tBachelors\n\x18\n\x0ccapital-gain\x12\x08\x12\x06\n\x04\x00\xe0\x07E', b'\n\x82\x03\n\x12\n\x05label\x12\t\n\x07\n\x05<=50K\n\x1a\n\teducation\x12\r\n\x0b\n\tBachelors\n\x18\n\x0ccapital-gain\x12\x08\x12\x06\n\x04\x00\x00\x00\x00\n(\n\x0emarital-status\x12\x16\n\x14\n\x12Married-civ-spouse\n\x1b\n\x0crelationship\x12\x0b\n\t\n\x07Husband\n\x19\n\reducation-num\x12\x08\x12\x06\n\x04\x00\x00PA\n\x11\n\x04race\x12\t\n\x07\n\x05White\n\x18\n\x0ccapital-loss\x12\x08\x12\x06\n\x04\x00\x00\x00\x00\n#\n\x0enative-country\x12\x11\n\x0f\n\rUnited-States\n!\n\noccupation\x12\x13\n\x11\n\x0fExec-managerial\n!\n\tworkclass\x12\x14\n\x12\n\x10Self-emp-not-inc\n\x0f\n\x03age\x12\x08\x12\x06\n\x04\x00\x00HB\n\x1a\n\x0ehours-per-week\x12\x08\x12\x06\n\x04\x00\x00PA\n\x0f\n\x03sex\x12\x08\n\x06\n\x04Male', b'\n\xf5\x02\n\x19\n\reducation-num\x12\x08\x12\x06\n\x04\x00\x00\x10A\n!\n\x0crelationship\x12\x11\n\x0f\n\rNot-in-family\n#\n\noccupation\x12\x15\n\x13\n\x11Handlers-cleaners\n\x0f\n\x03age\x12\x08\x12\x06\n\x04\x00\x00\x18B\n\x18\n\tworkclass\x12\x0b\n\t\n\x07Private\n\x18\n\x0ccapital-gain\x12\x08\x12\x06\n\x04\x00\x00\x00\x00\n\x18\n\x0ccapital-loss\x12\x08\x12\x06\n\x04\x00\x00\x00\x00\n\x12\n\x05label\x12\t\n\x07\n\x05<=50K\n\x0f\n\x03sex\x12\x08\n\x06\n\x04Male\n\x1a\n\x0ehours-per-week\x12\x08\x12\x06\n\x04\x00\x00 B\n\x18\n\teducation\x12\x0b\n\t\n\x07HS-grad\n\x11\n\x04race\x12\t\n\x07\n\x05White\n\x1e\n\x0emarital-status\x12\x0c\n\n\n\x08Divorced\n#\n\x0enative-country\x12\x11\n\x0f\n\rUnited-States'], dtype=object)>
您也可以将序列化 Example 协议缓冲区批次转换回张量字典
decoded_tensors = tf.io.parse_example(
serialized_example_batch,
features=RAW_DATA_FEATURE_SPEC
)
在某些情况下,标签不会被传入,因此编码函数被编写为使标签可选
features_dict = dict(pandas_train.loc[0])
features_dict.pop(LABEL_KEY)
LABEL_KEY in features_dict
False
在创建 Example
协议缓冲区时,它将简单地不包含标签键。
no_label_example = encode_example(features_dict)
LABEL_KEY in no_label_example.features.feature.keys()
False
设置超参数和基本管理
用于训练的常量和超参数。
NUM_OOV_BUCKETS = 1
EPOCH_SPLITS = 10
TRAIN_NUM_EPOCHS = 2*EPOCH_SPLITS
NUM_TRAIN_INSTANCES = len(pandas_train)
NUM_TEST_INSTANCES = len(pandas_test)
BATCH_SIZE = 128
STEPS_PER_TRAIN_EPOCH = tf.math.ceil(NUM_TRAIN_INSTANCES/BATCH_SIZE/EPOCH_SPLITS)
EVALUATION_STEPS = tf.math.ceil(NUM_TEST_INSTANCES/BATCH_SIZE)
# Names of temp files
TRANSFORMED_TRAIN_DATA_FILEBASE = 'train_transformed'
TRANSFORMED_TEST_DATA_FILEBASE = 'test_transformed'
EXPORTED_MODEL_DIR = 'exported_model_dir'
if testing:
TRAIN_NUM_EPOCHS = 1
使用 tf.Transform
进行预处理
创建一个 tf.Transform
preprocessing_fn
预处理函数是 tf.Transform 中最重要的概念。预处理函数是数据集转换真正发生的地方。它接受并返回一个张量字典,其中张量表示 Tensor
或 SparseTensor
。通常,构成预处理函数核心的 API 调用有两大类
- TensorFlow 操作:任何接受和返回张量的函数,通常是指 TensorFlow 操作。这些将 TensorFlow 操作添加到图中,这些操作将原始数据一次转换一个特征向量。这些将在训练和服务期间针对每个示例运行。
- Tensorflow Transform 分析器/映射器:tf.Transform 提供的任何分析器/映射器。这些也接受和返回张量,并且通常包含 TensorFlow 操作和 Beam 计算的组合,但与 TensorFlow 操作不同,它们仅在分析期间的 Beam 管道中运行,需要对整个训练数据集进行完整遍历。Beam 计算仅运行一次(在训练之前,在分析期间),并且通常对整个训练数据集进行完整遍历。它们创建
tf.constant
张量,这些张量将添加到您的图中。例如,tft.min
计算训练数据集上张量的最小值。
这是一个适用于此数据集的 preprocessing_fn
。它执行以下几项操作
- 使用
tft.scale_to_0_1
,它将数值特征缩放到[0,1]
范围。 - 使用
tft.compute_and_apply_vocabulary
,它为每个分类特征计算词汇表,并返回每个输入的整数 ID 作为tf.int64
。这适用于字符串和整数分类输入。 - 它使用标准 TensorFlow 操作对数据应用一些手动转换。这里这些操作应用于标签,但也可以转换特征。TensorFlow 操作执行以下几项操作
- 它们为标签构建一个查找表(
tf.init_scope
确保该表仅在函数首次调用时创建)。 - 它们规范化标签的文本。
- 它们将标签转换为独热编码。
- 它们为标签构建一个查找表(
def preprocessing_fn(inputs):
"""Preprocess input columns into transformed columns."""
# Since we are modifying some features and leaving others unchanged, we
# start by setting `outputs` to a copy of `inputs.
outputs = inputs.copy()
# Scale numeric columns to have range [0, 1].
for key in NUMERIC_FEATURE_KEYS:
outputs[key] = tft.scale_to_0_1(inputs[key])
# For all categorical columns except the label column, we generate a
# vocabulary but do not modify the feature. This vocabulary is instead
# used in the trainer, by means of a feature column, to convert the feature
# from a string to an integer id.
for key in CATEGORICAL_FEATURE_KEYS:
outputs[key] = tft.compute_and_apply_vocabulary(
tf.strings.strip(inputs[key]),
num_oov_buckets=NUM_OOV_BUCKETS,
vocab_filename=key)
# For the label column we provide the mapping from string to index.
table_keys = ['>50K', '<=50K']
with tf.init_scope():
initializer = tf.lookup.KeyValueTensorInitializer(
keys=table_keys,
values=tf.cast(tf.range(len(table_keys)), tf.int64),
key_dtype=tf.string,
value_dtype=tf.int64)
table = tf.lookup.StaticHashTable(initializer, default_value=-1)
# Remove trailing periods for test data when the data is read with tf.data.
# label_str = tf.sparse.to_dense(inputs[LABEL_KEY])
label_str = inputs[LABEL_KEY]
label_str = tf.strings.regex_replace(label_str, r'\.$', '')
label_str = tf.strings.strip(label_str)
data_labels = table.lookup(label_str)
transformed_label = tf.one_hot(
indices=data_labels, depth=len(table_keys), on_value=1.0, off_value=0.0)
outputs[LABEL_KEY] = tf.reshape(transformed_label, [-1, len(table_keys)])
return outputs
语法
您几乎准备好将所有内容整合在一起并使用 Apache Beam 来运行它。
Apache Beam 使用 特殊的语法来定义和调用转换。例如,在这一行中
result = pass_this | 'name this step' >> to_this_call
方法 to_this_call
正在被调用并传递给名为 pass_this
的对象,并且 此操作将在堆栈跟踪中被称为 name this step
。对 to_this_call
的调用的结果将返回到 result
中。您经常会看到像这样链接在一起的管道阶段
result = apache_beam.Pipeline() | 'first step' >> do_this_first() | 'second step' >> do_this_last()
并且由于它从一个新的管道开始,您可以像这样继续
next_result = result | 'doing more stuff' >> another_function()
转换数据
现在我们准备开始在 Apache Beam 管道中转换我们的数据。
- 使用
tfxio.CsvTFXIO
CSV 阅读器读取数据(要处理管道中的文本行,请使用tfxio.BeamRecordCsvTFXIO
代替)。 - 使用上面定义的
preprocessing_fn
分析和转换数据。 - 将结果写出为
Example
协议缓冲区的TFRecord
,我们将在稍后用于训练模型
def transform_data(train_data_file, test_data_file, working_dir):
"""Transform the data and write out as a TFRecord of Example protos.
Read in the data using the CSV reader, and transform it using a
preprocessing pipeline that scales numeric data and converts categorical data
from strings to int64 values indices, by creating a vocabulary for each
category.
Args:
train_data_file: File containing training data
test_data_file: File containing test data
working_dir: Directory to write transformed data and metadata to
"""
# The "with" block will create a pipeline, and run that pipeline at the exit
# of the block.
with beam.Pipeline() as pipeline:
with tft_beam.Context(temp_dir=tempfile.mkdtemp()):
# Create a TFXIO to read the census data with the schema. To do this we
# need to list all columns in order since the schema doesn't specify the
# order of columns in the csv.
# We first read CSV files and use BeamRecordCsvTFXIO whose .BeamSource()
# accepts a PCollection[bytes] because we need to patch the records first
# (see "FixCommasTrainData" below). Otherwise, tfxio.CsvTFXIO can be used
# to both read the CSV files and parse them to TFT inputs:
# csv_tfxio = tfxio.CsvTFXIO(...)
# raw_data = (pipeline | 'ToRecordBatches' >> csv_tfxio.BeamSource())
train_csv_tfxio = tfxio.CsvTFXIO(
file_pattern=train_data_file,
telemetry_descriptors=[],
column_names=ORDERED_CSV_COLUMNS,
schema=SCHEMA)
# Read in raw data and convert using CSV TFXIO.
raw_data = (
pipeline |
'ReadTrainCsv' >> train_csv_tfxio.BeamSource())
# Combine data and schema into a dataset tuple. Note that we already used
# the schema to read the CSV data, but we also need it to interpret
# raw_data.
cfg = train_csv_tfxio.TensorAdapterConfig()
raw_dataset = (raw_data, cfg)
# The TFXIO output format is chosen for improved performance.
transformed_dataset, transform_fn = (
raw_dataset | tft_beam.AnalyzeAndTransformDataset(
preprocessing_fn, output_record_batches=True))
# Transformed metadata is not necessary for encoding.
transformed_data, _ = transformed_dataset
# Extract transformed RecordBatches, encode and write them to the given
# directory.
coder = RecordBatchToExamplesEncoder()
_ = (
transformed_data
| 'EncodeTrainData' >>
beam.FlatMapTuple(lambda batch, _: coder.encode(batch))
| 'WriteTrainData' >> beam.io.WriteToTFRecord(
os.path.join(working_dir, TRANSFORMED_TRAIN_DATA_FILEBASE)))
# Now apply transform function to test data. In this case we remove the
# trailing period at the end of each line, and also ignore the header line
# that is present in the test data file.
test_csv_tfxio = tfxio.CsvTFXIO(
file_pattern=test_data_file,
skip_header_lines=1,
telemetry_descriptors=[],
column_names=ORDERED_CSV_COLUMNS,
schema=SCHEMA)
raw_test_data = (
pipeline
| 'ReadTestCsv' >> test_csv_tfxio.BeamSource())
raw_test_dataset = (raw_test_data, test_csv_tfxio.TensorAdapterConfig())
# The TFXIO output format is chosen for improved performance.
transformed_test_dataset = (
(raw_test_dataset, transform_fn)
| tft_beam.TransformDataset(output_record_batches=True))
# Transformed metadata is not necessary for encoding.
transformed_test_data, _ = transformed_test_dataset
# Extract transformed RecordBatches, encode and write them to the given
# directory.
_ = (
transformed_test_data
| 'EncodeTestData' >>
beam.FlatMapTuple(lambda batch, _: coder.encode(batch))
| 'WriteTestData' >> beam.io.WriteToTFRecord(
os.path.join(working_dir, TRANSFORMED_TEST_DATA_FILEBASE)))
# Will write a SavedModel and metadata to working_dir, which can then
# be read by the tft.TFTransformOutput class.
_ = (
transform_fn
| 'WriteTransformFn' >> tft_beam.WriteTransformFn(working_dir))
运行管道
import tempfile
import pathlib
output_dir = os.path.join(tempfile.mkdtemp(), 'keras')
transform_data(train_path, test_path, output_dir)
WARNING:apache_beam.runners.interactive.interactive_environment:Dependencies required for Interactive Beam PCollection visualization are not available, please use: `pip install apache-beam[interactive]` to install necessary dependencies to enable all data visualization features. INFO:tensorflow:Assets written to: /tmpfs/tmp/tmpui2ti1wk/tftransform_tmp/c6e2397d5edb4102a64777cdf8d1b9bb/assets INFO:tensorflow:Assets written to: /tmpfs/tmp/tmpui2ti1wk/tftransform_tmp/c6e2397d5edb4102a64777cdf8d1b9bb/assets INFO:tensorflow:struct2tensor is not available. INFO:tensorflow:struct2tensor is not available. INFO:tensorflow:tensorflow_decision_forests is not available. INFO:tensorflow:tensorflow_decision_forests is not available. INFO:tensorflow:tensorflow_text is not available. INFO:tensorflow:tensorflow_text is not available. INFO:tensorflow:Assets written to: /tmpfs/tmp/tmpui2ti1wk/tftransform_tmp/58d7642780cb4ce0964fc9e2deb91d67/assets INFO:tensorflow:Assets written to: /tmpfs/tmp/tmpui2ti1wk/tftransform_tmp/58d7642780cb4ce0964fc9e2deb91d67/assets INFO:tensorflow:struct2tensor is not available. INFO:tensorflow:struct2tensor is not available. INFO:tensorflow:tensorflow_decision_forests is not available. INFO:tensorflow:tensorflow_decision_forests is not available. INFO:tensorflow:tensorflow_text is not available. INFO:tensorflow:tensorflow_text is not available. WARNING:apache_beam.io.tfrecordio:Couldn't find python-snappy so the implementation of _TFRecordUtil._masked_crc32c is not as fast as it could be. INFO:tensorflow:struct2tensor is not available. INFO:tensorflow:struct2tensor is not available. INFO:tensorflow:tensorflow_decision_forests is not available. INFO:tensorflow:tensorflow_decision_forests is not available. INFO:tensorflow:tensorflow_text is not available. INFO:tensorflow:tensorflow_text is not available.
将输出目录包装为 tft.TFTransformOutput
tf_transform_output = tft.TFTransformOutput(output_dir)
tf_transform_output.transformed_feature_spec()
{'age': FixedLenFeature(shape=[], dtype=tf.float32, default_value=None), 'capital-gain': FixedLenFeature(shape=[], dtype=tf.float32, default_value=None), 'capital-loss': FixedLenFeature(shape=[], dtype=tf.float32, default_value=None), 'education': FixedLenFeature(shape=[], dtype=tf.int64, default_value=None), 'education-num': FixedLenFeature(shape=[], dtype=tf.float32, default_value=None), 'hours-per-week': FixedLenFeature(shape=[], dtype=tf.float32, default_value=None), 'label': FixedLenFeature(shape=[2], dtype=tf.float32, default_value=None), 'marital-status': FixedLenFeature(shape=[], dtype=tf.int64, default_value=None), 'native-country': FixedLenFeature(shape=[], dtype=tf.int64, default_value=None), 'occupation': FixedLenFeature(shape=[], dtype=tf.int64, default_value=None), 'race': FixedLenFeature(shape=[], dtype=tf.int64, default_value=None), 'relationship': FixedLenFeature(shape=[], dtype=tf.int64, default_value=None), 'sex': FixedLenFeature(shape=[], dtype=tf.int64, default_value=None), 'workclass': FixedLenFeature(shape=[], dtype=tf.int64, default_value=None)}
如果您查看目录,您会发现它包含三件事
train_transformed
和test_transformed
数据文件transform_fn
目录(一个tf.saved_model
)transformed_metadata
以下部分将展示如何使用这些工件来训练模型。
ls -l {output_dir}
total 15704 -rw-rw-r-- 1 kbuilder kbuilder 5356449 Apr 30 10:49 test_transformed-00000-of-00001 -rw-rw-r-- 1 kbuilder kbuilder 10712569 Apr 30 10:49 train_transformed-00000-of-00001 drwxr-xr-x 4 kbuilder kbuilder 4096 Apr 30 10:49 transform_fn drwxr-xr-x 2 kbuilder kbuilder 4096 Apr 30 10:49 transformed_metadata
使用预处理数据使用 tf_keras 训练模型
为了展示 tf.Transform
如何使我们能够对训练和服务使用相同的代码,从而防止偏差,我们将训练一个模型。为了训练我们的模型并为生产准备我们的训练模型,我们需要创建输入函数。我们的训练输入函数和服务输入函数之间的主要区别在于,训练数据包含标签,而生产数据不包含标签。参数和返回值也略有不同。
创建用于训练的输入函数
在上一节中运行管道创建了包含转换数据的 TFRecord
文件。
以下代码使用 tf.data.experimental.make_batched_features_dataset
和 tft.TFTransformOutput.transformed_feature_spec
将这些数据文件读取为 tf.data.Dataset
def _make_training_input_fn(tf_transform_output, train_file_pattern,
batch_size):
"""An input function reading from transformed data, converting to model input.
Args:
tf_transform_output: Wrapper around output of tf.Transform.
transformed_examples: Base filename of examples.
batch_size: Batch size.
Returns:
The input data for training or eval, in the form of k.
"""
def input_fn():
return tf.data.experimental.make_batched_features_dataset(
file_pattern=train_file_pattern,
batch_size=batch_size,
features=tf_transform_output.transformed_feature_spec(),
reader=tf.data.TFRecordDataset,
label_key=LABEL_KEY,
shuffle=True)
return input_fn
train_file_pattern = pathlib.Path(output_dir)/f'{TRANSFORMED_TRAIN_DATA_FILEBASE}*'
input_fn = _make_training_input_fn(
tf_transform_output=tf_transform_output,
train_file_pattern = str(train_file_pattern),
batch_size = 10
)
下面您可以看到转换后的数据样本。请注意,education-num
和 hourd-per-week
等数值列如何转换为范围为 [0,1] 的浮点数,以及字符串列如何转换为 ID
for example, label in input_fn().take(1):
break
pd.DataFrame(example)
WARNING:tensorflow:From /tmpfs/src/tf_docs_env/lib/python3.9/site-packages/tensorflow/python/data/experimental/ops/readers.py:1086: parse_example_dataset (from tensorflow.python.data.experimental.ops.parsing_ops) is deprecated and will be removed in a future version. Instructions for updating: Use `tf.data.Dataset.map(tf.io.parse_example(...))` instead. WARNING:tensorflow:From /tmpfs/src/tf_docs_env/lib/python3.9/site-packages/tensorflow/python/data/experimental/ops/readers.py:1086: parse_example_dataset (from tensorflow.python.data.experimental.ops.parsing_ops) is deprecated and will be removed in a future version. Instructions for updating: Use `tf.data.Dataset.map(tf.io.parse_example(...))` instead.
label
<tf.Tensor: shape=(10, 2), dtype=float32, numpy= array([[0., 1.], [1., 0.], [0., 1.], [0., 1.], [1., 0.], [0., 1.], [1., 0.], [0., 1.], [0., 1.], [0., 1.]], dtype=float32)>
训练、评估模型
构建模型
def build_keras_model(working_dir):
inputs = build_keras_inputs(working_dir)
encoded_inputs = encode_inputs(inputs)
stacked_inputs = tf.concat(tf.nest.flatten(encoded_inputs), axis=1)
output = tf_keras.layers.Dense(100, activation='relu')(stacked_inputs)
output = tf_keras.layers.Dense(50, activation='relu')(output)
output = tf_keras.layers.Dense(2)(output)
model = tf_keras.Model(inputs=inputs, outputs=output)
return model
def build_keras_inputs(working_dir):
tf_transform_output = tft.TFTransformOutput(working_dir)
feature_spec = tf_transform_output.transformed_feature_spec().copy()
feature_spec.pop(LABEL_KEY)
# Build the `keras.Input` objects.
inputs = {}
for key, spec in feature_spec.items():
if isinstance(spec, tf.io.VarLenFeature):
inputs[key] = tf_keras.layers.Input(
shape=[None], name=key, dtype=spec.dtype, sparse=True)
elif isinstance(spec, tf.io.FixedLenFeature):
inputs[key] = tf_keras.layers.Input(
shape=spec.shape, name=key, dtype=spec.dtype)
else:
raise ValueError('Spec type is not supported: ', key, spec)
return inputs
def encode_inputs(inputs):
encoded_inputs = {}
for key in inputs:
feature = tf.expand_dims(inputs[key], -1)
if key in CATEGORICAL_FEATURE_KEYS:
num_buckets = tf_transform_output.num_buckets_for_transformed_feature(key)
encoding_layer = (
tf_keras.layers.CategoryEncoding(
num_tokens=num_buckets, output_mode='binary', sparse=False))
encoded_inputs[key] = encoding_layer(feature)
else:
encoded_inputs[key] = feature
return encoded_inputs
model = build_keras_model(output_dir)
tf_keras.utils.plot_model(model,rankdir='LR', show_shapes=True)
构建数据集
def get_dataset(working_dir, filebase):
tf_transform_output = tft.TFTransformOutput(working_dir)
data_path_pattern = os.path.join(
working_dir,
filebase + '*')
input_fn = _make_training_input_fn(
tf_transform_output,
data_path_pattern,
batch_size=BATCH_SIZE)
dataset = input_fn()
return dataset
训练和评估模型
def train_and_evaluate(
model,
working_dir):
"""Train the model on training data and evaluate on test data.
Args:
working_dir: The location of the Transform output.
num_train_instances: Number of instances in train set
num_test_instances: Number of instances in test set
Returns:
The results from the estimator's 'evaluate' method
"""
train_dataset = get_dataset(working_dir, TRANSFORMED_TRAIN_DATA_FILEBASE)
validation_dataset = get_dataset(working_dir, TRANSFORMED_TEST_DATA_FILEBASE)
model = build_keras_model(working_dir)
history = train_model(model, train_dataset, validation_dataset)
metric_values = model.evaluate(validation_dataset,
steps=EVALUATION_STEPS,
return_dict=True)
return model, history, metric_values
def train_model(model, train_dataset, validation_dataset):
model.compile(optimizer='adam',
loss=tf.losses.CategoricalCrossentropy(from_logits=True),
metrics=['accuracy'])
history = model.fit(train_dataset, validation_data=validation_dataset,
epochs=TRAIN_NUM_EPOCHS,
steps_per_epoch=STEPS_PER_TRAIN_EPOCH,
validation_steps=EVALUATION_STEPS)
return history
model, history, metric_values = train_and_evaluate(model, output_dir)
Epoch 1/20 WARNING: All log messages before absl::InitializeLog() is called are written to STDERR I0000 00:00:1714474167.542556 187132 device_compiler.h:186] Compiled cluster using XLA! This line is logged at most once for the lifetime of the process. 26/26 [==============================] - 4s 70ms/step - loss: 0.5136 - accuracy: 0.7578 - val_loss: 0.4207 - val_accuracy: 0.8198 Epoch 2/20 26/26 [==============================] - 1s 27ms/step - loss: 0.3934 - accuracy: 0.8185 - val_loss: 0.3671 - val_accuracy: 0.8317 Epoch 3/20 26/26 [==============================] - 1s 27ms/step - loss: 0.3696 - accuracy: 0.8272 - val_loss: 0.3548 - val_accuracy: 0.8365 Epoch 4/20 26/26 [==============================] - 1s 27ms/step - loss: 0.3499 - accuracy: 0.8314 - val_loss: 0.3528 - val_accuracy: 0.8383 Epoch 5/20 26/26 [==============================] - 1s 27ms/step - loss: 0.3503 - accuracy: 0.8401 - val_loss: 0.3478 - val_accuracy: 0.8408 Epoch 6/20 26/26 [==============================] - 1s 27ms/step - loss: 0.3506 - accuracy: 0.8416 - val_loss: 0.3453 - val_accuracy: 0.8411 Epoch 7/20 26/26 [==============================] - 1s 26ms/step - loss: 0.3511 - accuracy: 0.8380 - val_loss: 0.3430 - val_accuracy: 0.8410 Epoch 8/20 26/26 [==============================] - 1s 27ms/step - loss: 0.3349 - accuracy: 0.8434 - val_loss: 0.3441 - val_accuracy: 0.8375 Epoch 9/20 26/26 [==============================] - 1s 26ms/step - loss: 0.3473 - accuracy: 0.8296 - val_loss: 0.3390 - val_accuracy: 0.8425 Epoch 10/20 26/26 [==============================] - 1s 27ms/step - loss: 0.3377 - accuracy: 0.8389 - val_loss: 0.3472 - val_accuracy: 0.8401 Epoch 11/20 26/26 [==============================] - 1s 27ms/step - loss: 0.3446 - accuracy: 0.8383 - val_loss: 0.3403 - val_accuracy: 0.8413 Epoch 12/20 26/26 [==============================] - 1s 26ms/step - loss: 0.3343 - accuracy: 0.8471 - val_loss: 0.3335 - val_accuracy: 0.8447 Epoch 13/20 26/26 [==============================] - 1s 27ms/step - loss: 0.3303 - accuracy: 0.8534 - val_loss: 0.3384 - val_accuracy: 0.8416 Epoch 14/20 26/26 [==============================] - 1s 27ms/step - loss: 0.3400 - accuracy: 0.8407 - val_loss: 0.3340 - val_accuracy: 0.8453 Epoch 15/20 26/26 [==============================] - 1s 27ms/step - loss: 0.3374 - accuracy: 0.8410 - val_loss: 0.3347 - val_accuracy: 0.8448 Epoch 16/20 26/26 [==============================] - 1s 27ms/step - loss: 0.3279 - accuracy: 0.8459 - val_loss: 0.3326 - val_accuracy: 0.8450 Epoch 17/20 26/26 [==============================] - 1s 26ms/step - loss: 0.3184 - accuracy: 0.8474 - val_loss: 0.3341 - val_accuracy: 0.8447 Epoch 18/20 26/26 [==============================] - 1s 26ms/step - loss: 0.3393 - accuracy: 0.8410 - val_loss: 0.3332 - val_accuracy: 0.8433 Epoch 19/20 26/26 [==============================] - 1s 26ms/step - loss: 0.3356 - accuracy: 0.8368 - val_loss: 0.3300 - val_accuracy: 0.8454 Epoch 20/20 26/26 [==============================] - 1s 27ms/step - loss: 0.3283 - accuracy: 0.8438 - val_loss: 0.3298 - val_accuracy: 0.8434 128/128 [==============================] - 1s 4ms/step - loss: 0.3303 - accuracy: 0.8433
plt.plot(history.history['loss'], label='Train')
plt.plot(history.history['val_loss'], label='Eval')
plt.ylim(0,max(plt.ylim()))
plt.legend()
plt.title('Loss');
转换新数据
在上一节中,训练过程使用了由 tft_beam.AnalyzeAndTransformDataset
在 transform_dataset
函数中生成的转换数据的硬拷贝。
要对新数据进行操作,您需要加载由 tft_beam.WriteTransformFn
保存的 preprocessing_fn
的最终版本。
TFTransformOutput.transform_features_layer
方法从输出目录加载 preprocessing_fn
SavedModel。
这是一个从源文件加载新的、未处理的批次的函数
def read_csv(file_name, batch_size):
return tf.data.experimental.make_csv_dataset(
file_pattern=file_name,
batch_size=batch_size,
column_names=ORDERED_CSV_COLUMNS,
column_defaults=COLUMN_DEFAULTS,
prefetch_buffer_size=0,
ignore_errors=True)
for ex in read_csv(test_path, batch_size=5):
break
pd.DataFrame(ex)
WARNING:tensorflow:From /tmpfs/src/tf_docs_env/lib/python3.9/site-packages/tensorflow/python/data/experimental/ops/readers.py:573: ignore_errors (from tensorflow.python.data.experimental.ops.error_ops) is deprecated and will be removed in a future version. Instructions for updating: Use `tf.data.Dataset.ignore_errors` instead. WARNING:tensorflow:From /tmpfs/src/tf_docs_env/lib/python3.9/site-packages/tensorflow/python/data/experimental/ops/readers.py:573: ignore_errors (from tensorflow.python.data.experimental.ops.error_ops) is deprecated and will be removed in a future version. Instructions for updating: Use `tf.data.Dataset.ignore_errors` instead.
加载 tft.TransformFeaturesLayer
以使用 preprocessing_fn
转换此数据
ex2 = ex.copy()
ex2.pop('fnlwgt')
tft_layer = tf_transform_output.transform_features_layer()
t_ex = tft_layer(ex2)
label = t_ex.pop(LABEL_KEY)
pd.DataFrame(t_ex)
INFO:tensorflow:struct2tensor is not available. INFO:tensorflow:struct2tensor is not available. INFO:tensorflow:tensorflow_decision_forests is not available. INFO:tensorflow:tensorflow_decision_forests is not available. INFO:tensorflow:tensorflow_text is not available. INFO:tensorflow:tensorflow_text is not available.
tft_layer
足够智能,即使只传入一部分特征,它仍然会执行转换。例如,如果您只传入两个特征,您将只获得这些特征的转换版本。
ex2 = pd.DataFrame(ex)[['education', 'hours-per-week']]
ex2
pd.DataFrame(tft_layer(dict(ex2)))
这是一个更健壮的版本,它会删除特征规范中不存在的特征,如果标签在提供的特征中,则返回一个 (features, label)
对
class Transform(tf.Module):
def __init__(self, working_dir):
self.working_dir = working_dir
self.tf_transform_output = tft.TFTransformOutput(working_dir)
self.tft_layer = tf_transform_output.transform_features_layer()
@tf.function
def __call__(self, features):
raw_features = {}
for key, val in features.items():
# Skip unused keys
if key not in RAW_DATA_FEATURE_SPEC:
continue
raw_features[key] = val
# Apply the `preprocessing_fn`.
transformed_features = tft_layer(raw_features)
if LABEL_KEY in transformed_features:
# Pop the label and return a (features, labels) pair.
data_labels = transformed_features.pop(LABEL_KEY)
return (transformed_features, data_labels)
else:
return transformed_features
transform = Transform(output_dir)
t_ex, t_label = transform(ex)
pd.DataFrame(t_ex)
现在您可以使用 Dataset.map
对新数据动态地应用该转换
model.evaluate(
read_csv(test_path, batch_size=5).map(transform),
steps=EVALUATION_STEPS,
return_dict=True
)
128/128 [==============================] - 1s 4ms/step - loss: 0.2992 - accuracy: 0.8547 {'loss': 0.2991926074028015, 'accuracy': 0.854687511920929}
导出模型
因此您有一个训练好的模型,以及一个将 preprocessing_fn
应用于新数据的方法。将它们组装成一个新的模型,该模型接受序列化 tf.train.Example
协议缓冲区作为输入。
class ServingModel(tf.Module):
def __init__(self, model, working_dir):
self.model = model
self.working_dir = working_dir
self.transform = Transform(working_dir)
@tf.function(input_signature=[tf.TensorSpec(shape=[None], dtype=tf.string)])
def __call__(self, serialized_tf_examples):
# parse the tf.train.Example
feature_spec = RAW_DATA_FEATURE_SPEC.copy()
feature_spec.pop(LABEL_KEY)
parsed_features = tf.io.parse_example(serialized_tf_examples, feature_spec)
# Apply the `preprocessing_fn`
transformed_features = self.transform(parsed_features)
# Run the model
outputs = self.model(transformed_features)
# Format the output
classes_names = tf.constant([['0', '1']])
classes = tf.tile(classes_names, [tf.shape(outputs)[0], 1])
return {'classes': classes, 'scores': outputs}
def export(self, output_dir):
# Increment the directory number. This is required in order to make this
# model servable with model_server.
save_model_dir = pathlib.Path(output_dir)/'model'
number_dirs = [int(p.name) for p in save_model_dir.glob('*')
if p.name.isdigit()]
id = max([0] + number_dirs)+1
save_model_dir = save_model_dir/str(id)
# Set the signature to make it visible for serving.
concrete_serving_fn = self.__call__.get_concrete_function()
signatures = {'serving_default': concrete_serving_fn}
# Export the model.
tf.saved_model.save(
self,
str(save_model_dir),
signatures=signatures)
return save_model_dir
构建模型并在序列化示例批次上进行测试运行
serving_model = ServingModel(model, output_dir)
serving_model(serialized_example_batch)
{'classes': <tf.Tensor: shape=(3, 2), dtype=string, numpy= array([[b'0', b'1'], [b'0', b'1'], [b'0', b'1']], dtype=object)>, 'scores': <tf.Tensor: shape=(3, 2), dtype=float32, numpy= array([[-1.6049761e+00, 9.7535902e-01], [-5.3329688e-01, -1.6330201e-03], [-1.8765860e+00, 1.5198938e+00]], dtype=float32)>}
将模型导出为 SavedModel
saved_model_dir = serving_model.export(output_dir)
saved_model_dir
INFO:tensorflow:Assets written to: /tmpfs/tmp/tmpckiw2b8s/keras/model/1/assets INFO:tensorflow:Assets written to: /tmpfs/tmp/tmpckiw2b8s/keras/model/1/assets PosixPath('/tmpfs/tmp/tmpckiw2b8s/keras/model/1')
重新加载模型并在同一批示例上进行测试
reloaded = tf.saved_model.load(str(saved_model_dir))
run_model = reloaded.signatures['serving_default']
run_model(serialized_example_batch)
{'classes': <tf.Tensor: shape=(3, 2), dtype=string, numpy= array([[b'0', b'1'], [b'0', b'1'], [b'0', b'1']], dtype=object)>, 'scores': <tf.Tensor: shape=(3, 2), dtype=float32, numpy= array([[-1.6049761e+00, 9.7535902e-01], [-5.3329688e-01, -1.6330201e-03], [-1.8765860e+00, 1.5198938e+00]], dtype=float32)>}
我们做了什么
在这个例子中,我们使用 tf.Transform
对人口普查数据数据集进行预处理,并使用清理和转换后的数据训练模型。我们还创建了一个输入函数,我们可以在将训练好的模型部署到生产环境中进行推断时使用它。通过对训练和推断使用相同的代码,我们避免了数据偏差问题。在此过程中,我们学习了如何创建一个 Apache Beam 转换来执行清理数据所需的转换。我们还看到了如何使用此转换后的数据使用 tf_keras
训练模型。这只是 TensorFlow Transform 可以做的事情的一小部分!我们鼓励您深入研究 tf.Transform
并发现它可以为您做什么。
[可选] 使用预处理数据使用 tf.estimator 训练模型
创建用于训练的输入函数
def _make_training_input_fn(tf_transform_output, transformed_examples,
batch_size):
"""Creates an input function reading from transformed data.
Args:
tf_transform_output: Wrapper around output of tf.Transform.
transformed_examples: Base filename of examples.
batch_size: Batch size.
Returns:
The input function for training or eval.
"""
def input_fn():
"""Input function for training and eval."""
dataset = tf.data.experimental.make_batched_features_dataset(
file_pattern=transformed_examples,
batch_size=batch_size,
features=tf_transform_output.transformed_feature_spec(),
reader=tf.data.TFRecordDataset,
shuffle=True)
transformed_features = tf.compat.v1.data.make_one_shot_iterator(
dataset).get_next()
# Extract features and label from the transformed tensors.
transformed_labels = tf.where(
tf.equal(transformed_features.pop(LABEL_KEY), 1))
return transformed_features, transformed_labels[:,1]
return input_fn
创建用于服务的输入函数
让我们创建一个可以在生产中使用的输入函数,并为服务准备我们的训练模型。
def _make_serving_input_fn(tf_transform_output):
"""Creates an input function reading from raw data.
Args:
tf_transform_output: Wrapper around output of tf.Transform.
Returns:
The serving input function.
"""
raw_feature_spec = RAW_DATA_FEATURE_SPEC.copy()
# Remove label since it is not available during serving.
raw_feature_spec.pop(LABEL_KEY)
def serving_input_fn():
"""Input function for serving."""
# Get raw features by generating the basic serving input_fn and calling it.
# Here we generate an input_fn that expects a parsed Example proto to be fed
# to the model at serving time. See also
# tf.estimator.export.build_raw_serving_input_receiver_fn.
raw_input_fn = tf.estimator.export.build_parsing_serving_input_receiver_fn(
raw_feature_spec, default_batch_size=None)
serving_input_receiver = raw_input_fn()
# Apply the transform function that was used to generate the materialized
# data.
raw_features = serving_input_receiver.features
transformed_features = tf_transform_output.transform_raw_features(
raw_features)
return tf.estimator.export.ServingInputReceiver(
transformed_features, serving_input_receiver.receiver_tensors)
return serving_input_fn
将我们的输入数据包装在 FeatureColumns 中
我们的模型将期望我们的数据在 TensorFlow FeatureColumns 中。
def get_feature_columns(tf_transform_output):
"""Returns the FeatureColumns for the model.
Args:
tf_transform_output: A `TFTransformOutput` object.
Returns:
A list of FeatureColumns.
"""
# Wrap scalars as real valued columns.
real_valued_columns = [tf.feature_column.numeric_column(key, shape=())
for key in NUMERIC_FEATURE_KEYS]
# Wrap categorical columns.
one_hot_columns = [
tf.feature_column.indicator_column(
tf.feature_column.categorical_column_with_identity(
key=key,
num_buckets=(NUM_OOV_BUCKETS +
tf_transform_output.vocabulary_size_by_name(
vocab_filename=key))))
for key in CATEGORICAL_FEATURE_KEYS]
return real_valued_columns + one_hot_columns
训练、评估和导出我们的模型
def train_and_evaluate(working_dir, num_train_instances=NUM_TRAIN_INSTANCES,
num_test_instances=NUM_TEST_INSTANCES):
"""Train the model on training data and evaluate on test data.
Args:
working_dir: Directory to read transformed data and metadata from and to
write exported model to.
num_train_instances: Number of instances in train set
num_test_instances: Number of instances in test set
Returns:
The results from the estimator's 'evaluate' method
"""
tf_transform_output = tft.TFTransformOutput(working_dir)
run_config = tf.estimator.RunConfig()
estimator = tf.estimator.LinearClassifier(
feature_columns=get_feature_columns(tf_transform_output),
config=run_config,
loss_reduction=tf.losses.Reduction.SUM)
# Fit the model using the default optimizer.
train_input_fn = _make_training_input_fn(
tf_transform_output,
os.path.join(working_dir, TRANSFORMED_TRAIN_DATA_FILEBASE + '*'),
batch_size=BATCH_SIZE)
estimator.train(
input_fn=train_input_fn,
max_steps=TRAIN_NUM_EPOCHS * num_train_instances / BATCH_SIZE)
# Evaluate model on test dataset.
eval_input_fn = _make_training_input_fn(
tf_transform_output,
os.path.join(working_dir, TRANSFORMED_TEST_DATA_FILEBASE + '*'),
batch_size=1)
# Export the model.
serving_input_fn = _make_serving_input_fn(tf_transform_output)
exported_model_dir = os.path.join(working_dir, EXPORTED_MODEL_DIR)
estimator.export_saved_model(exported_model_dir, serving_input_fn)
return estimator.evaluate(input_fn=eval_input_fn, steps=num_test_instances)
将所有内容整合在一起
我们已经创建了所有需要预处理人口普查数据、训练模型并将其准备用于服务的内容。到目前为止,我们只是在准备工作。现在是开始运行的时候了!
import tempfile
temp = temp = os.path.join(tempfile.mkdtemp(),'estimator')
transform_data(train_path, test_path, temp)
results = train_and_evaluate(temp)
INFO:tensorflow:Assets written to: /tmpfs/tmp/tmpvfol9yyw/tftransform_tmp/7f57f74495a24870877a207197967bb1/assets INFO:tensorflow:Assets written to: /tmpfs/tmp/tmpvfol9yyw/tftransform_tmp/7f57f74495a24870877a207197967bb1/assets INFO:tensorflow:struct2tensor is not available. INFO:tensorflow:struct2tensor is not available. INFO:tensorflow:tensorflow_decision_forests is not available. INFO:tensorflow:tensorflow_decision_forests is not available. INFO:tensorflow:tensorflow_text is not available. INFO:tensorflow:tensorflow_text is not available. INFO:tensorflow:Assets written to: /tmpfs/tmp/tmpvfol9yyw/tftransform_tmp/50532d4a7a7844099ecd59a9a8bb3b64/assets INFO:tensorflow:Assets written to: /tmpfs/tmp/tmpvfol9yyw/tftransform_tmp/50532d4a7a7844099ecd59a9a8bb3b64/assets INFO:tensorflow:struct2tensor is not available. INFO:tensorflow:struct2tensor is not available. INFO:tensorflow:tensorflow_decision_forests is not available. INFO:tensorflow:tensorflow_decision_forests is not available. INFO:tensorflow:tensorflow_text is not available. INFO:tensorflow:tensorflow_text is not available. INFO:tensorflow:struct2tensor is not available. INFO:tensorflow:struct2tensor is not available. INFO:tensorflow:tensorflow_decision_forests is not available. INFO:tensorflow:tensorflow_decision_forests is not available. INFO:tensorflow:tensorflow_text is not available. INFO:tensorflow:tensorflow_text is not available. WARNING:tensorflow:From /tmpfs/tmp/ipykernel_186972/871689286.py:16: RunConfig.__init__ (from tensorflow_estimator.python.estimator.run_config) is deprecated and will be removed in a future version. Instructions for updating: Use tf.keras instead. WARNING:tensorflow:From /tmpfs/tmp/ipykernel_186972/871689286.py:16: RunConfig.__init__ (from tensorflow_estimator.python.estimator.run_config) is deprecated and will be removed in a future version. Instructions for updating: Use tf.keras instead. WARNING:tensorflow:From /tmpfs/tmp/ipykernel_186972/2648502843.py:11: numeric_column (from tensorflow.python.feature_column.feature_column_v2) is deprecated and will be removed in a future version. Instructions for updating: Use Keras preprocessing layers instead, either directly or via the `tf.keras.utils.FeatureSpace` utility. Each of `tf.feature_column.*` has a functional equivalent in `tf.keras.layers` for feature preprocessing when training a Keras model. WARNING:tensorflow:From /tmpfs/tmp/ipykernel_186972/2648502843.py:11: numeric_column (from tensorflow.python.feature_column.feature_column_v2) is deprecated and will be removed in a future version. Instructions for updating: Use Keras preprocessing layers instead, either directly or via the `tf.keras.utils.FeatureSpace` utility. Each of `tf.feature_column.*` has a functional equivalent in `tf.keras.layers` for feature preprocessing when training a Keras model. WARNING:tensorflow:From /tmpfs/tmp/ipykernel_186972/2648502843.py:17: categorical_column_with_identity (from tensorflow.python.feature_column.feature_column_v2) is deprecated and will be removed in a future version. Instructions for updating: Use Keras preprocessing layers instead, either directly or via the `tf.keras.utils.FeatureSpace` utility. Each of `tf.feature_column.*` has a functional equivalent in `tf.keras.layers` for feature preprocessing when training a Keras model. WARNING:tensorflow:From /tmpfs/tmp/ipykernel_186972/2648502843.py:17: categorical_column_with_identity (from tensorflow.python.feature_column.feature_column_v2) is deprecated and will be removed in a future version. Instructions for updating: Use Keras preprocessing layers instead, either directly or via the `tf.keras.utils.FeatureSpace` utility. Each of `tf.feature_column.*` has a functional equivalent in `tf.keras.layers` for feature preprocessing when training a Keras model. WARNING:tensorflow:From /tmpfs/tmp/ipykernel_186972/2648502843.py:16: indicator_column (from tensorflow.python.feature_column.feature_column_v2) is deprecated and will be removed in a future version. Instructions for updating: Use Keras preprocessing layers instead, either directly or via the `tf.keras.utils.FeatureSpace` utility. Each of `tf.feature_column.*` has a functional equivalent in `tf.keras.layers` for feature preprocessing when training a Keras model. WARNING:tensorflow:From /tmpfs/tmp/ipykernel_186972/2648502843.py:16: indicator_column (from tensorflow.python.feature_column.feature_column_v2) is deprecated and will be removed in a future version. Instructions for updating: Use Keras preprocessing layers instead, either directly or via the `tf.keras.utils.FeatureSpace` utility. Each of `tf.feature_column.*` has a functional equivalent in `tf.keras.layers` for feature preprocessing when training a Keras model. WARNING:tensorflow:From /tmpfs/tmp/ipykernel_186972/871689286.py:18: LinearClassifierV2.__init__ (from tensorflow_estimator.python.estimator.canned.linear) is deprecated and will be removed in a future version. Instructions for updating: Use tf.keras instead. WARNING:tensorflow:From /tmpfs/tmp/ipykernel_186972/871689286.py:18: LinearClassifierV2.__init__ (from tensorflow_estimator.python.estimator.canned.linear) is deprecated and will be removed in a future version. Instructions for updating: Use tf.keras instead. WARNING:tensorflow:From /tmpfs/src/tf_docs_env/lib/python3.9/site-packages/tensorflow_estimator/python/estimator/head/head_utils.py:54: BinaryClassHead.__init__ (from tensorflow_estimator.python.estimator.head.binary_class_head) is deprecated and will be removed in a future version. Instructions for updating: Use tf.keras instead. WARNING:tensorflow:From /tmpfs/src/tf_docs_env/lib/python3.9/site-packages/tensorflow_estimator/python/estimator/head/head_utils.py:54: BinaryClassHead.__init__ (from tensorflow_estimator.python.estimator.head.binary_class_head) is deprecated and will be removed in a future version. Instructions for updating: Use tf.keras instead. WARNING:tensorflow:From /tmpfs/src/tf_docs_env/lib/python3.9/site-packages/tensorflow_estimator/python/estimator/canned/linear.py:944: Estimator.__init__ (from tensorflow_estimator.python.estimator.estimator) is deprecated and will be removed in a future version. Instructions for updating: Use tf.keras instead. WARNING:tensorflow:From /tmpfs/src/tf_docs_env/lib/python3.9/site-packages/tensorflow_estimator/python/estimator/canned/linear.py:944: Estimator.__init__ (from tensorflow_estimator.python.estimator.estimator) is deprecated and will be removed in a future version. Instructions for updating: Use tf.keras instead. WARNING:tensorflow:Using temporary folder as model directory: /tmpfs/tmp/tmp5z0b2qd4 WARNING:tensorflow:Using temporary folder as model directory: /tmpfs/tmp/tmp5z0b2qd4 INFO:tensorflow:Using config: {'_model_dir': '/tmpfs/tmp/tmp5z0b2qd4', '_tf_random_seed': None, '_save_summary_steps': 100, '_save_checkpoints_steps': None, '_save_checkpoints_secs': 600, '_session_config': allow_soft_placement: true graph_options { rewrite_options { meta_optimizer_iterations: ONE } } , '_keep_checkpoint_max': 5, '_keep_checkpoint_every_n_hours': 10000, '_log_step_count_steps': 100, '_train_distribute': None, '_device_fn': None, '_protocol': None, '_eval_distribute': None, '_experimental_distribute': None, '_experimental_max_worker_delay_secs': None, '_session_creation_timeout_secs': 7200, '_checkpoint_save_graph_def': True, '_service': None, '_cluster_spec': ClusterSpec({}), '_task_type': 'worker', '_task_id': 0, '_global_id_in_cluster': 0, '_master': '', '_evaluation_master': '', '_is_chief': True, '_num_ps_replicas': 0, '_num_worker_replicas': 1} INFO:tensorflow:Using config: {'_model_dir': '/tmpfs/tmp/tmp5z0b2qd4', '_tf_random_seed': None, '_save_summary_steps': 100, '_save_checkpoints_steps': None, '_save_checkpoints_secs': 600, '_session_config': allow_soft_placement: true graph_options { rewrite_options { meta_optimizer_iterations: ONE } } , '_keep_checkpoint_max': 5, '_keep_checkpoint_every_n_hours': 10000, '_log_step_count_steps': 100, '_train_distribute': None, '_device_fn': None, '_protocol': None, '_eval_distribute': None, '_experimental_distribute': None, '_experimental_max_worker_delay_secs': None, '_session_creation_timeout_secs': 7200, '_checkpoint_save_graph_def': True, '_service': None, '_cluster_spec': ClusterSpec({}), '_task_type': 'worker', '_task_id': 0, '_global_id_in_cluster': 0, '_master': '', '_evaluation_master': '', '_is_chief': True, '_num_ps_replicas': 0, '_num_worker_replicas': 1} WARNING:tensorflow:From /tmpfs/src/tf_docs_env/lib/python3.9/site-packages/tensorflow_estimator/python/estimator/estimator.py:385: StopAtStepHook.__init__ (from tensorflow.python.training.basic_session_run_hooks) is deprecated and will be removed in a future version. Instructions for updating: Use tf.keras instead. WARNING:tensorflow:From /tmpfs/src/tf_docs_env/lib/python3.9/site-packages/tensorflow_estimator/python/estimator/estimator.py:385: StopAtStepHook.__init__ (from tensorflow.python.training.basic_session_run_hooks) is deprecated and will be removed in a future version. Instructions for updating: Use tf.keras instead. INFO:tensorflow:Calling model_fn. INFO:tensorflow:Calling model_fn. WARNING:tensorflow:From /tmpfs/src/tf_docs_env/lib/python3.9/site-packages/keras/src/optimizers/legacy/ftrl.py:173: calling Constant.__init__ (from tensorflow.python.ops.init_ops) with dtype is deprecated and will be removed in a future version. Instructions for updating: Call initializer instance with the dtype argument instead of passing it to the constructor WARNING:tensorflow:From /tmpfs/src/tf_docs_env/lib/python3.9/site-packages/keras/src/optimizers/legacy/ftrl.py:173: calling Constant.__init__ (from tensorflow.python.ops.init_ops) with dtype is deprecated and will be removed in a future version. Instructions for updating: Call initializer instance with the dtype argument instead of passing it to the constructor WARNING:tensorflow:From /tmpfs/src/tf_docs_env/lib/python3.9/site-packages/tensorflow_estimator/python/estimator/model_fn.py:250: EstimatorSpec.__new__ (from tensorflow_estimator.python.estimator.model_fn) is deprecated and will be removed in a future version. Instructions for updating: Use tf.keras instead. WARNING:tensorflow:From /tmpfs/src/tf_docs_env/lib/python3.9/site-packages/tensorflow_estimator/python/estimator/model_fn.py:250: EstimatorSpec.__new__ (from tensorflow_estimator.python.estimator.model_fn) is deprecated and will be removed in a future version. Instructions for updating: Use tf.keras instead. INFO:tensorflow:Done calling model_fn. INFO:tensorflow:Done calling model_fn. WARNING:tensorflow:From /tmpfs/src/tf_docs_env/lib/python3.9/site-packages/tensorflow_estimator/python/estimator/estimator.py:1416: NanTensorHook.__init__ (from tensorflow.python.training.basic_session_run_hooks) is deprecated and will be removed in a future version. Instructions for updating: Use tf.keras instead. WARNING:tensorflow:From /tmpfs/src/tf_docs_env/lib/python3.9/site-packages/tensorflow_estimator/python/estimator/estimator.py:1416: NanTensorHook.__init__ (from tensorflow.python.training.basic_session_run_hooks) is deprecated and will be removed in a future version. Instructions for updating: Use tf.keras instead. WARNING:tensorflow:From /tmpfs/src/tf_docs_env/lib/python3.9/site-packages/tensorflow_estimator/python/estimator/estimator.py:1419: LoggingTensorHook.__init__ (from tensorflow.python.training.basic_session_run_hooks) is deprecated and will be removed in a future version. Instructions for updating: Use tf.keras instead. WARNING:tensorflow:From /tmpfs/src/tf_docs_env/lib/python3.9/site-packages/tensorflow_estimator/python/estimator/estimator.py:1419: LoggingTensorHook.__init__ (from tensorflow.python.training.basic_session_run_hooks) is deprecated and will be removed in a future version. Instructions for updating: Use tf.keras instead. WARNING:tensorflow:From /tmpfs/src/tf_docs_env/lib/python3.9/site-packages/tensorflow/python/training/basic_session_run_hooks.py:232: SecondOrStepTimer.__init__ (from tensorflow.python.training.basic_session_run_hooks) is deprecated and will be removed in a future version. Instructions for updating: Use tf.keras instead. WARNING:tensorflow:From /tmpfs/src/tf_docs_env/lib/python3.9/site-packages/tensorflow/python/training/basic_session_run_hooks.py:232: SecondOrStepTimer.__init__ (from tensorflow.python.training.basic_session_run_hooks) is deprecated and will be removed in a future version. Instructions for updating: Use tf.keras instead. WARNING:tensorflow:From /tmpfs/src/tf_docs_env/lib/python3.9/site-packages/tensorflow_estimator/python/estimator/estimator.py:1456: CheckpointSaverHook.__init__ (from tensorflow.python.training.basic_session_run_hooks) is deprecated and will be removed in a future version. Instructions for updating: Use tf.keras instead. WARNING:tensorflow:From /tmpfs/src/tf_docs_env/lib/python3.9/site-packages/tensorflow_estimator/python/estimator/estimator.py:1456: CheckpointSaverHook.__init__ (from tensorflow.python.training.basic_session_run_hooks) is deprecated and will be removed in a future version. Instructions for updating: Use tf.keras instead. INFO:tensorflow:Create CheckpointSaverHook. INFO:tensorflow:Create CheckpointSaverHook. WARNING:tensorflow:From /tmpfs/src/tf_docs_env/lib/python3.9/site-packages/tensorflow/python/training/monitored_session.py:579: StepCounterHook.__init__ (from tensorflow.python.training.basic_session_run_hooks) is deprecated and will be removed in a future version. Instructions for updating: Use tf.keras instead. WARNING:tensorflow:From /tmpfs/src/tf_docs_env/lib/python3.9/site-packages/tensorflow/python/training/monitored_session.py:579: StepCounterHook.__init__ (from tensorflow.python.training.basic_session_run_hooks) is deprecated and will be removed in a future version. Instructions for updating: Use tf.keras instead. WARNING:tensorflow:From /tmpfs/src/tf_docs_env/lib/python3.9/site-packages/tensorflow/python/training/monitored_session.py:586: SummarySaverHook.__init__ (from tensorflow.python.training.basic_session_run_hooks) is deprecated and will be removed in a future version. Instructions for updating: Use tf.keras instead. WARNING:tensorflow:From /tmpfs/src/tf_docs_env/lib/python3.9/site-packages/tensorflow/python/training/monitored_session.py:586: SummarySaverHook.__init__ (from tensorflow.python.training.basic_session_run_hooks) is deprecated and will be removed in a future version. Instructions for updating: Use tf.keras instead. INFO:tensorflow:Graph was finalized. INFO:tensorflow:Graph was finalized. INFO:tensorflow:Running local_init_op. INFO:tensorflow:Running local_init_op. INFO:tensorflow:Done running local_init_op. INFO:tensorflow:Done running local_init_op. INFO:tensorflow:Calling checkpoint listeners before saving checkpoint 0... INFO:tensorflow:Calling checkpoint listeners before saving checkpoint 0... INFO:tensorflow:Saving checkpoints for 0 into /tmpfs/tmp/tmp5z0b2qd4/model.ckpt. INFO:tensorflow:Saving checkpoints for 0 into /tmpfs/tmp/tmp5z0b2qd4/model.ckpt. INFO:tensorflow:Calling checkpoint listeners after saving checkpoint 0... INFO:tensorflow:Calling checkpoint listeners after saving checkpoint 0... WARNING:tensorflow:From /tmpfs/src/tf_docs_env/lib/python3.9/site-packages/tensorflow/python/training/monitored_session.py:1455: SessionRunArgs.__new__ (from tensorflow.python.training.session_run_hook) is deprecated and will be removed in a future version. Instructions for updating: Use tf.keras instead. WARNING:tensorflow:From /tmpfs/src/tf_docs_env/lib/python3.9/site-packages/tensorflow/python/training/monitored_session.py:1455: SessionRunArgs.__new__ (from tensorflow.python.training.session_run_hook) is deprecated and will be removed in a future version. Instructions for updating: Use tf.keras instead. WARNING:tensorflow:From /tmpfs/src/tf_docs_env/lib/python3.9/site-packages/tensorflow/python/training/monitored_session.py:1454: SessionRunContext.__init__ (from tensorflow.python.training.session_run_hook) is deprecated and will be removed in a future version. Instructions for updating: Use tf.keras instead. WARNING:tensorflow:From /tmpfs/src/tf_docs_env/lib/python3.9/site-packages/tensorflow/python/training/monitored_session.py:1454: SessionRunContext.__init__ (from tensorflow.python.training.session_run_hook) is deprecated and will be removed in a future version. Instructions for updating: Use tf.keras instead. WARNING:tensorflow:From /tmpfs/src/tf_docs_env/lib/python3.9/site-packages/tensorflow/python/training/monitored_session.py:1474: SessionRunValues.__new__ (from tensorflow.python.training.session_run_hook) is deprecated and will be removed in a future version. Instructions for updating: Use tf.keras instead. WARNING:tensorflow:From /tmpfs/src/tf_docs_env/lib/python3.9/site-packages/tensorflow/python/training/monitored_session.py:1474: SessionRunValues.__new__ (from tensorflow.python.training.session_run_hook) is deprecated and will be removed in a future version. Instructions for updating: Use tf.keras instead. INFO:tensorflow:loss = 88.72284, step = 0 INFO:tensorflow:loss = 88.72284, step = 0 INFO:tensorflow:global_step/sec: 217.048 INFO:tensorflow:global_step/sec: 217.048 INFO:tensorflow:loss = 38.05179, step = 100 (0.463 sec) INFO:tensorflow:loss = 38.05179, step = 100 (0.463 sec) INFO:tensorflow:global_step/sec: 309.278 INFO:tensorflow:global_step/sec: 309.278 INFO:tensorflow:loss = 62.872578, step = 200 (0.323 sec) INFO:tensorflow:loss = 62.872578, step = 200 (0.323 sec) INFO:tensorflow:global_step/sec: 306.322 INFO:tensorflow:global_step/sec: 306.322 INFO:tensorflow:loss = 43.058277, step = 300 (0.327 sec) INFO:tensorflow:loss = 43.058277, step = 300 (0.327 sec) INFO:tensorflow:global_step/sec: 307.682 INFO:tensorflow:global_step/sec: 307.682 INFO:tensorflow:loss = 33.610596, step = 400 (0.325 sec) INFO:tensorflow:loss = 33.610596, step = 400 (0.325 sec) INFO:tensorflow:global_step/sec: 306.892 INFO:tensorflow:global_step/sec: 306.892 INFO:tensorflow:loss = 49.49376, step = 500 (0.326 sec) INFO:tensorflow:loss = 49.49376, step = 500 (0.326 sec) INFO:tensorflow:global_step/sec: 309.289 INFO:tensorflow:global_step/sec: 309.289 INFO:tensorflow:loss = 39.562958, step = 600 (0.323 sec) INFO:tensorflow:loss = 39.562958, step = 600 (0.323 sec) INFO:tensorflow:global_step/sec: 311.884 INFO:tensorflow:global_step/sec: 311.884 INFO:tensorflow:loss = 39.649498, step = 700 (0.320 sec) INFO:tensorflow:loss = 39.649498, step = 700 (0.320 sec) INFO:tensorflow:global_step/sec: 311.451 INFO:tensorflow:global_step/sec: 311.451 INFO:tensorflow:loss = 40.63858, step = 800 (0.321 sec) INFO:tensorflow:loss = 40.63858, step = 800 (0.321 sec) INFO:tensorflow:global_step/sec: 310.801 INFO:tensorflow:global_step/sec: 310.801 INFO:tensorflow:loss = 56.933117, step = 900 (0.322 sec) INFO:tensorflow:loss = 56.933117, step = 900 (0.322 sec) INFO:tensorflow:global_step/sec: 310.947 INFO:tensorflow:global_step/sec: 310.947 INFO:tensorflow:loss = 43.414566, step = 1000 (0.321 sec) INFO:tensorflow:loss = 43.414566, step = 1000 (0.321 sec) INFO:tensorflow:global_step/sec: 307.503 INFO:tensorflow:global_step/sec: 307.503 INFO:tensorflow:loss = 46.722263, step = 1100 (0.326 sec) INFO:tensorflow:loss = 46.722263, step = 1100 (0.326 sec) INFO:tensorflow:global_step/sec: 310.43 INFO:tensorflow:global_step/sec: 310.43 INFO:tensorflow:loss = 42.71798, step = 1200 (0.322 sec) INFO:tensorflow:loss = 42.71798, step = 1200 (0.322 sec) INFO:tensorflow:global_step/sec: 306.606 INFO:tensorflow:global_step/sec: 306.606 INFO:tensorflow:loss = 32.245277, step = 1300 (0.326 sec) INFO:tensorflow:loss = 32.245277, step = 1300 (0.326 sec) INFO:tensorflow:global_step/sec: 304.767 INFO:tensorflow:global_step/sec: 304.767 INFO:tensorflow:loss = 39.286648, step = 1400 (0.328 sec) INFO:tensorflow:loss = 39.286648, step = 1400 (0.328 sec) INFO:tensorflow:global_step/sec: 311.309 INFO:tensorflow:global_step/sec: 311.309 INFO:tensorflow:loss = 47.270004, step = 1500 (0.321 sec) INFO:tensorflow:loss = 47.270004, step = 1500 (0.321 sec) INFO:tensorflow:global_step/sec: 312.664 INFO:tensorflow:global_step/sec: 312.664 INFO:tensorflow:loss = 41.641903, step = 1600 (0.320 sec) INFO:tensorflow:loss = 41.641903, step = 1600 (0.320 sec) INFO:tensorflow:global_step/sec: 314.642 INFO:tensorflow:global_step/sec: 314.642 INFO:tensorflow:loss = 39.352055, step = 1700 (0.318 sec) INFO:tensorflow:loss = 39.352055, step = 1700 (0.318 sec) INFO:tensorflow:global_step/sec: 308.436 INFO:tensorflow:global_step/sec: 308.436 INFO:tensorflow:loss = 42.981514, step = 1800 (0.324 sec) INFO:tensorflow:loss = 42.981514, step = 1800 (0.324 sec) INFO:tensorflow:global_step/sec: 304.007 INFO:tensorflow:global_step/sec: 304.007 INFO:tensorflow:loss = 39.558506, step = 1900 (0.329 sec) INFO:tensorflow:loss = 39.558506, step = 1900 (0.329 sec) INFO:tensorflow:global_step/sec: 308.174 INFO:tensorflow:global_step/sec: 308.174 INFO:tensorflow:loss = 36.912056, step = 2000 (0.325 sec) INFO:tensorflow:loss = 36.912056, step = 2000 (0.325 sec) INFO:tensorflow:global_step/sec: 305.635 INFO:tensorflow:global_step/sec: 305.635 INFO:tensorflow:loss = 50.084297, step = 2100 (0.327 sec) INFO:tensorflow:loss = 50.084297, step = 2100 (0.327 sec) INFO:tensorflow:global_step/sec: 304.925 INFO:tensorflow:global_step/sec: 304.925 INFO:tensorflow:loss = 34.076836, step = 2200 (0.328 sec) INFO:tensorflow:loss = 34.076836, step = 2200 (0.328 sec) INFO:tensorflow:global_step/sec: 304.67 INFO:tensorflow:global_step/sec: 304.67 INFO:tensorflow:loss = 42.80255, step = 2300 (0.328 sec) INFO:tensorflow:loss = 42.80255, step = 2300 (0.328 sec) INFO:tensorflow:global_step/sec: 304.428 INFO:tensorflow:global_step/sec: 304.428 INFO:tensorflow:loss = 43.28376, step = 2400 (0.328 sec) INFO:tensorflow:loss = 43.28376, step = 2400 (0.328 sec) INFO:tensorflow:global_step/sec: 306.855 INFO:tensorflow:global_step/sec: 306.855 INFO:tensorflow:loss = 52.975185, step = 2500 (0.326 sec) INFO:tensorflow:loss = 52.975185, step = 2500 (0.326 sec) INFO:tensorflow:global_step/sec: 301.499 INFO:tensorflow:global_step/sec: 301.499 INFO:tensorflow:loss = 38.57332, step = 2600 (0.332 sec) INFO:tensorflow:loss = 38.57332, step = 2600 (0.332 sec) INFO:tensorflow:global_step/sec: 304.658 INFO:tensorflow:global_step/sec: 304.658 INFO:tensorflow:loss = 42.026337, step = 2700 (0.328 sec) INFO:tensorflow:loss = 42.026337, step = 2700 (0.328 sec) INFO:tensorflow:global_step/sec: 304.471 INFO:tensorflow:global_step/sec: 304.471 INFO:tensorflow:loss = 49.812424, step = 2800 (0.329 sec) INFO:tensorflow:loss = 49.812424, step = 2800 (0.329 sec) INFO:tensorflow:global_step/sec: 301.243 INFO:tensorflow:global_step/sec: 301.243 INFO:tensorflow:loss = 38.365997, step = 2900 (0.332 sec) INFO:tensorflow:loss = 38.365997, step = 2900 (0.332 sec) INFO:tensorflow:global_step/sec: 303.047 INFO:tensorflow:global_step/sec: 303.047 INFO:tensorflow:loss = 46.136482, step = 3000 (0.330 sec) INFO:tensorflow:loss = 46.136482, step = 3000 (0.330 sec) INFO:tensorflow:global_step/sec: 309.327 INFO:tensorflow:global_step/sec: 309.327 INFO:tensorflow:loss = 39.838882, step = 3100 (0.323 sec) INFO:tensorflow:loss = 39.838882, step = 3100 (0.323 sec) INFO:tensorflow:global_step/sec: 314.267 INFO:tensorflow:global_step/sec: 314.267 INFO:tensorflow:loss = 41.79177, step = 3200 (0.318 sec) INFO:tensorflow:loss = 41.79177, step = 3200 (0.318 sec) INFO:tensorflow:global_step/sec: 301.294 INFO:tensorflow:global_step/sec: 301.294 INFO:tensorflow:loss = 41.994194, step = 3300 (0.332 sec) INFO:tensorflow:loss = 41.994194, step = 3300 (0.332 sec) INFO:tensorflow:global_step/sec: 308.412 INFO:tensorflow:global_step/sec: 308.412 INFO:tensorflow:loss = 41.158104, step = 3400 (0.324 sec) INFO:tensorflow:loss = 41.158104, step = 3400 (0.324 sec) INFO:tensorflow:global_step/sec: 305.302 INFO:tensorflow:global_step/sec: 305.302 INFO:tensorflow:loss = 35.35069, step = 3500 (0.328 sec) INFO:tensorflow:loss = 35.35069, step = 3500 (0.328 sec) INFO:tensorflow:global_step/sec: 303.808 INFO:tensorflow:global_step/sec: 303.808 INFO:tensorflow:loss = 49.999313, step = 3600 (0.329 sec) INFO:tensorflow:loss = 49.999313, step = 3600 (0.329 sec) INFO:tensorflow:global_step/sec: 312.812 INFO:tensorflow:global_step/sec: 312.812 INFO:tensorflow:loss = 44.52297, step = 3700 (0.320 sec) INFO:tensorflow:loss = 44.52297, step = 3700 (0.320 sec) INFO:tensorflow:global_step/sec: 311.422 INFO:tensorflow:global_step/sec: 311.422 INFO:tensorflow:loss = 31.237823, step = 3800 (0.321 sec) INFO:tensorflow:loss = 31.237823, step = 3800 (0.321 sec) INFO:tensorflow:global_step/sec: 311.942 INFO:tensorflow:global_step/sec: 311.942 INFO:tensorflow:loss = 40.837013, step = 3900 (0.321 sec) INFO:tensorflow:loss = 40.837013, step = 3900 (0.321 sec) INFO:tensorflow:global_step/sec: 310.278 INFO:tensorflow:global_step/sec: 310.278 INFO:tensorflow:loss = 48.289017, step = 4000 (0.322 sec) INFO:tensorflow:loss = 48.289017, step = 4000 (0.322 sec) INFO:tensorflow:global_step/sec: 305.809 INFO:tensorflow:global_step/sec: 305.809 INFO:tensorflow:loss = 42.82827, step = 4100 (0.327 sec) INFO:tensorflow:loss = 42.82827, step = 4100 (0.327 sec) INFO:tensorflow:global_step/sec: 309.371 INFO:tensorflow:global_step/sec: 309.371 INFO:tensorflow:loss = 49.08073, step = 4200 (0.323 sec) INFO:tensorflow:loss = 49.08073, step = 4200 (0.323 sec) INFO:tensorflow:global_step/sec: 313.159 INFO:tensorflow:global_step/sec: 313.159 INFO:tensorflow:loss = 43.150997, step = 4300 (0.319 sec) INFO:tensorflow:loss = 43.150997, step = 4300 (0.319 sec) INFO:tensorflow:global_step/sec: 317.596 INFO:tensorflow:global_step/sec: 317.596 INFO:tensorflow:loss = 46.704082, step = 4400 (0.315 sec) INFO:tensorflow:loss = 46.704082, step = 4400 (0.315 sec) INFO:tensorflow:global_step/sec: 316.261 INFO:tensorflow:global_step/sec: 316.261 INFO:tensorflow:loss = 42.477634, step = 4500 (0.316 sec) INFO:tensorflow:loss = 42.477634, step = 4500 (0.316 sec) INFO:tensorflow:global_step/sec: 319.902 INFO:tensorflow:global_step/sec: 319.902 INFO:tensorflow:loss = 47.049324, step = 4600 (0.313 sec) INFO:tensorflow:loss = 47.049324, step = 4600 (0.313 sec) INFO:tensorflow:global_step/sec: 323.097 INFO:tensorflow:global_step/sec: 323.097 INFO:tensorflow:loss = 28.26455, step = 4700 (0.310 sec) INFO:tensorflow:loss = 28.26455, step = 4700 (0.310 sec) INFO:tensorflow:global_step/sec: 318.749 INFO:tensorflow:global_step/sec: 318.749 INFO:tensorflow:loss = 30.772062, step = 4800 (0.314 sec) INFO:tensorflow:loss = 30.772062, step = 4800 (0.314 sec) INFO:tensorflow:global_step/sec: 323.13 INFO:tensorflow:global_step/sec: 323.13 INFO:tensorflow:loss = 42.176075, step = 4900 (0.310 sec) INFO:tensorflow:loss = 42.176075, step = 4900 (0.310 sec) INFO:tensorflow:global_step/sec: 321.773 INFO:tensorflow:global_step/sec: 321.773 INFO:tensorflow:loss = 52.00352, step = 5000 (0.311 sec) INFO:tensorflow:loss = 52.00352, step = 5000 (0.311 sec) INFO:tensorflow:Calling checkpoint listeners before saving checkpoint 5088... INFO:tensorflow:Calling checkpoint listeners before saving checkpoint 5088... INFO:tensorflow:Saving checkpoints for 5088 into /tmpfs/tmp/tmp5z0b2qd4/model.ckpt. INFO:tensorflow:Saving checkpoints for 5088 into /tmpfs/tmp/tmp5z0b2qd4/model.ckpt. INFO:tensorflow:Calling checkpoint listeners after saving checkpoint 5088... INFO:tensorflow:Calling checkpoint listeners after saving checkpoint 5088... INFO:tensorflow:Loss for final step: 33.25688. INFO:tensorflow:Loss for final step: 33.25688. WARNING:tensorflow:From /tmpfs/tmp/ipykernel_186972/3233312620.py:20: build_parsing_serving_input_receiver_fn (from tensorflow_estimator.python.estimator.export.export) is deprecated and will be removed in a future version. Instructions for updating: Use tf.keras instead. WARNING:tensorflow:From /tmpfs/tmp/ipykernel_186972/3233312620.py:20: build_parsing_serving_input_receiver_fn (from tensorflow_estimator.python.estimator.export.export) is deprecated and will be removed in a future version. Instructions for updating: Use tf.keras instead. WARNING:tensorflow:From /tmpfs/src/tf_docs_env/lib/python3.9/site-packages/tensorflow_estimator/python/estimator/export/export.py:312: ServingInputReceiver.__new__ (from tensorflow_estimator.python.estimator.export.export) is deprecated and will be removed in a future version. Instructions for updating: Use tf.keras instead. WARNING:tensorflow:From /tmpfs/src/tf_docs_env/lib/python3.9/site-packages/tensorflow_estimator/python/estimator/export/export.py:312: ServingInputReceiver.__new__ (from tensorflow_estimator.python.estimator.export.export) is deprecated and will be removed in a future version. Instructions for updating: Use tf.keras instead. INFO:tensorflow:struct2tensor is not available. INFO:tensorflow:struct2tensor is not available. INFO:tensorflow:tensorflow_decision_forests is not available. INFO:tensorflow:tensorflow_decision_forests is not available. INFO:tensorflow:tensorflow_text is not available. INFO:tensorflow:tensorflow_text is not available. WARNING:tensorflow:Loading a TF2 SavedModel but eager mode seems disabled. WARNING:tensorflow:Loading a TF2 SavedModel but eager mode seems disabled. INFO:tensorflow:Calling model_fn. INFO:tensorflow:Calling model_fn. WARNING:tensorflow:From /tmpfs/src/tf_docs_env/lib/python3.9/site-packages/tensorflow_estimator/python/estimator/head/base_head.py:786: ClassificationOutput.__init__ (from tensorflow.python.saved_model.model_utils.export_output) is deprecated and will be removed in a future version. Instructions for updating: Use tf.keras instead. WARNING:tensorflow:From /tmpfs/src/tf_docs_env/lib/python3.9/site-packages/tensorflow_estimator/python/estimator/head/base_head.py:786: ClassificationOutput.__init__ (from tensorflow.python.saved_model.model_utils.export_output) is deprecated and will be removed in a future version. Instructions for updating: Use tf.keras instead. WARNING:tensorflow:From /tmpfs/src/tf_docs_env/lib/python3.9/site-packages/tensorflow_estimator/python/estimator/head/binary_class_head.py:561: RegressionOutput.__init__ (from tensorflow.python.saved_model.model_utils.export_output) is deprecated and will be removed in a future version. Instructions for updating: Use tf.keras instead. WARNING:tensorflow:From /tmpfs/src/tf_docs_env/lib/python3.9/site-packages/tensorflow_estimator/python/estimator/head/binary_class_head.py:561: RegressionOutput.__init__ (from tensorflow.python.saved_model.model_utils.export_output) is deprecated and will be removed in a future version. Instructions for updating: Use tf.keras instead. WARNING:tensorflow:From /tmpfs/src/tf_docs_env/lib/python3.9/site-packages/tensorflow_estimator/python/estimator/head/binary_class_head.py:563: PredictOutput.__init__ (from tensorflow.python.saved_model.model_utils.export_output) is deprecated and will be removed in a future version. Instructions for updating: Use tf.keras instead. WARNING:tensorflow:From /tmpfs/src/tf_docs_env/lib/python3.9/site-packages/tensorflow_estimator/python/estimator/head/binary_class_head.py:563: PredictOutput.__init__ (from tensorflow.python.saved_model.model_utils.export_output) is deprecated and will be removed in a future version. Instructions for updating: Use tf.keras instead. INFO:tensorflow:Done calling model_fn. INFO:tensorflow:Done calling model_fn. WARNING:tensorflow:From /tmpfs/src/tf_docs_env/lib/python3.9/site-packages/tensorflow/python/saved_model/signature_def_utils_impl.py:168: build_tensor_info (from tensorflow.python.saved_model.utils_impl) is deprecated and will be removed in a future version. Instructions for updating: This API was designed for TensorFlow v1. See https://tensorflowcn.cn/guide/migrate for instructions on how to migrate your code to TensorFlow v2. WARNING:tensorflow:From /tmpfs/src/tf_docs_env/lib/python3.9/site-packages/tensorflow/python/saved_model/signature_def_utils_impl.py:168: build_tensor_info (from tensorflow.python.saved_model.utils_impl) is deprecated and will be removed in a future version. Instructions for updating: This API was designed for TensorFlow v1. See https://tensorflowcn.cn/guide/migrate for instructions on how to migrate your code to TensorFlow v2. WARNING:tensorflow:From /tmpfs/src/tf_docs_env/lib/python3.9/site-packages/tensorflow/python/saved_model/model_utils/export_utils.py:83: get_tensor_from_tensor_info (from tensorflow.python.saved_model.utils_impl) is deprecated and will be removed in a future version. Instructions for updating: This API was designed for TensorFlow v1. See https://tensorflowcn.cn/guide/migrate for instructions on how to migrate your code to TensorFlow v2. WARNING:tensorflow:From /tmpfs/src/tf_docs_env/lib/python3.9/site-packages/tensorflow/python/saved_model/model_utils/export_utils.py:83: get_tensor_from_tensor_info (from tensorflow.python.saved_model.utils_impl) is deprecated and will be removed in a future version. Instructions for updating: This API was designed for TensorFlow v1. See https://tensorflowcn.cn/guide/migrate for instructions on how to migrate your code to TensorFlow v2. INFO:tensorflow:Signatures INCLUDED in export for Classify: ['serving_default', 'classification'] INFO:tensorflow:Signatures INCLUDED in export for Classify: ['serving_default', 'classification'] INFO:tensorflow:Signatures INCLUDED in export for Regress: ['regression'] INFO:tensorflow:Signatures INCLUDED in export for Regress: ['regression'] INFO:tensorflow:Signatures INCLUDED in export for Predict: ['predict'] INFO:tensorflow:Signatures INCLUDED in export for Predict: ['predict'] INFO:tensorflow:Signatures INCLUDED in export for Train: None INFO:tensorflow:Signatures INCLUDED in export for Train: None INFO:tensorflow:Signatures INCLUDED in export for Eval: None INFO:tensorflow:Signatures INCLUDED in export for Eval: None INFO:tensorflow:Restoring parameters from /tmpfs/tmp/tmp5z0b2qd4/model.ckpt-5088 INFO:tensorflow:Restoring parameters from /tmpfs/tmp/tmp5z0b2qd4/model.ckpt-5088 INFO:tensorflow:Assets added to graph. INFO:tensorflow:Assets added to graph. INFO:tensorflow:Assets written to: /tmpfs/tmp/tmp4ti4zdkp/estimator/exported_model_dir/temp-1714474233/assets INFO:tensorflow:Assets written to: /tmpfs/tmp/tmp4ti4zdkp/estimator/exported_model_dir/temp-1714474233/assets INFO:tensorflow:SavedModel written to: /tmpfs/tmp/tmp4ti4zdkp/estimator/exported_model_dir/temp-1714474233/saved_model.pb INFO:tensorflow:SavedModel written to: /tmpfs/tmp/tmp4ti4zdkp/estimator/exported_model_dir/temp-1714474233/saved_model.pb INFO:tensorflow:Calling model_fn. INFO:tensorflow:Calling model_fn. INFO:tensorflow:Done calling model_fn. INFO:tensorflow:Done calling model_fn. INFO:tensorflow:Starting evaluation at 2024-04-30T10:50:35 INFO:tensorflow:Starting evaluation at 2024-04-30T10:50:35 WARNING:tensorflow:From /tmpfs/src/tf_docs_env/lib/python3.9/site-packages/tensorflow/python/training/evaluation.py:260: FinalOpsHook.__init__ (from tensorflow.python.training.basic_session_run_hooks) is deprecated and will be removed in a future version. Instructions for updating: Use tf.keras instead. WARNING:tensorflow:From /tmpfs/src/tf_docs_env/lib/python3.9/site-packages/tensorflow/python/training/evaluation.py:260: FinalOpsHook.__init__ (from tensorflow.python.training.basic_session_run_hooks) is deprecated and will be removed in a future version. Instructions for updating: Use tf.keras instead. INFO:tensorflow:Graph was finalized. INFO:tensorflow:Graph was finalized. INFO:tensorflow:Restoring parameters from /tmpfs/tmp/tmp5z0b2qd4/model.ckpt-5088 INFO:tensorflow:Restoring parameters from /tmpfs/tmp/tmp5z0b2qd4/model.ckpt-5088 INFO:tensorflow:Running local_init_op. INFO:tensorflow:Running local_init_op. INFO:tensorflow:Done running local_init_op. INFO:tensorflow:Done running local_init_op. INFO:tensorflow:Evaluation [1628/16280] INFO:tensorflow:Evaluation [1628/16280] INFO:tensorflow:Evaluation [3256/16280] INFO:tensorflow:Evaluation [3256/16280] INFO:tensorflow:Evaluation [4884/16280] INFO:tensorflow:Evaluation [4884/16280] INFO:tensorflow:Evaluation [6512/16280] INFO:tensorflow:Evaluation [6512/16280] INFO:tensorflow:Evaluation [8140/16280] INFO:tensorflow:Evaluation [8140/16280] INFO:tensorflow:Evaluation [9768/16280] INFO:tensorflow:Evaluation [9768/16280] INFO:tensorflow:Evaluation [11396/16280] INFO:tensorflow:Evaluation [11396/16280] INFO:tensorflow:Evaluation [13024/16280] INFO:tensorflow:Evaluation [13024/16280] INFO:tensorflow:Evaluation [14652/16280] INFO:tensorflow:Evaluation [14652/16280] INFO:tensorflow:Evaluation [16280/16280] INFO:tensorflow:Evaluation [16280/16280] INFO:tensorflow:Inference Time : 49.09539s INFO:tensorflow:Inference Time : 49.09539s INFO:tensorflow:Finished evaluation at 2024-04-30-10:51:24 INFO:tensorflow:Finished evaluation at 2024-04-30-10:51:24 INFO:tensorflow:Saving dict for global step 5088: accuracy = 0.85110563, accuracy_baseline = 0.7637592, auc = 0.90211606, auc_precision_recall = 0.96728647, average_loss = 0.32371244, global_step = 5088, label/mean = 0.7637592, loss = 0.32371244, precision = 0.88235295, prediction/mean = 0.75723934, recall = 0.9289046 INFO:tensorflow:Saving dict for global step 5088: accuracy = 0.85110563, accuracy_baseline = 0.7637592, auc = 0.90211606, auc_precision_recall = 0.96728647, average_loss = 0.32371244, global_step = 5088, label/mean = 0.7637592, loss = 0.32371244, precision = 0.88235295, prediction/mean = 0.75723934, recall = 0.9289046 INFO:tensorflow:Saving 'checkpoint_path' summary for global step 5088: /tmpfs/tmp/tmp5z0b2qd4/model.ckpt-5088 INFO:tensorflow:Saving 'checkpoint_path' summary for global step 5088: /tmpfs/tmp/tmp5z0b2qd4/model.ckpt-5088
pprint.pprint(results)
{'accuracy': 0.85110563, 'accuracy_baseline': 0.7637592, 'auc': 0.90211606, 'auc_precision_recall': 0.96728647, 'average_loss': 0.32371244, 'global_step': 5088, 'label/mean': 0.7637592, 'loss': 0.32371244, 'precision': 0.88235295, 'prediction/mean': 0.75723934, 'recall': 0.9289046}