TensorFlow Extended (TFX) 的一个关键组件示例
TensorFlow 模型分析 (TFMA) 是一个用于跨不同数据切片执行模型评估的库。TFMA 使用 Apache Beam 在大量数据上以分布式方式执行其计算。
此示例 Colab 笔记本说明了如何使用 TFMA 来调查和可视化模型相对于数据集特征的性能。我们将使用之前训练的模型,现在您可以使用结果!我们训练的模型是用于 芝加哥出租车示例 的模型,该模型使用芝加哥市发布的 出租车行程数据集。在 BigQuery UI 中探索完整数据集。
作为模型构建者和开发人员,请考虑如何使用此数据以及模型预测可能带来的潜在益处和危害。像这样的模型可能会强化社会偏见和差异。某个特征是否与您要解决的问题相关,或者它会引入偏差?有关更多信息,请阅读有关 ML 公平性 的内容。
pickup_community_area | fare | trip_start_month |
trip_start_hour | trip_start_day | trip_start_timestamp |
pickup_latitude | pickup_longitude | dropoff_latitude |
dropoff_longitude | trip_miles | pickup_census_tract |
dropoff_census_tract | payment_type | company |
trip_seconds | dropoff_community_area | tips |
安装 Jupyter 扩展
jupyter nbextension enable --py widgetsnbextension --sys-prefix
jupyter nbextension install --py --symlink tensorflow_model_analysis --sys-prefix
jupyter nbextension enable --py tensorflow_model_analysis --sys-prefix
安装 TensorFlow 模型分析 (TFMA)
# Upgrade pip to the latest, and install TFMA.
pip install -U pip
pip install tensorflow-model-analysis
# This setup was tested with TF 2.10 and TFMA 0.41 (using colab), but it should
# also work with the latest release.
import sys
# Confirm that we're using Python 3
assert sys.version_info.major==3, 'This notebook must be run using Python 3.'
import tensorflow as tf
print('TF version: {}'.format(tf.__version__))
import apache_beam as beam
print('Beam version: {}'.format(beam.__version__))
import tensorflow_model_analysis as tfma
print('TFMA version: {}'.format(tfma.__version__))
2024-04-30 10:58:28.448131: E external/local_xla/xla/stream_executor/cuda/cuda_dnn.cc:9261] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered 2024-04-30 10:58:28.448179: E external/local_xla/xla/stream_executor/cuda/cuda_fft.cc:607] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered 2024-04-30 10:58:28.449816: E external/local_xla/xla/stream_executor/cuda/cuda_blas.cc:1515] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered TF version: 2.15.1 Beam version: 2.55.1 TFMA version: 0.46.0
我们将下载一个包含我们所需所有内容的 tar 文件。其中包括
- 训练和评估数据集
- 数据模式
- 训练和服务保存的模型(keras 和估计器)以及评估保存的模型(估计器)。
# Download the tar file from GCP and extract it
import io, os, tempfile
TAR_NAME = 'saved_models-2.2'
BASE_DIR = tempfile.mkdtemp()
DATA_DIR = os.path.join(BASE_DIR, TAR_NAME, 'data')
MODELS_DIR = os.path.join(BASE_DIR, TAR_NAME, 'models')
SCHEMA = os.path.join(BASE_DIR, TAR_NAME, 'schema.pbtxt')
OUTPUT_DIR = os.path.join(BASE_DIR, 'output')
!curl -O https://storage.googleapis.com/artifacts.tfx-oss-public.appspot.com/datasets/{TAR_NAME}.tar
!tar xf {TAR_NAME}.tar
!rm {TAR_NAME}.tar
print("Here's what we downloaded:")
!ls -R {BASE_DIR}
% Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed 100 6800k 100 6800k 0 0 26.9M 0 --:--:-- --:--:-- --:--:-- 26.9M Here's what we downloaded: /tmpfs/tmp/tmpo4hj24ht: saved_models-2.2 /tmpfs/tmp/tmpo4hj24ht/saved_models-2.2: data models schema.pbtxt /tmpfs/tmp/tmpo4hj24ht/saved_models-2.2/data: eval train /tmpfs/tmp/tmpo4hj24ht/saved_models-2.2/data/eval: data.csv /tmpfs/tmp/tmpo4hj24ht/saved_models-2.2/data/train: data.csv /tmpfs/tmp/tmpo4hj24ht/saved_models-2.2/models: estimator keras /tmpfs/tmp/tmpo4hj24ht/saved_models-2.2/models/estimator: eval_model_dir serving_model_dir /tmpfs/tmp/tmpo4hj24ht/saved_models-2.2/models/estimator/eval_model_dir: 1591221811 /tmpfs/tmp/tmpo4hj24ht/saved_models-2.2/models/estimator/eval_model_dir/1591221811: saved_model.pb tmp.pbtxt variables /tmpfs/tmp/tmpo4hj24ht/saved_models-2.2/models/estimator/eval_model_dir/1591221811/variables: variables.data-00000-of-00001 variables.index /tmpfs/tmp/tmpo4hj24ht/saved_models-2.2/models/estimator/serving_model_dir: checkpoint eval_chicago-taxi-eval events.out.tfevents.1591221780.my-pipeline-b57vp-237544850 export graph.pbtxt model.ckpt-100.data-00000-of-00001 model.ckpt-100.index model.ckpt-100.meta /tmpfs/tmp/tmpo4hj24ht/saved_models-2.2/models/estimator/serving_model_dir/eval_chicago-taxi-eval: events.out.tfevents.1591221799.my-pipeline-b57vp-237544850 /tmpfs/tmp/tmpo4hj24ht/saved_models-2.2/models/estimator/serving_model_dir/export: chicago-taxi /tmpfs/tmp/tmpo4hj24ht/saved_models-2.2/models/estimator/serving_model_dir/export/chicago-taxi: 1591221801 /tmpfs/tmp/tmpo4hj24ht/saved_models-2.2/models/estimator/serving_model_dir/export/chicago-taxi/1591221801: saved_model.pb variables /tmpfs/tmp/tmpo4hj24ht/saved_models-2.2/models/estimator/serving_model_dir/export/chicago-taxi/1591221801/variables: variables.data-00000-of-00001 variables.index /tmpfs/tmp/tmpo4hj24ht/saved_models-2.2/models/keras: 0 1 2 /tmpfs/tmp/tmpo4hj24ht/saved_models-2.2/models/keras/0: saved_model.pb variables /tmpfs/tmp/tmpo4hj24ht/saved_models-2.2/models/keras/0/variables: variables.data-00000-of-00001 variables.index /tmpfs/tmp/tmpo4hj24ht/saved_models-2.2/models/keras/1: saved_model.pb variables /tmpfs/tmp/tmpo4hj24ht/saved_models-2.2/models/keras/1/variables: variables.data-00000-of-00001 variables.index /tmpfs/tmp/tmpo4hj24ht/saved_models-2.2/models/keras/2: saved_model.pb variables /tmpfs/tmp/tmpo4hj24ht/saved_models-2.2/models/keras/2/variables: variables.data-00000-of-00001 variables.index
我们下载的内容中包含由 TensorFlow 数据验证 创建的数据模式。现在让我们解析它,以便我们可以在 TFMA 中使用它。
import tensorflow as tf
from google.protobuf import text_format
from tensorflow.python.lib.io import file_io
from tensorflow_metadata.proto.v0 import schema_pb2
from tensorflow.core.example import example_pb2
schema = schema_pb2.Schema()
contents = file_io.read_file_to_string(SCHEMA)
schema = text_format.Parse(contents, schema)
使用模式创建 TFRecords
我们需要让 TFMA 访问我们的数据集,因此让我们创建一个 TFRecords 文件。我们可以使用我们的模式来创建它,因为它为我们提供了每个特征的正确类型。
import csv
datafile = os.path.join(DATA_DIR, 'eval', 'data.csv')
reader = csv.DictReader(open(datafile, 'r'))
examples = []
for line in reader:
example = example_pb2.Example()
for feature in schema.feature:
key = feature.name
if feature.type == schema_pb2.FLOAT:
example.features.feature[key].float_list.value[:] = (
[float(line[key])] if len(line[key]) > 0 else [])
elif feature.type == schema_pb2.INT:
example.features.feature[key].int64_list.value[:] = (
[int(line[key])] if len(line[key]) > 0 else [])
elif feature.type == schema_pb2.BYTES:
example.features.feature[key].bytes_list.value[:] = (
[line[key].encode('utf8')] if len(line[key]) > 0 else [])
# Add a new column 'big_tipper' that indicates if tips was > 20% of the fare.
# TODO(b/157064428): Remove after label transformation is supported for Keras.
big_tipper = float(line['tips']) > float(line['fare']) * 0.2
example.features.feature['big_tipper'].float_list.value[:] = [big_tipper]
tfrecord_file = os.path.join(BASE_DIR, 'train_data.rio')
with tf.io.TFRecordWriter(tfrecord_file) as writer:
for example in examples:
!ls {tfrecord_file}
设置和运行 TFMA
TFMA 支持多种模型类型,包括 TF keras 模型、基于通用 TF2 签名 API 的模型以及 TF 估计器模型。 入门 指南列出了支持的所有模型类型以及任何限制。在本示例中,我们将展示如何配置基于 keras 的模型以及作为 EvalSavedModel
保存的基于估计器的模型。有关其他配置的示例,请参阅 常见问题解答。
TFMA 提供支持以计算在训练时使用的指标(即内置指标)以及在模型保存后作为 TFMA 配置设置的一部分定义的指标。对于我们的 keras 设置,我们将演示如何在配置中手动添加指标和绘图(有关支持的指标和绘图的信息,请参阅 指标 指南)。对于估计器设置,我们将使用与模型一起保存的内置指标。我们的设置还包括许多切片规范,这些规范将在以下部分中详细讨论。
在创建 tfma.EvalConfig
和 tfma.EvalSharedModel
后,我们可以使用 tfma.run_model_analysis
运行 TFMA。这将创建一个 tfma.EvalResult
import tensorflow_model_analysis as tfma
# Setup tfma.EvalConfig settings
keras_eval_config = text_format.Parse("""
## Model information
model_specs {
# For keras (and serving models) we need to add a `label_key`.
label_key: "big_tipper"
## Post training metric information. These will be merged with any built-in
## metrics from training.
metrics_specs {
metrics { class_name: "ExampleCount" }
metrics { class_name: "AUC" }
metrics { class_name: "Precision" }
metrics { class_name: "Recall" }
metrics { class_name: "MeanPrediction" }
metrics { class_name: "Calibration" }
metrics { class_name: "CalibrationPlot" }
metrics { class_name: "ConfusionMatrixPlot" }
# ... add additional metrics and plots ...
## Slicing information
slicing_specs {} # overall slice
slicing_specs {
feature_keys: ["trip_start_hour"]
slicing_specs {
feature_keys: ["trip_start_day"]
slicing_specs {
feature_values: {
key: "trip_start_month"
value: "1"
""", tfma.EvalConfig())
# Create a tfma.EvalSharedModel that points at our keras model.
keras_model_path = os.path.join(MODELS_DIR, 'keras', '2')
keras_eval_shared_model = tfma.default_eval_shared_model(
keras_output_path = os.path.join(OUTPUT_DIR, 'keras')
# Run TFMA
keras_eval_result = tfma.run_model_analysis(
WARNING:tensorflow:SavedModel saved prior to TF 2.5 detected when loading Keras model. Please ensure that you are saving the model with model.save() or tf.keras.models.save_model(), *NOT* tf.saved_model.save(). To confirm, there should be a file named "keras_metadata.pb" in the SavedModel directory. WARNING:tensorflow:SavedModel saved prior to TF 2.5 detected when loading Keras model. Please ensure that you are saving the model with model.save() or tf.keras.models.save_model(), *NOT* tf.saved_model.save(). To confirm, there should be a file named "keras_metadata.pb" in the SavedModel directory. WARNING:absl:Tensorflow version (2.15.1) found. Note that TFMA support for TF 2.0 is currently in beta WARNING:apache_beam.runners.interactive.interactive_environment:Dependencies required for Interactive Beam PCollection visualization are not available, please use: `pip install apache-beam[interactive]` to install necessary dependencies to enable all data visualization features. WARNING:tensorflow:SavedModel saved prior to TF 2.5 detected when loading Keras model. Please ensure that you are saving the model with model.save() or tf.keras.models.save_model(), *NOT* tf.saved_model.save(). To confirm, there should be a file named "keras_metadata.pb" in the SavedModel directory. WARNING:tensorflow:SavedModel saved prior to TF 2.5 detected when loading Keras model. Please ensure that you are saving the model with model.save() or tf.keras.models.save_model(), *NOT* tf.saved_model.save(). To confirm, there should be a file named "keras_metadata.pb" in the SavedModel directory. WARNING:apache_beam.io.tfrecordio:Couldn't find python-snappy so the implementation of _TFRecordUtil._masked_crc32c is not as fast as it could be. /tmpfs/src/tf_docs_env/lib/python3.9/site-packages/tensorflow_model_analysis/metrics/binary_confusion_matrices.py:152: RuntimeWarning: invalid value encountered in divide f1 = 2 * precision * recall / (precision + recall) /tmpfs/src/tf_docs_env/lib/python3.9/site-packages/tensorflow_model_analysis/metrics/binary_confusion_matrices.py:155: RuntimeWarning: invalid value encountered in divide false_omission_rate = fn / predicated_negatives WARNING:tensorflow:From /tmpfs/src/tf_docs_env/lib/python3.9/site-packages/tensorflow_model_analysis/writers/metrics_plots_and_validations_writer.py:112: tf_record_iterator (from tensorflow.python.lib.io.tf_record) is deprecated and will be removed in a future version. Instructions for updating: Use eager execution and: `tf.data.TFRecordDataset(path)` WARNING:tensorflow:From /tmpfs/src/tf_docs_env/lib/python3.9/site-packages/tensorflow_model_analysis/writers/metrics_plots_and_validations_writer.py:112: tf_record_iterator (from tensorflow.python.lib.io.tf_record) is deprecated and will be removed in a future version. Instructions for updating: Use eager execution and: `tf.data.TFRecordDataset(path)`
import tensorflow_model_analysis as tfma
# Setup tfma.EvalConfig settings
estimator_eval_config = text_format.Parse("""
## Model information
model_specs {
# To use EvalSavedModel set `signature_name` to "eval".
signature_name: "eval"
## Post training metric information. These will be merged with any built-in
## metrics from training.
metrics_specs {
metrics { class_name: "ConfusionMatrixPlot" }
# ... add additional metrics and plots ...
## Slicing information
slicing_specs {} # overall slice
slicing_specs {
feature_keys: ["trip_start_hour"]
slicing_specs {
feature_keys: ["trip_start_day"]
slicing_specs {
feature_values: {
key: "trip_start_month"
value: "1"
""", tfma.EvalConfig())
# Create a tfma.EvalSharedModel that points at our eval saved model.
estimator_base_model_path = os.path.join(
MODELS_DIR, 'estimator', 'eval_model_dir')
estimator_model_path = os.path.join(
estimator_base_model_path, os.listdir(estimator_base_model_path)[0])
estimator_eval_shared_model = tfma.default_eval_shared_model(
estimator_output_path = os.path.join(OUTPUT_DIR, 'estimator')
# Run TFMA
estimator_eval_result = tfma.run_model_analysis(
WARNING:tensorflow:SavedModel saved prior to TF 2.5 detected when loading Keras model. Please ensure that you are saving the model with model.save() or tf.keras.models.save_model(), *NOT* tf.saved_model.save(). To confirm, there should be a file named "keras_metadata.pb" in the SavedModel directory. WARNING:tensorflow:SavedModel saved prior to TF 2.5 detected when loading Keras model. Please ensure that you are saving the model with model.save() or tf.keras.models.save_model(), *NOT* tf.saved_model.save(). To confirm, there should be a file named "keras_metadata.pb" in the SavedModel directory. WARNING:absl:Tensorflow version (2.15.1) found. Note that TFMA support for TF 2.0 is currently in beta WARNING:tensorflow:From /tmpfs/src/tf_docs_env/lib/python3.9/site-packages/tensorflow_model_analysis/eval_saved_model/load.py:163: load (from tensorflow.python.saved_model.loader_impl) is deprecated and will be removed in a future version. Instructions for updating: Use `tf.saved_model.load` instead. WARNING:tensorflow:From /tmpfs/src/tf_docs_env/lib/python3.9/site-packages/tensorflow_model_analysis/eval_saved_model/load.py:163: load (from tensorflow.python.saved_model.loader_impl) is deprecated and will be removed in a future version. Instructions for updating: Use `tf.saved_model.load` instead. INFO:tensorflow:Restoring parameters from /tmpfs/tmp/tmpo4hj24ht/saved_models-2.2/models/estimator/eval_model_dir/1591221811/variables/variables INFO:tensorflow:Restoring parameters from /tmpfs/tmp/tmpo4hj24ht/saved_models-2.2/models/estimator/eval_model_dir/1591221811/variables/variables WARNING:tensorflow:From /tmpfs/src/tf_docs_env/lib/python3.9/site-packages/tensorflow_model_analysis/eval_saved_model/graph_ref.py:184: get_tensor_from_tensor_info (from tensorflow.python.saved_model.utils_impl) is deprecated and will be removed in a future version. Instructions for updating: This API was designed for TensorFlow v1. See https://tensorflowcn.cn/guide/migrate for instructions on how to migrate your code to TensorFlow v2. 2024-04-30 10:58:52.926791: W tensorflow/core/common_runtime/type_inference.cc:339] Type inference failed. This indicates an invalid graph that escaped type checking. Error message: INVALID_ARGUMENT: expected compatible input types, but input 1: type_id: TFT_OPTIONAL args { type_id: TFT_PRODUCT args { type_id: TFT_TENSOR args { type_id: TFT_INT64 } } } is neither a subtype nor a supertype of the combined inputs preceding it: type_id: TFT_OPTIONAL args { type_id: TFT_PRODUCT args { type_id: TFT_TENSOR args { type_id: TFT_INT32 } } } for Tuple type infernce function 0 while inferring type of node 'dnn/zero_fraction/cond/output/_9' WARNING:tensorflow:From /tmpfs/src/tf_docs_env/lib/python3.9/site-packages/tensorflow_model_analysis/eval_saved_model/graph_ref.py:184: get_tensor_from_tensor_info (from tensorflow.python.saved_model.utils_impl) is deprecated and will be removed in a future version. Instructions for updating: This API was designed for TensorFlow v1. See https://tensorflowcn.cn/guide/migrate for instructions on how to migrate your code to TensorFlow v2. 2024-04-30 10:58:53.077553: W tensorflow/c/c_api.cc:305] Operation '{name:'head/metrics/true_positives_1/Assign' id:674 op device:{requested: '', assigned: ''} def:{ { {node head/metrics/true_positives_1/Assign} } = AssignVariableOp[_has_manual_control_dependencies=true, dtype=DT_FLOAT, validate_shape=false](head/metrics/true_positives_1, head/metrics/true_positives_1/Initializer/zeros)} }' was changed by setting attribute after it was run by a session. This mutation will have no effect, and will trigger an error in the future. Either don't modify nodes after running them or create a new session. 2024-04-30 10:58:53.204776: W tensorflow/c/c_api.cc:305] Operation '{name:'head/metrics/true_positives_1/Assign' id:674 op device:{requested: '', assigned: ''} def:{ { {node head/metrics/true_positives_1/Assign} } = AssignVariableOp[_has_manual_control_dependencies=true, dtype=DT_FLOAT, validate_shape=false](head/metrics/true_positives_1, head/metrics/true_positives_1/Initializer/zeros)} }' was changed by setting attribute after it was run by a session. This mutation will have no effect, and will trigger an error in the future. Either don't modify nodes after running them or create a new session. /tmpfs/src/tf_docs_env/lib/python3.9/site-packages/tensorflow_model_analysis/metrics/binary_confusion_matrices.py:155: RuntimeWarning: invalid value encountered in divide false_omission_rate = fn / predicated_negatives /tmpfs/src/tf_docs_env/lib/python3.9/site-packages/tensorflow_model_analysis/metrics/binary_confusion_matrices.py:152: RuntimeWarning: invalid value encountered in divide f1 = 2 * precision * recall / (precision + recall)
现在我们已经运行了评估,让我们使用 TFMA 查看可视化结果。在以下示例中,我们将可视化对 Keras 模型运行评估的结果。要查看基于评估器的模型更新,请将 eval_result_path
更新为指向我们的 estimator_output_path
eval_result_path = keras_output_path
# eval_result_path = estimator_output_path
eval_result = keras_eval_result
# eval_result = estimator_eval_result
TFMA 在 tfma.experimental.dataframe
中提供 DataFrame API,用于将物化的输出加载为 Pandas DataFrames
。要查看指标,可以使用 metrics_as_dataframes(tfma.load_metrics(eval_path))
,它返回一个对象,该对象可能包含多个 DataFrame,每个 DataFrame 对应于一个指标值类型(double_value
和 array_value
)。填充的特定 DataFrame 取决于评估结果。在这里,我们以 double_value
DataFrame 为例。
import tensorflow_model_analysis.experimental.dataframe as tfma_dataframe
dfs = tfma_dataframe.metrics_as_dataframes(
每个 DataFrame 都有一个列多级索引,其顶层列为:slices
和 metric_values
。每个组的精确列可能会根据有效负载而改变。我们可以使用 DataFrame.columns
API 检查所有多级索引列。例如,切片列为 'Overall'、'trip_start_day'、'trip_start_hour' 和 'trip_start_month',它们由 eval_config
中的 slicing_specs
MultiIndex([( 'slices', 'trip_start_hour'), ( 'slices', 'Overall'), ( 'slices', 'trip_start_day'), ( 'slices', 'trip_start_month'), ( 'metric_keys', 'name'), ( 'metric_keys', 'model_name'), ( 'metric_keys', 'output_name'), ( 'metric_keys', 'example_weighted'), ( 'metric_keys', 'is_diff'), ('metric_values', 'double_value')], )
DataFrame 旨在冗长,这样就不会丢失有效负载中的信息。但是,有时为了直接使用,我们可能希望以更简洁但有损的形式组织信息:切片作为行,指标作为列。TFMA 为此目的提供了 auto_pivot
API。该实用程序在 metric_keys
内的所有非唯一列上进行透视,并将所有切片默认压缩为一个 stringified_slices
由于输出是 DataFrame,因此可以使用任何本机 DataFrame API 来切片和切块 DataFrame。例如,如果我们只对 trip_start_hour
为 1、3、5、7 感兴趣,而对 trip_start_day
不感兴趣,我们可以使用 DataFrame 的 .loc
过滤逻辑。同样,我们在执行过滤后使用 auto_pivot
函数重新组织 DataFrame,使其处于切片与指标视图中。
df_double = dfs.double_value
df_filtered = (df_double
我们还可以按指标值对切片进行排序。例如,我们将展示如何按升序 AUC 对上述 DataFrame 中的切片进行排序,以便我们可以找到性能较差的切片。这涉及两个步骤:自动透视,以便切片表示为行,列表示为指标,然后按 AUC 列对透视后的 DataFrame 进行排序。
# Pivoted table sorted by AUC in ascending order.
df_sorted = (
.sort_values(by='auc', ascending=True)
任何添加到 tfma.EvalConfig
作为训练后 metric_specs
的图表都可以使用 tfma.view.render_plot
与指标一样,图表也可以按切片查看。与指标不同的是,只能显示特定切片值的图表,因此必须使用 tfma.SlicingSpec
,并且它必须指定切片特征名称和值。如果没有提供切片,则使用 Overall
在下面的示例中,我们显示了为 trip_start_hour:1
切片计算的 CalibrationPlot
和 ConfusionMatrixPlot
tfma.SlicingSpec(feature_values={'trip_start_hour': '1'}))
PlotViewer(config={'sliceName': 'trip_start_hour:1', 'metricKeys': {'calibrationPlot': {'metricName': 'calibra…
这意味着您需要持续监控和衡量模型的性能,以便了解并应对变化。让我们看看 TFMA 如何提供帮助。
让我们加载 3 个不同的模型运行,并使用 TFMA 使用 render_time_series
# Note this re-uses the EvalConfig from the keras setup.
# Run eval on each saved model
output_paths = []
for i in range(3):
# Create a tfma.EvalSharedModel that points at our saved model.
eval_shared_model = tfma.default_eval_shared_model(
eval_saved_model_path=os.path.join(MODELS_DIR, 'keras', str(i)),
output_path = os.path.join(OUTPUT_DIR, 'time_series', str(i))
# Run TFMA
WARNING:tensorflow:SavedModel saved prior to TF 2.5 detected when loading Keras model. Please ensure that you are saving the model with model.save() or tf.keras.models.save_model(), *NOT* tf.saved_model.save(). To confirm, there should be a file named "keras_metadata.pb" in the SavedModel directory. WARNING:tensorflow:SavedModel saved prior to TF 2.5 detected when loading Keras model. Please ensure that you are saving the model with model.save() or tf.keras.models.save_model(), *NOT* tf.saved_model.save(). To confirm, there should be a file named "keras_metadata.pb" in the SavedModel directory. WARNING:absl:Tensorflow version (2.15.1) found. Note that TFMA support for TF 2.0 is currently in beta WARNING:tensorflow:SavedModel saved prior to TF 2.5 detected when loading Keras model. Please ensure that you are saving the model with model.save() or tf.keras.models.save_model(), *NOT* tf.saved_model.save(). To confirm, there should be a file named "keras_metadata.pb" in the SavedModel directory. WARNING:tensorflow:SavedModel saved prior to TF 2.5 detected when loading Keras model. Please ensure that you are saving the model with model.save() or tf.keras.models.save_model(), *NOT* tf.saved_model.save(). To confirm, there should be a file named "keras_metadata.pb" in the SavedModel directory. /tmpfs/src/tf_docs_env/lib/python3.9/site-packages/tensorflow_model_analysis/metrics/binary_confusion_matrices.py:152: RuntimeWarning: invalid value encountered in divide f1 = 2 * precision * recall / (precision + recall) /tmpfs/src/tf_docs_env/lib/python3.9/site-packages/tensorflow_model_analysis/metrics/binary_confusion_matrices.py:155: RuntimeWarning: invalid value encountered in divide false_omission_rate = fn / predicated_negatives WARNING:tensorflow:SavedModel saved prior to TF 2.5 detected when loading Keras model. Please ensure that you are saving the model with model.save() or tf.keras.models.save_model(), *NOT* tf.saved_model.save(). To confirm, there should be a file named "keras_metadata.pb" in the SavedModel directory. WARNING:tensorflow:SavedModel saved prior to TF 2.5 detected when loading Keras model. Please ensure that you are saving the model with model.save() or tf.keras.models.save_model(), *NOT* tf.saved_model.save(). To confirm, there should be a file named "keras_metadata.pb" in the SavedModel directory. WARNING:absl:Tensorflow version (2.15.1) found. Note that TFMA support for TF 2.0 is currently in beta WARNING:tensorflow:SavedModel saved prior to TF 2.5 detected when loading Keras model. Please ensure that you are saving the model with model.save() or tf.keras.models.save_model(), *NOT* tf.saved_model.save(). To confirm, there should be a file named "keras_metadata.pb" in the SavedModel directory. WARNING:tensorflow:SavedModel saved prior to TF 2.5 detected when loading Keras model. Please ensure that you are saving the model with model.save() or tf.keras.models.save_model(), *NOT* tf.saved_model.save(). To confirm, there should be a file named "keras_metadata.pb" in the SavedModel directory. /tmpfs/src/tf_docs_env/lib/python3.9/site-packages/tensorflow_model_analysis/metrics/binary_confusion_matrices.py:152: RuntimeWarning: invalid value encountered in divide f1 = 2 * precision * recall / (precision + recall) /tmpfs/src/tf_docs_env/lib/python3.9/site-packages/tensorflow_model_analysis/metrics/binary_confusion_matrices.py:155: RuntimeWarning: invalid value encountered in divide false_omission_rate = fn / predicated_negatives WARNING:tensorflow:SavedModel saved prior to TF 2.5 detected when loading Keras model. Please ensure that you are saving the model with model.save() or tf.keras.models.save_model(), *NOT* tf.saved_model.save(). To confirm, there should be a file named "keras_metadata.pb" in the SavedModel directory. WARNING:tensorflow:SavedModel saved prior to TF 2.5 detected when loading Keras model. Please ensure that you are saving the model with model.save() or tf.keras.models.save_model(), *NOT* tf.saved_model.save(). To confirm, there should be a file named "keras_metadata.pb" in the SavedModel directory. WARNING:absl:Tensorflow version (2.15.1) found. Note that TFMA support for TF 2.0 is currently in beta WARNING:tensorflow:SavedModel saved prior to TF 2.5 detected when loading Keras model. Please ensure that you are saving the model with model.save() or tf.keras.models.save_model(), *NOT* tf.saved_model.save(). To confirm, there should be a file named "keras_metadata.pb" in the SavedModel directory. WARNING:tensorflow:SavedModel saved prior to TF 2.5 detected when loading Keras model. Please ensure that you are saving the model with model.save() or tf.keras.models.save_model(), *NOT* tf.saved_model.save(). To confirm, there should be a file named "keras_metadata.pb" in the SavedModel directory. /tmpfs/src/tf_docs_env/lib/python3.9/site-packages/tensorflow_model_analysis/metrics/binary_confusion_matrices.py:152: RuntimeWarning: invalid value encountered in divide f1 = 2 * precision * recall / (precision + recall) /tmpfs/src/tf_docs_env/lib/python3.9/site-packages/tensorflow_model_analysis/metrics/binary_confusion_matrices.py:155: RuntimeWarning: invalid value encountered in divide false_omission_rate = fn / predicated_negatives
首先,假设我们昨天训练并部署了模型,现在我们想看看它在今天传入的新数据上的表现。可视化将首先显示 AUC。从 UI 中,您可以
- 使用“添加指标序列”菜单添加其他指标。
- 通过单击 x 关闭不需要的图表。
- 将鼠标悬停在数据点(图表中线段的末端)上以获取更多详细信息。
eval_results_from_disk = tfma.load_eval_results(output_paths[:2])
TimeSeriesViewer(config={'isModelCentric': True}, data=[{'metrics': {'': {'': {'example_count': {'doubleValue'…
eval_results_from_disk = tfma.load_eval_results(output_paths)
TimeSeriesViewer(config={'isModelCentric': True}, data=[{'metrics': {'': {'': {'example_count': {'doubleValue'…
TFMA 可以配置为同时评估多个模型。通常,这样做是为了将新模型与基线(例如当前正在服务的模型)进行比较,以确定指标(例如 AUC 等)的性能差异相对于基线的差异。当配置了 阈值 时,TFMA 将生成一个 tfma.ValidationResult
让我们重新配置 Keras 评估以比较两个模型:候选模型和基线模型。我们还将通过在 AUC 指标上设置 tmfa.MetricThreshold
# Setup tfma.EvalConfig setting
eval_config_with_thresholds = text_format.Parse("""
## Model information
model_specs {
name: "candidate"
# For keras we need to add a `label_key`.
label_key: "big_tipper"
model_specs {
name: "baseline"
# For keras we need to add a `label_key`.
label_key: "big_tipper"
is_baseline: true
## Post training metric information
metrics_specs {
metrics { class_name: "ExampleCount" }
metrics { class_name: "BinaryAccuracy" }
metrics { class_name: "BinaryCrossentropy" }
metrics {
class_name: "AUC"
threshold {
# Ensure that AUC is always > 0.9
value_threshold {
lower_bound { value: 0.9 }
# Ensure that AUC does not drop by more than a small epsilon
# e.g. (candidate - baseline) > -1e-10 or candidate > baseline - 1e-10
change_threshold {
absolute { value: -1e-10 }
metrics { class_name: "AUCPrecisionRecall" }
metrics { class_name: "Precision" }
metrics { class_name: "Recall" }
metrics { class_name: "MeanLabel" }
metrics { class_name: "MeanPrediction" }
metrics { class_name: "Calibration" }
metrics { class_name: "CalibrationPlot" }
metrics { class_name: "ConfusionMatrixPlot" }
# ... add additional metrics and plots ...
## Slicing information
slicing_specs {} # overall slice
slicing_specs {
feature_keys: ["trip_start_hour"]
slicing_specs {
feature_keys: ["trip_start_day"]
slicing_specs {
feature_keys: ["trip_start_month"]
slicing_specs {
feature_keys: ["trip_start_hour", "trip_start_day"]
""", tfma.EvalConfig())
# Create tfma.EvalSharedModels that point at our keras models.
candidate_model_path = os.path.join(MODELS_DIR, 'keras', '2')
baseline_model_path = os.path.join(MODELS_DIR, 'keras', '1')
eval_shared_models = [
validation_output_path = os.path.join(OUTPUT_DIR, 'validation')
# Run TFMA
eval_result_with_validation = tfma.run_model_analysis(
WARNING:tensorflow:SavedModel saved prior to TF 2.5 detected when loading Keras model. Please ensure that you are saving the model with model.save() or tf.keras.models.save_model(), *NOT* tf.saved_model.save(). To confirm, there should be a file named "keras_metadata.pb" in the SavedModel directory. WARNING:tensorflow:SavedModel saved prior to TF 2.5 detected when loading Keras model. Please ensure that you are saving the model with model.save() or tf.keras.models.save_model(), *NOT* tf.saved_model.save(). To confirm, there should be a file named "keras_metadata.pb" in the SavedModel directory. WARNING:tensorflow:SavedModel saved prior to TF 2.5 detected when loading Keras model. Please ensure that you are saving the model with model.save() or tf.keras.models.save_model(), *NOT* tf.saved_model.save(). To confirm, there should be a file named "keras_metadata.pb" in the SavedModel directory. WARNING:tensorflow:SavedModel saved prior to TF 2.5 detected when loading Keras model. Please ensure that you are saving the model with model.save() or tf.keras.models.save_model(), *NOT* tf.saved_model.save(). To confirm, there should be a file named "keras_metadata.pb" in the SavedModel directory. WARNING:absl:Tensorflow version (2.15.1) found. Note that TFMA support for TF 2.0 is currently in beta WARNING:tensorflow:SavedModel saved prior to TF 2.5 detected when loading Keras model. Please ensure that you are saving the model with model.save() or tf.keras.models.save_model(), *NOT* tf.saved_model.save(). To confirm, there should be a file named "keras_metadata.pb" in the SavedModel directory. WARNING:tensorflow:SavedModel saved prior to TF 2.5 detected when loading Keras model. Please ensure that you are saving the model with model.save() or tf.keras.models.save_model(), *NOT* tf.saved_model.save(). To confirm, there should be a file named "keras_metadata.pb" in the SavedModel directory. WARNING:tensorflow:SavedModel saved prior to TF 2.5 detected when loading Keras model. Please ensure that you are saving the model with model.save() or tf.keras.models.save_model(), *NOT* tf.saved_model.save(). To confirm, there should be a file named "keras_metadata.pb" in the SavedModel directory. WARNING:tensorflow:SavedModel saved prior to TF 2.5 detected when loading Keras model. Please ensure that you are saving the model with model.save() or tf.keras.models.save_model(), *NOT* tf.saved_model.save(). To confirm, there should be a file named "keras_metadata.pb" in the SavedModel directory. /tmpfs/src/tf_docs_env/lib/python3.9/site-packages/tensorflow_model_analysis/metrics/confusion_matrix_metrics.py:528: RuntimeWarning: invalid value encountered in divide prec_slope = dtp / np.maximum(dp, 0) /tmpfs/src/tf_docs_env/lib/python3.9/site-packages/tensorflow_model_analysis/metrics/confusion_matrix_metrics.py:532: RuntimeWarning: divide by zero encountered in divide p[:num_thresholds - 1] / np.maximum(p[1:], 0), np.ones_like(p[1:])) /tmpfs/src/tf_docs_env/lib/python3.9/site-packages/tensorflow_model_analysis/metrics/confusion_matrix_metrics.py:532: RuntimeWarning: invalid value encountered in divide p[:num_thresholds - 1] / np.maximum(p[1:], 0), np.ones_like(p[1:])) /tmpfs/src/tf_docs_env/lib/python3.9/site-packages/tensorflow_model_analysis/metrics/binary_confusion_matrices.py:152: RuntimeWarning: invalid value encountered in divide f1 = 2 * precision * recall / (precision + recall) /tmpfs/src/tf_docs_env/lib/python3.9/site-packages/tensorflow_model_analysis/metrics/binary_confusion_matrices.py:155: RuntimeWarning: invalid value encountered in divide false_omission_rate = fn / predicated_negatives /tmpfs/src/tf_docs_env/lib/python3.9/site-packages/tensorflow_model_analysis/metrics/confusion_matrix_metrics.py:539: RuntimeWarning: invalid value encountered in divide recall = tp / (tp + fn) /tmpfs/src/tf_docs_env/lib/python3.9/site-packages/tensorflow_model_analysis/metrics/confusion_matrix_metrics.py:534: RuntimeWarning: invalid value encountered in divide prec_slope * (dtp + intercept * np.log(safe_p_ratio)) /
在使用一个或多个模型针对基线运行评估时,TFMA 会自动为评估期间计算的所有指标添加差异指标。这些指标以相应的指标命名,但在指标名称后附加了 _diff
TimeSeriesViewer(config={'isModelCentric': True}, data=[{'metrics': {'': {'': {'binary_crossentropy': {'double…
现在让我们看看验证检查的输出。要查看验证结果,我们使用 tfma.load_validator_result
。在我们的示例中,验证失败,因为 AUC 低于阈值。
validation_result = tfma.load_validation_result(validation_output_path)