使用 SavedModel 格式

SavedModel 包含一个完整的 TensorFlow 程序，包括训练参数（即 tf.Variable）和计算。它不需要原始模型构建代码即可运行，这使其可用于与 TFLite、TensorFlow.js、TensorFlow Serving 或 TensorFlow Hub 共享或部署。

您可以使用以下 API 保存和加载 SavedModel 格式的模型

低级 tf.saved_model API。本文档详细介绍了如何使用此 API。
- 保存：tf.saved_model.save(model, path_to_dir)
- 加载：model = tf.saved_model.load(path_to_dir)
高级 tf.keras.Model API。请参阅 Keras 保存和序列化指南。
如果您只想在训练期间保存/加载权重，请参阅检查点指南。

从 Keras 创建 SavedModel

为了快速入门，本节将导出一个预训练的 Keras 模型，并使用它来处理图像分类请求。本指南的其余部分将详细说明并讨论创建 SavedModels 的其他方法。

import os
import tempfile

from matplotlib import pyplot as plt
import numpy as np
import tensorflow as tf

tmpdir = tempfile.mkdtemp()

physical_devices = tf.config.list_physical_devices('GPU')
for device in physical_devices:
  tf.config.experimental.set_memory_growth(device, True)

file = tf.keras.utils.get_file(
    "grace_hopper.jpg",
    "https://storage.googleapis.com/download.tensorflow.org/example_images/grace_hopper.jpg")
img = tf.keras.utils.load_img(file, target_size=[224, 224])
plt.imshow(img)
plt.axis('off')
x = tf.keras.utils.img_to_array(img)
x = tf.keras.applications.mobilenet.preprocess_input(
    x[tf.newaxis,...])

您将使用 Grace Hopper 的图像作为运行示例，以及一个易于使用的 Keras 预训练图像分类模型。自定义模型也适用，将在后面详细介绍。

labels_path = tf.keras.utils.get_file(
    'ImageNetLabels.txt',
    'https://storage.googleapis.com/download.tensorflow.org/data/ImageNetLabels.txt')
imagenet_labels = np.array(open(labels_path).read().splitlines())

pretrained_model = tf.keras.applications.MobileNet()
result_before_save = pretrained_model(x)

decoded = imagenet_labels[np.argsort(result_before_save)[0,::-1][:5]+1]

print("Result before saving:\n", decoded)

此图像的最高预测是“军装”。

mobilenet_save_path = os.path.join(tmpdir, "mobilenet/1/")
tf.saved_model.save(pretrained_model, mobilenet_save_path)

保存路径遵循 TensorFlow Serving 使用的约定，其中最后一个路径组件（1/）是模型的版本号 - 它允许像 Tensorflow Serving 这样的工具推断模型的相对新鲜程度。

您可以使用 tf.saved_model.load 将 SavedModel 加载回 Python，并查看 Admiral Hopper 的图像如何被分类。

loaded = tf.saved_model.load(mobilenet_save_path)
print(list(loaded.signatures.keys()))  # ["serving_default"]

导入的签名始终返回字典。要自定义签名名称和输出字典键，请参阅导出期间指定签名。

infer = loaded.signatures["serving_default"]
print(infer.structured_outputs)

从 SavedModel 运行推理会得到与原始模型相同的结果。

labeling = infer(tf.constant(x))[pretrained_model.output_names[0]]

decoded = imagenet_labels[np.argsort(labeling)[0,::-1][:5]+1]

print("Result after saving and loading:\n", decoded)

在 TensorFlow Serving 中运行 SavedModel

SavedModels 可从 Python 使用（更多内容见下文），但生产环境通常使用专用服务进行推理，而无需运行 Python 代码。使用 TensorFlow Serving 从 SavedModel 设置起来很容易。

有关端到端 tensorflow-serving 示例，请参阅 TensorFlow Serving REST 教程。

磁盘上的 SavedModel 格式

SavedModel 是一个目录，包含序列化签名以及运行它们所需的状态，包括变量值和词汇表。

ls {mobilenet_save_path}

文件 saved_model.pb 存储实际的 TensorFlow 程序或模型，以及一组命名签名，每个签名标识一个接受张量输入并生成张量输出的函数。

SavedModels 可能包含模型的多个变体（多个 v1.MetaGraphDefs，使用 --tag_set 标志标识到 saved_model_cli），但这很少见。创建模型多个变体的 API 包括 tf.Estimator.experimental_export_all_saved_models 以及 TensorFlow 1.x 中的 tf.saved_model.Builder。

saved_model_cli show --dir {mobilenet_save_path} --tag_set serve

目录 variables 包含标准训练检查点（请参阅训练检查点指南）。

ls {mobilenet_save_path}/variables

目录 assets 包含 TensorFlow 图使用的文件，例如用于初始化词汇表表的文本文件。在本示例中未使用。

SavedModels 可能具有 assets.extra 目录，用于任何 TensorFlow 图未使用的文件，例如有关如何使用 SavedModel 的消费者信息。TensorFlow 本身不使用此目录。

文件 fingerprint.pb 包含 SavedModel 的指纹，它由几个 64 位哈希组成，这些哈希唯一标识 SavedModel 的内容。指纹 API 目前处于实验阶段，但可以使用 tf.saved_model.experimental.read_fingerprint 将 SavedModel 指纹读入 tf.saved_model.experimental.Fingerprint 对象。

保存自定义模型

tf.saved_model.save 支持保存 tf.Module 对象及其子类，如 tf.keras.Layer 和 tf.keras.Model。

让我们看一个保存和恢复 tf.Module 的示例。

class CustomModule(tf.Module):

  def __init__(self):
    super(CustomModule, self).__init__()
    self.v = tf.Variable(1.)

  @tf.function
  def __call__(self, x):
    print('Tracing with', x)
    return x * self.v

  @tf.function(input_signature=[tf.TensorSpec([], tf.float32)])
  def mutate(self, new_v):
    self.v.assign(new_v)

module = CustomModule()

当您保存 tf.Module 时，任何 tf.Variable 属性、tf.function 装饰的方法以及通过递归遍历找到的 tf.Module 都将被保存。（有关此递归遍历的更多信息，请参阅检查点教程。）但是，任何 Python 属性、函数和数据都会丢失。这意味着当 tf.function 被保存时，不会保存任何 Python 代码。

如果没有保存 Python 代码，SavedModel 如何知道如何恢复函数？

简而言之，tf.function 通过跟踪 Python 代码来生成 ConcreteFunction（围绕 tf.Graph 的可调用包装器）。当保存 tf.function 时，您实际上是在保存 tf.function 的 ConcreteFunctions 缓存。

要了解有关 tf.function 和 ConcreteFunctions 之间关系的更多信息，请参阅 tf.function 指南。

module_no_signatures_path = os.path.join(tmpdir, 'module_no_signatures')
module(tf.constant(0.))
print('Saving model...')
tf.saved_model.save(module, module_no_signatures_path)

加载和使用自定义模型

当您在 Python 中加载 SavedModel 时，所有 tf.Variable 属性、tf.function 装饰的方法以及 tf.Module 都将以与原始保存的 tf.Module 相同的对象结构恢复。

imported = tf.saved_model.load(module_no_signatures_path)
assert imported(tf.constant(3.)).numpy() == 3
imported.mutate(tf.constant(2.))
assert imported(tf.constant(3.)).numpy() == 6

由于没有保存 Python 代码，因此使用新的输入签名调用 tf.function 会失败

imported(tf.constant([3.]))

ValueError: Could not find matching function to call for canonicalized inputs ((,), {}). Only existing signatures are [((TensorSpec(shape=(), dtype=tf.float32, name=u'x'),), {})].

基本微调

变量对象可用，您可以通过导入的函数进行反向传播。这足以在简单情况下微调（即重新训练）SavedModel。

optimizer = tf.keras.optimizers.SGD(0.05)

def train_step():
  with tf.GradientTape() as tape:
    loss = (10. - imported(tf.constant(2.))) ** 2
  variables = tape.watched_variables()
  grads = tape.gradient(loss, variables)
  optimizer.apply_gradients(zip(grads, variables))
  return loss

for _ in range(10):
  # "v" approaches 5, "loss" approaches 0
  print("loss={:.2f} v={:.2f}".format(train_step(), imported.v.numpy()))

一般微调

来自 Keras 的 SavedModel 提供了更多细节，而不是简单的 __call__，以解决更高级的微调情况。TensorFlow Hub 建议在为微调目的共享的 SavedModels 中提供以下内容（如果适用）

如果模型使用 dropout 或其他在训练和推理之间前向传递不同的技术（如批归一化），则 __call__ 方法将接受一个可选的、Python 值的 training= 参数，该参数默认为 False，但可以设置为 True。
除了 __call__ 属性之外，还有 .variable 和 .trainable_variable 属性，它们包含相应的变量列表。最初可训练但在微调期间应冻结的变量将从 .trainable_variables 中省略。
为了框架（如 Keras）将权重正则化表示为层或子模型的属性，还可以存在 .regularization_losses 属性。它包含一个零参数函数列表，这些函数的值旨在添加到总损失中。

回到最初的 MobileNet 示例，您可以看到其中的一些在实际应用中

loaded = tf.saved_model.load(mobilenet_save_path)
print("MobileNet has {} trainable variables: {}, ...".format(
          len(loaded.trainable_variables),
          ", ".join([v.name for v in loaded.trainable_variables[:5]])))

trainable_variable_ids = {id(v) for v in loaded.trainable_variables}
non_trainable_variables = [v for v in loaded.variables
                           if id(v) not in trainable_variable_ids]
print("MobileNet also has {} non-trainable variables: {}, ...".format(
          len(non_trainable_variables),
          ", ".join([v.name for v in non_trainable_variables[:3]])))

导出期间指定签名

像 TensorFlow Serving 和 saved_model_cli 这样的工具可以与 SavedModels 交互。为了帮助这些工具确定使用哪个 ConcreteFunctions，您需要指定服务签名。 tf.keras.Model 会自动指定服务签名，但您需要为我们的自定义模块显式声明服务签名。

默认情况下，自定义 tf.Module 中没有声明任何签名。

assert len(imported.signatures) == 0

要声明服务签名，请使用 signatures 关键字参数指定 ConcreteFunction。当指定单个签名时，它的签名键将为 'serving_default'，它将保存为常量 tf.saved_model.DEFAULT_SERVING_SIGNATURE_DEF_KEY。

module_with_signature_path = os.path.join(tmpdir, 'module_with_signature')
call = module.__call__.get_concrete_function(tf.TensorSpec(None, tf.float32))
tf.saved_model.save(module, module_with_signature_path, signatures=call)

imported_with_signatures = tf.saved_model.load(module_with_signature_path)
list(imported_with_signatures.signatures.keys())

要导出多个签名，请传递一个从签名键到 ConcreteFunctions 的字典。每个签名键对应一个 ConcreteFunction。

module_multiple_signatures_path = os.path.join(tmpdir, 'module_with_multiple_signatures')
signatures = {"serving_default": call,
              "array_input": module.__call__.get_concrete_function(tf.TensorSpec([None], tf.float32))}

tf.saved_model.save(module, module_multiple_signatures_path, signatures=signatures)

imported_with_multiple_signatures = tf.saved_model.load(module_multiple_signatures_path)
list(imported_with_multiple_signatures.signatures.keys())

默认情况下，输出张量名称相当通用，如 output_0。要控制输出的名称，请修改您的 tf.function 以返回一个将输出名称映射到输出的字典。输入的名称来自 Python 函数参数名称。

class CustomModuleWithOutputName(tf.Module):
  def __init__(self):
    super(CustomModuleWithOutputName, self).__init__()
    self.v = tf.Variable(1.)

  @tf.function(input_signature=[tf.TensorSpec(None, tf.float32)])
  def __call__(self, x):
    return {'custom_output_name': x * self.v}

module_output = CustomModuleWithOutputName()
call_output = module_output.__call__.get_concrete_function(tf.TensorSpec(None, tf.float32))
module_output_path = os.path.join(tmpdir, 'module_with_output_name')
tf.saved_model.save(module_output, module_output_path,
                    signatures={'serving_default': call_output})

imported_with_output_name = tf.saved_model.load(module_output_path)
imported_with_output_name.signatures['serving_default'].structured_outputs

Proto 拆分

由于 protobuf 实现的限制，proto 大小不能超过 2GB。这会导致尝试保存非常大的模型时出现以下错误

ValueError: Message tensorflow.SavedModel exceeds maximum protobuf size of 2GB: ...

google.protobuf.message.DecodeError: Error parsing message as the message exceeded the protobuf limit with type 'tensorflow.GraphDef'

如果您希望保存超过 2GB 限制的模型，则需要使用新的 proto 拆分选项进行保存

tf.saved_model.save(
  ...,
  options=tf.saved_model.SaveOptions(experimental_image_format=True)
)

有关更多信息，请参阅 Proto Splitter / Merger 库指南。

在 C++ 中加载 SavedModel

SavedModel 的 C++ 版本加载程序提供了一个 API，用于从路径加载 SavedModel，同时允许 SessionOptions 和 RunOptions。您必须指定与要加载的图相关的标签。加载的 SavedModel 版本称为 SavedModelBundle，它包含 MetaGraphDef 和加载它的会话。

const string export_dir = ...
SavedModelBundle bundle;
...
LoadSavedModel(session_options, run_options, export_dir, {kSavedModelTagTrain},
               &bundle);

SavedModel 命令行界面详情

您可以使用 SavedModel 命令行界面 (CLI) 检查和执行 SavedModel。例如，您可以使用 CLI 检查模型的 SignatureDef。CLI 使您能够快速确认输入张量数据类型和形状是否与模型匹配。此外，如果您想测试您的模型，您可以使用 CLI 通过传递各种格式（例如，Python 表达式）的样本输入来进行健全性检查，然后获取输出。

安装 SavedModel CLI

总的来说，您可以通过以下两种方式之一安装 TensorFlow

通过安装预构建的 TensorFlow 二进制文件。
通过从源代码构建 TensorFlow。

如果您通过预构建的 TensorFlow 二进制文件安装了 TensorFlow，那么 SavedModel CLI 已经安装在您的系统上，路径为 bin/saved_model_cli。

如果您从源代码构建了 TensorFlow，则必须运行以下附加命令来构建 saved_model_cli

$ bazel build //tensorflow/python/tools:saved_model_cli

命令概述

SavedModel CLI 支持对 SavedModel 执行以下两个命令

show，它显示了 SavedModel 中可用的计算。
run，它运行 SavedModel 中的计算。

`show` 命令

SavedModel 包含一个或多个模型变体（从技术上讲，v1.MetaGraphDef），由它们的标签集标识。为了提供模型服务，您可能想知道每个模型变体中包含哪些类型的 SignatureDef，以及它们的输入和输出是什么。 show 命令允许您以分层顺序检查 SavedModel 的内容。以下是语法

usage: saved_model_cli show [-h] --dir DIR [--all]
[--tag_set TAG_SET] [--signature_def SIGNATURE_DEF_KEY]

例如，以下命令显示了 SavedModel 中所有可用的标签集

$ saved_model_cli show --dir /tmp/saved_model_dir
The given SavedModel contains the following tag-sets:
serve
serve, gpu

以下命令显示了标签集的所有可用 SignatureDef 键

$ saved_model_cli show --dir /tmp/saved_model_dir --tag_set serve
The given SavedModel `MetaGraphDef` contains `SignatureDefs` with the
following keys:
SignatureDef key: "classify_x2_to_y3"
SignatureDef key: "classify_x_to_y"
SignatureDef key: "regress_x2_to_y3"
SignatureDef key: "regress_x_to_y"
SignatureDef key: "regress_x_to_y2"
SignatureDef key: "serving_default"

如果标签集中存在 *多个* 标签，则必须指定所有标签，每个标签之间用逗号分隔。例如

$ saved_model_cli show --dir /tmp/saved_model_dir --tag_set serve,gpu

要显示特定 SignatureDef 的所有输入和输出 TensorInfo，请将 SignatureDef 键传递给 signature_def 选项。当您想了解用于执行计算图的输入张量的张量键值、数据类型和形状时，这非常有用。例如

$ saved_model_cli show --dir \
/tmp/saved_model_dir --tag_set serve --signature_def serving_default
The given SavedModel SignatureDef contains the following input(s):
  inputs['x'] tensor_info:
      dtype: DT_FLOAT
      shape: (-1, 1)
      name: x:0
The given SavedModel SignatureDef contains the following output(s):
  outputs['y'] tensor_info:
      dtype: DT_FLOAT
      shape: (-1, 1)
      name: y:0
Method name is: tensorflow/serving/predict

要显示 SavedModel 中所有可用的信息，请使用 --all 选项。例如

$ saved_model_cli show --dir /tmp/saved_model_dir --all
MetaGraphDef with tag-set: 'serve' contains the following SignatureDefs:

signature_def['classify_x2_to_y3']:
  The given SavedModel SignatureDef contains the following input(s):
    inputs['inputs'] tensor_info:
        dtype: DT_FLOAT
        shape: (-1, 1)
        name: x2:0
  The given SavedModel SignatureDef contains the following output(s):
    outputs['scores'] tensor_info:
        dtype: DT_FLOAT
        shape: (-1, 1)
        name: y3:0
  Method name is: tensorflow/serving/classify

...

signature_def['serving_default']:
  The given SavedModel SignatureDef contains the following input(s):
    inputs['x'] tensor_info:
        dtype: DT_FLOAT
        shape: (-1, 1)
        name: x:0
  The given SavedModel SignatureDef contains the following output(s):
    outputs['y'] tensor_info:
        dtype: DT_FLOAT
        shape: (-1, 1)
        name: y:0
  Method name is: tensorflow/serving/predict

`run` 命令

调用 run 命令以运行图计算，传递输入，然后显示（并可选地保存）输出。以下是语法

usage: saved_model_cli run [-h] --dir DIR --tag_set TAG_SET --signature_def
                           SIGNATURE_DEF_KEY [--inputs INPUTS]
                           [--input_exprs INPUT_EXPRS]
                           [--input_examples INPUT_EXAMPLES] [--outdir OUTDIR]
                           [--overwrite] [--tf_debug]

run 命令提供了以下三种方法来将输入传递给模型

--inputs 选项使您能够传递文件中的 numpy ndarray。
--input_exprs 选项使您能够传递 Python 表达式。
--input_examples 选项使您能够传递 tf.train.Example。

`--inputs`

要传递文件中的输入数据，请指定 --inputs 选项，它采用以下通用格式

--inputs <INPUTS>

其中 *INPUTS* 是以下两种格式之一

<input_key>=<filename>
<input_key>=<filename>[<variable_name>]

您可以传递多个 *INPUTS*。如果您确实传递了多个输入，请使用分号分隔每个 *INPUTS*。

saved_model_cli 使用 numpy.load 加载 *filename*。*filename* 可以采用以下任何格式

.npy
.npz
pickle 格式

.npy 文件始终包含 numpy ndarray。因此，从 .npy 文件加载时，内容将直接分配给指定的输入张量。如果您使用该 .npy 文件指定了 *variable_name*，则将忽略 *variable_name* 并发出警告。

从 .npz（zip）文件加载时，您可以选择指定 *variable_name* 来标识 zip 文件中要为输入张量键加载的变量。如果您没有指定 *variable_name*，SavedModel CLI 将检查 zip 文件中是否只包含一个文件，并将其加载到指定的输入张量键中。

从 pickle 文件加载时，如果方括号中没有指定 variable_name，则 pickle 文件中的任何内容都将传递给指定的输入张量键。否则，SavedModel CLI 将假设 pickle 文件中存储了一个字典，并将对应于 *variable_name* 的值用于输入张量键。

`--input_exprs`

要通过 Python 表达式传递输入，请指定 --input_exprs 选项。当您没有数据文件，但仍想使用与模型 SignatureDef 的数据类型和形状匹配的一些简单输入来对模型进行健全性检查时，这很有用。例如

`<input_key>=[[1],[2],[3]]`

除了 Python 表达式之外，您还可以传递 numpy 函数。例如

`<input_key>=np.ones((32,32,3))`

（请注意，numpy 模块已作为 np 可供您使用。）

`--input_examples`

要传递 tf.train.Example 作为输入，请指定 --input_examples 选项。对于每个输入键，它接受一个字典列表，其中每个字典都是 tf.train.Example 的实例。字典键是特征，值是每个特征的值列表。例如

`<input_key>=[{"age":[22,24],"education":["BS","MS"]}]`

保存输出

默认情况下，SavedModel CLI 将输出写入标准输出。如果将目录传递给 --outdir 选项，则输出将作为以输出张量键命名的 .npy 文件保存在给定目录下。

使用 --overwrite 覆盖现有的输出文件。