在 TensorFlow.org 上查看 | 在 Google Colab 中运行 | 在 GitHub 上查看源代码 | 下载笔记本 |
概述
使用 **TensorFlow 文本摘要 API**,您可以轻松地记录任意文本并在 TensorBoard 中查看它。这对于对输入数据进行采样和检查,或记录执行元数据或生成的文本非常有用。您还可以将诊断数据记录为文本,这在模型开发过程中可能会有所帮助。
在本教程中,您将尝试使用文本摘要 API 的一些基本用例。
设置
try:
# %tensorflow_version only exists in Colab.
%tensorflow_version 2.x
except Exception:
pass
# Load the TensorBoard notebook extension.
%load_ext tensorboard
import tensorflow as tf
from datetime import datetime
import json
from packaging import version
import tempfile
print("TensorFlow version: ", tf.__version__)
assert version.parse(tf.__version__).release[0] >= 2, \
"This notebook requires TensorFlow 2.0 or above."
TensorFlow version: 2.5.0-dev20210219
记录单个文本
要了解文本摘要 API 的工作原理,您只需记录一些文本并查看它在 TensorBoard 中的显示方式。
my_text = "Hello world! 😃"
# Clear out any prior log data.
!rm -rf logs
# Sets up a timestamped log directory.
logdir = "logs/text_basics/" + datetime.now().strftime("%Y%m%d-%H%M%S")
# Creates a file writer for the log directory.
file_writer = tf.summary.create_file_writer(logdir)
# Using the file writer, log the text.
with file_writer.as_default():
tf.summary.text("first_text", my_text, step=0)
现在,使用 TensorBoard 检查文本。等待几秒钟,UI 就会启动。
%tensorboard --logdir logs
组织多个文本流
如果您有多个文本流,您可以将它们保存在单独的命名空间中,以帮助组织它们,就像标量或其他数据一样。
请注意,如果您在许多步骤中记录文本,TensorBoard 将对步骤进行子采样以进行显示,以便使显示易于管理。您可以使用 --samples_per_plugin
标志控制采样率。
# Sets up a second directory to not overwrite the first one.
logdir = "logs/multiple_texts/" + datetime.now().strftime("%Y%m%d-%H%M%S")
# Creates a file writer for the log directory.
file_writer = tf.summary.create_file_writer(logdir)
# Using the file writer, log the text.
with file_writer.as_default():
with tf.name_scope("name_scope_1"):
for step in range(20):
tf.summary.text("a_stream_of_text", f"Hello from step {step}", step=step)
tf.summary.text("another_stream_of_text", f"This can be kept separate {step}", step=step)
with tf.name_scope("name_scope_2"):
tf.summary.text("just_from_step_0", "This is an important announcement from step 0", step=0)
%tensorboard --logdir logs/multiple_texts --samples_per_plugin 'text=5'
Markdown 解释
TensorBoard 将文本摘要解释为 Markdown,因为丰富的格式可以使您记录的数据更易于阅读和理解,如下所示。(如果您不希望解释 Markdown,请参阅 此问题,了解有关抑制解释的解决方法。)
# Sets up a third timestamped log directory under "logs"
logdir = "logs/markdown/" + datetime.now().strftime("%Y%m%d-%H%M%S")
# Creates a file writer for the log directory.
file_writer = tf.summary.create_file_writer(logdir)
some_obj_worth_noting = {
"tfds_training_data": {
"name": "mnist",
"split": "train",
"shuffle_files": "True",
},
"keras_optimizer": {
"name": "Adagrad",
"learning_rate": "0.001",
"epsilon": 1e-07,
},
"hardware": "Cloud TPU",
}
# TODO: Update this example when TensorBoard is released with
# https://github.com/tensorflow/tensorboard/pull/4585
# which supports fenced codeblocks in Markdown.
def pretty_json(hp):
json_hp = json.dumps(hp, indent=2)
return "".join("\t" + line for line in json_hp.splitlines(True))
markdown_text = """
### Markdown Text
TensorBoard supports basic markdown syntax, including:
preformatted code
**bold text**
| and | tables |
| ---- | ---------- |
| among | others |
"""
with file_writer.as_default():
tf.summary.text("run_params", pretty_json(some_obj_worth_noting), step=0)
tf.summary.text("markdown_jubiliee", markdown_text, step=0)
%tensorboard --logdir logs/markdown