在 TensorFlow.org 上查看 | 在 Google Colab 中运行 | 在 GitHub 上查看源代码 | 下载笔记本 |
本教程演示了如何训练一个简单的 卷积神经网络 (CNN) 来对 CIFAR 图像 进行分类。由于本教程使用了 Keras 顺序式 API,因此创建和训练模型只需几行代码。
导入 TensorFlow
import tensorflow as tf
from tensorflow.keras import datasets, layers, models
import matplotlib.pyplot as plt
2023-10-27 06:01:15.153603: E external/local_xla/xla/stream_executor/cuda/cuda_dnn.cc:9261] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered 2023-10-27 06:01:15.153656: E external/local_xla/xla/stream_executor/cuda/cuda_fft.cc:607] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered 2023-10-27 06:01:15.155401: E external/local_xla/xla/stream_executor/cuda/cuda_blas.cc:1515] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered
下载并准备 CIFAR10 数据集
CIFAR10 数据集包含 60,000 张彩色图像,分为 10 个类别,每个类别有 6,000 张图像。数据集分为 50,000 张训练图像和 10,000 张测试图像。这些类别是互斥的,它们之间没有重叠。
(train_images, train_labels), (test_images, test_labels) = datasets.cifar10.load_data()
# Normalize pixel values to be between 0 and 1
train_images, test_images = train_images / 255.0, test_images / 255.0
Downloading data from https://www.cs.toronto.edu/~kriz/cifar-10-python.tar.gz 170498071/170498071 [==============================] - 2s 0us/step
验证数据
为了验证数据集是否正确,让我们绘制训练集中前 25 张图像,并在每张图像下方显示类别名称
class_names = ['airplane', 'automobile', 'bird', 'cat', 'deer',
'dog', 'frog', 'horse', 'ship', 'truck']
plt.figure(figsize=(10,10))
for i in range(25):
plt.subplot(5,5,i+1)
plt.xticks([])
plt.yticks([])
plt.grid(False)
plt.imshow(train_images[i])
# The CIFAR labels happen to be arrays,
# which is why you need the extra index
plt.xlabel(class_names[train_labels[i][0]])
plt.show()
创建卷积基础
以下 6 行代码使用常见模式定义卷积基础:一系列 Conv2D 和 MaxPooling2D 层。
作为输入,CNN 接受形状为 (image_height, image_width, color_channels) 的张量,忽略批次大小。如果您不熟悉这些维度,color_channels 指的是 (R,G,B)。在本例中,您将配置 CNN 以处理形状为 (32, 32, 3) 的输入,这是 CIFAR 图像的格式。您可以通过将参数 input_shape
传递给您的第一层来实现这一点。
model = models.Sequential()
model.add(layers.Conv2D(32, (3, 3), activation='relu', input_shape=(32, 32, 3)))
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.Conv2D(64, (3, 3), activation='relu'))
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.Conv2D(64, (3, 3), activation='relu'))
让我们展示一下您目前模型的架构
model.summary()
Model: "sequential" _________________________________________________________________ Layer (type) Output Shape Param # ================================================================= conv2d (Conv2D) (None, 30, 30, 32) 896 max_pooling2d (MaxPooling2 (None, 15, 15, 32) 0 D) conv2d_1 (Conv2D) (None, 13, 13, 64) 18496 max_pooling2d_1 (MaxPoolin (None, 6, 6, 64) 0 g2D) conv2d_2 (Conv2D) (None, 4, 4, 64) 36928 ================================================================= Total params: 56320 (220.00 KB) Trainable params: 56320 (220.00 KB) Non-trainable params: 0 (0.00 Byte) _________________________________________________________________
如上所示,您可以看到每个 Conv2D 和 MaxPooling2D 层的输出都是一个形状为 (高度,宽度,通道) 的 3D 张量。随着您深入网络,宽度和高度维度往往会缩小。每个 Conv2D 层的输出通道数量由第一个参数控制(例如,32 或 64)。通常,随着宽度和高度的缩小,您可以在计算上负担得起在每个 Conv2D 层中添加更多输出通道。
在顶部添加 Dense 层
要完成模型,您将把卷积基的最后一个输出张量(形状为 (4, 4, 64))馈送到一个或多个 Dense 层以执行分类。Dense 层以向量作为输入(它们是 1D),而当前输出是 3D 张量。首先,您将展平(或展开)3D 输出到 1D,然后在顶部添加一个或多个 Dense 层。CIFAR 有 10 个输出类别,因此您使用具有 10 个输出的最终 Dense 层。
model.add(layers.Flatten())
model.add(layers.Dense(64, activation='relu'))
model.add(layers.Dense(10))
以下是您模型的完整架构
model.summary()
Model: "sequential" _________________________________________________________________ Layer (type) Output Shape Param # ================================================================= conv2d (Conv2D) (None, 30, 30, 32) 896 max_pooling2d (MaxPooling2 (None, 15, 15, 32) 0 D) conv2d_1 (Conv2D) (None, 13, 13, 64) 18496 max_pooling2d_1 (MaxPoolin (None, 6, 6, 64) 0 g2D) conv2d_2 (Conv2D) (None, 4, 4, 64) 36928 flatten (Flatten) (None, 1024) 0 dense (Dense) (None, 64) 65600 dense_1 (Dense) (None, 10) 650 ================================================================= Total params: 122570 (478.79 KB) Trainable params: 122570 (478.79 KB) Non-trainable params: 0 (0.00 Byte) _________________________________________________________________
网络摘要显示 (4, 4, 64) 输出在通过两个 Dense 层之前被展平成形状为 (1024) 的向量。
编译并训练模型
model.compile(optimizer='adam',
loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True),
metrics=['accuracy'])
history = model.fit(train_images, train_labels, epochs=10,
validation_data=(test_images, test_labels))
Epoch 1/10 WARNING: All log messages before absl::InitializeLog() is called are written to STDERR I0000 00:00:1698386490.372362 489369 device_compiler.h:186] Compiled cluster using XLA! This line is logged at most once for the lifetime of the process. 1563/1563 [==============================] - 10s 5ms/step - loss: 1.5211 - accuracy: 0.4429 - val_loss: 1.2497 - val_accuracy: 0.5531 Epoch 2/10 1563/1563 [==============================] - 6s 4ms/step - loss: 1.1408 - accuracy: 0.5974 - val_loss: 1.1474 - val_accuracy: 0.6023 Epoch 3/10 1563/1563 [==============================] - 6s 4ms/step - loss: 0.9862 - accuracy: 0.6538 - val_loss: 0.9759 - val_accuracy: 0.6582 Epoch 4/10 1563/1563 [==============================] - 6s 4ms/step - loss: 0.8929 - accuracy: 0.6879 - val_loss: 0.9412 - val_accuracy: 0.6702 Epoch 5/10 1563/1563 [==============================] - 6s 4ms/step - loss: 0.8183 - accuracy: 0.7131 - val_loss: 0.8830 - val_accuracy: 0.6967 Epoch 6/10 1563/1563 [==============================] - 6s 4ms/step - loss: 0.7588 - accuracy: 0.7334 - val_loss: 0.8671 - val_accuracy: 0.7039 Epoch 7/10 1563/1563 [==============================] - 6s 4ms/step - loss: 0.7126 - accuracy: 0.7518 - val_loss: 0.8972 - val_accuracy: 0.6897 Epoch 8/10 1563/1563 [==============================] - 7s 4ms/step - loss: 0.6655 - accuracy: 0.7661 - val_loss: 0.8412 - val_accuracy: 0.7111 Epoch 9/10 1563/1563 [==============================] - 7s 4ms/step - loss: 0.6205 - accuracy: 0.7851 - val_loss: 0.8581 - val_accuracy: 0.7109 Epoch 10/10 1563/1563 [==============================] - 7s 4ms/step - loss: 0.5872 - accuracy: 0.7937 - val_loss: 0.8817 - val_accuracy: 0.7113
评估模型
plt.plot(history.history['accuracy'], label='accuracy')
plt.plot(history.history['val_accuracy'], label = 'val_accuracy')
plt.xlabel('Epoch')
plt.ylabel('Accuracy')
plt.ylim([0.5, 1])
plt.legend(loc='lower right')
test_loss, test_acc = model.evaluate(test_images, test_labels, verbose=2)
313/313 - 1s - loss: 0.8817 - accuracy: 0.7113 - 655ms/epoch - 2ms/step
print(test_acc)
0.7113000154495239
您的简单 CNN 实现了超过 70% 的测试精度。对于几行代码来说还不错!对于另一种 CNN 风格,请查看 TensorFlow 2 专家快速入门 示例,该示例使用 Keras 子类化 API 和 tf.GradientTape
。