Keras 中的权重聚类示例

在 TensorFlow.org 上查看 在 Google Colab 中运行 在 GitHub 上查看源代码 下载笔记本

概述

欢迎使用 TensorFlow 模型优化工具包的权重聚类端到端示例。

其他页面

有关权重聚类的介绍以及如何确定是否应该使用它(包括支持的内容),请参阅概述页面

要快速找到适合您的用例的 API(超出使用 16 个聚类对模型进行完全聚类),请参阅综合指南

内容

在本教程中,您将

  1. 从头开始训练 MNIST 数据集的keras 模型。
  2. 通过应用权重聚类 API 对模型进行微调,并查看准确率。
  3. 从聚类创建 6 倍更小的 TF 和 TFLite 模型。
  4. 通过结合权重聚类和训练后量化创建 8 倍更小的 TFLite 模型。
  5. 查看从 TF 到 TFLite 的准确率持久性。

设置

您可以在本地virtualenvcolab 中运行此 Jupyter 笔记本。有关设置依赖项的详细信息,请参阅安装指南

 pip install -q tensorflow-model-optimization
import tensorflow as tf
from tensorflow_model_optimization.python.core.keras.compat import keras

import numpy as np
import tempfile
import zipfile
import os

训练 MNIST 的 keras 模型,不进行聚类

# Load MNIST dataset
(train_images, train_labels), (test_images, test_labels) = tf.keras.datasets.mnist.load_data()

# Normalize the input image so that each pixel value is between 0 to 1.
train_images = train_images / 255.0
test_images  = test_images / 255.0

# Define the model architecture.
model = keras.Sequential([
    keras.layers.InputLayer(input_shape=(28, 28)),
    keras.layers.Reshape(target_shape=(28, 28, 1)),
    keras.layers.Conv2D(filters=12, kernel_size=(3, 3), activation=tf.nn.relu),
    keras.layers.MaxPooling2D(pool_size=(2, 2)),
    keras.layers.Flatten(),
    keras.layers.Dense(10)
])

# Train the digit classification model
model.compile(optimizer='adam',
              loss=keras.losses.SparseCategoricalCrossentropy(from_logits=True),
              metrics=['accuracy'])

model.fit(
    train_images,
    train_labels,
    validation_split=0.1,
    epochs=10
)
2024-03-09 12:34:47.914475: E external/local_xla/xla/stream_executor/cuda/cuda_driver.cc:282] failed call to cuInit: CUDA_ERROR_NO_DEVICE: no CUDA-capable device is detected
Epoch 1/10
1688/1688 [==============================] - 21s 4ms/step - loss: 0.2914 - accuracy: 0.9183 - val_loss: 0.1165 - val_accuracy: 0.9680
Epoch 2/10
1688/1688 [==============================] - 7s 4ms/step - loss: 0.1108 - accuracy: 0.9686 - val_loss: 0.0783 - val_accuracy: 0.9788
Epoch 3/10
1688/1688 [==============================] - 7s 4ms/step - loss: 0.0807 - accuracy: 0.9769 - val_loss: 0.0671 - val_accuracy: 0.9833
Epoch 4/10
1688/1688 [==============================] - 7s 4ms/step - loss: 0.0661 - accuracy: 0.9806 - val_loss: 0.0646 - val_accuracy: 0.9840
Epoch 5/10
1688/1688 [==============================] - 7s 4ms/step - loss: 0.0575 - accuracy: 0.9826 - val_loss: 0.0596 - val_accuracy: 0.9850
Epoch 6/10
1688/1688 [==============================] - 7s 4ms/step - loss: 0.0514 - accuracy: 0.9843 - val_loss: 0.0594 - val_accuracy: 0.9852
Epoch 7/10
1688/1688 [==============================] - 7s 4ms/step - loss: 0.0465 - accuracy: 0.9859 - val_loss: 0.0685 - val_accuracy: 0.9820
Epoch 8/10
1688/1688 [==============================] - 7s 4ms/step - loss: 0.0422 - accuracy: 0.9873 - val_loss: 0.0622 - val_accuracy: 0.9845
Epoch 9/10
1688/1688 [==============================] - 7s 4ms/step - loss: 0.0389 - accuracy: 0.9880 - val_loss: 0.0594 - val_accuracy: 0.9837
Epoch 10/10
1688/1688 [==============================] - 7s 4ms/step - loss: 0.0354 - accuracy: 0.9894 - val_loss: 0.0644 - val_accuracy: 0.9843
<tf_keras.src.callbacks.History at 0x7f1be5b37430>

评估基线模型并保存以备后用

_, baseline_model_accuracy = model.evaluate(
    test_images, test_labels, verbose=0)

print('Baseline test accuracy:', baseline_model_accuracy)

_, keras_file = tempfile.mkstemp('.h5')
print('Saving model to: ', keras_file)
keras.models.save_model(model, keras_file, include_optimizer=False)
Baseline test accuracy: 0.98089998960495
Saving model to:  /tmpfs/tmp/tmpigrs28_d.h5
/tmpfs/tmp/ipykernel_29244/3680774635.py:8: UserWarning: You are saving your model as an HDF5 file via `model.save()`. This file format is considered legacy. We recommend using instead the native TF-Keras format, e.g. `model.save('my_model.keras')`.
  keras.models.save_model(model, keras_file, include_optimizer=False)

使用聚类对预训练模型进行微调

cluster_weights() API 应用于整个预训练模型,以演示其在应用 zip 后减少模型大小方面的有效性,同时保持良好的准确率。有关如何为您的用例最佳平衡准确率和压缩率,请参阅综合指南 中的逐层示例。

定义模型并应用聚类 API

在将模型传递给聚类 API 之前,请确保模型已训练并显示出一些可接受的准确率。

import tensorflow_model_optimization as tfmot

cluster_weights = tfmot.clustering.keras.cluster_weights
CentroidInitialization = tfmot.clustering.keras.CentroidInitialization

clustering_params = {
  'number_of_clusters': 16,
  'cluster_centroids_init': CentroidInitialization.LINEAR
}

# Cluster a whole model
clustered_model = cluster_weights(model, **clustering_params)

# Use smaller learning rate for fine-tuning clustered model
opt = keras.optimizers.Adam(learning_rate=1e-5)

clustered_model.compile(
  loss=keras.losses.SparseCategoricalCrossentropy(from_logits=True),
  optimizer=opt,
  metrics=['accuracy'])

clustered_model.summary()
Model: "sequential"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
=================================================================
 cluster_reshape (ClusterWe  (None, 28, 28, 1)         0         
 ights)                                                          
                                                                 
 cluster_conv2d (ClusterWei  (None, 26, 26, 12)        244       
 ghts)                                                           
                                                                 
 cluster_max_pooling2d (Clu  (None, 13, 13, 12)        0         
 sterWeights)                                                    
                                                                 
 cluster_flatten (ClusterWe  (None, 2028)              0         
 ights)                                                          
                                                                 
 cluster_dense (ClusterWeig  (None, 10)                40586     
 hts)                                                            
                                                                 
=================================================================
Total params: 40830 (239.13 KB)
Trainable params: 20442 (79.85 KB)
Non-trainable params: 20388 (159.28 KB)
_________________________________________________________________

微调模型并评估其相对于基线的准确率

使用聚类对模型进行 1 个 epoch 的微调。

# Fine-tune model
clustered_model.fit(
  train_images,
  train_labels,
  batch_size=500,
  epochs=1,
  validation_split=0.1)
108/108 [==============================] - 3s 17ms/step - loss: 0.0397 - accuracy: 0.9875 - val_loss: 0.0671 - val_accuracy: 0.9833
<tf_keras.src.callbacks.History at 0x7f1be5b9a790>

对于此示例,与基线相比,聚类后测试准确率的损失很小。

_, clustered_model_accuracy = clustered_model.evaluate(
  test_images, test_labels, verbose=0)

print('Baseline test accuracy:', baseline_model_accuracy)
print('Clustered test accuracy:', clustered_model_accuracy)
Baseline test accuracy: 0.98089998960495
Clustered test accuracy: 0.9786999821662903

从聚类创建 6 倍更小的模型

为了看到聚类的压缩优势,需要strip_clustering 和应用标准压缩算法(例如通过 gzip)都是必要的。

首先,为 TensorFlow 创建一个可压缩模型。在这里,strip_clustering 删除了聚类仅在训练期间需要的变量(例如,用于存储聚类中心和索引的tf.Variable),否则这些变量会在推理期间增加模型大小。

final_model = tfmot.clustering.keras.strip_clustering(clustered_model)

_, clustered_keras_file = tempfile.mkstemp('.h5')
print('Saving clustered model to: ', clustered_keras_file)
keras.models.save_model(final_model, clustered_keras_file, 
                           include_optimizer=False)
Saving clustered model to:  /tmpfs/tmp/tmpu8mqv83j.h5
WARNING:tensorflow:Compiled the loaded model, but the compiled metrics have yet to be built. `model.compile_metrics` will be empty until you train or evaluate the model.
/tmpfs/tmp/ipykernel_29244/2668672504.py:5: UserWarning: You are saving your model as an HDF5 file via `model.save()`. This file format is considered legacy. We recommend using instead the native TF-Keras format, e.g. `model.save('my_model.keras')`.
  keras.models.save_model(final_model, clustered_keras_file,

然后,为 TFLite 创建可压缩模型。您可以将聚类后的模型转换为可在目标后端运行的格式。TensorFlow Lite 是您可以用来部署到移动设备的示例。

clustered_tflite_file = '/tmp/clustered_mnist.tflite'
converter = tf.lite.TFLiteConverter.from_keras_model(final_model)
tflite_clustered_model = converter.convert()
with open(clustered_tflite_file, 'wb') as f:
  f.write(tflite_clustered_model)
print('Saved clustered TFLite model to:', clustered_tflite_file)
INFO:tensorflow:Assets written to: /tmpfs/tmp/tmpdw8boe6k/assets
INFO:tensorflow:Assets written to: /tmpfs/tmp/tmpdw8boe6k/assets
Saved clustered TFLite model to: /tmp/clustered_mnist.tflite
WARNING: All log messages before absl::InitializeLog() is called are written to STDERR
W0000 00:00:1709987777.341582   29244 tf_tfl_flatbuffer_helpers.cc:390] Ignored output_format.
W0000 00:00:1709987777.341632   29244 tf_tfl_flatbuffer_helpers.cc:393] Ignored drop_control_dependency.

定义一个辅助函数来实际压缩通过 gzip 的模型并测量压缩后的尺寸。

def get_gzipped_model_size(file):
  # It returns the size of the gzipped model in bytes.
  import os
  import zipfile

  _, zipped_file = tempfile.mkstemp('.zip')
  with zipfile.ZipFile(zipped_file, 'w', compression=zipfile.ZIP_DEFLATED) as f:
    f.write(file)

  return os.path.getsize(zipped_file)

比较并查看模型从聚类中缩小了 6 倍

print("Size of gzipped baseline Keras model: %.2f bytes" % (get_gzipped_model_size(keras_file)))
print("Size of gzipped clustered Keras model: %.2f bytes" % (get_gzipped_model_size(clustered_keras_file)))
print("Size of gzipped clustered TFlite model: %.2f bytes" % (get_gzipped_model_size(clustered_tflite_file)))
Size of gzipped baseline Keras model: 78177.00 bytes
Size of gzipped clustered Keras model: 13053.00 bytes
Size of gzipped clustered TFlite model: 12638.00 bytes

通过结合权重聚类和训练后量化创建 8 倍更小的 TFLite 模型

您可以将训练后量化应用于聚类后的模型以获得更多好处。

converter = tf.lite.TFLiteConverter.from_keras_model(final_model)
converter.optimizations = [tf.lite.Optimize.DEFAULT]
tflite_quant_model = converter.convert()

_, quantized_and_clustered_tflite_file = tempfile.mkstemp('.tflite')

with open(quantized_and_clustered_tflite_file, 'wb') as f:
  f.write(tflite_quant_model)

print('Saved quantized and clustered TFLite model to:', quantized_and_clustered_tflite_file)
print("Size of gzipped baseline Keras model: %.2f bytes" % (get_gzipped_model_size(keras_file)))
print("Size of gzipped clustered and quantized TFlite model: %.2f bytes" % (get_gzipped_model_size(quantized_and_clustered_tflite_file)))
INFO:tensorflow:Assets written to: /tmpfs/tmp/tmpzkdsr0o3/assets
INFO:tensorflow:Assets written to: /tmpfs/tmp/tmpzkdsr0o3/assets
W0000 00:00:1709987778.159317   29244 tf_tfl_flatbuffer_helpers.cc:390] Ignored output_format.
W0000 00:00:1709987778.159352   29244 tf_tfl_flatbuffer_helpers.cc:393] Ignored drop_control_dependency.
Saved quantized and clustered TFLite model to: /tmpfs/tmp/tmpylkyahiy.tflite
Size of gzipped baseline Keras model: 78177.00 bytes
Size of gzipped clustered and quantized TFlite model: 9792.00 bytes

查看从 TF 到 TFLite 的准确率持久性

定义一个辅助函数来评估测试数据集上的 TFLite 模型。

def eval_model(interpreter):
  input_index = interpreter.get_input_details()[0]["index"]
  output_index = interpreter.get_output_details()[0]["index"]

  # Run predictions on every image in the "test" dataset.
  prediction_digits = []
  for i, test_image in enumerate(test_images):
    if i % 1000 == 0:
      print('Evaluated on {n} results so far.'.format(n=i))
    # Pre-processing: add batch dimension and convert to float32 to match with
    # the model's input data format.
    test_image = np.expand_dims(test_image, axis=0).astype(np.float32)
    interpreter.set_tensor(input_index, test_image)

    # Run inference.
    interpreter.invoke()

    # Post-processing: remove batch dimension and find the digit with highest
    # probability.
    output = interpreter.tensor(output_index)
    digit = np.argmax(output()[0])
    prediction_digits.append(digit)

  print('\n')
  # Compare prediction results with ground truth labels to calculate accuracy.
  prediction_digits = np.array(prediction_digits)
  accuracy = (prediction_digits == test_labels).mean()
  return accuracy

您评估模型(已聚类和量化),然后查看 TensorFlow 的准确率是否持久到 TFLite 后端。

interpreter = tf.lite.Interpreter(model_content=tflite_quant_model)
interpreter.allocate_tensors()

test_accuracy = eval_model(interpreter)

print('Clustered and quantized TFLite test_accuracy:', test_accuracy)
print('Clustered TF test accuracy:', clustered_model_accuracy)
INFO: Created TensorFlow Lite XNNPACK delegate for CPU.
WARNING: Attempting to use a delegate that only supports static-sized tensors with a graph that has dynamic-sized tensors (tensor#13 is a dynamic-sized tensor).
Evaluated on 0 results so far.
Evaluated on 1000 results so far.
Evaluated on 2000 results so far.
Evaluated on 3000 results so far.
Evaluated on 4000 results so far.
Evaluated on 5000 results so far.
Evaluated on 6000 results so far.
Evaluated on 7000 results so far.
Evaluated on 8000 results so far.
Evaluated on 9000 results so far.


Clustered and quantized TFLite test_accuracy: 0.9791
Clustered TF test accuracy: 0.9786999821662903

结论

在本教程中,您了解了如何使用 TensorFlow 模型优化工具包 API 创建聚类模型。更具体地说,您已经经历了为 MNIST 创建 8 倍更小的模型的端到端示例,并且准确率差异很小。我们鼓励您尝试此新功能,它对于在资源受限的环境中部署尤其重要。