多任务推荐器

在 TensorFlow.org 上查看 在 Google Colab 中运行 在 GitHub 上查看源代码 下载笔记本

基本检索教程 中,我们使用电影观看作为正向交互信号构建了一个检索系统。

然而,在许多应用程序中,存在多个丰富的反馈来源可供利用。例如,一个电子商务网站可能会记录用户访问产品页面(数量丰富,但信号相对较弱)、图像点击、添加到购物车以及最终的购买行为。它甚至可能会记录购买后的信号,例如评论和退货。

整合所有这些不同的反馈形式对于构建用户喜欢使用且不会以牺牲整体性能为代价优化任何一个指标的系统至关重要。

此外,为多个任务构建一个联合模型可能比构建多个特定于任务的模型产生更好的结果。这在某些数据丰富(例如,点击)而某些数据稀疏(购买、退货、手动评论)的情况下尤其如此。在这些情况下,联合模型可能能够通过称为 迁移学习 的现象,使用从丰富任务中学习到的表示来改进其对稀疏任务的预测。例如,这篇论文 表明,通过添加一个使用丰富点击日志数据的辅助任务,可以显著提高从稀疏用户调查中预测显式用户评分的模型。

在本教程中,我们将使用隐式(电影观看)和显式信号(评分)来构建一个针对 Movielens 的多目标推荐器。

导入

首先,让我们导入必要的库。

pip install -q tensorflow-recommenders
pip install -q --upgrade tensorflow-datasets
import os
import pprint
import tempfile

from typing import Dict, Text

import numpy as np
import tensorflow as tf
import tensorflow_datasets as tfds
2022-12-14 12:23:34.727681: W tensorflow/compiler/xla/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libnvinfer.so.7'; dlerror: libnvinfer.so.7: cannot open shared object file: No such file or directory
2022-12-14 12:23:34.727787: W tensorflow/compiler/xla/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libnvinfer_plugin.so.7'; dlerror: libnvinfer_plugin.so.7: cannot open shared object file: No such file or directory
2022-12-14 12:23:34.727798: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Cannot dlopen some TensorRT libraries. If you would like to use Nvidia GPU with TensorRT, please make sure the missing libraries mentioned above are installed properly.
import tensorflow_recommenders as tfrs

准备数据集

我们将使用 Movielens 100K 数据集。

ratings = tfds.load('movielens/100k-ratings', split="train")
movies = tfds.load('movielens/100k-movies', split="train")

# Select the basic features.
ratings = ratings.map(lambda x: {
    "movie_title": x["movie_title"],
    "user_id": x["user_id"],
    "user_rating": x["user_rating"],
})
movies = movies.map(lambda x: x["movie_title"])
WARNING:tensorflow:From /tmpfs/src/tf_docs_env/lib/python3.9/site-packages/tensorflow/python/autograph/pyct/static_analysis/liveness.py:83: Analyzer.lamba_check (from tensorflow.python.autograph.pyct.static_analysis.liveness) is deprecated and will be removed after 2023-09-23.
Instructions for updating:
Lambda fuctions will be no more assumed to be used in the statement where they are used, or at least in the same block. https://github.com/tensorflow/tensorflow/issues/56089
WARNING:tensorflow:From /tmpfs/src/tf_docs_env/lib/python3.9/site-packages/tensorflow/python/autograph/pyct/static_analysis/liveness.py:83: Analyzer.lamba_check (from tensorflow.python.autograph.pyct.static_analysis.liveness) is deprecated and will be removed after 2023-09-23.
Instructions for updating:
Lambda fuctions will be no more assumed to be used in the statement where they are used, or at least in the same block. https://github.com/tensorflow/tensorflow/issues/56089

并重复我们为构建词汇表和将数据拆分为训练集和测试集所做的准备工作

# Randomly shuffle data and split between train and test.
tf.random.set_seed(42)
shuffled = ratings.shuffle(100_000, seed=42, reshuffle_each_iteration=False)

train = shuffled.take(80_000)
test = shuffled.skip(80_000).take(20_000)

movie_titles = movies.batch(1_000)
user_ids = ratings.batch(1_000_000).map(lambda x: x["user_id"])

unique_movie_titles = np.unique(np.concatenate(list(movie_titles)))
unique_user_ids = np.unique(np.concatenate(list(user_ids)))

多任务模型

多任务推荐器有两个关键部分

  1. 它们针对两个或多个目标进行优化,因此具有两个或多个损失函数。
  2. 它们在任务之间共享变量,从而实现迁移学习。

在本教程中,我们将像以前一样定义我们的模型,但我们将有两个任务而不是一个任务:一个预测评分,另一个预测电影观看。

用户和电影模型与以前相同

user_model = tf.keras.Sequential([
  tf.keras.layers.StringLookup(
      vocabulary=unique_user_ids, mask_token=None),
  # We add 1 to account for the unknown token.
  tf.keras.layers.Embedding(len(unique_user_ids) + 1, embedding_dimension)
])

movie_model = tf.keras.Sequential([
  tf.keras.layers.StringLookup(
      vocabulary=unique_movie_titles, mask_token=None),
  tf.keras.layers.Embedding(len(unique_movie_titles) + 1, embedding_dimension)
])

但是,现在我们将有两个任务。第一个是评分任务

tfrs.tasks.Ranking(
    loss=tf.keras.losses.MeanSquaredError(),
    metrics=[tf.keras.metrics.RootMeanSquaredError()],
)

它的目标是尽可能准确地预测评分。

第二个是检索任务

tfrs.tasks.Retrieval(
    metrics=tfrs.metrics.FactorizedTopK(
        candidates=movies.batch(128)
    )
)

与以前一样,此任务的目标是预测用户将观看或不会观看哪些电影。

整合在一起

我们将所有内容整合到一个模型类中。

这里的新组件是 - 由于我们有两个任务和两个损失函数 - 我们需要决定每个损失函数的重要性。我们可以通过为每个损失函数分配一个权重来做到这一点,并将这些权重视为超参数。如果我们为评分任务分配一个较大的损失权重,我们的模型将专注于预测评分(但仍然使用来自检索任务的一些信息);如果我们为检索任务分配一个较大的损失权重,它将专注于检索。

class MovielensModel(tfrs.models.Model):

  def __init__(self, rating_weight: float, retrieval_weight: float) -> None:
    # We take the loss weights in the constructor: this allows us to instantiate
    # several model objects with different loss weights.

    super().__init__()

    embedding_dimension = 32

    # User and movie models.
    self.movie_model: tf.keras.layers.Layer = tf.keras.Sequential([
      tf.keras.layers.StringLookup(
        vocabulary=unique_movie_titles, mask_token=None),
      tf.keras.layers.Embedding(len(unique_movie_titles) + 1, embedding_dimension)
    ])
    self.user_model: tf.keras.layers.Layer = tf.keras.Sequential([
      tf.keras.layers.StringLookup(
        vocabulary=unique_user_ids, mask_token=None),
      tf.keras.layers.Embedding(len(unique_user_ids) + 1, embedding_dimension)
    ])

    # A small model to take in user and movie embeddings and predict ratings.
    # We can make this as complicated as we want as long as we output a scalar
    # as our prediction.
    self.rating_model = tf.keras.Sequential([
        tf.keras.layers.Dense(256, activation="relu"),
        tf.keras.layers.Dense(128, activation="relu"),
        tf.keras.layers.Dense(1),
    ])

    # The tasks.
    self.rating_task: tf.keras.layers.Layer = tfrs.tasks.Ranking(
        loss=tf.keras.losses.MeanSquaredError(),
        metrics=[tf.keras.metrics.RootMeanSquaredError()],
    )
    self.retrieval_task: tf.keras.layers.Layer = tfrs.tasks.Retrieval(
        metrics=tfrs.metrics.FactorizedTopK(
            candidates=movies.batch(128).map(self.movie_model)
        )
    )

    # The loss weights.
    self.rating_weight = rating_weight
    self.retrieval_weight = retrieval_weight

  def call(self, features: Dict[Text, tf.Tensor]) -> tf.Tensor:
    # We pick out the user features and pass them into the user model.
    user_embeddings = self.user_model(features["user_id"])
    # And pick out the movie features and pass them into the movie model.
    movie_embeddings = self.movie_model(features["movie_title"])

    return (
        user_embeddings,
        movie_embeddings,
        # We apply the multi-layered rating model to a concatentation of
        # user and movie embeddings.
        self.rating_model(
            tf.concat([user_embeddings, movie_embeddings], axis=1)
        ),
    )

  def compute_loss(self, features: Dict[Text, tf.Tensor], training=False) -> tf.Tensor:

    ratings = features.pop("user_rating")

    user_embeddings, movie_embeddings, rating_predictions = self(features)

    # We compute the loss for each task.
    rating_loss = self.rating_task(
        labels=ratings,
        predictions=rating_predictions,
    )
    retrieval_loss = self.retrieval_task(user_embeddings, movie_embeddings)

    # And combine them using the loss weights.
    return (self.rating_weight * rating_loss
            + self.retrieval_weight * retrieval_loss)

评分专用模型

根据我们分配的权重,模型将编码不同程度的任务平衡。让我们从一个只考虑评分的模型开始。

model = MovielensModel(rating_weight=1.0, retrieval_weight=0.0)
model.compile(optimizer=tf.keras.optimizers.Adagrad(0.1))
cached_train = train.shuffle(100_000).batch(8192).cache()
cached_test = test.batch(4096).cache()
model.fit(cached_train, epochs=3)
metrics = model.evaluate(cached_test, return_dict=True)

print(f"Retrieval top-100 accuracy: {metrics['factorized_top_k/top_100_categorical_accuracy']:.3f}.")
print(f"Ranking RMSE: {metrics['root_mean_squared_error']:.3f}.")
Epoch 1/3
10/10 [==============================] - 7s 319ms/step - root_mean_squared_error: 2.2354 - factorized_top_k/top_1_categorical_accuracy: 3.3750e-04 - factorized_top_k/top_5_categorical_accuracy: 0.0026 - factorized_top_k/top_10_categorical_accuracy: 0.0060 - factorized_top_k/top_50_categorical_accuracy: 0.0305 - factorized_top_k/top_100_categorical_accuracy: 0.0599 - loss: 4.5809 - regularization_loss: 0.0000e+00 - total_loss: 4.5809
Epoch 2/3
10/10 [==============================] - 3s 319ms/step - root_mean_squared_error: 1.1220 - factorized_top_k/top_1_categorical_accuracy: 2.6250e-04 - factorized_top_k/top_5_categorical_accuracy: 0.0025 - factorized_top_k/top_10_categorical_accuracy: 0.0056 - factorized_top_k/top_50_categorical_accuracy: 0.0304 - factorized_top_k/top_100_categorical_accuracy: 0.0601 - loss: 1.2614 - regularization_loss: 0.0000e+00 - total_loss: 1.2614
Epoch 3/3
10/10 [==============================] - 3s 315ms/step - root_mean_squared_error: 1.1170 - factorized_top_k/top_1_categorical_accuracy: 2.6250e-04 - factorized_top_k/top_5_categorical_accuracy: 0.0024 - factorized_top_k/top_10_categorical_accuracy: 0.0057 - factorized_top_k/top_50_categorical_accuracy: 0.0304 - factorized_top_k/top_100_categorical_accuracy: 0.0605 - loss: 1.2500 - regularization_loss: 0.0000e+00 - total_loss: 1.2500
5/5 [==============================] - 3s 185ms/step - root_mean_squared_error: 1.1125 - factorized_top_k/top_1_categorical_accuracy: 5.0000e-04 - factorized_top_k/top_5_categorical_accuracy: 0.0034 - factorized_top_k/top_10_categorical_accuracy: 0.0065 - factorized_top_k/top_50_categorical_accuracy: 0.0309 - factorized_top_k/top_100_categorical_accuracy: 0.0599 - loss: 1.2326 - regularization_loss: 0.0000e+00 - total_loss: 1.2326
Retrieval top-100 accuracy: 0.060.
Ranking RMSE: 1.113.

该模型在预测评分方面表现良好(RMSE 约为 1.11),但在预测哪些电影将被观看或不被观看方面表现不佳:其在 100 处的准确率几乎是仅训练用于预测观看行为的模型的 4 倍。

检索专用模型

现在让我们尝试一个只关注检索的模型。

model = MovielensModel(rating_weight=0.0, retrieval_weight=1.0)
model.compile(optimizer=tf.keras.optimizers.Adagrad(0.1))
model.fit(cached_train, epochs=3)
metrics = model.evaluate(cached_test, return_dict=True)

print(f"Retrieval top-100 accuracy: {metrics['factorized_top_k/top_100_categorical_accuracy']:.3f}.")
print(f"Ranking RMSE: {metrics['root_mean_squared_error']:.3f}.")
Epoch 1/3
10/10 [==============================] - 4s 309ms/step - root_mean_squared_error: 3.6972 - factorized_top_k/top_1_categorical_accuracy: 5.8750e-04 - factorized_top_k/top_5_categorical_accuracy: 0.0056 - factorized_top_k/top_10_categorical_accuracy: 0.0131 - factorized_top_k/top_50_categorical_accuracy: 0.0751 - factorized_top_k/top_100_categorical_accuracy: 0.1483 - loss: 69829.1612 - regularization_loss: 0.0000e+00 - total_loss: 69829.1612
Epoch 2/3
10/10 [==============================] - 3s 301ms/step - root_mean_squared_error: 3.6905 - factorized_top_k/top_1_categorical_accuracy: 0.0010 - factorized_top_k/top_5_categorical_accuracy: 0.0118 - factorized_top_k/top_10_categorical_accuracy: 0.0272 - factorized_top_k/top_50_categorical_accuracy: 0.1425 - factorized_top_k/top_100_categorical_accuracy: 0.2634 - loss: 67466.0661 - regularization_loss: 0.0000e+00 - total_loss: 67466.0661
Epoch 3/3
10/10 [==============================] - 3s 300ms/step - root_mean_squared_error: 3.6877 - factorized_top_k/top_1_categorical_accuracy: 0.0016 - factorized_top_k/top_5_categorical_accuracy: 0.0183 - factorized_top_k/top_10_categorical_accuracy: 0.0391 - factorized_top_k/top_50_categorical_accuracy: 0.1782 - factorized_top_k/top_100_categorical_accuracy: 0.3048 - loss: 66294.5128 - regularization_loss: 0.0000e+00 - total_loss: 66294.5128
5/5 [==============================] - 1s 188ms/step - root_mean_squared_error: 3.6884 - factorized_top_k/top_1_categorical_accuracy: 9.5000e-04 - factorized_top_k/top_5_categorical_accuracy: 0.0093 - factorized_top_k/top_10_categorical_accuracy: 0.0203 - factorized_top_k/top_50_categorical_accuracy: 0.1199 - factorized_top_k/top_100_categorical_accuracy: 0.2330 - loss: 31092.1455 - regularization_loss: 0.0000e+00 - total_loss: 31092.1455
Retrieval top-100 accuracy: 0.233.
Ranking RMSE: 3.688.

我们得到了相反的结果:一个在检索方面表现良好,但在预测评分方面表现不佳的模型。

联合模型

现在让我们训练一个为两个任务分配正权重的模型。

model = MovielensModel(rating_weight=1.0, retrieval_weight=1.0)
model.compile(optimizer=tf.keras.optimizers.Adagrad(0.1))
model.fit(cached_train, epochs=3)
metrics = model.evaluate(cached_test, return_dict=True)

print(f"Retrieval top-100 accuracy: {metrics['factorized_top_k/top_100_categorical_accuracy']:.3f}.")
print(f"Ranking RMSE: {metrics['root_mean_squared_error']:.3f}.")
Epoch 1/3
10/10 [==============================] - 4s 309ms/step - root_mean_squared_error: 2.0230 - factorized_top_k/top_1_categorical_accuracy: 5.8750e-04 - factorized_top_k/top_5_categorical_accuracy: 0.0051 - factorized_top_k/top_10_categorical_accuracy: 0.0123 - factorized_top_k/top_50_categorical_accuracy: 0.0768 - factorized_top_k/top_100_categorical_accuracy: 0.1509 - loss: 69787.3722 - regularization_loss: 0.0000e+00 - total_loss: 69787.3722
Epoch 2/3
10/10 [==============================] - 3s 312ms/step - root_mean_squared_error: 1.3647 - factorized_top_k/top_1_categorical_accuracy: 0.0012 - factorized_top_k/top_5_categorical_accuracy: 0.0120 - factorized_top_k/top_10_categorical_accuracy: 0.0275 - factorized_top_k/top_50_categorical_accuracy: 0.1438 - factorized_top_k/top_100_categorical_accuracy: 0.2642 - loss: 67453.3125 - regularization_loss: 0.0000e+00 - total_loss: 67453.3125
Epoch 3/3
10/10 [==============================] - 3s 309ms/step - root_mean_squared_error: 1.1934 - factorized_top_k/top_1_categorical_accuracy: 0.0016 - factorized_top_k/top_5_categorical_accuracy: 0.0190 - factorized_top_k/top_10_categorical_accuracy: 0.0394 - factorized_top_k/top_50_categorical_accuracy: 0.1771 - factorized_top_k/top_100_categorical_accuracy: 0.3037 - loss: 66299.1676 - regularization_loss: 0.0000e+00 - total_loss: 66299.1676
5/5 [==============================] - 1s 190ms/step - root_mean_squared_error: 1.1100 - factorized_top_k/top_1_categorical_accuracy: 9.5000e-04 - factorized_top_k/top_5_categorical_accuracy: 0.0086 - factorized_top_k/top_10_categorical_accuracy: 0.0210 - factorized_top_k/top_50_categorical_accuracy: 0.1237 - factorized_top_k/top_100_categorical_accuracy: 0.2349 - loss: 31075.5518 - regularization_loss: 0.0000e+00 - total_loss: 31075.5518
Retrieval top-100 accuracy: 0.235.
Ranking RMSE: 1.110.

结果是一个在两个任务上的表现都与每个专用模型大致相同的模型。

进行预测

我们可以使用训练好的多任务模型来获取训练好的用户和电影嵌入,以及预测的评分

trained_movie_embeddings, trained_user_embeddings, predicted_rating = model({
      "user_id": np.array(["42"]),
      "movie_title": np.array(["Dances with Wolves (1990)"])
  })
print("Predicted rating:")
print(predicted_rating)
Predicted rating:
tf.Tensor([[4.604047]], shape=(1, 1), dtype=float32)

虽然这里的结果没有显示出联合模型在这种情况下具有明显的准确性优势,但多任务学习通常是一个非常有用的工具。当我们能够将知识从数据丰富的任务(例如,点击)迁移到密切相关的、数据稀疏的任务(例如,购买)时,我们可以预期获得更好的结果。