Would You like Your Data to be Trained?
A User Controllable Recommendation Framework

Lei Wang, Xu Chen, Zhenhua Dong, Quanyu Dai  

1. Abstract

Recommender system has been deployed in a large amount of real-world applications, profoundly influencing people's daily life and production. Traditional recommender models mostly collect as much as possible user information to accurate estimate the user preference. However, in real-world scenarios, the users may not want all their behaviors to join into the model training process. For example, the user may want to actively edit her profile by removing the items which are incorrectly clicked or purchased for the other people. In this paper, we study a novel recommendation paradigm, where the users are allowed to indicate their ``willingness'' on letting different data to train the model, and the models are optimized to maximize the utility which trades-off the recommendation performance and the violation of the user ``willingness''. More specifically, we formulate the recommendation problem as a multiplayer game. Each user is a player, and the player action is a selection vector representing whether the user would like to leverage her interacted items to train the model. For efficiently solving this game, we design an influence function based model, which can approximate the recommendation performances for different actions without re-optimizing the model. In addition, we also improve the above model by deploying multiple anchor actions for the influence function, which is expected to improve the performance approximation accuracy. At last, we theoretically analyze the convergence rate of our algorithm and demonstrate the superiority of introducing multiple anchor actions. We conduct extensive experiments based on both simulation and real-world datasets to demonstrate the effectiveness of our models on balancing the recommendation quality and user willingness.

2. Contributions

  • We propose a novel recommendation paradigm, where the users can explicitly indicate their willingness on leveraging different items to train the model.
  • To solve the above problem, we formulate the recommendation task as a multiplayer game, and design two influence function based models to solve the game efficiently.
  • We theoretically analyze our models by providing the convergence rate of the learning algorithm and demonstrating the superiority of multiple anchor selection vectors.
  • Extensive experiments are conducted to verify the effectiveness of our model based on both synthetic and real-world datasets.

3. Main Results

Table 1: Overall comparison between the baselines and our models.

result

4. Code and Datasets

4.1 Code [link: Github]

Table 2: Structure of the code files [main program].

result

Table 3: Structure of the code files [utils].

result

4.2 Datasets [link:Google Driver]

Table 4: Statistics of the datasets used in our experiments.

result

5. Usage

5.1. Download the code

Install the required packages according to requirements.txt.

result

5.2. Prepare the datasets.

(1) Directly download the processed datasets used in this paper:

Steam Diginetica Amazon

(2) Use your own datasets:
Ensure that your data is organized according to the format: user_id:token, item_id:token, timestamp:float.

(3) Rename the dataset by dataset-name.inter, and put it into the folder ./dataset/dataset-name/ (note: replace "dataset-name" with your own dataset name).

5.3. Run our framework

Run main.py to train our model, where the training parameters can be indicated through the config file. For example:

result

6. Detailed parameter search ranges

result

where the lambda is the parameter that balances the recommendation quality and user willingness, the iteration number M denotes the number of times we repeat the experiment, the L denotes the iteration number of online mirror descent, the T denotes the number of anchors.

7. Runtime Environment

  • System:Linux dell-PowerEdge-R730

  • CPU: Intel(R) Xeon(R) CPU E5-2630 v4 @ 2.20GHz

  • CPU-Memory:16G

  • GPU:NVIDIA Corporation GV100 [TITAN V] (rev a1)

  • GPU-Memory:45G

  • Pytorch: 1.7.0

  • CUDA:11.6