Thanh-Thien Le, Viet Dao, Linh Van Nguyen, Thi-Nhung Nguyen, Linh Ngo Van, Thien Huu Nguyen
Cold-start Recommendation by Personalized Embedding Region Elicitation
1. Introduction
Rating elicitation is a success element for recommender systems to perform well at cold-starting, in which the systems need to recommend items to a newly arrived user with no prior knowledge about the user’s preference. Existing elicitation methods employ a fixed set of items to learn the user’s preference and then infer the users’ preferences on the remaining items. Using a fixed seed set can limit the performance of the recommendation system since the seed set is unlikely optimal for all new users with potentially diverse preferences. This paper addresses this challenge using a 2-phase, personalized elicitation scheme. First, the elicitation scheme asks users to rate a small set of popular items in a “burn-in” phase. Second, it sequentially asks the user to rate adaptive items to refine the preference and the user’s representation. Throughout the process, the system represents the user’s embedding value not by a point estimate but by a region estimate. The value of information obtained by asking the user’s rating on an item is quantified by the distance from the region center embedding space that contains with high confidence the true embedding value of the user. Finally, the recommendations are successively generated by considering the preference region of the user. We show that each subproblem in the elicitation scheme can be efficiently implemented. Further, we empirically demonstrate the effectiveness of the proposed method against existing rating-elicitation methods on several prominent datasets.
2. Method
Figure 1. When a new user arrives, we use a determinantal point process to query a diverse set of items from the items list to construct the burn-in questionnaire. Subsequently, we use a sequential question-answering procedure to refine the embedding region of the user’s preferences. The recommendation is made using the Chebyshev center of the embedding region, which is consistent with the user’s stated preferences.
This section presents our proposed solution package comprising two distinct phases: a burn-in questionnaire and a sequential and adaptive Q&A process. Additionally, we provide a recommendation module based on the Chebyshev center of the region, which is designed specifically for the recommendation task. As there is no prior information about the user’s preferences, we implement a burn-in phase using a determinantal point process (DPP) to generate a short, static questionnaire for each new user. The DPP balances two criteria: diversity and popularity.
The adaptive Q&A process facilitates the sequential elicitation of user preferences. We assume this phase lasts rounds; in each round, we select items to ask for feedback from the user. While the user’s true embedding vector is not available to the system, we can characterize the plausible values of the user’s embeddings from the user’s feedback. By utilizing a set of positively rated items and negatively rated items, we can form pairwise preferences and effectively refine the plausible embedding region. Therefore, this iterative elicitation allows us to increase the accuracy of the preference approximation.
Item recommendation. At any time, our system keeps track of three sets of items: , , and . We generate all valid pairwise preferences by coupling items from the and sets. Each preference pair delineates a distinct cut in the embedding region, effectively narrowing down the area denoted by the set in the embedding space containing the new user embedding. To generate item recommendations, we calculate the Euclidean distance from all unqueried items to the aggregated center and recommend the top items nearest to this center.
Figure 2. Illustration of our method in 2D toy example: Recall that a cut in the embedding space is created by pairing a positive item with a negative item. At time , when no questions have been asked, there are no cuts in the embedding space. Moving to time , we asked the user to elicit items 1, 2, and 4, and the user-specified ‘dislike’, ‘like’, and ‘dislike’ for each respective item. This introduces two cuts in the space, and the initial Chebyshev center is calculated. Progressing to time , we ask the user to elicit item 5 and determine it to be a disliked item. As a result, a final cut is constructed by pairing item 2 with item 5. This process concludes with the finalization of region
3. Experiments
3.1 Real-World Experiment
Table 1. Real-user experiments on three datasets.
3.2. Questionnaire size analysis.
Figure 3. Performance improvements with the dynamic questionnaire size on Amazon-Books and Gowalla datasets.
4. Conclusion
In this paper, we have addressed the problem of cold-start recommendation by proposing a personalized elicitation scheme consisting of two phases. After a short “burn-in” phase, we employ an adaptive preference approach where users are sequentially prompted to rate items that refine their preferences and user representation. Throughout the process, the system represents the user’s preferences as a region estimate rather than a single point, capturing the uncertainty in their preferences. The value of information gained from user ratings is quantified by considering the distance from the region center that confidently contains the true embedding value. Recommendations are generated by considering the user’s preferences region. We have demonstrated the efficiency of each subproblem in the elicitation scheme and conducted empirical evaluations on prominent datasets to showcase the effectiveness of our proposed method compared to existing rating-elicitation approaches.
Overall
Hieu Trung Nguyen*, Duy Nguyen*, Khoa D. Doan, Viet Anh Nguyen
Share Article