15:00 CEST - Workshop opening
Abstract: Recommender Systems research often focuses on static datasets, with static models and static evaluation scenarios. Practical applications, however, often dictate highly dynamic environments for the recommender to perform in. In this talk, I will present some recent advances that aim to bridge this gap. The first part of this talk will focus on online learning, and how existing state-of-the-art static approaches to recommendation can be adapted to thrive in online settings. Then, I will question some of the core assumptions of the "next-item prediction" paradigm that the field has focused on, and motivate the "bandit learning" paradigm as an alternative that allows us to keep a clear focus on online objectives. In order to achieve true online success, we additionally need to consider the impact of our recommendations and biases that arise from feedback loops.
16:00 - 16:10 CESTBreak
16:10 CEST - Invited talk: NFC: a Deep and Hybrid Item-based Model for Item Cold-Start RecommendationCesare Bernardis (speaker) and Paolo Cremonesi
Abstract: In this talk, we present Neural Feature Combiner (NFC), a new item-based, deep learning model for item cold-start recommendation. The new model learns to generate a hybrid similarity matrix, taking as input only the content representations of the items. Its architecture, which allows tackling the dynamic nature of the cold-start problem, is composed of two main components. The first maps content features into a low-dimensional embedding space. The second combines the features that compose the embeddings, in order to compute the similarity values. The model is trained end-to-end using collaborative similarities as target values. In the talk, we show the results of our experiments, demonstrating that learning from collaborative similarities has several advantages over learning from user-item interactions. We provide empirical evidence that NFC outperforms the state-of-the-art for item cold-start recommendation in multiple scenarios, arguing about its effectiveness in exploiting collaborative information. Finally, we present a qualitative analysis of the embeddings generated by NFC, showing its ability to provide robust latent representations.
16:35 CEST - Efficiently Maintaining Next Basket Recommendations under Additions and Deletions of Baskets and Items (download paper)Benjamin Longxiang Wang (speaker) and Sebastian Schelter
Recommender systems play an important role in helping people find information and make decisions in today's increasingly digitalized societies. However, the wide adoption of such machine learning applications also causes concerns in terms of data privacy. These concerns are addressed by the recent "General Data Protection Regulation" (GDPR) in Europe, which requires companies to delete personal user data upon request, when users enforce their "right to be forgotten". Many researchers argue that this deletion obligation does not only apply to the data stored in primary data stores such as relational databases, but also requires an update of machine learning models whose trainingset included the personal data to delete. As a consequence, the academic community started to investigate how to unlearn user data from trained machine learning models in an efficient and timely manner.
We explore this direction in the context of a sequential recommendation task called Next Basket Recommendation (NBR), where the goal is to recommend a set of items based on a user's purchase history. We design efficient algorithms for incrementally and decrementally updating a state-of-the-art next basket recommendation model in response to additions and deletions of user baskets and items. Furthermore, we discuss an efficient, data-parallel implementation of our method in the Spark Structured Streaming system.
We evaluate our implementation on a variety of real-world datasets, where we investigate the impact of our update techniques on several ranking metrics and measure the time to perform model updates. Our results show that our method provides constant update time efficiency with respect to an additional user basket in the incremental case, and linear efficiency in the decremental case where we delete existing baskets. With modest computational resources, we are able to update models with a latency of around 0.2 milliseconds regardless of the history size in the incremental case, and less than one millisecond in the decremental case.
16:55 - 17:20 CESTBreak
17:20 CEST - Invited talk: Rank-sensitive Proportional Aggregations in Dynamic Recommendation ScenariosŠtěpán Balcar, Vit Skrhak and Ladislav Peska (speaker)
In this talk, we focus on the problem of proportionality preservation in dynamic recommendation scenarios. Our starting point is the belief that different (e.g. collaborative vs. content-based) recommender systems (RS) may provide complementary views on the user's preferences or needs. By using only a single best performing RS, we inherently loose other viewpoints, which may lead to too narrow-minded recommendations and in the long-run deteriorate user satisfaction.
Instead, we introduce a FuzzDA framework aiming to provide an unbiased aggregation of individual RS under the constraints of dynamic recommendation scenario. The framework consists of three main components: aggregator, iterative votes assignment strategy and negative implicit feedback incorporation strategy. The aggregator algorithm is based on D'Hondt's algorithm for mandates allocation (with several modifications) and aggregates outputs of individual RS in ranking-aware proportionality-preserving manner w.r.t. votes assigned to individual RS. Votes assignment strategies observe the performance of individual RS (as well as several contextual features) and transform them into the assigned votes. Finally, negative implicit feedback strategies focus on short-term user-specific discrimination on the item level. In the talk we further report on evaluations of FuzzDA framework, where framework variants were especially successful in maintaining very good iterative novelty vs. click-through rate ratios and performed well w.r.t. several diversity metrics.
Abstract: In this work, we release a large and novel dataset of learners engaging with educational videos in-the-wild. The dataset, named Personalised Educational Engagement with Knowledge Topics (PEEK), is one of the first publicly available datasets that address personalised educational engagement. Educational recommenders have received much less attention in comparison to e-commerce and entertainment-related recommenders, even though efficient personalised learning systems could improve learning gains significantly. One of the main challenges in advancing this research direction is the scarcity of large, publicly available datasets. In the PEEK dataset, educational video lectures have been associated with Wikipedia concepts related to the material of the lecture, thus providing a humanly intuitive taxonomy. We believe that granular learner engagement signals, in unison with rich content representations, will pave the way to building powerful personalisation algorithms that will revolutionise educational and informational recommendation systems. Towards this goal, we 1) construct a novel dataset from a popular video lecture repository, 2) identify a set of benchmark algorithms to model engagement, and 3) run extensive experimentation on the PEEK dataset to demonstrate its value. Our experiments with the dataset show promise in building powerful informational recommender systems. The dataset and the support code is available at https://github.com/sahanbull/PEEK-Dataset.
18:05 - 18:15 CESTBreak
In modern digital applications, personalized user experience is shaped interactively. In a typical scenario, a recommender system estimates user preferences based on their previous choices. However, these choices are from only a subset of available items selected by the system in line with the user's preference estimates. In other words, users and the system interact in a feedback loop: the system learns to make future recommendations based on user choices from the alternatives it previously recommended.
In this talk, we describe a Bayesian choice model, the Dirichlet-Luce model, where we assume that choice observations comply with Luce's choice axiom, i.e., the users choose (or tend to choose) from a subset of all items recommended to them. Furthermore, the model is built on a generalization of the Dirichlet distribution as a prior probability distribution over user preferences (the joint distribution of choice probabilities), conjugate to the likelihood the choice observations lead to. We illustrate that the model achieves efficient inference of user preferences, based on the observation that the number of distinct presentations (subsets of the set of all items) that suffices for a `good' preference estimate scales with the number of all items.
The Bayesian construction of the Dirichlet-Luce model leads to a bandit algorithm---based on Thompson sampling---for online learning to recommend. The algorithm achieves low-regret measured in terms of the inherent attractiveness of the items included in the recommendations, compared to several dueling bandits algorithms, a combinatorial bandit algorithm with relative feedback, and a state-of-the-art online learning-to-rank algorithm.
Dirichlet-Luce model ensures independence of unexplored items. That is, posterior probabilities of preferences to the items that were never shown to the users (or are newly introduced to the system) stay invariant independently of other choices made. The combination of the Dirichlet-Luce model and the proposed bandit algorithm also eliminates some biases that recommender systems might be prone to; where the system overestimates user preference to promoted or initially preferred items due to overexposure or underestimates preference to underrepresented items in the recommendations. As a result, we believe the model has a potential to be reused as a fundamental building block for recommender systems.
Recommender systems have been investigated for many years, with the aim of generating the most accurate recommendations possible. However, available data about new users is often insufficient, leading to inaccurate recommendations; an issue that is known as the cold-start problem. A solution can be active learning. Active learning strategies proactively select items and ask users to rate these. This way, detailed user preferences can be acquired and as a result, more accurate recommendations can be offered to the user.
In this study, we compare five active learning algorithms, combined with three different predictor algorithms, which are used to estimate to what extent the user would like the item that is asked to rate. In addition, two modes are tested for selecting the items: batch mode (all items at once), and sequential mode (the items one by one). Evaluation of the recommender in terms of rating prediction, decision support, and the ranking of items, showed that sequential mode produces the most accurate recommendations for dense data sets. Differences between the active learning algorithms are small. For most active learners, the best predictor turned out to be FunkSVD in combination with sequential mode.