9:00 - Workshop opening
Abstract: Real-time recommendation of Twitter users based on the content of their profiles is a very challenging task. Traditional IR methods such as TF-IDF fail to handle efficiently large datasets. In this paper we present a scalable approach that allows real time recommendation of users based on their tweets. Our model builds a graph of terms, driven by the fact that users sharing similar interests will share similar terms. We show how this model can be encoded as a compact binary footprint, that allows very fast comparison and ranking, taking full advantage of modern CPU architectures. We validate our approach through an empirical evaluation against the Apache Lucene’s implementation of TF-IDF. We show that our approach is in average two hundred times faster than standard optimised implementation of TF-IDF with a precision of 58%. The work presented here has been published in The Web Intelligence Journal, volume 14, number 1.
10:10 - Best contributions announcement
10:20 - 11:00Coffee brak
11:00 - Happiness is a Choice: Sentiment and Activity-Aware Location Recommendation (Best contribution runner-up)Jia Wang, Yungang Feng, Elham Naghizadeh, Lida Rashidi, Kwan Hui Lim and Kate Lee
Abstract: Studying large, widely spread Twitter data has laid the foundation for many novel applications from predicitng natural distasters and epidemics to understanding urban dynamics. Recent studies have focused on exploring people's emotional response to their urban environment, e.g., green spaces versus built up areas, through analysing the sentiment of tweets within that area. Since green spaces have the capacity to improve citizen's well-being, we developed a system that is capable of recommending green spaces to users. Our system is unique in the sense that the recommendations are tailored with regard to users' preferred activity as well as the degree of positive sentiments in each green space. We show that the incoming flow of tweets can be used to refine the recommendations over time. Furthermore, We implemented a web-based, user-friendly interface to solicit user inputs and display recommendation results.
Abstract: Recommender systems are powerful personalization tools which have seen widespread adoption across the Internet. However, it is recognized that by emphasizing personalization through the optimization of accuracy-driven metrics, the issue of over-personalization emerges with negative effects on the user experience. An increasingly popular countermeasure to the problem is offered by diversifying the recommendations even at the cost of reducing the accuracy of the recommender system. In this paper, we investigate and develop a solution that addresses the problem in the context of a movie application domain. We present a user-centric conceptual framework to enhance the diversity on four related dimensions, namely global coverage, local coverage, novelty, and redundancy. The proposed solution is designed to diversify users profiles, modeled on categorical preferences, within the same group in the recommendation filtering. We evaluate our approach on the Movielens dataset and show that our algorithm yields better results compared to random selection distant neighbors and performs comparably to one of the current state of the art solutions.
Abstract: Crowdsourcing is an approach whereby employers call for workers online with different capabilities to process a task for monetary reward. With a vast amount of tasks posted every day, satisfying the workers, employers, and service providers who are the stakeholders of any crowdsourcing system is critical to its success. To achieve this, the system should address three objectives: (1) match the worker with suitable tasks that fit the worker’s interests and skills and raise the worker’s rewards and rating, (2) give the employer more acceptable solutions with lower cost and time and raise the employer’s rating, and (3) raise the rate of accepted tasks, which will raise the aggregated commissions to the service provider and improve the average rating of the registered users (employers and workers) accordingly. For these objectives, we present a mechanism design that is capable of reaching holistic satisfaction using a multi-objective recommendation system. In contrast, all previous crowdsourcing recommendation systems are designed to address one stakeholder who could be either the worker or the employer. Moreover, our unique contribution is to consider each stakeholder to be self serving. Considering selfish behavior from every stakeholder, we provide a more qualified recommendation for each stakeholder.
Abstract: Modern search engines present result pages composed of two most prominent types of information: sponsored and organic search results. The whole-page results must satisfy user's information inquiry while sponsored ad alongside the search results has become a key monetization strategy for the platform. Against the backdrop of this situation, a basic question has received comparatively little attention: how many ads are good enough to get higher user satisfaction and better monetization? Most search engines always display a fixed number of ads or use heuristic rules to determine the number of ads. In this paper, we formulate the task of finding the best number of ads into a linear programming optimization problem, for which we propose a novel online algorithm to solve. We have conducted several offline experiments and tested our approach in Alibaba E-commerce platform. The experimental results show that the platform could achieve higher revenue and more clicks simultaneously by the proposed algorithm.
12:20 - 13:40Lunch break
Abstract: Recommender systems are designed to identify the items that a user will like or find useful based on the user’s prior preferences and activities. These systems have become ubiquitous and are an essential tool for information filtering and (e-)commerce. Over the years, collaborative filtering, which derive these recommendations by leveraging past activities of groups of users, has emerged as the most prominent approach for solving this problem. This talk will present some of our recent work towards improving the performance of collaborative filtering-based recommender systems and understanding some of their fundamental limitations and characteristics. It will start by analyzing how the ratings that users provide to a set of items relate to their ratings of the set’s individual items and, using these insights, will present rating prediction approaches that utilize distant supervision. It will then discuss extensions to approaches based on sparse linear and latent factor models that postulate that users’ preferences are a combination of global and local preferences, which are shown to lead to better user modeling and as such improved prediction performance. Finally, the talk will conclude by discussing what can be accurately predicted by latent factor approaches and by analyzing the estimation error of sparse linear and latent factor models and how its characteristics impacts the performance of top N recommendation algorithms.
14:40 - Dynamic Local Models for Online Recommendation (Best contribution)Marie Al-Ghossein, Talel Abdessalem and Anthony Barré
Abstract: With the explosion of the volume of user-generated data, designing online recommender systems that learn from data streams has become essential. These systems rely on incremental learning that continuously update models as new observations arrive and they should be able to adapt to drifts in real-time. User preferences evolve over time and tracking their evolution is not an easy task. In addition to the low number of observations available per user, the preferences change at different moments and in different ways for each individual. In this paper, we propose a novel approach based on local models to address this problem. Local models are known for their ability to capture diverse preferences among user subsets. Our approach automatically detects the drift of preferences that leads a user to adopt a behavior closer to the users of another subset, and adjusts the models accordingly. Our experiments on real world datasets show promising results and prove the effectiveness of using local models to adapt to changes in user preferences.
15:00 - 15:40Coffee brak
Abstract: Recommender systems try to predict which items a user will prefer. Traditional models for recommendation only take into account the user-item interaction, usually expressed by explicit ratings. However, in these days, web services continuously generate data of users and items. Thereby, it would be an advantage to incorporate this data into the model. Also, the ratings may not be given and we only can infer the implicit preference of the users. In this work, we propose an incremental matrix co-factorization model with implicit user feedback, considering a real-world data-stream scenario. This model can be seen as an extension of the conventional Matrix Factorization that includes additional dimensions to be decomposed in the common latent factor space. Our experimental results show an improvement in the accuracy of the recommendation task, after incorporating an additional dimension in three music domain datasets. Furthermore, we perform a statistically robust evaluation of the learning process for our fully stream-based implementations.
Abstract: Traditional collaborative filtering, and content-based approaches try to learn a static recommendation model in a batch fashion. These approaches are far to be still suitable in highly dynamic recommendation scenarios as news recommendation and computational advertisement. These domains are characterized by very fluid item and user sets. Furthermore, in the era of big-data, content features tend to be high-dimensional, which leads to a further challenge for traditional on-line learning algorithms (e.g multi-armed bandits) that are mostly designed for low-dimensional data. In this work we face the aforementioned problems, investigating an approximated adaptive contextual bandit learner. Our model takes into account the problem of finding the real low-dimensional manifold spanned by data content features, with respect to the real high dimensional ones. By storing the data in a proper lower dimensional space, we are then able to reduce the computational costs without losing to much in terms of recommendation quality. With this work we provide an overview over the main properties, the adopted techniques and preliminary experimental results measured over a synthetic dataset. We also discuss a drawback of the proposed method that may appear in a specific worst-case scenario.
Abstract: In the context of news recommendations, many time-aware approaches were proposed. These approaches have tried to capture the recency of news with respect to their short life span, by using either decaying weights on past articles or even forgetting them. However, most of these approaches have missed to consider sessions, which encapsulate inside them the articles that a user has interacted with in a short time period. In this paper, we provide news recommendations based on user sessions to reveal their short-term intentions. We also combine content-based with collaborative filtering to deal with the severe data sparsity problem that exists in our real-life data set. We have experimentally seen that the users' interests evolve over time and that our strategies can adapt fast to these changes.