A Stream-based Resource for Multi-Dimensional Evaluation of Recommender Algorithms
Abstract
Recommender System research has evolved to focus on developing algorithms capable of high performance in online systems. This development requires a new class of resources. Today's researchers need access to infrastructure enabling multi-dimensional evaluation of recommender systems. In other words, they need to analyze algorithms concerning both functional requirements (such as prediction accuracy) and non-functional requirements (such as speed). Researchers need to subject algorithms to realistic conditions in online A/B tests. We introduce two resources supporting such evaluation methodologies: the new data set of stream recommendation interactions released for CLEF NewsREEL 2017, and the new Open Recommendation Platform (ORP). The data set allows researchers to study a stream recommendation problem closely by "replaying" it locally, and ORP makes it possible to take this evaluation "live" in a living lab scenario. Specifically, ORP allows researchers to deploy their algorithms in a live stream to carry out A/B tests. To our knowledge, NewsREEL is the first online recommender system resource ever to be put at the disposal of the research community. In order to encourage others to develop comparable resources for a wide range of domains, we present a list of practical lessons learned in the development of the dataset and ORP.