A Highly Available Real-time News Recommender based on Apache Spark
Abstract
Recommending news articles is a challenging task due to the continuous changes in the set of available news articles and the context-dependent preferences of users. In addition, news recommenders must fulfill high requirements with respect to response time and scalability. Traditional recommender approaches are optimized for the analysis of static data sets. In news recommendation scenarios, characterized by continuous changes, high volume of messages, and tight time constraints, alternative approaches are needed. In this work we present a highly scalable recommender system optimized for the processing of streams. We evaluate the system in the CLEF NewsREEL challenge. Our system is built on Apache Spark enabling the distributed processing of recommendation requests ensuring the scalability of our approach. The evaluation of the implemented system shows that our approach is suitable for the news recommendation scenario and provides high-quality results while satisfying the tight time constraints.