A Highly Available Real-time News Recommender based on Apache Spark

Abstract

Recommending news articles is a challenging task due to the continuous changes in the set of available news articles and the context-dependent preferences of users. In addition, news recommenders must fulfill high requirements with respect to response time and scalability. Traditional recommender approaches are optimized for the analysis of static data sets. In news recommendation scenarios, characterized by continuous changes, high volume of messages, and tight time constraints, alternative approaches are needed. In this work we present a highly scalable recommender system optimized for the processing of streams. We evaluate the system in the CLEF NewsREEL challenge. Our system is built on Apache Spark enabling the distributed processing of recommendation requests ensuring the scalability of our approach. The evaluation of the implemented system shows that our approach is suitable for the news recommendation scenario and provides high-quality results while satisfying the tight time constraints.

@inproceedings{DomannLommatzsch:RealTimeRecommenderBasedOnSpark,
author = {Jaschar Domann and Andreas Lommatzsch},
title = {A Highly Available Real-time News Recommender based on Apache Spark},
booktitle = {{CLEF}'17: Proceedings of the 8th International Conference of the {CLEF} Initiative},
year = {2017},
isbn = {978-3-319-65813-1},
numpages = {12},
location = {Dublin, Ireland},
note={LNCS vol. 10456}
publisher = {Springer International Publishing}
}
Authors:
Jaschar Domann, Andreas Lommatzsch
Category:
Conference Paper
Year:
2017
Location:
CLEF 2017: Proceedings of the 8th International Conference of the CLEF Initiative, Dublin, Ireland