Analyzing Social Bookmarking Systems: A del.icio.us Cookbook
Abstract
Social bookmarking systems have recently gained interest among researches in the areas of data mining and web intelligence, as they provide a vast amount of user-generated annotations and reflect the interests of millions of people. In this paper, we discuss our initial findings obtained from analyzing a vast corpus of almost 150 million bookmarks found at del.icio.us. Apart from investigating bookmarking and tagging patterns in this data, we discuss evidence that social bookmarking systems are vulnerable to spamming and hence need to be preprocessed before any insightful analysis can be carried out. We present a method, which limits the influence of spam in social bookmarking analysis and provide conclusions and directions for future research.