Collaborative Filtering for Orkut Communities: Discovery of User Latent Behavior
Venue
18th International Conference on World Wide Web (WWW), ACM (2009), pp. 681-690
Publication Year
2009
Authors
Wen-Yen Chen, Jon Chu, Junyi Luan, Hongjie Bai, Edward Chang
BibTeX
Abstract
Users of social networking services can connect with each other by forming
communities for online interaction. Yet as the number of communities hosted by such
websites grows over time, users have even greater need for effective community
recommendations in order to meet more users. In this paper, we investigate two
algorithms from very different domains and evaluate their effectiveness for
personalized community recommendation. First is association rule mining (ARM),
which discovers associations between sets of communities that are shared across
many users. Second is latent Dirichlet allocation (LDA), which models
user-community co-occurrences using latent aspects. In comparing LDA with ARM, we
are interested in discovering whether modeling low-rank latent structure is more
effective for recommendations than directly mining rules from the observed data. We
experiment on an Orkut data set consisting of 492,104 users and 118,002
communities. We show that LDA consistently performs better than ARM using the top-k
recommendations ranking metric, and we analyze examples of the latent information
learned by LDA to explain this finding. To efficiently handle the large-scale data
set, we parallelize LDA on distributed computers and demonstrate our parallel
implementation's scalability with varying numbers of machines.
