Collaborative Filtering and the Missing at Random Assumption
Benjamin Marlin, Richard Zemel, Sam Roweis, Malcolm Slaney
Rating prediction is an important applica- tion, and a popular research topic in collab- orative filtering. However, both the valid- ity of learning algorithms, and the validity of standard testing procedures rest on the assumption that missing ratings are missing at random (MAR). In this paper we present the results of a user study in which we col- lect a random sample of ratings from current users of an online radio service. An analy- sis of the rating data collected in the study shows that the sample of random ratings has markedly different properties than ratings of user-selected songs. When asked to report on their own rating behaviour, a large number of users indicate they believe their opinion of a song does affect whether they choose to rate that song, a violation of the MAR condi- tion. Finally, we present experimental results showing that incorporating an explicit model of the missing data mechanism can lead to significant improvements in prediction per- formance on the random sample of ratings.
PDF Link: /papers/07/p267-marlin.pdf
AUTHOR = "Benjamin Marlin
and Richard Zemel and Sam Roweis and Malcolm Slaney",
TITLE = "Collaborative Filtering and the Missing at Random Assumption",
BOOKTITLE = "Proceedings of the Twenty-Third Conference Annual Conference on Uncertainty in Artificial Intelligence (UAI-07)",
PUBLISHER = "AUAI Press",
ADDRESS = "Corvallis, Oregon",
YEAR = "2007",
PAGES = "267--275"