Learning to Rank with Selection Bias in Personal Search
Venue
Proceedings of SIGIR 2016, ACM (to appear)
Publication Year
2016
Authors
Xuanhui Wang, Michael Bendersky, Donald Metzler, Marc Najork
BibTeX
Abstract
Click-through data has proven to be a critical resource for improving search
ranking quality. Though a large amount of click data can be easily collected by
search engines, various biases make it difficult to fully leverage this type of
data. In the past, many click models have been proposed and successfully used to
estimate the relevance for individual query-document pairs in the context of web
search. These click models typically require a large quantity of clicks for each
individual pair and this makes them difficult to apply in systems where click data
is highly sparse due to personalized corpora and information needs, e.g., personal
search. In this paper, we study the problem of how to leverage sparse click data in
personal search and introduce a novel selection bias problem and address it in the
learning-to-rank framework. This paper proposes a few bias estimation methods,
including a novel query-dependent one that captures queries with similar results
and can successfully deal with sparse data. We empirically demonstrate that
learning-to-rank that accounts for query-dependent selection bias yields
significant improvements in search effectiveness through online experiments with
one of the world's largest personal search engines.
