Trust, but Verify: Predicting Contribution Quality for Knowledge Base Construction and Curation
Venue
WSDM (2014) (to appear)
Publication Year
2014
Authors
Chun How Tan, Eugene Agichtein, Panos Ipeirotis, Evgeniy Gabrilovich
BibTeX
Abstract
The largest publicly available knowledge repositories, such as Wikipedia and
Freebase, owe their existence and growth to volunteer contributors around the
globe. While the majority of contributions are correct, errors can still creep in,
due to editors' carelessness, misunderstanding of the schema, malice, or even lack
of accepted ground truth. If left undetected, inaccuracies often degrade the
experience of users and the performance of applications that rely on these
knowledge repositories. We present a new method, CQUAL, for automatically
predicting the quality of contributions submitted to a knowledge base.
Significantly expanding upon previous work, our method holistically exploits a
variety of signals, including the user's domains of expertise as reflected in her
prior contribution history, and the historical accuracy rates of different types of
facts. In a large-scale human evaluation, our method exhibits precision of 91% at
80% recall. Our model verifies whether a contribution is correct immediately after
it is submitted, significantly alleviating the need for post-submission human
reviewing.
