Knowledge Base Completion via Search-Based Question Answering
Venue
WWW (2014)
Publication Year
2014
Authors
Robert West, Evgeniy Gabrilovich, Kevin Murphy, Shaohua Sun, Rahul Gupta, Dekang Lin
BibTeX
Abstract
Over the past few years, massive amounts of world knowledge have been accumulated
in publicly available knowledge bases, such as Freebase, NELL, and YAGO. Yet
despite their seemingly huge size, these knowledge bases are greatly incomplete.
For example, over 70% of people included in Freebase have no known place of birth,
and 99% have no known ethnicity. In this paper, we propose a way to leverage
existing Web-search--based question-answering technology to fill in the gaps in
knowledge bases in a targeted way. In particular, for each entity attribute, we
learn the best set of queries to ask, such that the answer snippets returned by the
search engine are most likely to contain the correct value for that attribute. For
example, if we want to find Frank Zappa's mother, we could ask the query "who is
the mother of Frank Zappa". However, this is likely to return "The Mothers of
Invention", which was the name of his band. Our system learns that it should (in
this case) add disambiguating terms, such as Zappa's place of birth, in order to
make it more likely that the search results contain snippets mentioning his mother.
Our system also learns how many different queries to ask for each attribute, since
in some cases, asking too many can hurt accuracy (by introducing false positives).
We discuss how to aggregate candidate answers across multiple queries, ultimately
returning probabilistic predictions for possible values for each attribute.
Finally, we evaluate our system and show that it is able to extract a large number
of facts with high confidence.
