A Scalable Gibbs Sampler for Probabilistic Entity Linking
Venue
Advances in Information Retrieval (ECIR 2014), Springer International Publishing, pp. 335-346
Publication Year
2014
Authors
Neil Houlsby, Massimiliano Ciaramita
BibTeX
Abstract
Entity linking involves labeling phrases in text with their referent entities, such
as Wikipedia or Freebase entries. This task is challenging due to the large number
of possible entities, in the millions, and heavy-tailed mention ambiguity. We
formulate the problem in terms of probabilistic inference within a topic model,
where each topic is associated with a Wikipedia article. To deal with the large
number of topics we propose a novel efficient Gibbs sampling scheme which can also
incorporate side information, such as the Wikipedia graph. This conceptually simple
probabilistic approach achieves state-of-the-art performance in entity-linking on
the Aida-CoNLL dataset.
