Posterior vs. Parameter Sparsity in Latent Variable Models
Venue
Advances in Neural Information Processing Systems 22 (2009), pp. 664-672
Publication Year
2009
Authors
Joao Graca, Kuzman Ganchev, Ben Taskar, Fernando Pereira
BibTeX
Abstract
In this paper we explore the problem of biasing unsupervised models to favor
sparsity. We extend the posterior regularization framework [8] to encourage the
model to achieve posterior sparsity on the unlabeled training data. We apply this
new method to learn first-order HMMs for unsupervised part-of-speech (POS) tagging,
and show that HMMs learned this way consistently and significantly out-performs both
EM-trained HMMs, and HMMs with a sparsity-inducing Dirichlet prior trained by
variational EM. We evaluate these HMMs on three languages — English, Bulgarian and
Portuguese — under four conditions. We find that our method always improves
performance with respect to both baselines, while variational Bayes actually
degrades performance in most cases. We increase accuracy with respect to EM by
2.5%-8.7% absolute and we see improvements even in a semisupervised condition where
a limited dictionary is provided.
