Systematic Comparison of Phrase Table Pruning Techniques
Abstract: When trained on very large parallel corpora, the phrase
table component of a machine translation system grows to consume vast computational
resources. In this paper, we introduce a novel pruning criterion that places phrase
table pruning on a sound theoretical foundation. Systematic experiments on four
language pairs under various data conditions show that our principled approach is
superior to existing ad hoc pruning methods.