Towards Acoustic Model Unification Across Dialects
Venue
2016 IEEE Workshop on Spoken Language Technology
Publication Year
2016
Authors
Austin Waters, Meysam Bastani, Mohamed G. Elfeky, Pedro Moreno, Xavier Velez
BibTeX
Abstract
Research has shown that acoustic model performance typically decreases when
evaluated on a dialectal variation of the same language that was not used during
training. Similarly, models simultaneously trained on a group of dialects tend to
under-perform when compared to dialect-specific models. In this paper, we report on
our efforts towards building a unified acoustic model that can serve a
multi-dialectal language. Two techniques are presented: Distillation and MTL. In
Distillation, we use an ensemble of dialect-specific acoustic models and distill
its knowledge in a single model. In MTL, we utilize MultiTask Learning to train a
unified acoustic model that learns to distinguish dialects as a side task. We show
that both techniques are superior to the naive model that is trained on all
dialectal data, reducing word error rates by 4.2% and 0.6%, respectively. And,
while achieving this improvement, neither technique degrades the performance of the
dialect-specific models by more than 3.4%.