Recent Advances in Google Real-time HMM-driven Unit Selection Synthesizer
Venue
INTERSPEECH 2016, Sep 8-12, San Francisco, USA, pp. 2238-2242
Publication Year
2016
Authors
Xavi Gonzalvo, Siamak Tazari, Chun-an Chan, Markus Becker, Alexander Gutkin, Hanna Silen
BibTeX
Abstract
This paper presents advances in Google's hidden Markov model (HMM)-driven unit
selection speech synthesis system. We describe several improvements to the run-time
system; these include minimal latency, high-quality and fast refresh cycle for new
voices. Traditionally unit selection synthesizers are limited in terms of the
amount of data they can handle and the real applications they are built for. That
is even more critical for real-life large-scale applications where high-quality is
expected and low latency is required given the available computational resources.
In this paper we present an optimized engine to handle a large database at runtime,
a composite unit search approach for combining diphones and phrase-based units. In
addition a new voice building strategy for handling big databases and keeping the
building times low is presented.