This paper proposes a novel acoustic model based on neural networks for statistical
parametric speech synthesis. The neural network outputs parameters of a non-zero
mean Gaussian process, which defines a probability density function of a speech
waveform given linguistic features. The mean and covariance functions of the
Gaussian process represent deterministic (voiced) and stochastic (unvoiced)
components of a speech waveform, whereas the previous approach considered the
unvoiced component only. Experimental results show that the proposed approach can
generate speech waveforms approximating natural speech waveforms.