Speech Processing
66 Publications
-
Accurate and Compact Large Vocabulary Speech Recognition on Mobile Devices
Xin Lei, Andrew Senior, Alexander Gruenstein, Jeffrey Sorensen
Interspeech (2013) (to appear)
-
An Empirical study of learning rates in deep neural networks for speech recognition
Andrew Senior, Georg Heigold, Marc'aurelio Ranzato, Ke Yang
Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), IEEE, Vancouver, CA (2013) (to appear)
-
Deep Neural Networks with Auxiliary Gaussian Mixture Models for Real-Time Speech Recognition
Xin Lei, Hui Lin, Georg Heigold
Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), IEEE, Vancouver, CA (2013) (to appear)
-
Empirical Exploration of Language Modeling for the google.com Query Stream as Applied to Mobile Voice Search
Ciprian Chelba, Johan Schalkwyk
Mobile Speech and Advanced Natural Language Solutions, Springer Science+Business Media, New York (2013), pp. 197-229
-
Language Model Verbalization for Automatic Speech Recognition
Hasim Sak, Françoise Beaufays, Kaisuke Nakajima, Cyril Allauzen
Proc ICASSP, IEEE (2013) (to appear)
-
Language Modeling Capitalization
Françoise Beaufays, Brian Strope
Proc ICASSP, IEEE (2013) (to appear)
-
Large Scale Distributed Acoustic Modeling With Back-off N-grams
Ciprian Chelba, Peng Xu, Fernando Pereira, Thomas Richardson
IEEE Transactions on Audio, Speech and Language Processing, vol. 21 (2013), pp. 1158-1169
-
Monitoring the Effects of Temporal Clipping on VoIP Speech Quality
Andrew Hines, Jan Skoglund, Anil Kokaram, Naomi Harte
Interspeech 2013 (to appear)
-
Multiframe Deep Neural Networks for Acoustic Modeling
Vincent Vanhoucke, Matthieu Devin, Georg Heigold
Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), IEEE, Vancouver, CA (2013)
-
Multilingual acoustic models using distributed deep neural networks
Georg Heigold, Vincent Vanhoucke, Andrew Senior, Patrick Nguyen, Marc'aurelio Ranzato, Matthieu Devin, Jeff Dean
Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), IEEE, Vancouver, CA (2013)
-
On Rectified Linear Units For Speech Processing
M.D. Zeiler, M. Ranzato, R. Monga, M. Mao, K. Yang, Q.V. Le, P. Nguyen, A. Senior, V. Vanhoucke, J. Dean, G.E. Hinton
38th International Conference on Acoustics, Speech and Signal Processing (ICASSP), Vancouver (2013)
-
Recurrent Neural Networks for Voice Activity Detection
Thad Hughes, Keir Mierle
ICASSP, IEEE (2013), pp. 7378-7382
-
Robustness of Speech Quality Metrics to Background Noise and Network Degradations: Comparing VISQOL, PESQ and POLQA
Andrew Hines, Jan Skoglund, Anil Kokaram, Naomi Harte
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), IEEE (2013), pp. 3697-3701
-
Speaker Adaptation of Context Dependent Deep Neural Networks
International Conference of Acoustics, Speech, and Signal Processing. (2013) (to appear)
-
Speech and Natural Language: Where Are We Now And Where Are We Headed?
Mobile Voice Conference, San Francisco (2013)
-
Statistical Parametric Speech Synthesis Using Deep Neural Networks
Heiga Zen, Andrew Senior, Mike Schuster
Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), IEEE (2013), pp. 7962-7966
-
Application Of Pretrained Deep Neural Networks To Large Vocabulary Speech Recognition
Navdeep Jaitly, Patrick Nguyen, Andrew Senior, Vincent Vanhoucke
Proceedings of Interspeech 2012
-
Buildling adaptive dialogue systems via Bayes-adaptive POMDP
Shaowei Png, Joelle Pineau, B. Chaib-draa
IEEE Journal of Selected Topics in Signal Processing, vol. vol.6(8). 2012. (2012), pp. 917-927
-
Chapter 17: Uncertainty Decoding, In Virtanen, Singh, & Raj (Eds.) Techniques for Noise Robustness in Automatic Speech Recognition.
Wiley (2012), pp. 463-485
-
Continuous Space Discriminative Language Modeling
Puyang Xu, Sanjeev Khudanpur, Maider Lehr, Emily Prud’hommeaux, Nathan Glenn, Damianos Karakos, Brian Roark, Kenji Sagae, Murat Saraclar, Izhak Shafran, Dan Bikel, Chris Callison-Burch, Yuan Cao, Keith Hall, Eva Hasler, Philipp Koehn, Adam Lopez, Matt Post, Darcey Riley
ICASSP 2012
-
Deep Neural Networks for Acoustic Modeling in Speech Recognition
Geoffrey Hinton, Li Deng, Dong Yu, George Dahl, Abdel-rahman Mohamed, Navdeep Jaitly, Andrew Senior, Vincent Vanhoucke, Patrick Nguyen, Tara Sainath, Brian Kingsbury
Signal Processing Magazine (2012)
-
Distributed Acoustic Modeling with Back-off N-grams
Ciprian Chelba, Peng Xu, Fernando Pereira, Thomas Richardson
Proceedings of ICASSP 2012, IEEE, pp. 4129-4132
-
Distributed Discriminative Language Models for Google Voice Search
Preethi Jyothi, Leif Johnson, Ciprian Chelba, Brian Strope
Proceedings of ICASSP 2012, IEEE, pp. 5017-5021
-
Estimating Word-Stability During Incremental Speech Recognition
Ian McGraw, Alexander Gruenstein
Interspeech (2012)
-
Google's Cross-Dialect Arabic Voice Search
Fadi Biadsy, Pedro J. Moreno, Martin Jansche
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2012), pp. 4441-4444
-
Hallucinated N-Best Lists for Discriminative Language Modeling
Kenji Sagae, Maider Lehr, Emily Tucker Prud’hommeaux, Puyang Xu, Nathan Glenn, Damianos Karakos, Sanjeev Khudanpur, Brian Roark, Murat Saraçlar, Izhak Shafran, Daniel M. Bikel, Chris Callison-Burch, Yuan Cao, Keith Hall, Eva Hassler, Philipp Koehn, Adam Lopez, Matt Post, Darcey Riley
Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP) (2012)
-
Haptic Voice Recognition Grand Challenge
K. Sim, S. Zhao, K. Yu, H. Liao
14th ACM International Conference on Multimodal Interaction. (2012)
-
IMPROVED PREDICTION OF NEARLY-PERIODIC SIGNALS
Bastiaan Kleijn, Jan Skoglund
International Workshop on Acoustic Signal Enhancement 2012 (IWAENC2012)
-
Investigations on Exemplar-Based Features for Speech Recognition Towards Thousands of Hours of Unsupervised, Noisy Data
Georg Heigold, Patrick Nguyen, Mitchel Weintraub, Vincent Vanhoucke
Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), IEEE, Kyoto, Japan (2012), pp. 4437-4440
-
Japanese and Korean Voice Search
Mike Schuster, Kaisuke Nakajima
International Conference on Acoustics, Speech and Signal Processing, IEEE (2012), pp. 5149-5152
-
Language Modeling for Automatic Speech Recognition Meets the Web: Google Search by Voice
Ciprian Chelba, Johan Schalkwyk, Boulos Harb, Carolina Parada, Cyril Allauzen, Leif Johnson, Michael Riley, Peng Xu, Preethi Jyothi, Thorsten Brants, Vida Ha, Will Neveitt
University of Toronto (2012)
-
Large Scale Language Modeling in Automatic Speech Recognition
Ciprian Chelba, Dan Bikel, Maria Shugrina, Patrick Nguyen, Shankar Kumar
Google (2012)
-
Large-scale Discriminative Language Model Reranking for Voice Search
Preethi Jyothi, Leif Johnson, Ciprian Chelba, Brian Strope
Proceedings of the NAACL-HLT 2012 Workshop: Will We Ever Really Replace the N-gram Model? On the Future of Language Modeling for HLT, Association for Computational Linguistics, pp. 41-49
-
Learning improved linear transforms for speech recognition
Andrew Senior, Youngmin Cho, Jason Weston
ICASSP, IEEE (2012)
-
Music Models for Music-Speech Separation
Thad Hughes, Trausti Kristjansson
ICASSP, IEEE (2012), pp. 4917-4920
-
Optimal Size, Freshness and Time-frame for Voice Search Vocabulary
Google (2012)
-
Recognition of Multilingual Speech in Mobile Applications
Hui Lin, Jui-Ting Huang, Francoise Beaufays, Brian Strope, Yun-hsuan Sung
ICASSP (2012)
-
Semi-supervised Discriminative Language Modeling for Turkish ASR
Murat Saraçlar, Daniel M. Bikel, Keith Hall, Kenji Sagae
2012 IEEE International Conference on Acoustics, Speech, and Signal Processing Proceedings, IEEE, Kyoto, Japan
-
Spectral Intersections for Non-Stationary Signal Separation
Trausti Kristjansson, Thad Hughes
Proceedings of InterSpeech 2012, Portland, OR
-
Speech/Nonspeech Segmentation in Web Videos
Proceedings of InterSpeech 2012
-
VISQOL: THE VIRTUAL SPEECH QUALITY OBJECTIVE LISTENER
Andrew Hines, Jan Skoglund, Anil Kokaram, Naomi Harte
International Workshop on Acoustic Signal Enhancement 2012 (IWAENC2012)
-
Voice Query Refinement
Cyril Allauzen, Edward Benson, Ciprian Chelba, Michael Riley, Johan Schalkwyk
Interspeech (2012)
-
A Web-Based Tool for Developing Multilingual Pronunciation Lexicons
Samantha Ainsley, Linne Ha, Martin Jansche, Ara Kim, Masayuki Nanzawa
12th Annual Conference of the International Speech Communication Association (Interspeech 2011), pp. 3331-3332
-
Bayesian Language Model Interpolation for Mobile Speech Input
Interspeech 2011, pp. 1429-1432
-
Deploying Google Search by Voice in Cantonese
Yun-hsuan Sung, Martin Jansche, Pedro Moreno
12th Annual Conference of the International Speech Communication Association (Interspeech 2011), pp. 2865-2868
-
Improving the speed of neural networks on CPUs
Vincent Vanhoucke, Andrew Senior, Mark Z. Mao
Deep Learning and Unsupervised Feature Learning Workshop, NIPS 2011
-
Language Modeling for Automatic Speech Recognition Meets the Web: Google Search by Voice
Ciprian Chelba, Johan Schalkwyk, Boulos Harb, Carolina Parada, Cyril Allauzen, Michael Riley, Peng Xu, Thorsten Brants, Vida Ha, Will Neveitt
OGI/OHSU Seminar Series, Portland, Oregon, USA (2011)
-
Recognizing English Queries in Mandarin Voice Search
Hung-An Chang, Yun-hsuan Sung, Brian Strope, Francoise Beaufays
ICASSP (2011)
-
Speech Retrieval
Ciprian Chelba, Timothy J. Hazen, Bhuvana Ramabhadran, Murat Saraçlar
Spoken Language Understanding, John Wiley and Sons, Ltd (2011), pp. 417-446
-
Unsupervised Testing Strategies for ASR
Brian Strope, Doug Beeferman, Alexander Gruenstein, Xin Lei
Interspeech 2011, pp. 1685-1688
-
Challenges in Automatic Speech Recognition
Ciprian Chelba, Johan Schalkwyk, Michiel Bacchiani
Interspeech 2010
-
Decision Tree State Clustering with Word and Syllable Features
Hank Liao, Chris Alberti, Michiel Bacchiani, Olivier Siohan
Interspeech, ISCA (2010), 2958 – 2961
-
Discriminative Topic Segmentation of Text and Speech
Mehryar Mohri, Pedro Moreno, Eugene Weinstein
International Conference on Artificial Intelligence and Statistics (AISTATS) (2010)
-
Google Search by Voice: A Case Study
Johan Schalkwyk, Doug Beeferman, Francoise Beaufays, Bill Byrne, Ciprian Chelba, Mike Cohen, Maryam Garrett, Brian Strope
Advances in Speech Recognition: Mobile Environments, Call Centers and Clinics, Springer (2010), pp. 61-90
-
On-Demand Language Model Interpolation for Mobile Speech Input
Brandon Ballinger, Cyril Allauzen, Alexander Gruenstein, Johan Schalkwyk
Interspeech (2010), pp. 1812-1815
-
Search by Voice in Mandarin Chinese
Jiulong Shan, Genqing Wu, Zhihong Hu, Xiliu Tang, Martin Jansche, Pedro J. Moreno
Interspeech 2010, pp. 354-357
-
Unsupervised Discovery and Training of Maximally Dissimilar Cluster Models
Francoise Beaufays, Vincent Vanhoucke, Brian Strope
Proc Interspeech (2010)
-
A new quality measure for topic segmentation of text and speech
Mehryar Mohri, Pedro J. Moreno, Eugene Weinstein
Conference of the International Speech Communication Association (Interspeech) (2009)
-
Restoring Punctuation and Capitalization in Transcribed Speech
Agustín Gravano, Martin Jansche, Michiel Bacchiani
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP) (2009), pp. 4741-4744
-
Revisiting Graphemes with Increasing Amounts of Data
Yun-Hsuan Sung, Thad Hughes, Francoise Beaufays, Brian Strope
ICASSP, IEEE (2009)
-
Web-derived Pronunciations
Arnab Ghoshal, Martin Jansche, Sanjeev Khudanpur, Michael Riley, Morgan Ulinski
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP) (2009), pp. 4289-4292
-
Deploying GOOG-411: Early Lessons in Data, Measurement, and Testing
Michiel Bacchiani, Francoise Beaufays, Johan Schalkwyk, Mike Schuster, Brian Strope
Proc. ICASSP (2008)
-
Retrieval and Browsing of Spoken Content
Ciprian Chelba, Timothy J. Hazen, Murat Saraçlar
Signal Processing Magazine, IEEE, vol. 25 (2008), pp. 39-49
-
Speech Recognition with Weighted Finite-State Transducers
Mehryar Mohri, Fernando C. N. Pereira, Michael Riley
Handbook on Speech Processing and Speech Communication, Part E: Speech recognition, Springer-Verlag, Heidelberg, Germany (2008)
-
Speech Recognition with Weighted Finite-State Transducers
Mehryar Mohri, Fernando C. N. Pereira, Michael Riley
Handbook on Speech Processing and Speech Communication, Part E: Speech recognition, Springer-Verlag, Heidelberg, Germany (2007)
