Ciprian Chelba

Ciprian Chelba is a Research Scientist with Google. Previously he worked as a Researcher in the Speech Technology Group at Microsoft Research.

His research interests are in statistical modeling of natural language and speech. Recent projects include: Google Audio Indexing; indexing, ranking and snippeting of speech content; Language Modeling for Google Search by Voice, and Android IME predictive keyboard.

Google Publications

  •    

    Empirical Exploration of Language Modeling for the google.com Query Stream as Applied to Mobile Voice Search

    Ciprian Chelba, Johan Schalkwyk

    Mobile Speech and Advanced Natural Language Solutions, Springer Science+Business Media, New York (2013), pp. 197-229

  •    

    Large Scale Distributed Acoustic Modeling With Back-off N-grams

    Ciprian Chelba, Peng Xu, Fernando Pereira, Thomas Richardson

    ICSI, Berkeley, California (2013)

  •    

    Large Scale Distributed Acoustic Modeling With Back-off N-grams

    Ciprian Chelba, Peng Xu, Fernando Pereira, Thomas Richardson

    IEEE Transactions on Audio, Speech and Language Processing, vol. 21 (2013), pp. 1158-1169

  •    

    Speech and Natural Language: Where Are We Now And Where Are We Headed?

    Ciprian Chelba

    Mobile Voice Conference, San Francisco (2013)

  •    

    Distributed Acoustic Modeling with Back-off N-grams

    Ciprian Chelba, Peng Xu, Fernando Pereira, Thomas Richardson

    Proceedings of ICASSP 2012, IEEE, pp. 4129-4132

  •    

    Distributed Discriminative Language Models for Google Voice Search

    Preethi Jyothi, Leif Johnson, Ciprian Chelba, Brian Strope

    Proceedings of ICASSP 2012, IEEE, pp. 5017-5021

  •    

    Language Modeling for Automatic Speech Recognition Meets the Web: Google Search by Voice

    Ciprian Chelba, Johan Schalkwyk, Boulos Harb, Carolina Parada, Cyril Allauzen, Leif Johnson, Michael Riley, Peng Xu, Preethi Jyothi, Thorsten Brants, Vida Ha, Will Neveitt

    University of Toronto (2012)

  •    

    Large Scale Language Modeling in Automatic Speech Recognition

    Ciprian Chelba, Dan Bikel, Maria Shugrina, Patrick Nguyen, Shankar Kumar

    Google (2012)

  •    

    Large-scale Discriminative Language Model Reranking for Voice Search

    Preethi Jyothi, Leif Johnson, Ciprian Chelba, Brian Strope

    Proceedings of the NAACL-HLT 2012 Workshop: Will We Ever Really Replace the N-gram Model? On the Future of Language Modeling for HLT, Association for Computational Linguistics, pp. 41-49

  •    

    Optimal Size, Freshness and Time-frame for Voice Search Vocabulary

    Maryam Kamvar, Ciprian Chelba

    Google (2012)

  •   

    Voice Query Refinement

    Cyril Allauzen, Edward Benson, Ciprian Chelba, Michael Riley, Johan Schalkwyk

    Interspeech (2012)

  •    

    Language Modeling for Automatic Speech Recognition Meets the Web: Google Search by Voice

    Ciprian Chelba, Johan Schalkwyk, Boulos Harb, Carolina Parada, Cyril Allauzen, Michael Riley, Peng Xu, Thorsten Brants, Vida Ha, Will Neveitt

    OGI/OHSU Seminar Series, Portland, Oregon, USA (2011)

  •   

    Speech Retrieval

    Ciprian Chelba, Timothy J. Hazen, Bhuvana Ramabhadran, Murat Saraçlar

    Spoken Language Understanding, John Wiley and Sons, Ltd (2011), pp. 417-446

  •    

    Challenges in Automatic Speech Recognition

    Ciprian Chelba, Johan Schalkwyk, Michiel Bacchiani

    Interspeech 2010

  •   

    Google Search by Voice: A Case Study

    Johan Schalkwyk, Doug Beeferman, Francoise Beaufays, Bill Byrne, Ciprian Chelba, Mike Cohen, Maryam Garrett, Brian Strope

    Advances in Speech Recognition: Mobile Environments, Call Centers and Clinics, Springer (2010), pp. 61-90

  •   

    Model Combination for Machine Translation

    John DeNero, Shankar kumar, Ciprian Chelba, Franz Och

    Proceedings of the North American Chapter of the Association for Computational Linguistics (NAACL) (2010), pp. 975-983

  •    

    Query Language Modeling for Voice Search

    Ciprian Chelba, Johan Schalkwyk, Thorsten Brants, Vida Ha, Boulos Harb, Will Neveitt, Carolina Parada, Peng Xu

    Proceedings of the 2010 IEEE Workshop on Spoken Language Technology, IEEE, pp. 127-132

  •    

    Statistical Language Modeling

    Ciprian Chelba

    The Handbook of Computational Linguistics and Natural Language Processing, Wiley-Blackwell, John Wiley & Sons Ltd, The Atrium, Southern Gate, Chichester, West Sussex, PO19 8SQ United Kingdom (2010), pp. 74-104

  •    

    Study on Interaction between Entropy Pruning and Kneser-Ney Smoothing

    Ciprian Chelba, Thorsten Brants, Will Neveitt, Peng Xu

    Proceedings of Interspeech (2010), pp. 2242-2245

  •    

    An Audio Indexing System for Election Video Material

    Christopher Alberti, Michiel Bacchiani, Ari Bezman, Ciprian Chelba, Anastassia Drofa, Hank Liao, Pedro Moreno, Ted Power, Arnaud Sahuguet, Maria Shugrina, Olivier Siohan

    Proceedings of ICASSP (2009), pp. 4873-4876

  •   

    Back-off Language Model Compression

    Boulos Harb, Ciprian Chelba, Jeffrey Dean, Sanjay Ghemawat

    Proceedings of Interspeech 2009, International Speech Communication Association (ISCA), pp. 325-355

  •   

    Retrieval and Browsing of Spoken Content

    Ciprian Chelba, Timothy J. Hazen, Murat Saraçlar

    Signal Processing Magazine, IEEE, vol. 25 (2008), pp. 39-49

Previous Publications

  •  

    An audio indexing system for election video material

    Christopher Alberti, Michiel Bacchiani, Ari Bezman, Ciprian Chelba, Anastassia Drofa, Hank Liao, Pedro Moreno, Ted Power, Arnaud Sahuguet, Maria Shugrina, Olivier Siohan

    ICASSP (2009), pp. 4873-4876

  •  

    Acoustic Sensitive Language Model Perplexity for Automatic Speech Recognition

    Ciprian Chelba

    Proceedings of Machine Learning Workshop, Snowbird, UT (2006)

  •  

    Adaptation of Maximum Entropy Capitalizer: Little Data Can Help a Lot

    Ciprian Chelba, Alex Acero

    Computer Speech and Language, vol. 20 (2006), pp. 382-399

  •  

    Integration of Metadata in Spoken Document Search Using Position Specific Posterior Lattices

    Jorge Silva, Ciprian Chelba, Alex Acero

    Proceedings of the IEEE International Workshop on Spoken Language Technology, IEEE, Palm Beach, Aruba (2006), to appear

  •  

    Pruning Analysis of the Position Specific Posterior Lattices for Spoken Document Search

    Jorge Silva Sanchez, Ciprian Chelba, Alex Acero

    Proceedings of ICASSP'06, IEEE, Toulouse, France (2006), to appear

  •  

    Soft Indexing of Speech Content for Search in Spoken Documents

    Ciprian Chelba, Jorge Silva, Alex Acero

    Computer Speech and Language (2006), pp. 458-478

  •   

    Towards Spoken-Document Retrieval for the Internet: Lattice Indexing For Large-Scale Web-Search Architectures

    Zheng-Yu Zhou, Peng Yu, Ciprian Chelba, Frank Seide

    Proceedings of the Human Language Technology Conference of the NAACL, Main Conference, Association for Computational Linguistics, New York City, USA (2006), pp. 415-422

  •   

    Indexing Uncertainty for Spoken Document Search

    Ciprian Chelba, Alex Acero

    Proceedings of Eurospeech, ISCA, Lisbon, Portugal (2005), pp. 61-64

  •   

    Position Specific Posterior Lattices for Indexing Speech

    Ciprian Chelba, Alex Acero

    Proceedings of the 43rd Annual Meeting of the Association for Computational Linguistics (ACL'05), Association for Computational Linguistics, Ann Arbor, Michigan (2005), pp. 443-450

  •   

    SPEECH OGLE: Indexing Uncertainty for Spoken Document Search

    Ciprian Chelba, Alex Acero

    Proceedings of the ACL Interactive Poster and Demonstration Sessions, Association for Computational Linguistics, Ann Arbor, Michigan (2005), pp. 41-44

  •  

    Adaptation of Maximum Entropy Capitalizer: Little Data Can Help a Lot

    Ciprian Chelba, Alex Acero

    Proceedings of EMNLP, Barcelona, Spain (2004), pp. 285-292

  •  

    Conditional Maximum Likelihood Estimation of Naive Bayes Probability Models

    Ciprian Chelba, Alex Acero

    Microsoft Research, Redmond, WA (2004)

  •  

    Conditional Maximum Likelihood Estimation of Naive Bayes Probability Models Using Rational Function Growth Transform

    Ciprian Chelba, Alex Acero

    Proceedings of Machine Learning Workshop, Snowbird, UT (2004)

  •  

    Parsing Conversational Speech Using Enhanced Segmentation

    Jeremy G. Kahn, Mari Ostendorf, Ciprian Chelba

    HLT-NAACL 2004: Short Papers, Association for Computational Linguistics, Boston, Massachusetts, USA, pp. 125-128

  •  

    Discriminative Training of N-gram Classifiers for Speech and Text Routing

    Ciprian Chelba, Alex Acero

    Proceedings of Eurospeech 2003, Geneva, Switzerland, pp. 1-4

  •  

    Speech Utterance Classification

    C. Chelba, M. Mahajan, A. Acero

    Proceedings of ICASSP, Hong Kong (2003), pp. 280-283

  •   

    A Study on Richer Syntactic Dependencies for Structured Language Modeling

    Peng Xu, Ciprian Chelba, Frederick Jelinek

    ACL, http://www.aclweb.org/ (2002), pp. 191-198

  •  

    Growth Transform for Conditional Maximum Likelihood Estimation of Log-linear Models

    Milind Mahajan, Ciprian Chelba

    Microsoft Research, Redmond, WA (2002)

  •  

    Mutual Information Phone Clustering for Decision Tree Induction

    C. Chelba, R. Morton

    Proc. Int. Conf. on Spoken Language Processing, Denver, Colorado (2002)

  •  

    Information Extraction Using the Structured Language Model

    Ciprian Chelba, Milind Mahajan

    Proceedings of EMNLP, Pittsburgh, Pennsylvania (2001), pp. 74-81

  •   

    Portability of Syntactic Structure for Language Modeling

    Ciprian Chelba

    Proceedings of the IEEE International Conference on Audio, Speech and Signal Processing Conference, IEEE, www.ieee.org (2001)

  •  

    Richer Syntactic Dependencies for Structured Language Modeling

    C. Chelba, P. Xu

    Proc. of the IEEE Workshop on Automatic Speech Recognition and Understanding, Madonna di Campiglio, Italy (2001)

  •   

    Exploiting Syntactic Structure for Natural Language Modeling

    Ciprian Chelba

    The Johns Hopkins University, www.jhu.edu (2000)

  •  

    Structured Language Modeling

    Ciprian Chelba, Frederick Jelinek

    Computer Speech and Language, vol. 14 (2000), pp. 283-332

  •  

    Putting Language into Language Modeling

    Frederick Jelinek, Ciprian Chelba

    Proceedings of Eurospeech'99, Budapest, Hungary (1999)

  •  

    Recognition performance of a structured language model

    C. Chelba, F. Jelinek

    Proceedings of Eurospeech, Budapest, Hungary (1999)

  •   

    Structured Language Modeling for Speech Recognition

    Ciprian Chelba, Frederick Jelinek

    Proceedings of NLDB (1999)

  •  

    Exploiting Syntactic Structure for Language Modeling

    Ciprian Chelba, Frederick Jelinek

    Proceedings of COLING-ACL (1998), pp. 225-231

  •   

    Refinement of a Structured Language Model

    Ciprian Chelba, Frederick Jelinek

    Proceedings of ICAPR (1998)

  •  

    A Structured Language Model

    Ciprian Chelba

    Proceedings of ACL-EACL, Madrid, Spain (1997), 498-500,student section

  •  

    Structure and Performance of a Dependency Language Model

    C. Chelba, D. Engle, F. Jelinek, V. Jimenez, S. Khudanpur, L. Mangu, H. Printz, E. S. Ristad, R. Rosenfeld, A. Stolcke, D. Wu

    Proceedings of Eurospeech, Rhodes, Greece (1997), pp. 2775-2778