Brian Roark

Brian Roark is a computational linguist working on various topics in natural language processing. His research interests include: syntactic parsing of text and speech; language modeling for automatic speech recognition and other applications; supervised and unsupervised learning of language and parsing models; text entry, accessibility and augmentative & alternative communication (AAC).

Before joining Google, he was a faculty member for 9 years in the Center for Spoken Language Understanding (CSLU) at Oregon Health & Science University (OHSU) – part of what used to be the Oregon Graduate Institute (OGI). Before that, he was in the Speech Algorithms Department at AT&T Labs - Research from 2001–2004. He received his Ph.D. in the Department of Cognitive and Linguistic Sciences at Brown University in 2001.

More information, including publications, CV and other links, can be found at his external webpage here.

Google Publications

Previous Publications

  •   

    Applications of Lexicographic Semirings to Problems in Speech and Language Processing

    Richard Sproat, Mahsa Yarmohammadi, Izhak Shafran, Brian Roark

    Computational Linguistics, vol. 40 (2014)

  •  

    Continuous Space Discriminative Language Modeling

    Puyang Xu, Sanjeev Khudanpur, Maider Lehr, Emily Prud’hommeaux, Nathan Glenn, Damianos Karakos, Brian Roark, Kenji Sagae, Murat Saraclar, Izhak Shafran, Dan Bikel, Chris Callison-Burch, Yuan Cao, Keith Hall, Eva Hasler, Philipp Koehn, Adam Lopez, Matt Post, Darcey Riley

    ICASSP 2012

  •  

    Hallucinated N-Best Lists for Discriminative Language Modeling

    Kenji Sagae, Maider Lehr, Emily Tucker Prud’hommeaux, Puyang Xu, Nathan Glenn, Damianos Karakos, Sanjeev Khudanpur, Brian Roark, Murat Saraçlar, Izhak Shafran, Daniel M. Bikel, Chris Callison-Burch, Yuan Cao, Keith Hall, Eva Hassler, Philipp Koehn, Adam Lopez, Matt Post, Darcey Riley

    Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP) (2012)

  •  

    Semi-supervised discriminative language modeling for Turkish ASR

    Arda Çelebi, Hasim Sak, Erinç Dikici, Murat Saraclar, Maider Lehr, Emily Tucker Prud'hommeaux, Puyang Xu, Nathan Glenn, Damianos Karakos, Sanjeev Khudanpur, Brian Roark, Kenji Sagae, Izhak Shafran, Daniel M. Bikel, Chris Callison-Burch, Yuan Cao, Keith B. Hall, Eva Hasler, Philipp Koehn, Adam Lopez, Matt Post, Darcey Riley

    ICASSP (2012), pp. 5025-5028

  •   

    The OpenGrm Open-Source Finite-State Grammar Software Libraries

    Brian Roark, Richard Sproat, Cyril Allauzen, Michael Riley, Jeffrey Sorensen, Terry Tai

    ACL (System Demonstrations) (2012), pp. 61-66

  •  

    Beam-Width Prediction for Efficient Context-Free Parsing

    Nathan Bodenstab, Aaron Dunlop, Keith Hall, Brian Roark

    Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics, Association for Computational Linguistics (2011)

  •  

    MAP adaptation of stochastic grammars

    M. Bacchiani, M. Riley, B. Roark, R. Sproat

    Computer Speech and Language, vol. 20 (2006), pp. 41-68

  •   

    Probabilistic Context-Free Grammar Induction Based on Structural Zeros

    Mehryar Mohri, Brian Roark

    Proceedings of the Seventh Meeting of the Human Language Technology conference - North American Chapter of the Association for Computational Linguistics (HLT-NAACL 2006), New York, NY

  •   

    The design principles and algorithms of a weighted grammar library

    Cyril Allauzen, Mehryar Mohri, Brian Roark

    Int. J. Found. Comput. Sci., vol. 16 (2005), pp. 403-421

  •   

    A General Weighted Grammar Library

    Cyril Allauzen, Mehryar Mohri, Brian Roark

    Ninth International Conference on Automata (CIAA 2004), Kingston, Canada, July 22-24, 2004, Springer-Verlag, Berlin-NY (2005)

  •   

    The Design Principles and Algorithms of a Weighted Grammar Library

    Cyril Allauzen, Mehryar Mohri, Brian Roark

    International Journal of Foundations of Computer Science, vol. 16 (2005)

  •  

    A General Weighted Grammar Library

    Cyril Allauzen, Mehryar Mohri, Brian Roark

    CIAA (2004), pp. 23-34

  •   

    A Generalized Construction of Integrated Speech Recognition Transducers

    Cyril Allauzen, Mehryar Mohri, Brian Roark, Michael Riley

    Proceedings of the International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2004), Montreal, Canada

  •  

    Improved name recognition with meta-data dependent name networks

    S. Maskey, M. Bacchiani, B. Roark, R. Sproat

    Proceedings of the International Conference on Acoustics,Speech and Signal Processing (2004)

  •  

    Language model adaptation with MAP estimation and the perceptron algorithm

    M. Bacchiani, B. Roark, M. Saraclar

    Proceedings of the HLT-NAACL (2004)

  •  

    Meta-data Conditional Language Modeling

    M. Bacchiani, B. Roark

    Proceedings of the International Conference on Acoustics,Speech and Signal Processing (2004)

  •   

    A General Weighted Grammar Library

    Cyril Allauzen, Mehryar Mohri, Brian Roark

    Proceedings of the Ninth International Conference on Automata (CIAA 2004), Kingston, Ontario, Canada

  •   

    Generalized Algorithms for Constructing Statistical Language Models

    Cyril Allauzen, Mehryar Mohri, Brian Roark

    ACL (2003)

  •   

    Supervised and unsupervised PCFG adaptation to novel domains

    Brian Roark, Michiel Bacchiani

    HLT-NAACL (2003)

  •  

    Unsupervised Language Model Adaptation

    M. Bacchiani, B. Roark

    Proceedings of the International Conference on Acoustics,Speech and Signal Processing (2003)

  •   

    Generalized Algorithms for Constructing Statistical Language Models

    Cyril Allauzen, Mehryar Mohri, Brian Roark

    $41$st Meeting of the Association for Computational Linguistics (ACL 2003), Proceedings of the Conference, Sapporo, Japan