Machine Translation
Machine Translation is a great example of how cutting edge research and world class infrastructure come together at Google. We focus our research efforts towards developing statistical translation techniques that improve with more data and generalize well to new languages. Our large scale computing infrastructure allows us to rapidly experiment with new models trained on web-scale data to significantly improve translation quality. This research backs the translations served at translate.google.com, allowing our users to translate text, web pages and even speech. Deployed within a wide range of Google services like GMail, Books, Android and web search, Google Translate is a high impact, research driven product that bridges the language barrier and makes it possible to explore the multilingual web in 90 languages. Exciting research challenges abound as we pursue human quality translation and develop machine translation systems for new languages.
41 Publications
-
Addressing the Rare Word Problem in Neural Machine Translation
Thang Luong, Ilya Sutskever, Quoc V. Le, Oriol Vinyals, Wojciech Zaremba
ACL (2015)
-
Efficient Top-Down BTG Parsing for Machine Translation Preordering
Tetsuji Nakagawa
Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), Association for Computational Linguistics (2015), pp. 208-218
-
Pushdown automata in statistical machine translation
Cyril Allauzen, Bill Byrne, Adrià de Gispert, Gonzalo Iglesias, Michael Riley
Computational Linguistics, vol. 40 (2014), pp. 687-723
-
Enlisting the Ghost: Modeling Empty Categories for Machine Translation
Bing Xiang, Xiaoqiang Luo, Bowen Zhou
Proceedings of ACL, ACL (2013), pp. 822-831
-
Exploiting Similarities among Languages for Machine Translation
Tomas Mikolov, Quoc V. Le, Ilya Sutskever
ARXIV (2013)
-
Source-Side Classifier Preordering for Machine Translation
Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP '13) (2013)
-
A Class-Based Agreement Model For Generating Accurately Inflected Translations
Spence Green, John DeNero
50th Annual Meeting of the Association for Computational Linguistics (ACL 2012)
-
A Systematic Comparison of Phrase Table Pruning Techniques
Richard Zens, Daisy Stanton, Peng Xu
Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning, Association for Computational Linguistics, Jeju Island, Korea, pp. 972-983
-
Joern Wuebker, Hermann Ney, Richard Zens
Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics, Association for Computational Linguistics, Jeju, Republic of Korea (2012), pp. 28-32
-
Unsupervised Translation Sense Clustering
Mohit Bansal, John DeNero, Dekang Lin
the North American Association of Computational Linguistics (2012)
-
A Lightweight Evaluation Framework for Machine Translation Reordering
David Talbot, Hideto Kazawa, Hiroshi Ichikawa
Proceedings of the 6th Workshop on Statistical Machine Translation (2011), pp. 468-476
-
Binarized Forest to String Translation
Hao Zhang, Licheng Fang, Peng Xu, Xiaoyun Wu
ACL (2011), pp. 835-845
-
Hierarchical Phrase-Based Translation Representations
Gonzalo Iglesias, Cyril Allauzen, William Byrne, Adrià de Gispert, Michael Riley
Proceedings of EMNLP 2011
-
Inducing Sentence Structure from Parallel Corpora for Reordering
Proceedings of the 2011 Conference on Empirical Methods in Natural Language Processing (EMNLP), Association for Computational Linguistics
-
Language-independent Compound Splitting with Morphological Operations
Klaus Macherey, Andrew M. Dai, David Talbot, Ashok C. Popat, Franz Och
ACL HLT 2011, pp. 10
-
Model-Based Aligner Combination Using Dual Decomposition
Proceedings of the Association for Computational Linguistics (ACL), 2011
-
Training a Parser for Machine Translation Reordering
Jason Katz-Brown, Slav Petrov, Ryan McDonald, Franz Och, David Talbot, Hiroshi Ichikawa, Masakazu Seno
Proceedings of the 2011 Conference on Empirical Methods in Natural Language Processing (EMNLP '11)
-
Dmitriy Genzel, Ashok C. Popat, Nemanja Spasojevic, Michael Jahr, Andrew Senior, Eugene Ie, Frank Yung-Fong Tang
ICDAR-2011
-
Ashish Venugopal, Jakob Uszkoreit, David Talbot, Franz Och, Juri Ganitkevitch
Proceedings of the 2011 Conference on Empirical Methods in Natural Language Processing (EMNLP), Association for Computational Linguistics
-
Automatically Learning Source-side Reordering Rules for Large Scale Machine Translation
COLING-2010
-
Large Scale Parallel Document Mining for Machine Translation
Jakob Uszkoreit, Jay Ponte, Ashok Popat, Moshe Dubiner
Proceedings of the 23rd International Conference on Computational Linguistics (Coling 2010), Coling 2010 Organizing Committee, Beijing, China, pp. 1101-1109
-
Model Combination for Machine Translation
John DeNero, Shankar kumar, Ciprian Chelba, Franz Och
Proceedings of the North American Chapter of the Association for Computational Linguistics (NAACL) (2010), pp. 975-983
-
The Handbook of Computational Linguistics and Natural Language Processing, Wiley-Blackwell, John Wiley & Sons Ltd, The Atrium, Southern Gate, Chichester, West Sussex, PO19 8SQ United Kingdom (2010), pp. 74-104
-
Syntax based reordering with automatically derived rules for improved statistical machine translation
Karthik Visweswariah, Jiri Navratil, Jeffrey Sorensen, Vijil Chenthamarakshan, Nanda Kambhatla
Proceedings of the 23rd International Conference on Computational Linguistics, Association for Computational Linguistics, Stroudsburg, PA, USA (2010), pp. 1119-1127
-
“Poetic” Statistical Machine Translation: Rhyme and Meter
Dmitriy Genzel, Jakob Uszkoreit, Franz Och
EMNLP (2010), pp. 158-166
-
Compiling a massive, multilingual dictionary via probabilistic inference
Mausam, Stephen Soderland, Oren Etzioni, Daniel S. Weld, Michael Skinner, Jeff Bilmes
ACL-IJCNLP '09: Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP: Volume 1, Association for Computational Linguistics, Morristown, NJ, USA (2009), pp. 262-270
-
Creating a High-Quality Machine Translation System for a Low-Resource Language: Yiddish
Dmitriy Genzel, Klaus Macherey, Jakob Uszkoreit
MT Summit XII (2009)
-
Efficient Minimum Error Rate Training and Minimum Bayes-Risk Decoding for Translation Hypergraphs and Lattices
Shankar Kumar, Wolfgang Macherey, Chris Dyer, Franz Och
Proceedings of the 47th Annual Meeting of the ACL and the 4th IJCNLP of the AFNLP, ACL and AFNLP (2009), pp. 163-171
-
Learning linear ordering problems for better translation
Roy Tromble, Jason Eisner
EMNLP '09: Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing, Association for Computational Linguistics, Morristown, NJ, USA, pp. 1007-1016
-
Using a dependency parser to improve SMT for subject-object-verb languages
Peng Xu, Jaeho Kang, Michael Ringgaard, Franz Och
NAACL '09: Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics, Association for Computational Linguistics, Morristown, NJ, USA, pp. 245-253
-
A systematic comparison of phrase-based, hierarchical and syntax-augmented statistical MT
Andreas Zollmann, Ashish Venugopal, Franz Josef Och, Jay Ponte
Proceedings of the 22nd International Conference on Computational Linguistics (COLING) (2008)
-
Distributed Word Clustering for Large Scale Class-Based Language Modeling in Machine Translation
Jakob Uszkoreit, Thorsten Brants
Proceedings of ACL-08: HLT, Association for Computational Linguistics, Columbus, Ohio (2008), pp. 755-762
-
Lattice Minimum Bayes-Risk Decoding for Statistical Machine Translation
Roy Tromble, Shankar Kumar, Franz Och, Wolfgang Macherey
Proceedings of the 2008 Conference on Empirical Methods in Natural Language Processing, Association for Computational Linguistics, pp. 620-629
-
Lattice-based Minimum Error Rate Training for Statistical Machine Translation
Wolfgang Macherey, Franz Och, Ignacio Thayer, Jakob Uszkoreit
Proceedings of the 2008 Conference on Empirical Methods in Natural Language Processing (EMNLP), Association for Computational Linguistics, pp. 725-734
-
Mining Parenthetical Translations from the Web by Word Alignment
Dekang Lin, Shaojun Zhao, Benjamin Van Durme, Marius Pasca
Proceedings of the 46th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies (ACL-2008), Columbus, Ohio, pp. 994-1002
-
Using Word Space Models for Enriching Multilingual Lexical Resources and Detecting the Relation Between Morphological and Semantic Composition
Adil Toumouh, Dominic Widdows, Ahmed Lehireche
International Conference on Web and Information Tecnologies (ICWIT '08) (2008), pp. 195-201
-
An Empirical Study on Computing Consensus Translations from Multiple Machine Translation Systems
Wolfgang Macherey, Franz J. Och
Proceedings of the 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning (EMNLP-CoNLL), Association for Computational Linguistics, 209 N. Eighth Street, East Stroudsburg, PA, USA, pp. 986-995
-
Improving Word Alignment with Bridge Languages
Shankar Kumar, Franz Och, Wolfgang Macherey
Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning, Association for Computational Linguistics, 209 N. Eighth Street, East Stroudsburg, PA, USA (2007)
-
Inversion transduction grammar for joint phrasal translation modeling
Colin Cherry, Dekang Lin
SSST '07: Proceedings of the NAACL-HLT 2007/AMTA Workshop on Syntax and Structure in Statistical Translation, Association for Computational Linguistics, Morristown, NJ, USA, pp. 17-24
-
Large Language Models in Machine Translation
Thorsten Brants, Ashok C. Popat, Peng Xu, Franz J. Och, Jeffrey Dean
Proceedings of the 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning (EMNLP-CoNLL), pp. 858-867
-
A path-based transfer model for machine translation
Dekang Lin
COLING '04: Proceedings of the 20th international conference on Computational Linguistics, Association for Computational Linguistics, Morristown, NJ, USA (2004), pp. 625
