Watermarking the Outputs of Structured Prediction with an application in Statistical Machine Translation
Venue
Proceedings of the 2011 Conference on Empirical Methods in Natural Language Processing (EMNLP), Association for Computational Linguistics
Publication Year
2011
Authors
Ashish Venugopal, Jakob Uszkoreit, David Talbot, Franz Och, Juri Ganitkevitch
BibTeX
Abstract
We propose a general method to watermark and probabilistically identify the
structured outputs of machine learning algorithms. Our method is robust to local
editing operations and provides well defined trade-offs between the ability to
identify algorithm outputs and the quality of the watermarked output. Unlike
previous work in the field, our approach does not rely on controlling the inputs to
the algorithm and provides probabilistic guarantees on the ability to identify
collections of results from one’s own algorithm. We present an application in
statistical machine translation, where machine translated output is watermarked at
minimal loss in translation quality and detected with high recall.
