Watermarking the Outputs of Structured Prediction with an application in Statistical
Abstract: We propose a general method to watermark and
probabilistically identify the structured outputs of machine learning algorithms. Our
method is robust to local editing operations and provides well deﬁned trade-oﬀs between
the ability to identify algorithm outputs and the quality of the watermarked output.
Unlike previous work in the ﬁeld, our approach does not rely on controlling the inputs
to the algorithm and provides probabilistic guarantees on the ability to identify
collections of results from one’s own algorithm. We present an application in
statistical machine translation, where machine translated output is watermarked at
minimal loss in translation quality and detected with high recall.