Structured prediction theory based on factor graph complexity
Abstract
We present a general theoretical analysis of structured prediction with a series
of new results. We give new data-dependent margin guarantees for structured
prediction for a very wide family of loss functions and a general family of hypotheses,
with an arbitrary factor graph decomposition. These are the tightest margin
bounds known for both standard multi-class and general structured prediction
problems. Our guarantees are expressed in terms of a data-dependent complexity
measure, factor graph complexity, which we show can be estimated from data and
bounded in terms of familiar quantities for several commonly used hypothesis sets
along with a sparsity measure for features and graphs. Our proof techniques include
generalizations of Talagrand’s contraction lemma that can be of independent
interest.
We further extend our theory by leveraging the principle of Voted Risk Minimization
(VRM) and show that learning is possible even with complex factor graphs. We
present new learning bounds for this advanced setting, which we use to design two
new algorithms, Voted Conditional Random Field (VCRF) and Voted Structured
Boosting (StructBoost). These algorithms can make use of complex features and
factor graphs and yet benefit from favorable learning guarantees. We also report
the results of experiments with VCRF on several datasets to validate our theory.