A discourse typically involves numerous entities, but few are mentioned more than
once. Distinguishing those that die out after just one mention (singleton) from
those that lead longer lives (coreferent) would dramatically simplify the
hypothesis space for coreference resolution models, leading to increased
performance. To realize these gains, we build a classifier for predicting the
singleton/coreferent distinction. The model’s feature representations synthesize
linguistic insights about the factors affecting discourse entity lifespans
(especially negation, modality, and attitude predication) with existing results
about the benefits of “surface” (part-of-speech and n-gram-based) features for
coreference resolution. The model is effective in its own right, and the feature
representations help to identify the anchor phrases in bridging anaphora as well.
Furthermore, incorporating the model into two very different state-of-the-art
coreference resolution systems, one rule-based and the other learning-based, yields
significant performance improvements.