The Virtues of Peer Pressure: A Simple Method for Discovering High-Value Mistakes
Much of the recent success of neural networks can be attributed to the deeper architectures that have become prevalent. However, the deeper architectures often yield unintelligible solutions, require enormous amounts of labeled data, and still remain brittle and easily broken. In this paper, we present a method to efficiently and intuitively discover input instances that are misclassified by well-trained neural networks. As in previous studies, we can identify instances that are so similar to previously seen examples such that the transformation is visually imperceptible. Additionally, unlike in previous studies, we can also generate mistakes that are significantly different from any training sample, while, importantly, still remaining in the space of samples that the network should be able to classify correctly. This is achieved by training a basket of N “peer networks” rather than a single network. These are similarly trained networks that serve to provide consistency pressure on each other. When an example is found for which a single network, S, disagrees with all of the other N − 1 networks, which are consistent in their prediction, that example is a potential mistake for S. We present a simple method to find such examples and demonstrate it on two visual tasks. The examples discovered yield realistic images that clearly illuminate the weaknesses of the trained models, as well as provide a source of numerous, diverse, labeled-training samples.