The Difficulty of Training Deep Architectures and the Effect of Unsupervised Pre-Training

   Abstract