Jump to Content

Pedestrian Detection with a Large-Field-Of-View Deep Network

Alex Krizhevsky
Proceedings of ICRA 2015

Abstract

Pedestrian detection is of crucial importance to autonomous driving applications. Methods based on deep learning have shown significant improvements in accuracy, which makes them particularly suitable for applications, such as pedestrian detection, where reducing miss rate is very important. Although they are accurate, their runtime has been at best in seconds per image, which makes them not practical for onboard applications. We present here a Large-Field-Of-View (LFOV) deep network for pedestrian detection, that can achieve high accuracy and is designed to make deep networks work faster for detection problems. The idea of the proposed Large-Field-of-View deep network is to learn to make classification decisions simultaneously and accurately at multiple locations. The LFOV network processes larger image areas at much faster speeds than typical deep networks have been able to do, and can intrinsically reuse computations. Our pedestrian detection solution, which is a combination of a LFOV network and a standard deep network, works at 280 ms per image on GPU and achieves 35.85 average miss rate on the Caltech Pedestrian Detection Benchmark.