Automatically Scheduling Halide Image Processing Pipelines
Venue
SIGGRAPH 2016 (2016)
Publication Year
2016
Authors
Andrew Adams, Dillon Sharlet, Jonathan Ragan-Kelley, Kayvon Fatahalian, Ravi Teja Mullapudi
BibTeX
Abstract
The Halide image processing language has proven to be an effective system for
authoring high-performance image processing code. Halide programmers need only
provide a high-level strategy for mapping an image processing pipeline to a
parallel machine (a schedule), and the Halide compiler carries out the mechanical
task of generating platform-specific code that implements the schedule.
Unfortunately, designing high-performance schedules for complex image processing
pipelines requires substantial knowledge of modern hardware architecture and
code-optimization techniques. In this paper we provide an algorithm for
automatically generating high-performance schedules for Halide programs. Our
solution extends the function bounds analysis already present in the Halide
compiler to automatically perform locality and parallelism-enhancing global program
transformations typical of those employed by expert Halide developers. The
algorithm does not require costly (and often impractical) auto-tuning, and, in
seconds, generates schedules for a broad set of image processing benchmarks that
are performance-competitive with, and often better than, schedules manually
authored by expert Halide developers on server and mobile CPUs, as well as GPUs.
