MAO - an Extensible Micro-Architectural Optimizer
Venue
Proceedings of the 8th annual IEEE/ACM international symposium on Code generation and optimization, ACM (2011)
Publication Year
2011
Authors
Robert Hundt, Easwaran Raman, Martin Thuresson, Neil Vachharajani
BibTeX
Abstract
Performance matters, and so does repeatability and predictability. Today's
processors' micro-architectures have become so complex as to now contain many
undocumented, not understood, and even puzzling performance cliffs. Small changes
in the instruction stream, such as the insertion of a single NOP instruction, can
lead to significant performance deltas, with the effect of exposing compiler and
performance optimization efforts to perceived unwanted randomness. This paper
presents MAO, an extensible micro-architectural assembly to assembly optimizer,
which seeks to address this problem for x86/64 processors. In essence, MAO is a
thin wrapper around a common open source assembler infrastructure. It offers basic
operations, such as creation or modification of instructions, simple data-flow
analysis, and advanced infra-structure, such as loop recognition, and a repeated
relaxation algorithm to compute instruction addresses and lengths. This
infrastructure enables a plethora of passes for pattern matching, alignment
specific optimizations, peep-holes, experiments (such as random insertion of NOPs),
and fast prototyping of more sophisticated optimizations. MAO can be integrated
into any compiler that emits assembly code, or can be used standalone. MAO can be
used to discover micro-architectural details semi-automatically. Initial
performance results are encouraging.
