Publication Data
Evaluating Online Ad Campaigns in a Pipeline: Causal Models at Scale
Abstract: Display ads proliferate on the web, but are they effective?
Or are they irrelevant in light of all the other advertising that people see? We
describe a way to answer these questions, quickly and accurately, without randomized
experiments, surveys, focus groups or expert data analysts. Doubly robust estimation
protects against the selection bias that is inherent in observational data, and a
nonparametric test that is based on irrelevant outcomes provides further defense.
Simulations based on realistic scenarios show that the resulting estimates are more
robust to selection bias than traditional alternatives, such as regression modeling or
propensity scoring. Moreover, computations are fast enough that all processing, from
data retrieval through estimation, testing, validation and report generation, proceeds
in an automated pipeline, without anyone needing to see the raw data.
