Publication Data

   Evaluating Online Ad Campaigns in a Pipeline: Causal Models at Scale

Abstract: Display ads proliferate on the web, but are they effective? Or are they irrelevant in light of all the other advertising that people see? We describe a way to answer these questions, quickly and accurately, without randomized experiments, surveys, focus groups or expert data analysts. Doubly robust estimation protects against the selection bias that is inherent in observational data, and a nonparametric test that is based on irrelevant outcomes provides further defense. Simulations based on realistic scenarios show that the resulting estimates are more robust to selection bias than traditional alternatives, such as regression modeling or propensity scoring. Moreover, computations are fast enough that all processing, from data retrieval through estimation, testing, validation and report generation, proceeds in an automated pipeline, without anyone needing to see the raw data.