What’s your ML test score? A rubric for ML production systems
Reliable Machine Learning in the Wild - NIPS 2016
Eric Breck, Shanqing Cai, Eric Nielsen, Michael Salib, D. Sculley
Using machine learning in real-world production systems is complicated by a host of
issues not found in small toy examples or even large offline research experiments.
Testing and monitoring are key considerations for assessing the
production-readiness of an ML system. But how much testing and monitoring is
enough? We present an ML Test Score rubric based on a set of actionable tests to
help quantify these issues.