Towards A Rigorous Science of Interpretable Machine Learning
Finale Doshi-Velez, Been Kim
As machine learning systems become ubiquitous, there has been a surge of interest
in interpretable machine learning: systems that provide explanation for their
outputs. These explanations are often used to qualitatively assess other criteria
such as safety or non-discrimination. However, despite the interest in
interpretability, there is very little consensus on what interpretable machine
learning is and how it should be measured. In this position paper, we first define
interpretability and describe when interpretability is needed (and when it is not).
Next, we suggest a taxonomy for rigorous evaluation and expose open questions
towards a more rigorous science of interpretable machine learning.