Grader variability and the importance of reference standards for evaluating machine learning models for diabetic retinopathy

   Abstract