Systematic Analysis of Challenge-Driven Improvements in Molecular Prognostic Models for Breast Cancer
Venue
Science Translational Medicine, vol. 5.181 (2013), 181re1-181re1
Publication Year
2013
Authors
Adam Margolin, Erhan Bilal, Erich Huang, Ben Sauerwine, Nicole Deflaux, Lamia Youseff, Tyler Pirtle, Craig Citro, Joseph L. Hellerstein
BibTeX
Abstract
Although molecular prognostics in breast cancer are among the most successful
examples of translating genomic analysis to clinical applications, optimal
approaches to breast cancer clinical risk prediction remain controversial. The Sage
Bionetworks–DREAM Breast Cancer Prognosis Challenge (BCC) is a crowdsourced
research study for breast cancer prognostic modeling using genome-scale data. The
BCC provided a community of data analysts with a common platform for data access
and blinded evaluation of model accuracy in predicting breast cancer survival on
the basis of gene expression data, copy number data, and clinical covariates. This
approach offered the opportunity to assess whether a crowdsourced community
Challenge would generate models of breast cancer prognosis commensurate with or
exceeding current best-in-class approaches. The BCC comprised multiple rounds of
blinded evaluations on held-out portions of data on 1981 patients, resulting in
more than 1400 models submitted as open source code. Participants then retrained
their models on the full data set of 1981 samples and submitted up to five models
for validation in a newly generated data set of 184 breast cancer patients.
Analysis of the BCC results suggests that the best-performing modeling strategy
outperformed previously reported methods in blinded evaluations; model performance
was consistent across several independent evaluations; and aggregating
community-developed models achieved performance on par with the best-performing
individual models.
