Classifying with Confidence From Incomplete Test Data

Nathan Parris

Hyrum S. Anderson

Maya R. Gupta

Dun Yu Hsaio

Journal Machine Learning Research (JMLR), vol. 14 (2013)

Download Google Scholar

Abstract

We consider the classification problem given incomplete information about a test sample. This problem arises naturally when data about the test sample is collected over time, or when costs must be incurred to collect the data. For example, in a distributed sensor network only a fraction of the sensors may have reported measurements at a certain time, and either additional time, power, bandwidth or some other cost must be incurred to collect the complete data to classify. A practical goal is to assign a class label as soon as enough data is available to make a good decision. We formalize this goal through the notion of reliability --- the probability that a label assigned to the incomplete data matches the label that would be assigned to the complete data, and we propose a method to classify incomplete data only if some reliability threshold is met. Our approach models the complete data as a random variable whose distribution is dependent on the current incomplete data and the (complete) training data. The method differs from standard imputation strategies in that our focus is on determining the reliability of the classification decision, rather than just the class label. We show that the method provides useful reliability estimates of the correctness of the imputed class labels on a set of experiments on time-series datasets, where the goal is to classify the time-series as early as possible while still guaranteeing that the reliability threshold is met.

Defining the technology of today and tomorrow.

Philosophy

People

Teams

AI/ML Foundations  & Capabilities

Algorithms & Optimization

Computing Paradigms

Responsible Human-Centric Technology

Science & Societal Impact

Projects

Publications

Resources

Shaping the future, together.

Student programs

Faculty programs

Conferences & events

Classifying with Confidence From Incomplete Test Data

Abstract

Research Areas

Learn more about how we conduct our research

Defining the technology of today and tomorrow.

Philosophy

People

Teams

AI/ML Foundations & Capabilities

Algorithms & Optimization

Computing Paradigms

Responsible Human-Centric Technology

Science & Societal Impact

Projects

Publications

Resources

Shaping the future, together.

Student programs

Faculty programs

Conferences & events

Classifying with Confidence From Incomplete Test Data

Abstract

Research Areas

Learn more about how we conduct our research

AI/ML Foundations  & Capabilities