Classification with imperfect training labels
Accepted version
Repository URI
Repository DOI
Change log
Authors
Abstract
We study the effect of imperfect training data labels on the performance of classification methods. In a general setting, where the probability that an observation in the training dataset is mislabelled may depend on both the feature vector and the true label, we bound the excess risk of an arbitrary classifier trained with imperfect labels in terms of its excess risk for predicting a noisy label. This reveals conditions under which a classifier trained with imperfect labels remains consistent for classifying uncorrupted test data points. Furthermore, under stronger conditions, we derive detailed asymptotic properties for the popular
Description
Keywords
Journal Title
Conference Name
Journal ISSN
1464-3510
Volume Title
Publisher
Publisher DOI
Rights
Sponsorship
Engineering and Physical Sciences Research Council (EP/P031447/1)