Image Classification with Deep Learning in the Presence of Noisy Labels: A Survey

Image classification systems recently made a giant leap with the advancement
of deep neural networks. However, these systems require an excessive amount of
labeled data to be adequately trained. Gathering a correctly annotated dataset
is not always feasible due to several factors, such as the expensiveness of the
labeling process or difficulty of correctly classifying data, even for the
experts. Because of these practical challenges, label noise is a common problem
in real-world datasets, and numerous methods to train deep neural networks with
label noise are proposed in the literature. Although deep neural networks are
known to be relatively robust to label noise, their tendency to overfit data
makes them vulnerable to memorizing even random noise. Therefore, it is crucial
to consider the existence of label noise and develop counter algorithms to fade
away its adverse effects to train deep neural networks efficiently. Even though
an extensive survey of machine learning techniques under label noise exists,
the literature lacks a comprehensive survey of methodologies centered
explicitly around deep learning in the presence of noisy labels. This paper
aims to present these algorithms while categorizing them into one of the two
subgroups: noise model based and noise model free methods. Algorithms in the
first group aim to estimate the noise structure and use this information to
avoid the adverse effects of noisy labels. Differently, methods in the second
group try to come up with inherently noise robust algorithms by using
approaches like robust losses, regularizers or other learning paradigms.