Abstract
The absence of large annotated datasets to train deep neural networks (DNNs) is an issue since manual annotation is time-consuming, expensive, and error-prone. Semi-supervised learning techniques can address the problem propagating pseudo labels from supervised to unsupervised samples. However, they still require training and validation sets with many supervised samples. This work proposes a methodology, namely Deep Feature Annotation (DeepFA), that dismisses the validation set and uses very few supervised samples (e.g., 1% of the dataset). DeepFA modifies the feature spaces of a DNN along with meta-pseudo-labeling iterations in a 2D non-linear projection space using the most confidently labeled samples of an optimum-path forest semi-supervised classifier. We present a comprehensive study on DeepFA and a new variant that detects the best DNN model for generalization during the pseudo-labeling iterations. We evaluate components of DeepFA on eight datasets, finding the best DeepFA approach and showing that it outperforms self-pseudo-labeling.
Original language | English |
---|---|
Article number | 109649 |
Pages (from-to) | 1-16 |
Number of pages | 16 |
Journal | Pattern Recognition |
Volume | 141 |
Early online date | 29 Apr 2023 |
DOIs | |
Publication status | Published - Sept 2023 |
Bibliographical note
Funding Information:The authors acknowledge FAPESP grants #2014/12236−1, #2019/10705−8, #2022/12668-5, CAPES grants with Finance Code 001, and CNPq grants #303808/2018−7.
Funding Information:
The authors acknowledge FAPESP grants , , #2022/12668-5, CAPES grants with Finance Code 001, and CNPq grants .
Publisher Copyright:
© 2023
Keywords
- Pseudo-labeling
- Deep feature annotation
- Semi-supervised learning
- Feature space projection
- Data annotation