TY - GEN
T1 - Linking Data Separation, Visual Separation, Classifier Performance Using Multidimensional Projections
AU - Benato, Bárbara C.
AU - Falcão, Alexandre X.
AU - Telea, Alexandru C.
N1 - Publisher Copyright:
© The Author(s), under exclusive license to Springer Nature Switzerland AG 2024.
PY - 2024/8/22
Y1 - 2024/8/22
N2 - Understanding how data separation (DS), visual separation (VS), and classifier performance (CP) are related to each other is important for applications in both machine learning and information visualization. A recent study showed that, for a specific machine learning pipeline using a given multidimensional projection technique, high DS leads to high VS and next high CP. However, whether such correlations would stay the same (or not) when using other projection techniques was left open. We fill this gap by evaluating ten projection techniques in a pipeline that uses three contrastive learning methods (SimCLR, SupCon, and their combination) to produce latent spaces and next train and test classifiers for five image datasets of real-world application with human intestinal parasites. Our work identifies two classes of projection techniques – one leading to poor VS and next poor CS regardless of the available DS, and the other showing a good DS-VS-CP correlation. We argue that this last group of projections is a useful instrument in classifier engineering tasks.
AB - Understanding how data separation (DS), visual separation (VS), and classifier performance (CP) are related to each other is important for applications in both machine learning and information visualization. A recent study showed that, for a specific machine learning pipeline using a given multidimensional projection technique, high DS leads to high VS and next high CP. However, whether such correlations would stay the same (or not) when using other projection techniques was left open. We fill this gap by evaluating ten projection techniques in a pipeline that uses three contrastive learning methods (SimCLR, SupCon, and their combination) to produce latent spaces and next train and test classifiers for five image datasets of real-world application with human intestinal parasites. Our work identifies two classes of projection techniques – one leading to poor VS and next poor CS regardless of the available DS, and the other showing a good DS-VS-CP correlation. We argue that this last group of projections is a useful instrument in classifier engineering tasks.
KW - Contrastive learning
KW - Data separation
KW - Dimensionality reduction algorithms
KW - Embedded pseudolabeling
KW - Image classification
KW - Semi-supervised learning
KW - Visual separation
UR - http://www.scopus.com/inward/record.url?scp=85202594236&partnerID=8YFLogxK
U2 - 10.1007/978-3-031-66743-5_11
DO - 10.1007/978-3-031-66743-5_11
M3 - Conference contribution
AN - SCOPUS:85202594236
SN - 978-3-031-66742-8
T3 - Communications in Computer and Information Science
SP - 229
EP - 255
BT - Computer Vision, Imaging and Computer Graphics Theory and Applications - 18th International Joint Conference on Computer Vision, Imaging and Computer Graphics, VISIGRAPP 2023, Revised Selected Papers
A2 - de Sousa, A. Augusto
A2 - Bashford-Rogers, Thomas
A2 - Paljic, Alexis
A2 - Ziat, Mounia
A2 - Hurter, Christophe
A2 - Purchase, Helen
A2 - Radeva, Petia
A2 - Farinella, Giovanni Maria
A2 - Bouatouch, Kadi
PB - Springer
T2 - 18th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications, VISIGRAPP 2023
Y2 - 19 February 2023 through 21 February 2023
ER -