CL-SR: Boosting Imbalanced Image Classification with Contrastive Learning and Synthetic Minority Oversampling Technique Based on Rough Set Theory Integration

Image recognition models often struggle with class imbalance, which can impede their performance. To overcome this issue, researchers have extensively used resampling methods, traditionally focused on tabular datasets. In contrast to the original method, which generates data at the data level, this...

Full description

Bibliographic Details
Published in:APPLIED SCIENCES-BASEL
Main Authors: Gao, Xiaoling; Jamil, Nursuriati; Ramli, Muhammad Izzad
Format: Article
Language:English
Published: MDPI 2024
Subjects:
Online Access:https://www-webofscience-com.uitm.idm.oclc.org/wos/woscc/full-recordWOS:001376255500001
Description
Summary:Image recognition models often struggle with class imbalance, which can impede their performance. To overcome this issue, researchers have extensively used resampling methods, traditionally focused on tabular datasets. In contrast to the original method, which generates data at the data level, this paper introduces a novel strategy that combines contrastive learning with the Synthetic Minority Oversampling Technique based on Rough Set Theory (SMOTE-RSB) specifically tailored for imbalanced image datasets. Our method leverages contrastive learning to refine representation learning and balance features, thus effectively mitigating the challenges of imbalanced image classification. We begin by extracting features using a pre-trained contrastive learning encoder. Subsequently, SMOTE-RSB is applied to these features to augment underrepresented classes and reduce irrelevant features. We evaluated our approach on several modified benchmark datasets, including CIFAR-10, SVHN, and ImageNet-LT, achieving notable improvements: an F1 score of 72.43% and a Gmean of 82.53% on the CIFAR-10 long-tailed dataset, F1 scores up to 79.57% and Gmean of 88.20% on various SVHN datasets, and a Top-1 accuracy of 68.67% on ImageNet-LT. Both qualitative and quantitative results confirm the effectiveness of our method in managing imbalances in image datasets. Additional ablation studies exploring various contrastive learning models and oversampling techniques highlight the flexibility and efficiency of our approach across different settings, underscoring its significant potential for enhancing imbalanced image classification.
ISSN:
2076-3417
DOI:10.3390/app142311093