AN OPTIMIZED SUPPORT VECTOR MACHINE WITH GENETIC ALGORITHM FOR IMBALANCED DATA CLASSIFICATION

In supervised machine learning, class imbalance is commonly occurring when the number of examples that represent one class is much lower than other classes. Since an imbalance data may generate suboptimal classification models, it could lead to the minority examples are misclassified frequently and...

Full description

Bibliographic Details
Published in:Jurnal Teknologi
Main Author: Shamsudin H.; Yusof U.K.; Haijie Y.; Isa I.S.
Format: Article
Language:English
Published: Penerbit UTM Press 2023
Online Access:https://www.scopus.com/inward/record.uri?eid=2-s2.0-85163709116&doi=10.11113%2fjurnalteknologi.v85.19695&partnerID=40&md5=b961262df83e604e982eb7c7fbf3cbfc
id 2-s2.0-85163709116
spelling 2-s2.0-85163709116
Shamsudin H.; Yusof U.K.; Haijie Y.; Isa I.S.
AN OPTIMIZED SUPPORT VECTOR MACHINE WITH GENETIC ALGORITHM FOR IMBALANCED DATA CLASSIFICATION
2023
Jurnal Teknologi
85
4
10.11113/jurnalteknologi.v85.19695
https://www.scopus.com/inward/record.uri?eid=2-s2.0-85163709116&doi=10.11113%2fjurnalteknologi.v85.19695&partnerID=40&md5=b961262df83e604e982eb7c7fbf3cbfc
In supervised machine learning, class imbalance is commonly occurring when the number of examples that represent one class is much lower than other classes. Since an imbalance data may generate suboptimal classification models, it could lead to the minority examples are misclassified frequently and hardly achieving the best performance. This study proposes an improved support vector machine (SVM) method for imbalanced data namely as SVM-GA by optimizing SVM algorithm with Genetic Algorithm (GA) over a synthetic minority oversampling technique. Besides considering the best sampling method in optimized SVM, the experimental result shows that the proposed method improves by 97% compared to the baseline model and selected optimized models. The proposed model had significant performance by outperformed the baseline model and other models based SVM with Grid search and Randomized search in most of the cases, especially for the datasets which have extremely rare cases. © 2023, Penerbit UTM Press. All rights reserved.
Penerbit UTM Press
1279696
English
Article
All Open Access; Gold Open Access
author Shamsudin H.; Yusof U.K.; Haijie Y.; Isa I.S.
spellingShingle Shamsudin H.; Yusof U.K.; Haijie Y.; Isa I.S.
AN OPTIMIZED SUPPORT VECTOR MACHINE WITH GENETIC ALGORITHM FOR IMBALANCED DATA CLASSIFICATION
author_facet Shamsudin H.; Yusof U.K.; Haijie Y.; Isa I.S.
author_sort Shamsudin H.; Yusof U.K.; Haijie Y.; Isa I.S.
title AN OPTIMIZED SUPPORT VECTOR MACHINE WITH GENETIC ALGORITHM FOR IMBALANCED DATA CLASSIFICATION
title_short AN OPTIMIZED SUPPORT VECTOR MACHINE WITH GENETIC ALGORITHM FOR IMBALANCED DATA CLASSIFICATION
title_full AN OPTIMIZED SUPPORT VECTOR MACHINE WITH GENETIC ALGORITHM FOR IMBALANCED DATA CLASSIFICATION
title_fullStr AN OPTIMIZED SUPPORT VECTOR MACHINE WITH GENETIC ALGORITHM FOR IMBALANCED DATA CLASSIFICATION
title_full_unstemmed AN OPTIMIZED SUPPORT VECTOR MACHINE WITH GENETIC ALGORITHM FOR IMBALANCED DATA CLASSIFICATION
title_sort AN OPTIMIZED SUPPORT VECTOR MACHINE WITH GENETIC ALGORITHM FOR IMBALANCED DATA CLASSIFICATION
publishDate 2023
container_title Jurnal Teknologi
container_volume 85
container_issue 4
doi_str_mv 10.11113/jurnalteknologi.v85.19695
url https://www.scopus.com/inward/record.uri?eid=2-s2.0-85163709116&doi=10.11113%2fjurnalteknologi.v85.19695&partnerID=40&md5=b961262df83e604e982eb7c7fbf3cbfc
description In supervised machine learning, class imbalance is commonly occurring when the number of examples that represent one class is much lower than other classes. Since an imbalance data may generate suboptimal classification models, it could lead to the minority examples are misclassified frequently and hardly achieving the best performance. This study proposes an improved support vector machine (SVM) method for imbalanced data namely as SVM-GA by optimizing SVM algorithm with Genetic Algorithm (GA) over a synthetic minority oversampling technique. Besides considering the best sampling method in optimized SVM, the experimental result shows that the proposed method improves by 97% compared to the baseline model and selected optimized models. The proposed model had significant performance by outperformed the baseline model and other models based SVM with Grid search and Randomized search in most of the cases, especially for the datasets which have extremely rare cases. © 2023, Penerbit UTM Press. All rights reserved.
publisher Penerbit UTM Press
issn 1279696
language English
format Article
accesstype All Open Access; Gold Open Access
record_format scopus
collection Scopus
_version_ 1809677887319572480