Bayesian Optimization Cost-Sensitive XGBoost Learning Algorithm for Imbalanced Data in Semiconductor Industry
This paper proposes an improved ensemble learning model based on extreme gradient boosting (XGBoost) with Bayesian optimization cost-sensitive learning algorithm for dealing with highly imbalanced data in the semiconductor process to achieve the highest possible pass and fail accuracy or recall for...
Published in: | JORDAN JOURNAL OF ELECTRICAL ENGINEERING |
---|---|
Main Authors: | , , , , |
Format: | Article |
Language: | English |
Published: |
TAFILA TECHNICAL UNIV (TTU)
2023
|
Subjects: | |
Online Access: | https://www-webofscience-com.uitm.idm.oclc.org/wos/woscc/full-record/WOS:001114780100006 |
author |
Shamsudin Haziqah; Yusof Umi Kalsom; Kashif Fizza; Isa Iza Sazanita |
---|---|
spellingShingle |
Shamsudin Haziqah; Yusof Umi Kalsom; Kashif Fizza; Isa Iza Sazanita Bayesian Optimization Cost-Sensitive XGBoost Learning Algorithm for Imbalanced Data in Semiconductor Industry Computer Science; Engineering |
author_facet |
Shamsudin Haziqah; Yusof Umi Kalsom; Kashif Fizza; Isa Iza Sazanita |
author_sort |
Shamsudin |
spelling |
Shamsudin, Haziqah; Yusof, Umi Kalsom; Kashif, Fizza; Isa, Iza Sazanita Bayesian Optimization Cost-Sensitive XGBoost Learning Algorithm for Imbalanced Data in Semiconductor Industry JORDAN JOURNAL OF ELECTRICAL ENGINEERING English Article This paper proposes an improved ensemble learning model based on extreme gradient boosting (XGBoost) with Bayesian optimization cost-sensitive learning algorithm for dealing with highly imbalanced data in the semiconductor process to achieve the highest possible pass and fail accuracy or recall for the classification performances. Most of the existing models are biased toward the majority class neglecting the minority class. The proposed Bayesian optimization cost-sensitive XGboost model is configured to be applied to the semiconductor dataset. The obtained experimental results - based on benchmarking semiconductor industry dataset - show 91.46% and 23.08% for the pass and fail accuracies, respectively. This confirms that the proposed model is significant for imbalanced cases in semiconductor applications. Moreover, this investigation reveals that the proposed model is able not only to maintain the performance of the majority class, but also to classify well the minority class. TAFILA TECHNICAL UNIV (TTU) 2409-9600 2409-9619 2023 9 4 10.5455/jjee.204-1671971895 Computer Science; Engineering WOS:001114780100006 https://www-webofscience-com.uitm.idm.oclc.org/wos/woscc/full-record/WOS:001114780100006 |
title |
Bayesian Optimization Cost-Sensitive XGBoost Learning Algorithm for Imbalanced Data in Semiconductor Industry |
title_short |
Bayesian Optimization Cost-Sensitive XGBoost Learning Algorithm for Imbalanced Data in Semiconductor Industry |
title_full |
Bayesian Optimization Cost-Sensitive XGBoost Learning Algorithm for Imbalanced Data in Semiconductor Industry |
title_fullStr |
Bayesian Optimization Cost-Sensitive XGBoost Learning Algorithm for Imbalanced Data in Semiconductor Industry |
title_full_unstemmed |
Bayesian Optimization Cost-Sensitive XGBoost Learning Algorithm for Imbalanced Data in Semiconductor Industry |
title_sort |
Bayesian Optimization Cost-Sensitive XGBoost Learning Algorithm for Imbalanced Data in Semiconductor Industry |
container_title |
JORDAN JOURNAL OF ELECTRICAL ENGINEERING |
language |
English |
format |
Article |
description |
This paper proposes an improved ensemble learning model based on extreme gradient boosting (XGBoost) with Bayesian optimization cost-sensitive learning algorithm for dealing with highly imbalanced data in the semiconductor process to achieve the highest possible pass and fail accuracy or recall for the classification performances. Most of the existing models are biased toward the majority class neglecting the minority class. The proposed Bayesian optimization cost-sensitive XGboost model is configured to be applied to the semiconductor dataset. The obtained experimental results - based on benchmarking semiconductor industry dataset - show 91.46% and 23.08% for the pass and fail accuracies, respectively. This confirms that the proposed model is significant for imbalanced cases in semiconductor applications. Moreover, this investigation reveals that the proposed model is able not only to maintain the performance of the majority class, but also to classify well the minority class. |
publisher |
TAFILA TECHNICAL UNIV (TTU) |
issn |
2409-9600 2409-9619 |
publishDate |
2023 |
container_volume |
9 |
container_issue |
4 |
doi_str_mv |
10.5455/jjee.204-1671971895 |
topic |
Computer Science; Engineering |
topic_facet |
Computer Science; Engineering |
accesstype |
|
id |
WOS:001114780100006 |
url |
https://www-webofscience-com.uitm.idm.oclc.org/wos/woscc/full-record/WOS:001114780100006 |
record_format |
wos |
collection |
Web of Science (WoS) |
_version_ |
1809678578129829888 |