Bayesian Optimization Cost-Sensitive XGBoost Learning Algorithm for Imbalanced Data in Semiconductor Industry

This paper proposes an improved ensemble learning model based on extreme gradient boosting (XGBoost) with Bayesian optimization cost-sensitive learning algorithm for dealing with highly imbalanced data in the semiconductor process to achieve the highest possible pass and fail accuracy or recall for...

Full description

Bibliographic Details
Published in:JORDAN JOURNAL OF ELECTRICAL ENGINEERING
Main Authors: Shamsudin, Haziqah; Yusof, Umi Kalsom; Kashif, Fizza; Isa, Iza Sazanita
Format: Article
Language:English
Published: TAFILA TECHNICAL UNIV (TTU) 2023
Subjects:
Online Access:https://www-webofscience-com.uitm.idm.oclc.org/wos/woscc/full-record/WOS:001114780100006
author Shamsudin
Haziqah; Yusof
Umi Kalsom; Kashif
Fizza; Isa
Iza Sazanita
spellingShingle Shamsudin
Haziqah; Yusof
Umi Kalsom; Kashif
Fizza; Isa
Iza Sazanita
Bayesian Optimization Cost-Sensitive XGBoost Learning Algorithm for Imbalanced Data in Semiconductor Industry
Computer Science; Engineering
author_facet Shamsudin
Haziqah; Yusof
Umi Kalsom; Kashif
Fizza; Isa
Iza Sazanita
author_sort Shamsudin
spelling Shamsudin, Haziqah; Yusof, Umi Kalsom; Kashif, Fizza; Isa, Iza Sazanita
Bayesian Optimization Cost-Sensitive XGBoost Learning Algorithm for Imbalanced Data in Semiconductor Industry
JORDAN JOURNAL OF ELECTRICAL ENGINEERING
English
Article
This paper proposes an improved ensemble learning model based on extreme gradient boosting (XGBoost) with Bayesian optimization cost-sensitive learning algorithm for dealing with highly imbalanced data in the semiconductor process to achieve the highest possible pass and fail accuracy or recall for the classification performances. Most of the existing models are biased toward the majority class neglecting the minority class. The proposed Bayesian optimization cost-sensitive XGboost model is configured to be applied to the semiconductor dataset. The obtained experimental results - based on benchmarking semiconductor industry dataset - show 91.46% and 23.08% for the pass and fail accuracies, respectively. This confirms that the proposed model is significant for imbalanced cases in semiconductor applications. Moreover, this investigation reveals that the proposed model is able not only to maintain the performance of the majority class, but also to classify well the minority class.
TAFILA TECHNICAL UNIV (TTU)
2409-9600
2409-9619
2023
9
4
10.5455/jjee.204-1671971895
Computer Science; Engineering

WOS:001114780100006
https://www-webofscience-com.uitm.idm.oclc.org/wos/woscc/full-record/WOS:001114780100006
title Bayesian Optimization Cost-Sensitive XGBoost Learning Algorithm for Imbalanced Data in Semiconductor Industry
title_short Bayesian Optimization Cost-Sensitive XGBoost Learning Algorithm for Imbalanced Data in Semiconductor Industry
title_full Bayesian Optimization Cost-Sensitive XGBoost Learning Algorithm for Imbalanced Data in Semiconductor Industry
title_fullStr Bayesian Optimization Cost-Sensitive XGBoost Learning Algorithm for Imbalanced Data in Semiconductor Industry
title_full_unstemmed Bayesian Optimization Cost-Sensitive XGBoost Learning Algorithm for Imbalanced Data in Semiconductor Industry
title_sort Bayesian Optimization Cost-Sensitive XGBoost Learning Algorithm for Imbalanced Data in Semiconductor Industry
container_title JORDAN JOURNAL OF ELECTRICAL ENGINEERING
language English
format Article
description This paper proposes an improved ensemble learning model based on extreme gradient boosting (XGBoost) with Bayesian optimization cost-sensitive learning algorithm for dealing with highly imbalanced data in the semiconductor process to achieve the highest possible pass and fail accuracy or recall for the classification performances. Most of the existing models are biased toward the majority class neglecting the minority class. The proposed Bayesian optimization cost-sensitive XGboost model is configured to be applied to the semiconductor dataset. The obtained experimental results - based on benchmarking semiconductor industry dataset - show 91.46% and 23.08% for the pass and fail accuracies, respectively. This confirms that the proposed model is significant for imbalanced cases in semiconductor applications. Moreover, this investigation reveals that the proposed model is able not only to maintain the performance of the majority class, but also to classify well the minority class.
publisher TAFILA TECHNICAL UNIV (TTU)
issn 2409-9600
2409-9619
publishDate 2023
container_volume 9
container_issue 4
doi_str_mv 10.5455/jjee.204-1671971895
topic Computer Science; Engineering
topic_facet Computer Science; Engineering
accesstype
id WOS:001114780100006
url https://www-webofscience-com.uitm.idm.oclc.org/wos/woscc/full-record/WOS:001114780100006
record_format wos
collection Web of Science (WoS)
_version_ 1809678578129829888