Type 2 Diabetes Mellitus Prediction Using Data Mining Approach

Diabetes mellitus is a chronic, long-term condition that significantly impacts public health and socioeconomic growth worldwide. It has been established that risk prediction models can benefit clinical management decisions by targeting patients at a higher risk of developing type 2 diabetes. This me...

Full description

Bibliographic Details
Published in:2023 IEEE International Conference on Computing, ICOCO 2023
Main Author: Halias A.F.; Saiful N.H.; Ibrahim N.; Muhamad Jamil S.A.; Mansor M.M.; Ul - Saufie A.Z.; Md Ghani N.A.
Format: Conference paper
Language:English
Published: Institute of Electrical and Electronics Engineers Inc. 2023
Online Access:https://www.scopus.com/inward/record.uri?eid=2-s2.0-85184854222&doi=10.1109%2fICOCO59262.2023.10398078&partnerID=40&md5=63b7333b0b31e61b0f6f558f6418ae2c
id 2-s2.0-85184854222
spelling 2-s2.0-85184854222
Halias A.F.; Saiful N.H.; Ibrahim N.; Muhamad Jamil S.A.; Mansor M.M.; Ul - Saufie A.Z.; Md Ghani N.A.
Type 2 Diabetes Mellitus Prediction Using Data Mining Approach
2023
2023 IEEE International Conference on Computing, ICOCO 2023


10.1109/ICOCO59262.2023.10398078
https://www.scopus.com/inward/record.uri?eid=2-s2.0-85184854222&doi=10.1109%2fICOCO59262.2023.10398078&partnerID=40&md5=63b7333b0b31e61b0f6f558f6418ae2c
Diabetes mellitus is a chronic, long-term condition that significantly impacts public health and socioeconomic growth worldwide. It has been established that risk prediction models can benefit clinical management decisions by targeting patients at a higher risk of developing type 2 diabetes. This metric is critical for those at a higher risk of developing type 2 diabetes mellitus regarding healthcare and lifestyle changes. As a result, this study aims to uncover significant risk factors for type 2 diabetes mellitus classification. Another goal of this research is to discover the best prediction models by assessing the accuracy rate of each model. This study employs several classification models, includinglogistics regression, decision tree and naïve bayes. Moreover, several feature selection methods are employed in the classification model: forward selection, backward elimination and optimized selection (evolutionary). The analysis was conducted by using Diabetes BRFSS2015 dataset, which is obtained from Kaggle website. This dataset consists of 76902 observations with 20 explanatory variables and one target variable with dichotomous classification. The study's findings show that only eight of the 20 risk factors in the prediction models are identified as significant. Age, GenHlth, Sex, HvyAlcoholConsump, HighBP, HighChol, NoDocbcNoCost, and Veggies are all important riskfactors. Furthermore, among the nine prediction models, logistic regression with optimal selection had the highest accuracy rate of 75.61%. As a result, logistic regression with the optimum selection approach is the best model for predicting the prevalence of type 2 diabetes type. The study hopes to promote awareness and provide more insight into the risk factors for type 2 diabetes. Type 2 diabetes could be correctly predicted and recognised early, resulting in prompt, effective treatments and reduced consequences. © 2023 IEEE.
Institute of Electrical and Electronics Engineers Inc.

English
Conference paper

author Halias A.F.; Saiful N.H.; Ibrahim N.; Muhamad Jamil S.A.; Mansor M.M.; Ul - Saufie A.Z.; Md Ghani N.A.
spellingShingle Halias A.F.; Saiful N.H.; Ibrahim N.; Muhamad Jamil S.A.; Mansor M.M.; Ul - Saufie A.Z.; Md Ghani N.A.
Type 2 Diabetes Mellitus Prediction Using Data Mining Approach
author_facet Halias A.F.; Saiful N.H.; Ibrahim N.; Muhamad Jamil S.A.; Mansor M.M.; Ul - Saufie A.Z.; Md Ghani N.A.
author_sort Halias A.F.; Saiful N.H.; Ibrahim N.; Muhamad Jamil S.A.; Mansor M.M.; Ul - Saufie A.Z.; Md Ghani N.A.
title Type 2 Diabetes Mellitus Prediction Using Data Mining Approach
title_short Type 2 Diabetes Mellitus Prediction Using Data Mining Approach
title_full Type 2 Diabetes Mellitus Prediction Using Data Mining Approach
title_fullStr Type 2 Diabetes Mellitus Prediction Using Data Mining Approach
title_full_unstemmed Type 2 Diabetes Mellitus Prediction Using Data Mining Approach
title_sort Type 2 Diabetes Mellitus Prediction Using Data Mining Approach
publishDate 2023
container_title 2023 IEEE International Conference on Computing, ICOCO 2023
container_volume
container_issue
doi_str_mv 10.1109/ICOCO59262.2023.10398078
url https://www.scopus.com/inward/record.uri?eid=2-s2.0-85184854222&doi=10.1109%2fICOCO59262.2023.10398078&partnerID=40&md5=63b7333b0b31e61b0f6f558f6418ae2c
description Diabetes mellitus is a chronic, long-term condition that significantly impacts public health and socioeconomic growth worldwide. It has been established that risk prediction models can benefit clinical management decisions by targeting patients at a higher risk of developing type 2 diabetes. This metric is critical for those at a higher risk of developing type 2 diabetes mellitus regarding healthcare and lifestyle changes. As a result, this study aims to uncover significant risk factors for type 2 diabetes mellitus classification. Another goal of this research is to discover the best prediction models by assessing the accuracy rate of each model. This study employs several classification models, includinglogistics regression, decision tree and naïve bayes. Moreover, several feature selection methods are employed in the classification model: forward selection, backward elimination and optimized selection (evolutionary). The analysis was conducted by using Diabetes BRFSS2015 dataset, which is obtained from Kaggle website. This dataset consists of 76902 observations with 20 explanatory variables and one target variable with dichotomous classification. The study's findings show that only eight of the 20 risk factors in the prediction models are identified as significant. Age, GenHlth, Sex, HvyAlcoholConsump, HighBP, HighChol, NoDocbcNoCost, and Veggies are all important riskfactors. Furthermore, among the nine prediction models, logistic regression with optimal selection had the highest accuracy rate of 75.61%. As a result, logistic regression with the optimum selection approach is the best model for predicting the prevalence of type 2 diabetes type. The study hopes to promote awareness and provide more insight into the risk factors for type 2 diabetes. Type 2 diabetes could be correctly predicted and recognised early, resulting in prompt, effective treatments and reduced consequences. © 2023 IEEE.
publisher Institute of Electrical and Electronics Engineers Inc.
issn
language English
format Conference paper
accesstype
record_format scopus
collection Scopus
_version_ 1809677888891387904