Risk of Cardiovascular Heart Disease Using Data Mining Approach
Cardiovascular heart disease (CVD) stands as the primary global cause of death, with its prevalence increasing notably with age. This study utilised a data mining model to identify crucial risk variables for CVD, selecting the most effective model among decision trees, logistic regression, and artif...
Published in: | 2024 5th International Conference on Artificial Intelligence and Data Sciences, AiDAS 2024 - Proceedings |
---|---|
Main Author: | |
Format: | Conference paper |
Language: | English |
Published: |
Institute of Electrical and Electronics Engineers Inc.
2024
|
Online Access: | https://www.scopus.com/inward/record.uri?eid=2-s2.0-85209656603&doi=10.1109%2fAiDAS63860.2024.10730336&partnerID=40&md5=7adac5601deb3e49f427011c88c85059 |
id |
2-s2.0-85209656603 |
---|---|
spelling |
2-s2.0-85209656603 Shafri N.A.A.; Nizam N.A.A.M.; Hussin S.A.S.; Zahid Z. Risk of Cardiovascular Heart Disease Using Data Mining Approach 2024 2024 5th International Conference on Artificial Intelligence and Data Sciences, AiDAS 2024 - Proceedings 10.1109/AiDAS63860.2024.10730336 https://www.scopus.com/inward/record.uri?eid=2-s2.0-85209656603&doi=10.1109%2fAiDAS63860.2024.10730336&partnerID=40&md5=7adac5601deb3e49f427011c88c85059 Cardiovascular heart disease (CVD) stands as the primary global cause of death, with its prevalence increasing notably with age. This study utilised a data mining model to identify crucial risk variables for CVD, selecting the most effective model among decision trees, logistic regression, and artificial neural networks based on performance. Risk factors encompassed variables such as sex, age, education, smoking habits, medical history, and physiological measures. Following dataset analysis and cleaning, model validation involved assessing confusion matrices and ROC curves. SAS Enterprise Miner determined the variable importance, revealing that age significantly impacts CVD risk in decision tree datasets. In logistic regression, age emerged as the most crucial variable with the lowest p-value (p = 0.0036). The artificial neural network (ANN) highlighted seven variables with high R-squared values, indicating their contribution to CVD risk. The results, indicate that ANN achieved the highest evaluation in terms of sensitivity and accuracy, while Decision Tree has the highest value in specificity. In conclusion, the comparative analysis underscores the ANN as the optimal model for identifying CVD risk factors in the given dataset. © 2024 IEEE. Institute of Electrical and Electronics Engineers Inc. English Conference paper |
author |
Shafri N.A.A.; Nizam N.A.A.M.; Hussin S.A.S.; Zahid Z. |
spellingShingle |
Shafri N.A.A.; Nizam N.A.A.M.; Hussin S.A.S.; Zahid Z. Risk of Cardiovascular Heart Disease Using Data Mining Approach |
author_facet |
Shafri N.A.A.; Nizam N.A.A.M.; Hussin S.A.S.; Zahid Z. |
author_sort |
Shafri N.A.A.; Nizam N.A.A.M.; Hussin S.A.S.; Zahid Z. |
title |
Risk of Cardiovascular Heart Disease Using Data Mining Approach |
title_short |
Risk of Cardiovascular Heart Disease Using Data Mining Approach |
title_full |
Risk of Cardiovascular Heart Disease Using Data Mining Approach |
title_fullStr |
Risk of Cardiovascular Heart Disease Using Data Mining Approach |
title_full_unstemmed |
Risk of Cardiovascular Heart Disease Using Data Mining Approach |
title_sort |
Risk of Cardiovascular Heart Disease Using Data Mining Approach |
publishDate |
2024 |
container_title |
2024 5th International Conference on Artificial Intelligence and Data Sciences, AiDAS 2024 - Proceedings |
container_volume |
|
container_issue |
|
doi_str_mv |
10.1109/AiDAS63860.2024.10730336 |
url |
https://www.scopus.com/inward/record.uri?eid=2-s2.0-85209656603&doi=10.1109%2fAiDAS63860.2024.10730336&partnerID=40&md5=7adac5601deb3e49f427011c88c85059 |
description |
Cardiovascular heart disease (CVD) stands as the primary global cause of death, with its prevalence increasing notably with age. This study utilised a data mining model to identify crucial risk variables for CVD, selecting the most effective model among decision trees, logistic regression, and artificial neural networks based on performance. Risk factors encompassed variables such as sex, age, education, smoking habits, medical history, and physiological measures. Following dataset analysis and cleaning, model validation involved assessing confusion matrices and ROC curves. SAS Enterprise Miner determined the variable importance, revealing that age significantly impacts CVD risk in decision tree datasets. In logistic regression, age emerged as the most crucial variable with the lowest p-value (p = 0.0036). The artificial neural network (ANN) highlighted seven variables with high R-squared values, indicating their contribution to CVD risk. The results, indicate that ANN achieved the highest evaluation in terms of sensitivity and accuracy, while Decision Tree has the highest value in specificity. In conclusion, the comparative analysis underscores the ANN as the optimal model for identifying CVD risk factors in the given dataset. © 2024 IEEE. |
publisher |
Institute of Electrical and Electronics Engineers Inc. |
issn |
|
language |
English |
format |
Conference paper |
accesstype |
|
record_format |
scopus |
collection |
Scopus |
_version_ |
1820775439303442432 |