Risk of Cardiovascular Heart Disease Using Data Mining Approach

Cardiovascular heart disease (CVD) stands as the primary global cause of death, with its prevalence increasing notably with age. This study utilised a data mining model to identify crucial risk variables for CVD, selecting the most effective model among decision trees, logistic regression, and artif...

Full description

Bibliographic Details
Published in:2024 5th International Conference on Artificial Intelligence and Data Sciences, AiDAS 2024 - Proceedings
Main Author: Shafri N.A.A.; Nizam N.A.A.M.; Hussin S.A.S.; Zahid Z.
Format: Conference paper
Language:English
Published: Institute of Electrical and Electronics Engineers Inc. 2024
Online Access:https://www.scopus.com/inward/record.uri?eid=2-s2.0-85209656603&doi=10.1109%2fAiDAS63860.2024.10730336&partnerID=40&md5=7adac5601deb3e49f427011c88c85059
id 2-s2.0-85209656603
spelling 2-s2.0-85209656603
Shafri N.A.A.; Nizam N.A.A.M.; Hussin S.A.S.; Zahid Z.
Risk of Cardiovascular Heart Disease Using Data Mining Approach
2024
2024 5th International Conference on Artificial Intelligence and Data Sciences, AiDAS 2024 - Proceedings


10.1109/AiDAS63860.2024.10730336
https://www.scopus.com/inward/record.uri?eid=2-s2.0-85209656603&doi=10.1109%2fAiDAS63860.2024.10730336&partnerID=40&md5=7adac5601deb3e49f427011c88c85059
Cardiovascular heart disease (CVD) stands as the primary global cause of death, with its prevalence increasing notably with age. This study utilised a data mining model to identify crucial risk variables for CVD, selecting the most effective model among decision trees, logistic regression, and artificial neural networks based on performance. Risk factors encompassed variables such as sex, age, education, smoking habits, medical history, and physiological measures. Following dataset analysis and cleaning, model validation involved assessing confusion matrices and ROC curves. SAS Enterprise Miner determined the variable importance, revealing that age significantly impacts CVD risk in decision tree datasets. In logistic regression, age emerged as the most crucial variable with the lowest p-value (p = 0.0036). The artificial neural network (ANN) highlighted seven variables with high R-squared values, indicating their contribution to CVD risk. The results, indicate that ANN achieved the highest evaluation in terms of sensitivity and accuracy, while Decision Tree has the highest value in specificity. In conclusion, the comparative analysis underscores the ANN as the optimal model for identifying CVD risk factors in the given dataset. © 2024 IEEE.
Institute of Electrical and Electronics Engineers Inc.

English
Conference paper

author Shafri N.A.A.; Nizam N.A.A.M.; Hussin S.A.S.; Zahid Z.
spellingShingle Shafri N.A.A.; Nizam N.A.A.M.; Hussin S.A.S.; Zahid Z.
Risk of Cardiovascular Heart Disease Using Data Mining Approach
author_facet Shafri N.A.A.; Nizam N.A.A.M.; Hussin S.A.S.; Zahid Z.
author_sort Shafri N.A.A.; Nizam N.A.A.M.; Hussin S.A.S.; Zahid Z.
title Risk of Cardiovascular Heart Disease Using Data Mining Approach
title_short Risk of Cardiovascular Heart Disease Using Data Mining Approach
title_full Risk of Cardiovascular Heart Disease Using Data Mining Approach
title_fullStr Risk of Cardiovascular Heart Disease Using Data Mining Approach
title_full_unstemmed Risk of Cardiovascular Heart Disease Using Data Mining Approach
title_sort Risk of Cardiovascular Heart Disease Using Data Mining Approach
publishDate 2024
container_title 2024 5th International Conference on Artificial Intelligence and Data Sciences, AiDAS 2024 - Proceedings
container_volume
container_issue
doi_str_mv 10.1109/AiDAS63860.2024.10730336
url https://www.scopus.com/inward/record.uri?eid=2-s2.0-85209656603&doi=10.1109%2fAiDAS63860.2024.10730336&partnerID=40&md5=7adac5601deb3e49f427011c88c85059
description Cardiovascular heart disease (CVD) stands as the primary global cause of death, with its prevalence increasing notably with age. This study utilised a data mining model to identify crucial risk variables for CVD, selecting the most effective model among decision trees, logistic regression, and artificial neural networks based on performance. Risk factors encompassed variables such as sex, age, education, smoking habits, medical history, and physiological measures. Following dataset analysis and cleaning, model validation involved assessing confusion matrices and ROC curves. SAS Enterprise Miner determined the variable importance, revealing that age significantly impacts CVD risk in decision tree datasets. In logistic regression, age emerged as the most crucial variable with the lowest p-value (p = 0.0036). The artificial neural network (ANN) highlighted seven variables with high R-squared values, indicating their contribution to CVD risk. The results, indicate that ANN achieved the highest evaluation in terms of sensitivity and accuracy, while Decision Tree has the highest value in specificity. In conclusion, the comparative analysis underscores the ANN as the optimal model for identifying CVD risk factors in the given dataset. © 2024 IEEE.
publisher Institute of Electrical and Electronics Engineers Inc.
issn
language English
format Conference paper
accesstype
record_format scopus
collection Scopus
_version_ 1820775439303442432