Hypertension Prediction in Adolescents Using Anthropometric Measurements: Do Machine Learning Models Perform Equally Well?

The use of anthropometric measurements in machine learning algorithms for hypertension prediction enables the development of simple, non‐invasive prediction models. However, different machine learning algorithms were utilized in conjunction with various anthropometric data, either alone or in combin...

Full description

Bibliographic Details
Published in:	Applied Sciences (Switzerland)
Main Author:	Chai S.S.; Goh K.L.; Cheah W.L.; Chang Y.H.R.; Ng G.W.
Format:	Article
Language:	English
Published:	MDPI 2022
Online Access:	https://www.scopus.com/inward/record.uri?eid=2-s2.0-85124048353&doi=10.3390%2fapp12031600&partnerID=40&md5=3bf191d30724e2a933e6c61159115d7a

id	2-s2.0-85124048353
spelling	2-s2.0-85124048353 Chai S.S.; Goh K.L.; Cheah W.L.; Chang Y.H.R.; Ng G.W. Hypertension Prediction in Adolescents Using Anthropometric Measurements: Do Machine Learning Models Perform Equally Well? 2022 Applied Sciences (Switzerland) 12 3 10.3390/app12031600 https://www.scopus.com/inward/record.uri?eid=2-s2.0-85124048353&doi=10.3390%2fapp12031600&partnerID=40&md5=3bf191d30724e2a933e6c61159115d7a The use of anthropometric measurements in machine learning algorithms for hypertension prediction enables the development of simple, non‐invasive prediction models. However, different machine learning algorithms were utilized in conjunction with various anthropometric data, either alone or in combination with other biophysical and lifestyle variables. It is essential to assess the impacts of the chosen machine learning models using simple anthropometric measurements. We developed and tested 13 machine learning methods of neural network, ensemble, and classical categories to predict hypertension in adolescents using only simple anthropometric measurements. The imbalanced dataset of 2461 samples with 30.1% hypertension subjects was first partitioned into 90% for training and 10% for validation. The training dataset was reduced to eight simple anthro-pometric measurements: age, C index, ethnicity, gender, height, location, parental hypertension, and waist circumference using correlation coefficient. The Synthetic Minority Oversampling Technique (SMOTE) combined with random under‐sampling was used to balance the dataset. The models with optimal hyperparameters were assessed using accuracy, precision, sensitivity, specificity, F1‐score, misclassification rate, and AUC on the testing dataset. Across all seven performance measures, no model consistently outperformed the others. LightGBM was the best model for all six performance metrics, except sensitivity, whereas Decision Tree was the worst. We proposed using Bayes’ Theorem to assess the models’ applicability in the Sarawak adolescent population, resulting in the top four models being LightGBM, Random Forest, XGBoost, and CatBoost, and the bottom four models being Logistic Regression, LogitBoost, SVM, and Decision Tree. This study demon-strates that the choice of machine learning models has an effect on the prediction outcomes. © 2022 by the authors. Li-censee MDPI, Basel, Switzerland. MDPI 20763417 English Article All Open Access; Gold Open Access
author	Chai S.S.; Goh K.L.; Cheah W.L.; Chang Y.H.R.; Ng G.W.
spellingShingle	Chai S.S.; Goh K.L.; Cheah W.L.; Chang Y.H.R.; Ng G.W. Hypertension Prediction in Adolescents Using Anthropometric Measurements: Do Machine Learning Models Perform Equally Well?
author_facet	Chai S.S.; Goh K.L.; Cheah W.L.; Chang Y.H.R.; Ng G.W.
author_sort	Chai S.S.; Goh K.L.; Cheah W.L.; Chang Y.H.R.; Ng G.W.
title	Hypertension Prediction in Adolescents Using Anthropometric Measurements: Do Machine Learning Models Perform Equally Well?
title_short	Hypertension Prediction in Adolescents Using Anthropometric Measurements: Do Machine Learning Models Perform Equally Well?
title_full	Hypertension Prediction in Adolescents Using Anthropometric Measurements: Do Machine Learning Models Perform Equally Well?
title_fullStr	Hypertension Prediction in Adolescents Using Anthropometric Measurements: Do Machine Learning Models Perform Equally Well?
title_full_unstemmed	Hypertension Prediction in Adolescents Using Anthropometric Measurements: Do Machine Learning Models Perform Equally Well?
title_sort	Hypertension Prediction in Adolescents Using Anthropometric Measurements: Do Machine Learning Models Perform Equally Well?
publishDate	2022
container_title	Applied Sciences (Switzerland)
container_volume	12
container_issue	3
doi_str_mv	10.3390/app12031600
url	https://www.scopus.com/inward/record.uri?eid=2-s2.0-85124048353&doi=10.3390%2fapp12031600&partnerID=40&md5=3bf191d30724e2a933e6c61159115d7a
description	The use of anthropometric measurements in machine learning algorithms for hypertension prediction enables the development of simple, non‐invasive prediction models. However, different machine learning algorithms were utilized in conjunction with various anthropometric data, either alone or in combination with other biophysical and lifestyle variables. It is essential to assess the impacts of the chosen machine learning models using simple anthropometric measurements. We developed and tested 13 machine learning methods of neural network, ensemble, and classical categories to predict hypertension in adolescents using only simple anthropometric measurements. The imbalanced dataset of 2461 samples with 30.1% hypertension subjects was first partitioned into 90% for training and 10% for validation. The training dataset was reduced to eight simple anthro-pometric measurements: age, C index, ethnicity, gender, height, location, parental hypertension, and waist circumference using correlation coefficient. The Synthetic Minority Oversampling Technique (SMOTE) combined with random under‐sampling was used to balance the dataset. The models with optimal hyperparameters were assessed using accuracy, precision, sensitivity, specificity, F1‐score, misclassification rate, and AUC on the testing dataset. Across all seven performance measures, no model consistently outperformed the others. LightGBM was the best model for all six performance metrics, except sensitivity, whereas Decision Tree was the worst. We proposed using Bayes’ Theorem to assess the models’ applicability in the Sarawak adolescent population, resulting in the top four models being LightGBM, Random Forest, XGBoost, and CatBoost, and the bottom four models being Logistic Regression, LogitBoost, SVM, and Decision Tree. This study demon-strates that the choice of machine learning models has an effect on the prediction outcomes. © 2022 by the authors. Li-censee MDPI, Basel, Switzerland.
publisher	MDPI
issn	20763417
language	English
format	Article
accesstype	All Open Access; Gold Open Access
record_format	scopus
collection	Scopus
_version_	1809678025106653184

Hypertension Prediction in Adolescents Using Anthropometric Measurements: Do Machine Learning Models Perform Equally Well?

Similar Items