Identification of blood-based transcriptomics biomarkers for Alzheimer's disease using statistical and machine learning classifier
Alzheimer's disease (AD) is a neurodegenerative disorder that can be characterised by the gradual progression of memory loss, impairment of cognitive function, and progressive disability. This study aims to find the potential transcriptomics biomarkers that elucidate AD patients in Malaysia. Th...
Published in: | Informatics in Medicine Unlocked |
---|---|
Main Author: | |
Format: | Article |
Language: | English |
Published: |
Elsevier Ltd
2022
|
Online Access: | https://www.scopus.com/inward/record.uri?eid=2-s2.0-85138033436&doi=10.1016%2fj.imu.2022.101083&partnerID=40&md5=3e96f501effffc42fe3e202d2643d02b |
id |
2-s2.0-85138033436 |
---|---|
spelling |
2-s2.0-85138033436 Abdullah M.N.; Wah Y.B.; Abdul Majeed A.B.; Zakaria Y.; Shaadan N. Identification of blood-based transcriptomics biomarkers for Alzheimer's disease using statistical and machine learning classifier 2022 Informatics in Medicine Unlocked 33 10.1016/j.imu.2022.101083 https://www.scopus.com/inward/record.uri?eid=2-s2.0-85138033436&doi=10.1016%2fj.imu.2022.101083&partnerID=40&md5=3e96f501effffc42fe3e202d2643d02b Alzheimer's disease (AD) is a neurodegenerative disorder that can be characterised by the gradual progression of memory loss, impairment of cognitive function, and progressive disability. This study aims to find the potential transcriptomics biomarkers that elucidate AD patients in Malaysia. The sample involves 92 AD patients and 92 non-AD subjects with 22,254 genes. Boruta's feature selection, a method to reduce the dimensionality of the transcriptomics dataset selected 68 genes. The classification performance of four statistical classifiers and three machine learning (ML) classifiers was evaluated based on sensitivity, precision, accuracy, and F-measure. The F-measure statistic (test set) for elastic net LR (mean = 0.9, sd = 0.05) and random forest (mean = 0.79, sd = 0.06) was found to be the highest as compared to other ML classifiers while naïve Bayes has the lowest F-measure (mean = 0.74, sd = 0.07). The elastic net logistic regression results showed there were 16 (4 novel biomarkers, 7 upregulated biomarkers, and 5 downregulated biomarkers) potential biomarkers for AD patients in Malaysia. The elastic net logistic regression model with 16 transcript genes has 81.59% accuracy and 85.19% sensitivity. The F-measure statistic for this model was 0.8159. © 2022 The Author(s) Elsevier Ltd 23529148 English Article All Open Access; Gold Open Access |
author |
Abdullah M.N.; Wah Y.B.; Abdul Majeed A.B.; Zakaria Y.; Shaadan N. |
spellingShingle |
Abdullah M.N.; Wah Y.B.; Abdul Majeed A.B.; Zakaria Y.; Shaadan N. Identification of blood-based transcriptomics biomarkers for Alzheimer's disease using statistical and machine learning classifier |
author_facet |
Abdullah M.N.; Wah Y.B.; Abdul Majeed A.B.; Zakaria Y.; Shaadan N. |
author_sort |
Abdullah M.N.; Wah Y.B.; Abdul Majeed A.B.; Zakaria Y.; Shaadan N. |
title |
Identification of blood-based transcriptomics biomarkers for Alzheimer's disease using statistical and machine learning classifier |
title_short |
Identification of blood-based transcriptomics biomarkers for Alzheimer's disease using statistical and machine learning classifier |
title_full |
Identification of blood-based transcriptomics biomarkers for Alzheimer's disease using statistical and machine learning classifier |
title_fullStr |
Identification of blood-based transcriptomics biomarkers for Alzheimer's disease using statistical and machine learning classifier |
title_full_unstemmed |
Identification of blood-based transcriptomics biomarkers for Alzheimer's disease using statistical and machine learning classifier |
title_sort |
Identification of blood-based transcriptomics biomarkers for Alzheimer's disease using statistical and machine learning classifier |
publishDate |
2022 |
container_title |
Informatics in Medicine Unlocked |
container_volume |
33 |
container_issue |
|
doi_str_mv |
10.1016/j.imu.2022.101083 |
url |
https://www.scopus.com/inward/record.uri?eid=2-s2.0-85138033436&doi=10.1016%2fj.imu.2022.101083&partnerID=40&md5=3e96f501effffc42fe3e202d2643d02b |
description |
Alzheimer's disease (AD) is a neurodegenerative disorder that can be characterised by the gradual progression of memory loss, impairment of cognitive function, and progressive disability. This study aims to find the potential transcriptomics biomarkers that elucidate AD patients in Malaysia. The sample involves 92 AD patients and 92 non-AD subjects with 22,254 genes. Boruta's feature selection, a method to reduce the dimensionality of the transcriptomics dataset selected 68 genes. The classification performance of four statistical classifiers and three machine learning (ML) classifiers was evaluated based on sensitivity, precision, accuracy, and F-measure. The F-measure statistic (test set) for elastic net LR (mean = 0.9, sd = 0.05) and random forest (mean = 0.79, sd = 0.06) was found to be the highest as compared to other ML classifiers while naïve Bayes has the lowest F-measure (mean = 0.74, sd = 0.07). The elastic net logistic regression results showed there were 16 (4 novel biomarkers, 7 upregulated biomarkers, and 5 downregulated biomarkers) potential biomarkers for AD patients in Malaysia. The elastic net logistic regression model with 16 transcript genes has 81.59% accuracy and 85.19% sensitivity. The F-measure statistic for this model was 0.8159. © 2022 The Author(s) |
publisher |
Elsevier Ltd |
issn |
23529148 |
language |
English |
format |
Article |
accesstype |
All Open Access; Gold Open Access |
record_format |
scopus |
collection |
Scopus |
_version_ |
1809677783039737856 |