Identification of blood-based transcriptomics biomarkers for Alzheimer's disease using statistical and machine learning classifier

Alzheimer's disease (AD) is a neurodegenerative disorder that can be characterised by the gradual progression of memory loss, impairment of cognitive function, and progressive disability. This study aims to find the potential transcriptomics biomarkers that elucidate AD patients in Malaysia. Th...

Full description

Bibliographic Details
Published in:Informatics in Medicine Unlocked
Main Author: Abdullah M.N.; Wah Y.B.; Abdul Majeed A.B.; Zakaria Y.; Shaadan N.
Format: Article
Language:English
Published: Elsevier Ltd 2022
Online Access:https://www.scopus.com/inward/record.uri?eid=2-s2.0-85138033436&doi=10.1016%2fj.imu.2022.101083&partnerID=40&md5=3e96f501effffc42fe3e202d2643d02b
id 2-s2.0-85138033436
spelling 2-s2.0-85138033436
Abdullah M.N.; Wah Y.B.; Abdul Majeed A.B.; Zakaria Y.; Shaadan N.
Identification of blood-based transcriptomics biomarkers for Alzheimer's disease using statistical and machine learning classifier
2022
Informatics in Medicine Unlocked
33

10.1016/j.imu.2022.101083
https://www.scopus.com/inward/record.uri?eid=2-s2.0-85138033436&doi=10.1016%2fj.imu.2022.101083&partnerID=40&md5=3e96f501effffc42fe3e202d2643d02b
Alzheimer's disease (AD) is a neurodegenerative disorder that can be characterised by the gradual progression of memory loss, impairment of cognitive function, and progressive disability. This study aims to find the potential transcriptomics biomarkers that elucidate AD patients in Malaysia. The sample involves 92 AD patients and 92 non-AD subjects with 22,254 genes. Boruta's feature selection, a method to reduce the dimensionality of the transcriptomics dataset selected 68 genes. The classification performance of four statistical classifiers and three machine learning (ML) classifiers was evaluated based on sensitivity, precision, accuracy, and F-measure. The F-measure statistic (test set) for elastic net LR (mean = 0.9, sd = 0.05) and random forest (mean = 0.79, sd = 0.06) was found to be the highest as compared to other ML classifiers while naïve Bayes has the lowest F-measure (mean = 0.74, sd = 0.07). The elastic net logistic regression results showed there were 16 (4 novel biomarkers, 7 upregulated biomarkers, and 5 downregulated biomarkers) potential biomarkers for AD patients in Malaysia. The elastic net logistic regression model with 16 transcript genes has 81.59% accuracy and 85.19% sensitivity. The F-measure statistic for this model was 0.8159. © 2022 The Author(s)
Elsevier Ltd
23529148
English
Article
All Open Access; Gold Open Access
author Abdullah M.N.; Wah Y.B.; Abdul Majeed A.B.; Zakaria Y.; Shaadan N.
spellingShingle Abdullah M.N.; Wah Y.B.; Abdul Majeed A.B.; Zakaria Y.; Shaadan N.
Identification of blood-based transcriptomics biomarkers for Alzheimer's disease using statistical and machine learning classifier
author_facet Abdullah M.N.; Wah Y.B.; Abdul Majeed A.B.; Zakaria Y.; Shaadan N.
author_sort Abdullah M.N.; Wah Y.B.; Abdul Majeed A.B.; Zakaria Y.; Shaadan N.
title Identification of blood-based transcriptomics biomarkers for Alzheimer's disease using statistical and machine learning classifier
title_short Identification of blood-based transcriptomics biomarkers for Alzheimer's disease using statistical and machine learning classifier
title_full Identification of blood-based transcriptomics biomarkers for Alzheimer's disease using statistical and machine learning classifier
title_fullStr Identification of blood-based transcriptomics biomarkers for Alzheimer's disease using statistical and machine learning classifier
title_full_unstemmed Identification of blood-based transcriptomics biomarkers for Alzheimer's disease using statistical and machine learning classifier
title_sort Identification of blood-based transcriptomics biomarkers for Alzheimer's disease using statistical and machine learning classifier
publishDate 2022
container_title Informatics in Medicine Unlocked
container_volume 33
container_issue
doi_str_mv 10.1016/j.imu.2022.101083
url https://www.scopus.com/inward/record.uri?eid=2-s2.0-85138033436&doi=10.1016%2fj.imu.2022.101083&partnerID=40&md5=3e96f501effffc42fe3e202d2643d02b
description Alzheimer's disease (AD) is a neurodegenerative disorder that can be characterised by the gradual progression of memory loss, impairment of cognitive function, and progressive disability. This study aims to find the potential transcriptomics biomarkers that elucidate AD patients in Malaysia. The sample involves 92 AD patients and 92 non-AD subjects with 22,254 genes. Boruta's feature selection, a method to reduce the dimensionality of the transcriptomics dataset selected 68 genes. The classification performance of four statistical classifiers and three machine learning (ML) classifiers was evaluated based on sensitivity, precision, accuracy, and F-measure. The F-measure statistic (test set) for elastic net LR (mean = 0.9, sd = 0.05) and random forest (mean = 0.79, sd = 0.06) was found to be the highest as compared to other ML classifiers while naïve Bayes has the lowest F-measure (mean = 0.74, sd = 0.07). The elastic net logistic regression results showed there were 16 (4 novel biomarkers, 7 upregulated biomarkers, and 5 downregulated biomarkers) potential biomarkers for AD patients in Malaysia. The elastic net logistic regression model with 16 transcript genes has 81.59% accuracy and 85.19% sensitivity. The F-measure statistic for this model was 0.8159. © 2022 The Author(s)
publisher Elsevier Ltd
issn 23529148
language English
format Article
accesstype All Open Access; Gold Open Access
record_format scopus
collection Scopus
_version_ 1809677783039737856