Association features of smote and rose for drug addiction relapse risk
Drug addiction is a major problem in many countries, with rehabilitation and treatment clinics playing a critical role in aiding drug addicts’ recovery. Thus, the issue requires an effective automated system that can predict the likelihood of relapse in addicts. The system uses a dataset to train an...
Published in: | Journal of King Saud University - Computer and Information Sciences |
---|---|
Main Author: | |
Format: | Article |
Language: | English |
Published: |
King Saud bin Abdulaziz University
2022
|
Online Access: | https://www.scopus.com/inward/record.uri?eid=2-s2.0-85133815944&doi=10.1016%2fj.jksuci.2022.06.012&partnerID=40&md5=614b8c8b77819d9ab2ba794176d6be85 |
id |
2-s2.0-85133815944 |
---|---|
spelling |
2-s2.0-85133815944 Selamat N.A.; Abdullah A.; Mat Diah N. Association features of smote and rose for drug addiction relapse risk 2022 Journal of King Saud University - Computer and Information Sciences 34 9 10.1016/j.jksuci.2022.06.012 https://www.scopus.com/inward/record.uri?eid=2-s2.0-85133815944&doi=10.1016%2fj.jksuci.2022.06.012&partnerID=40&md5=614b8c8b77819d9ab2ba794176d6be85 Drug addiction is a major problem in many countries, with rehabilitation and treatment clinics playing a critical role in aiding drug addicts’ recovery. Thus, the issue requires an effective automated system that can predict the likelihood of relapse in addicts. The system uses a dataset to train and test a machine learning algorithm for the automatic classification of drug patients. Nonetheless, the problem in training a machine learning classifier includes imbalanced classes, which can increase problems of overfitting and hinder generalization performance. The study proposed an association feature rule to combine the two most common over-sampling techniques: Synthetic Minority Over-Sampling Technique (SMOTE) and Random Over-Sampling Technique (ROSE) to balance the number of samples between classes, extending the problem feature space. Accordingly, the Random Forest algorithm is employed to classify new instances. The cross-experiments results on the Validi drug relapse dataset showed that the proposed combination approach outperforms the popular over-sampling and under-sampling approaches, indicating that the selected set of association features help the relapse classification tasks. © 2022 The Authors King Saud bin Abdulaziz University 13191578 English Article |
author |
Selamat N.A.; Abdullah A.; Mat Diah N. |
spellingShingle |
Selamat N.A.; Abdullah A.; Mat Diah N. Association features of smote and rose for drug addiction relapse risk |
author_facet |
Selamat N.A.; Abdullah A.; Mat Diah N. |
author_sort |
Selamat N.A.; Abdullah A.; Mat Diah N. |
title |
Association features of smote and rose for drug addiction relapse risk |
title_short |
Association features of smote and rose for drug addiction relapse risk |
title_full |
Association features of smote and rose for drug addiction relapse risk |
title_fullStr |
Association features of smote and rose for drug addiction relapse risk |
title_full_unstemmed |
Association features of smote and rose for drug addiction relapse risk |
title_sort |
Association features of smote and rose for drug addiction relapse risk |
publishDate |
2022 |
container_title |
Journal of King Saud University - Computer and Information Sciences |
container_volume |
34 |
container_issue |
9 |
doi_str_mv |
10.1016/j.jksuci.2022.06.012 |
url |
https://www.scopus.com/inward/record.uri?eid=2-s2.0-85133815944&doi=10.1016%2fj.jksuci.2022.06.012&partnerID=40&md5=614b8c8b77819d9ab2ba794176d6be85 |
description |
Drug addiction is a major problem in many countries, with rehabilitation and treatment clinics playing a critical role in aiding drug addicts’ recovery. Thus, the issue requires an effective automated system that can predict the likelihood of relapse in addicts. The system uses a dataset to train and test a machine learning algorithm for the automatic classification of drug patients. Nonetheless, the problem in training a machine learning classifier includes imbalanced classes, which can increase problems of overfitting and hinder generalization performance. The study proposed an association feature rule to combine the two most common over-sampling techniques: Synthetic Minority Over-Sampling Technique (SMOTE) and Random Over-Sampling Technique (ROSE) to balance the number of samples between classes, extending the problem feature space. Accordingly, the Random Forest algorithm is employed to classify new instances. The cross-experiments results on the Validi drug relapse dataset showed that the proposed combination approach outperforms the popular over-sampling and under-sampling approaches, indicating that the selected set of association features help the relapse classification tasks. © 2022 The Authors |
publisher |
King Saud bin Abdulaziz University |
issn |
13191578 |
language |
English |
format |
Article |
accesstype |
|
record_format |
scopus |
collection |
Scopus |
_version_ |
1809677890987491328 |