Association features of smote and rose for drug addiction relapse risk

Drug addiction is a major problem in many countries, with rehabilitation and treatment clinics playing a critical role in aiding drug addicts’ recovery. Thus, the issue requires an effective automated system that can predict the likelihood of relapse in addicts. The system uses a dataset to train an...

Full description

Bibliographic Details
Published in:Journal of King Saud University - Computer and Information Sciences
Main Author: Selamat N.A.; Abdullah A.; Mat Diah N.
Format: Article
Language:English
Published: King Saud bin Abdulaziz University 2022
Online Access:https://www.scopus.com/inward/record.uri?eid=2-s2.0-85133815944&doi=10.1016%2fj.jksuci.2022.06.012&partnerID=40&md5=614b8c8b77819d9ab2ba794176d6be85
id 2-s2.0-85133815944
spelling 2-s2.0-85133815944
Selamat N.A.; Abdullah A.; Mat Diah N.
Association features of smote and rose for drug addiction relapse risk
2022
Journal of King Saud University - Computer and Information Sciences
34
9
10.1016/j.jksuci.2022.06.012
https://www.scopus.com/inward/record.uri?eid=2-s2.0-85133815944&doi=10.1016%2fj.jksuci.2022.06.012&partnerID=40&md5=614b8c8b77819d9ab2ba794176d6be85
Drug addiction is a major problem in many countries, with rehabilitation and treatment clinics playing a critical role in aiding drug addicts’ recovery. Thus, the issue requires an effective automated system that can predict the likelihood of relapse in addicts. The system uses a dataset to train and test a machine learning algorithm for the automatic classification of drug patients. Nonetheless, the problem in training a machine learning classifier includes imbalanced classes, which can increase problems of overfitting and hinder generalization performance. The study proposed an association feature rule to combine the two most common over-sampling techniques: Synthetic Minority Over-Sampling Technique (SMOTE) and Random Over-Sampling Technique (ROSE) to balance the number of samples between classes, extending the problem feature space. Accordingly, the Random Forest algorithm is employed to classify new instances. The cross-experiments results on the Validi drug relapse dataset showed that the proposed combination approach outperforms the popular over-sampling and under-sampling approaches, indicating that the selected set of association features help the relapse classification tasks. © 2022 The Authors
King Saud bin Abdulaziz University
13191578
English
Article

author Selamat N.A.; Abdullah A.; Mat Diah N.
spellingShingle Selamat N.A.; Abdullah A.; Mat Diah N.
Association features of smote and rose for drug addiction relapse risk
author_facet Selamat N.A.; Abdullah A.; Mat Diah N.
author_sort Selamat N.A.; Abdullah A.; Mat Diah N.
title Association features of smote and rose for drug addiction relapse risk
title_short Association features of smote and rose for drug addiction relapse risk
title_full Association features of smote and rose for drug addiction relapse risk
title_fullStr Association features of smote and rose for drug addiction relapse risk
title_full_unstemmed Association features of smote and rose for drug addiction relapse risk
title_sort Association features of smote and rose for drug addiction relapse risk
publishDate 2022
container_title Journal of King Saud University - Computer and Information Sciences
container_volume 34
container_issue 9
doi_str_mv 10.1016/j.jksuci.2022.06.012
url https://www.scopus.com/inward/record.uri?eid=2-s2.0-85133815944&doi=10.1016%2fj.jksuci.2022.06.012&partnerID=40&md5=614b8c8b77819d9ab2ba794176d6be85
description Drug addiction is a major problem in many countries, with rehabilitation and treatment clinics playing a critical role in aiding drug addicts’ recovery. Thus, the issue requires an effective automated system that can predict the likelihood of relapse in addicts. The system uses a dataset to train and test a machine learning algorithm for the automatic classification of drug patients. Nonetheless, the problem in training a machine learning classifier includes imbalanced classes, which can increase problems of overfitting and hinder generalization performance. The study proposed an association feature rule to combine the two most common over-sampling techniques: Synthetic Minority Over-Sampling Technique (SMOTE) and Random Over-Sampling Technique (ROSE) to balance the number of samples between classes, extending the problem feature space. Accordingly, the Random Forest algorithm is employed to classify new instances. The cross-experiments results on the Validi drug relapse dataset showed that the proposed combination approach outperforms the popular over-sampling and under-sampling approaches, indicating that the selected set of association features help the relapse classification tasks. © 2022 The Authors
publisher King Saud bin Abdulaziz University
issn 13191578
language English
format Article
accesstype
record_format scopus
collection Scopus
_version_ 1809677890987491328