Summary: | Drug addiction is a major problem in many countries, with rehabilitation and treatment clinics playing a critical role in aiding drug addicts’ recovery. Thus, the issue requires an effective automated system that can predict the likelihood of relapse in addicts. The system uses a dataset to train and test a machine learning algorithm for the automatic classification of drug patients. Nonetheless, the problem in training a machine learning classifier includes imbalanced classes, which can increase problems of overfitting and hinder generalization performance. The study proposed an association feature rule to combine the two most common over-sampling techniques: Synthetic Minority Over-Sampling Technique (SMOTE) and Random Over-Sampling Technique (ROSE) to balance the number of samples between classes, extending the problem feature space. Accordingly, the Random Forest algorithm is employed to classify new instances. The cross-experiments results on the Validi drug relapse dataset showed that the proposed combination approach outperforms the popular over-sampling and under-sampling approaches, indicating that the selected set of association features help the relapse classification tasks. © 2022 The Authors
|