Summary: | The field of educational data mining has gained significant traction for its pivotal role in assessing students' academic achievements. However, to ensure the compatibility of algorithms with the selected dataset, it is imperative for a comprehensive analysis of the algorithms to be done. This study delved into the development of machine learning algorithms utilizing students' online learning activities to effectively classify their academic performance. In the data cleaning stage, we employed VarianceThreshold for discarding features that have all zeros. Feature selection and oversampling techniques were integrated into the data preprocessing, using information gain to facilitate efficient feature selection and synthetic minority oversampling technique (SMOTE) to address class imbalance. In the classification phase, three supervised machine learning algorithms: k-nearest neighbors (KNN), multi-layer perceptron (MLP), and logistic regression (LR) were implemented, with 3-fold cross-validation to enhance robustness. Classifiers’ performance underwent refinement through hyperparameter tuning via GridSearchCV. Evaluation metrics, encompassing accuracy, precision, recall, and F1-score, were meticulously measured for each classifier. Notably, the study revealed that both MLP and LR achieved impeccable scores of 100% across all metrics, while KNN exhibited a noticeable performance boost after using hyperparameter tuning. © 2024 Institute of Advanced Engineering and Science. All rights reserved.
|