Development of machine learning algorithms in student performance classification based on online learning activities

The field of educational data mining has gained significant traction for its pivotal role in assessing students' academic achievements. However, to ensure the compatibility of algorithms with the selected dataset, it is imperative for a comprehensive analysis of the algorithms to be done. This...

Full description

Bibliographic Details
Published in:International Journal of Electrical and Computer Engineering
Main Author: Alias M.A.H.; Aziz M.A.A.; Hambali N.; Taib M.N.
Format: Article
Language:English
Published: Institute of Advanced Engineering and Science 2024
Online Access:https://www.scopus.com/inward/record.uri?eid=2-s2.0-85206088644&doi=10.11591%2fijece.v14i6.pp7126-7136&partnerID=40&md5=53a004de08c33d835afcde4b91a18fdd
Description
Summary:The field of educational data mining has gained significant traction for its pivotal role in assessing students' academic achievements. However, to ensure the compatibility of algorithms with the selected dataset, it is imperative for a comprehensive analysis of the algorithms to be done. This study delved into the development of machine learning algorithms utilizing students' online learning activities to effectively classify their academic performance. In the data cleaning stage, we employed VarianceThreshold for discarding features that have all zeros. Feature selection and oversampling techniques were integrated into the data preprocessing, using information gain to facilitate efficient feature selection and synthetic minority oversampling technique (SMOTE) to address class imbalance. In the classification phase, three supervised machine learning algorithms: k-nearest neighbors (KNN), multi-layer perceptron (MLP), and logistic regression (LR) were implemented, with 3-fold cross-validation to enhance robustness. Classifiers’ performance underwent refinement through hyperparameter tuning via GridSearchCV. Evaluation metrics, encompassing accuracy, precision, recall, and F1-score, were meticulously measured for each classifier. Notably, the study revealed that both MLP and LR achieved impeccable scores of 100% across all metrics, while KNN exhibited a noticeable performance boost after using hyperparameter tuning. © 2024 Institute of Advanced Engineering and Science. All rights reserved.
ISSN:20888708
DOI:10.11591/ijece.v14i6.pp7126-7136