Student performance classification: a comparison of feature selection methods based on online learning activities

The classification of student performance involves categorizing students' performance using input data such as demographic information and examination results. However, our study introduces a novel approach by emphasizing students' online learning activities as a rich data source. To avoid...

Full description

Bibliographic Details
Published in:International Journal of Electrical and Computer Engineering
Main Author: Alias M.A.H.; Aziz M.A.A.; Hambali N.; Taib M.N.
Format: Article
Language:English
Published: Institute of Advanced Engineering and Science 2024
Online Access:https://www.scopus.com/inward/record.uri?eid=2-s2.0-85195190065&doi=10.11591%2fijece.v14i4.pp4675-4685&partnerID=40&md5=1571a2be942ebed4a74e406791892710
Description
Summary:The classification of student performance involves categorizing students' performance using input data such as demographic information and examination results. However, our study introduces a novel approach by emphasizing students' online learning activities as a rich data source. To avoid misinterpretation during the classification, we therefore presented a study comparing several feature selection (FS) methods combined with artificial neural network (ANN), for classifying students’ performance based on their online learning activities. At first, we focused on tackling the issue of missing values by implementing data cleaning using variance threshold. feature selection techniques were implemented which encompass both filter-based (information gain, chi-square, Pearson correlation) and wrapper-based, sequential selection (forward and backward) techniques. In the classification stage, multi-layer perceptron (MLP) was used with the default hyperparameters and 5-fold cross-validation along with synthetic minority oversampling technique (SMOTE) were also applied to each method. We evaluated each feature selection method's performance using key metrics: accuracy, precision, recall, and F1-score. The outcomes highlighted information gain and sequential selection (forward and backward) as the top-performing methods, all achieving 100% accuracy. This research underscores the potential of leveraging online learning activities for robust student performance classification within the specified constraints. © 2024 Institute of Advanced Engineering and Science. All rights reserved.
ISSN:20888708
DOI:10.11591/ijece.v14i4.pp4675-4685