Predicting Student Drop-Out in Higher Institution Using Data Mining Techniques

The increasing number of students dropping out is a major concern of higher educational institutions as it gives a great impact not only cost to the students but also a waste of public funds. Thus, it is imperative to understand which students are at risk of dropping out and what are the factors tha...

Full description

Bibliographic Details
Published in:Journal of Physics: Conference Series
Main Author: Wan Yaacob W.F.; Mohd Sobri N.; Nasir S.A.M.; Wan Yaacob W.F.; Norshahidi N.D.; Wan Husin W.Z.
Format: Conference paper
Language:English
Published: Institute of Physics Publishing 2020
Online Access:https://www.scopus.com/inward/record.uri?eid=2-s2.0-85086722625&doi=10.1088%2f1742-6596%2f1496%2f1%2f012005&partnerID=40&md5=c3a97ef4990c662af106f14ca2e1b6c7
Description
Summary:The increasing number of students dropping out is a major concern of higher educational institutions as it gives a great impact not only cost to the students but also a waste of public funds. Thus, it is imperative to understand which students are at risk of dropping out and what are the factors that contribute to higher dropout rates. This can be done using educational data mining. In this paper, we described the uses of data mining techniques to predict student dropout of Computer Science undergraduate students after 3 years of enrolment in Universiti Teknologi MARA. The experimental results showed an achievable reliable classification accuracy from the selected algorithm in predicting dropouts. Decision tree, logistic regression, random forest, K-nearest neighbour and neural network algorithm were compared to propose the best model. The results showed that some of the machines learning algorithms are able to establish effective predictive models from student retention data. The Logistic Regression model was found to be the best learners to predict the dropout students with identified potential subject causes. In addition, we also presented some findings related to data exploration. © 2020 Published under licence by IOP Publishing Ltd.
ISSN:17426588
DOI:10.1088/1742-6596/1496/1/012005