Predicting Student Drop-Out in Higher Institution Using Data Mining Techniques

The increasing number of students dropping out is a major concern of higher educational institutions as it gives a great impact not only cost to the students but also a waste of public funds. Thus, it is imperative to understand which students are at risk of dropping out and what are the factors tha...

Full description

Bibliographic Details
Published in:Journal of Physics: Conference Series
Main Author: Wan Yaacob W.F.; Mohd Sobri N.; Nasir S.A.M.; Wan Yaacob W.F.; Norshahidi N.D.; Wan Husin W.Z.
Format: Conference paper
Language:English
Published: Institute of Physics Publishing 2020
Online Access:https://www.scopus.com/inward/record.uri?eid=2-s2.0-85086722625&doi=10.1088%2f1742-6596%2f1496%2f1%2f012005&partnerID=40&md5=c3a97ef4990c662af106f14ca2e1b6c7
id 2-s2.0-85086722625
spelling 2-s2.0-85086722625
Wan Yaacob W.F.; Mohd Sobri N.; Nasir S.A.M.; Wan Yaacob W.F.; Norshahidi N.D.; Wan Husin W.Z.
Predicting Student Drop-Out in Higher Institution Using Data Mining Techniques
2020
Journal of Physics: Conference Series
1496
1
10.1088/1742-6596/1496/1/012005
https://www.scopus.com/inward/record.uri?eid=2-s2.0-85086722625&doi=10.1088%2f1742-6596%2f1496%2f1%2f012005&partnerID=40&md5=c3a97ef4990c662af106f14ca2e1b6c7
The increasing number of students dropping out is a major concern of higher educational institutions as it gives a great impact not only cost to the students but also a waste of public funds. Thus, it is imperative to understand which students are at risk of dropping out and what are the factors that contribute to higher dropout rates. This can be done using educational data mining. In this paper, we described the uses of data mining techniques to predict student dropout of Computer Science undergraduate students after 3 years of enrolment in Universiti Teknologi MARA. The experimental results showed an achievable reliable classification accuracy from the selected algorithm in predicting dropouts. Decision tree, logistic regression, random forest, K-nearest neighbour and neural network algorithm were compared to propose the best model. The results showed that some of the machines learning algorithms are able to establish effective predictive models from student retention data. The Logistic Regression model was found to be the best learners to predict the dropout students with identified potential subject causes. In addition, we also presented some findings related to data exploration. © 2020 Published under licence by IOP Publishing Ltd.
Institute of Physics Publishing
17426588
English
Conference paper
All Open Access; Gold Open Access
author Wan Yaacob W.F.; Mohd Sobri N.; Nasir S.A.M.; Wan Yaacob W.F.; Norshahidi N.D.; Wan Husin W.Z.
spellingShingle Wan Yaacob W.F.; Mohd Sobri N.; Nasir S.A.M.; Wan Yaacob W.F.; Norshahidi N.D.; Wan Husin W.Z.
Predicting Student Drop-Out in Higher Institution Using Data Mining Techniques
author_facet Wan Yaacob W.F.; Mohd Sobri N.; Nasir S.A.M.; Wan Yaacob W.F.; Norshahidi N.D.; Wan Husin W.Z.
author_sort Wan Yaacob W.F.; Mohd Sobri N.; Nasir S.A.M.; Wan Yaacob W.F.; Norshahidi N.D.; Wan Husin W.Z.
title Predicting Student Drop-Out in Higher Institution Using Data Mining Techniques
title_short Predicting Student Drop-Out in Higher Institution Using Data Mining Techniques
title_full Predicting Student Drop-Out in Higher Institution Using Data Mining Techniques
title_fullStr Predicting Student Drop-Out in Higher Institution Using Data Mining Techniques
title_full_unstemmed Predicting Student Drop-Out in Higher Institution Using Data Mining Techniques
title_sort Predicting Student Drop-Out in Higher Institution Using Data Mining Techniques
publishDate 2020
container_title Journal of Physics: Conference Series
container_volume 1496
container_issue 1
doi_str_mv 10.1088/1742-6596/1496/1/012005
url https://www.scopus.com/inward/record.uri?eid=2-s2.0-85086722625&doi=10.1088%2f1742-6596%2f1496%2f1%2f012005&partnerID=40&md5=c3a97ef4990c662af106f14ca2e1b6c7
description The increasing number of students dropping out is a major concern of higher educational institutions as it gives a great impact not only cost to the students but also a waste of public funds. Thus, it is imperative to understand which students are at risk of dropping out and what are the factors that contribute to higher dropout rates. This can be done using educational data mining. In this paper, we described the uses of data mining techniques to predict student dropout of Computer Science undergraduate students after 3 years of enrolment in Universiti Teknologi MARA. The experimental results showed an achievable reliable classification accuracy from the selected algorithm in predicting dropouts. Decision tree, logistic regression, random forest, K-nearest neighbour and neural network algorithm were compared to propose the best model. The results showed that some of the machines learning algorithms are able to establish effective predictive models from student retention data. The Logistic Regression model was found to be the best learners to predict the dropout students with identified potential subject causes. In addition, we also presented some findings related to data exploration. © 2020 Published under licence by IOP Publishing Ltd.
publisher Institute of Physics Publishing
issn 17426588
language English
format Conference paper
accesstype All Open Access; Gold Open Access
record_format scopus
collection Scopus
_version_ 1809677897785409536