Dropout early warning systems for high school students using machine learning

Research output: Contribution to journalArticlepeer-review

105 Scopus citations


Students’ dropouts are a serious problem for students, society, and policy makers. Predictive modeling using machine learning has a great potential in developing early warning systems to identify students at risk of dropping out in advance and help them. In this study, we use the random forests in machine learning to predict students at risk of dropping out. The data used in this study are the samples of 165,715 high school students from the 2014 National Education Information System (NEIS), which is a national system for educational administration information connected through the Internet with around 12,000 elementary and secondary schools, 17 city/provincial offices of education, and the Ministry of Education in Korea. Our predictive model showed an excellent performance in predicting students’ dropouts in terms of various performance metrics for binary classification. The results of our study demonstrate the benefit of using machine learning with students’ big data in education. We briefly overview machine learning in general and the random forests model and present the various performance metrics to evaluate our predictive model.

Original languageEnglish
Pages (from-to)346-353
Number of pages8
JournalChildren and Youth Services Review
StatePublished - Jan 2019

Bibliographical note

Publisher Copyright:
© 2018 Elsevier Ltd


  • Big data
  • Dropout
  • Machine learning
  • Predictive model
  • Random forests model


Dive into the research topics of 'Dropout early warning systems for high school students using machine learning'. Together they form a unique fingerprint.

Cite this