Applications of Machine Learning Algorithms for Examining the Impact of COVID-19 on the Dropout Rate for High Schools in the State of Illinois

No Thumbnail Available

Authors

Ngantchou, Claude

Issue Date

2025-12

Type

Dissertation

Language

en

Keywords

High School Dropout , Business, Engineering, Science, & Technological Innovation

Research Projects

Organizational Units

Journal Issue

Alternative Title

Abstract

High school dropout remains a persistent and pressing issue in the United States and globally. This quantitative, non-experimental study aimed to develop a predictive model for high school dropout rates in the state of Illinois before and during the COVID-19 pandemic, covering the years 2017–2019 and 2020–2022, respectively. Publicly available datasets from the Illinois State Board of Education were analyzed using multiple linear regression, random forest, and XGBoost models to assess the impact of the pandemic and to identify which school-level features most strongly predict dropout outcomes. The study applied the CRISP-DM framework and interpreted results through the lens of survival analysis theory to address the problem of academic attrition over time. Research questions and hypotheses were tested using the three predictive models. The analysis identified mobility rate, COVID-19 period, and low-income enrollment as the most influential predictors of high school dropout, with mobility rate emerging as the top signal across models. Model performance was evaluated using R², mean absolute error (MAE), and root mean squared error (RMSE). The XGBoost model offered the best balance of predictive accuracy and computational efficiency, making it the most effective and preferred model for this study. Recommendations for future research are grounded in the study’s predictive scope and methodological limitations. Proposed next steps include evaluating model performance over extended timeframes, incorporating post–COVID-19 data, and exploring additional demographic and school-level predictors using time-aware validation and stratified replication. These extensions will strengthen the generalizability and practical value of predictive modeling for dropout prevention in educational settings.

Description

Citation

Publisher

License

Journal

Volume

Issue

PubMed ID

DOI

ISSN

EISSN