Author Type

Graduate Student

Date of Award

Spring 3-19-2026

Document Type

Dissertation

Publication Status

Version of Record

Submission Date

April 2026

Department

Computer and Electrical Engineering and Computer Science

College Granting Degree

College of Engineering and Computer Science

Department Granting Degree

Electrical Engineering and Computer Science

Degree Name

Doctor of Philosophy (PhD)

Thesis/Dissertation Advisor [Chair]

Mihaela Cardei

Abstract

Early identification of students at risk of academic failure is essential for timely pedagogical interventions and reducing dropout rates. While Artificial Intelligence (AI) has significantly advanced predictive modeling in education, two primary challenges persist: effectively modeling the complex, evolving relationships within heterogeneous educational data, and ensuring the reliability of model outputs for high-stakes decision-making. This dissertation addresses these challenges by proposing a comprehensive framework for early and continuous student performance prediction applied to the Open University Learning Analytics (OULA) dataset. First, we introduce a Heterogeneous Graph Neural Network (HGNN) approach that utilizes metapath structures to capture latent interactions between diverse educational entities. By integrating dynamic assessment features, this model achieves a 68.6% F1 score within the first 8% of the semester and reaching up to 89.5% near the semester’s end, outperforming traditional Machine Learning (ML) baselines. Second, we investigate the temporal evolution of student performance by comparing static models against temporal HGNN and ML architectures. To evaluate the impact of different model components, we conduct extensive experiments on the OULA dataset including simple vs. cumulative student performance features, ablation study on feature assignment, and metapath selection. The results indicate that accounting for the cumulative nature of student data yields improvements of up to 10.1% in F1 scores early in the term, confirming the value of temporal feature engineering. Finally, to bridge the gap between predictive power and practical accountability, we incorporate Conformal Prediction (CP) to quantify model uncertainty. By formulating student success as a temporal multiclass task (Inadequate, Deficient, Satisfactory, and Excellent), we move beyond deterministic point-estimates. We evaluate post-hoc calibration techniques, including Temperature Scaling and Dirichlet Calibration, to refine these prediction sets. The results show that even at 8% semester completion, the framework maintains a class-conditional coverage of at least 88% for a significance level alpha= 0.1, providing a reliable safety net for human-in-the-loop interventions. This statistically valid safety net ensures that even when early semester data is sparse and point-estimates are unstable, the true outcome remains within the predicted set. By transitioning from broad sets to precise singletons as the semester progresses, this framework provides a reliable, calibrated trigger for tiered, human-in-the-loop instructional interventions. This work combines graph-based structural learning with temporal dynamics and calibrated uncertainty to propose a robust and trustworthy system for proactive educational support.

Share

COinS