Date of Award
Spring 4-17-2026
Document Type
Dissertation
Publication Status
Version of Record
Submission Date
April 2026
Department
Complex Systems and Brain Sciences
College Granting Degree
Charles E. Schmidt College of Science
Department Granting Degree
Center for Complex Systems
Degree Name
Doctor of Philosophy (PhD)
Thesis/Dissertation Advisor [Chair]
Ashkaan K. Fahimipour
Thesis/Dissertation Co-Chair
Mark M. Bailey
Abstract
This dissertation asks ‘how do we know that the machines we design and build are trying to achieve the goals that we had intended?’. At present, state of the art general learning algorithms almost universally utilize deep neural networks which remain notoriously difficult to interpret and control. This has led to unexpected failures of neural network-based systems when they encounter data outside their training distributions. To address these failures, I present a refinement of a general learning algorithm for robotic applications that is both interpretable and robust to catastrophic changes. I show that Deterministic Artificial Intelligence enables near-exact torque prediction on a control moment gyroscope, even in an out-of distribution environment with simulated damage. For non-robotic applications that still require neural networks, I present an alternative post-hoc interpretability technique that combines Shapley values and manifold learning methods to characterize the learned goals of neural networks. I show that the learned goals of certain deep neural networks lie near a low-dimensional manifold within the space of possible strategies. I also replicate a common failure of deep neural networks by showing that their performance degrades predictably when agents are deployed to complex environments that differ from their training distributions. Finally, I provide the first known implementation of Finite Factored Sets as a computable program, which allows for the discernment of temporal relationships between arbitrary variables, including those that are deterministic functions of one another. This program enables models of causality that work across scales of abstraction. Together, this work offers three distinct pathways to verify whether general learning algorithms are pursuing goals their designers intended.
Recommended Citation
Hoover, Stephen Hendry, "TOWARD SOLVING ALIGNMENT FAILURES IN GENERAL LEARNING ALGORITHMS" (2026). Electronic Theses and Dissertations. 258.
https://digitalcommons.fau.edu/etd_general/258