Author Type

Graduate Student

Date of Award

Fall 10-24-2025

Document Type

Dissertation

Publication Status

Version of Record

Submission Date

November 2025

Department

Computer and Electrical Engineering and Computer Science

College Granting Degree

College of Engineering and Computer Science

Degree Name

Doctor of Philosophy (PhD)

Thesis/Dissertation Advisor [Chair]

Zhen Ni

Abstract

With the rapid progress of reinforcement learning (RL) and its applications in robotics, autonomous driving, and energy systems, learning reliable reward functions has become a central challenge. Traditional RL relies heavily on handcrafted reward functions, which are often infeasible to design in complex real-world environments. Inverse reinforcement learning (IRL) provides an alternative by recovering reward functions from expert demonstrations. However, existing IRL methods suffer from inefficiency, poor robustness under perturbations, and limited generalization to unseen scenarios.

This dissertation focuses on advancing IRL from three complementary perspectives: computational efficiency, robustness, and generalization. Specifically, the following problems are investigated: 1. Computational Efficiency: We develop a novel, efficient IRL (e-IRL) framework that reformulates reward recovery through feature expectation matching, eliminating the need for state visitation estimation. This reduces memory and computation costs while scaling effectively to high-dimensional environments. 2. Robustness: To address perturbed environments, we introduce a multi-virtual-agent IRL (MVIRL) framework. By training multiple agents across parallel perturbed environments with weight-sharing and data aggregation, MVIRL learns robust reward functions that remain stable under noise, gravity variations, and adversarial conditions. 3. Generalization and Personalization: We propose a contrastive IRL (CIRL) framework that integrates self-supervised contrastive representation learning with maximum entropy IRL. CIRL leverages momentum encoders and reward-regularized contrastive loss to improve sample efficiency and adapt to personalized driving styles, ensuring both robustness and adaptability.

Available for download on Tuesday, November 10, 2026

Share

COinS