Author Type

Graduate Student

Semester Award Granted

Summer 2025

Submission Date

August 2025

Document Type

Thesis

Degree Name

Master of Science (MS)

Thesis/Dissertation Advisor [Chair]

Georgios Sklivanitis

Abstract

This thesis investigates monocular vision-based navigation for autonomous ground robots in indoor environments. It explores the effectiveness of mapless navigation strategies using image inputs and evaluates both modular and end-to-end deep learning techniques. A comprehensive implementation pipeline was developed on a Clearpath Dingo robot, comparing the approaches: classical optical flow (Farneback), modular deep learning with motion planning, and an end-to-end model.

The modular methods involve predicting region-wise obstacle depths using a ResNet 18 based model trained on a custom indoor dataset, followed by a motion planner. The end-to-end approach leverages a ResNet-18 architecture to jointly predict steering angles and collision probabilities. All models were trained and evaluated using custom datasets collected in varied indoor trajectories, totaling over 100,000 images.

Experimental evaluation focused on both offline performance (e.g., RMSE, F1- score) and real-time deployment on the Dingo platform. Results indicate that while classical vision methods lack robustness, deep learning-based approaches demonstrate higher reliability and adaptability. Notably, modular models allow easier interpretability and integration with planning systems, while end-to-end models offer faster inference and smoother navigation.

This study validates the feasibility of deploying monocular vision-driven systems for indoor mapless navigation, highlighting trade-offs between interpretability, performance, and real-time constraints. The outcomes contribute to the broader field of robot autonomy using vision-only sensing, especially in low-cost, compute-constrained settings.

Share

COinS