Author Type

Graduate Student

Date of Award

Fall 11-25-2025

Document Type

Thesis

Publication Status

Version of Record

Submission Date

December 2025

Department

Computer and Electrical Engineering and Computer Science

Degree Name

Master of Science (MS)

Thesis/Dissertation Advisor [Chair]

Behnaz Ghoraani

Abstract

This thesis presents a comparative evaluation of two deep learning-based methods for extracting 2D and 3D gait metrics from monocular video: a 2D-to-3D lifting approach combining AlphaPose for keypoint detection with MotionBERT for temporal 3D reconstruction, and a direct 3D estimation method using MeTRAbs. Videos of two healthy adults walking across a ProtoKinetics Zeno™ Walkway were captured laterally at 4K/60 fps using an iPhone 16. Ground-truth spatiotemporal gait parameters were obtained from the pressure-sensitive mat, enabling quantitative validation via Mean Absolute Error (MAE) and Pearson correlation. MeTRAbs consistently outperformed the lifting pipeline, achieving MAE below 2% for temporal metrics (e.g., 1.58% for step time, 1.09% for gait cycle time) and strong correlations (r > 0.90, p < 0.001). Spatial metrics showed MAE of 15.28% (step length) and 15.84% (stride length), with superior robustness to occlusions. AlphaPose+MotionBERT exhibited higher spatial errors (20.40% and 18.47%, respectively) and weaker correlations, though it remained viable for velocity (9.34% MAE) and low-resource settings. The results highlight direct 3D estimation as the preferred method for clinical precision and biomechanical fidelity, while 2D-to-3D lifting offers a lightweight alternative for scalable, non-intrusive monitoring. Future work should expand to pathological gaits, multi-view fusion, and edge-optimized models to broaden real-world applicability.

Share

COinS