Author Type

Graduate Student

Date of Award

Fall 11-15-2025

Document Type

Dissertation

Publication Status

Version of Record

Submission Date

November 2025

Department

Computer and Electrical Engineering and Computer Science

Degree Name

Doctor of Philosophy (PhD)

Thesis/Dissertation Advisor [Chair]

Michael DeGiorgio

Abstract

Detecting genomic regions influenced by adaptive processes is fundamental to understanding adaptation and its connection to modern human traits. Traditional summary-statistic and likelihood methods capture only simple evolutionary scenarios, while deep learning models such as convolutional neural networks, though powerful, often lose spatial information and lack interpretability. This dissertation introduces three complementary frameworks that overcome these limitations by integrating interpretable machine learning with spatially structured genomic data. The first framework, T-REx employs tensor decomposition to extract informative features from haplotype alignment images, enabling accurate classification of selective sweeps with classical learning models. The second framework, α-DAWG applies wavelet and curvelet decompositions to capture directional and frequency-based signatures of selection, achieving performance comparable to convolutional neural networks while remaining interpretable and robust. The third framework, SKINET introduces a trend-filtered kernel within a support vector machine, preserving spatial autocovariation and extending to quantitative inference of adaptive parameters. Applied to human genomic data, these frameworks not only detect classical signature of selective sweeps such as LCT and MCM6 but also uncover novel candidates including FAM177A1 and PTPRJ that reveal interesting disease associations. Collectively, these methods demonstrate that interpretable, structure-aware machine learning frameworks can advance the frontier of adaptive inference by combining predictive power with evolutionary insights.

Share

COinS