May 2024, for CSCI 5461: Functional Genomics, Systems Biology, and Bioinformatics. In collaboration with Jacob Bridge, Aaron Castle, Annabelle Coler, Isabel Constable, and Kyla Hagen.
ABSTRACT: Due to a progressive disruption of cellular homeostasis and communication, chronic disease risk increases dramatically with age. As a result, according to the “Geroscience Hypothesis,” therapies that directly target drivers of aging could dramatically improve geriatric health outcomes. Although many age-associated pathways have been described, there is significant disagreement regarding their relative contributions to distinct tissue and disease phenotypes, complicating the drug development process. We hypothesize that drivers of aging can be inferred through gene network analysis, as the upstream regulators responsible for coordinated patterns of age-dependent expression. By applying regression models to RNA-seq data from over 3,000 human donors, encompassing myriad ages, tissues, and disease states, we isolate tissue-independent expression changes associated with healthy aging. We then identify characteristic variations associated with 16 geriatric diseases, and construct an interactive network of coordinately regulated gene sets. Critical nodes within the network display substantial functional agreement with known hallmarks of aging, reinforcing the utility of our approach. Annotation of network outputs also allows for a granular assessment of putative drivers, with the potential to streamline identification of robust therapeutic targets.
December 2024, for CSCI 5561: Computer Vision. In collaboration with Evan Krainess, Hady Kotifani, and William Anderson.
ABSTRACT: This project investigates the potential of Neural Radiance Fields (NeRF) for depth estimation, focusing on their ability to reconstruct 3D scenes and generate high-quality depth maps from 2D image inputs. Depth estimation is essential for various applications, including robotics, augmented and virtual reality, autonomous vehicles, and computer vision. NeRF offers a promising alternative to traditional depth estimation techniques, excelling in generating accurate reconstructions of complex scenes. The focus of our work was on improving the implementation and performance of NeRF-based depth estimation rather than tackling challenges like sparse imaging or transparent object detection, as explored by advanced techniques such as SinNeRF and Dex-NeRF. Instead, we concentrated on reimplementing and optimizing the TinyNeRF algorithm in PyTorch, which allowed us to enhance the computational efficiency and quality of depth outputs. Key achievements of this project include accelerating the NeRF algorithm, creating custom functions to generate diverse depth outputs, and improving the accuracy of these outputs. Notably, we applied Gaussian blurring to the input data to mitigate noise and refine depth calculation, resulting in more precise and visually coherent depth maps. By iteratively refining our approach, we observed significant improvements in both computational speed and output quality. Our work demonstrates how fundamental adjustments, such as enhanced preprocessing and optimized implementation, can elevate the performance of NeRF in depth estimation tasks. This study underscores the practicality of NeRF for generating detailed depth maps and sets the foundation for future applications in 3D scene reconstruction and beyond.
May 2025, for CSCI 5527: Deep Learning: Models, Computation, and Applications. In collaboration with Anthony Brogni, Ajitesh Parthasarathy, and Victor Hofstetter.
ABSTRACT: This paper presents a novel hybrid approach for converting handwritten mathematical expressions into syntactically correct LaTeX code. We combine an image-to-LaTeX model with a language-model-based postprocessor to address the challenges of accurately recognizing complex mathematical notation. Our system comprises a ResNet-34 based CNN encoder for feature extraction from handwritten images, a Transformer decoder for LaTeX sequence generation, and a finetuned Phi-4-mini language model for syntax correction. Trained on the MathWriting dataset with data augmentation techniques, our model achieves 85.59% accuracy with beam search decoding alone, increasing to 86.22% with LLM post-processing. This work provides a foundation for future improvements in mathematical content recognition systems for educational and research applications.
November 2024, for CSCI 5619: Virtual Reality and 3D Interaction. In collaboration with Aaron Castle.
ABSTRACT: The ElevateVR Drone Simulator is a virtual reality project designed to deliver an immersive and realistic FPV (First-Person View) drone experience, offering accessible alternatives to costly and high-risk real-world drone racing. This simulator features three distinct modes: racing, time trials, and freeplay, each designed to provide a unique and engaging playstyle. Players can explore custom-designed environments with interactive objects and obstacles. Built using Unity and Unity's physics engine, the project focuses on simulating realistic drone dynamics, including gravity, drag, and wind resistance. The racing mode features AI competitors, while the time trial mode focuses on precision and speed, both through a racetrack with custom layouts. The freeplay mode allows players to explore maps at their leisure, offering opportunities to practice skills or simply enjoy the environments. Custom objects and maps, developed specifically for this simulator, enhance immersion and provide creative challenges tailored to each mode. The development process prioritized responsive controls, mechanics, and stable performance across all playstyles. User testing ensured the realism of flight mechanics, intuitive controls, and fun gameplay. Success is measured by the simulator's ability to offer distinct, polished modes of play, realistic FPV visuals, and reliable technical functionality. ElevateVR demonstrates the potential for VR technology to deliver an authentic FPV drone experience while giving users accessible tools for practice and entertainment without having to worry about a financial burden.
May 2025, for CSCI 5541: Natural Language Processing. In collaboration with Anthony Brogni, Ajitesh Parthasarathy, and Tanmay Patwardhan.
ABSTRACT: We present a targeted investigation into the ability of fine-tuned small language models (SLMs) to break mono-alphabetic substitution ciphers. Using Qwen2.5-0.5B and Phi-4-mini-instruct as our base models, we generated a bespoke training corpus by programmatically encrypt000 “Human vs AI” dataset text samples with random mono-alphabetic mappings. We then applied Group-Relative Policy Optimization (GRPO) fine-tuning and evaluated a novel test-time scaling mechanism designedto scale performance during inference. Our experiments measured accuracy using RapidFuzz similarity (a fuzzy similarity metric), comparing GRPO alone as well as GRPO combined with test-time scaling against initial baselines. We found that GRPO fine-tuning yielded improvements in the decryption accuracy of SLMs on this task. The test-time scaling mechanism was evaluated for its ability to reduce ambiguous substitutions, and its effect on overall performance is discussed. Our results quantify the effectiveness and highlight the limitations of these methods for lightweight cryptanalysis using SLMs, illuminating remaining failure modes and informing approaches for future research.
December 2024, for CSCI 5525: Machine Learning: Analysis and Methods. In collaboration with Ryan Roche and Evan Voogd.
ABSTRACT: In this paper, we provide a comprehensive analysis on comparing the Performance of Classical and Deep Learning Techniques on Classifying Water as Potable or Non-Potable. We present our results of training sklearn's implementation of SVM and Random Forest on our dataset, with both models performing slightly below 70% on a hold-out test set. We also provide the results of our attempts at modifying the Random Forest architecture to better suit this dataset, comparing it to SVM, Multilayer-Perceptron, and Transformer based approaches.
May 2024, for CSCI 4511W: Introduction to Artificial Intelligence. In collaboration with Marie Milanowski.
ABSTRACT: One of the most crucial problems within computing is navigating and solving difficult systems with search algorithms. These problems are presented in many different ways, including mazes, images, data searching, and numerous others. Good pathing relies on speed, efficiency, and ability to navigate complex situations and problems. In this case, procedural two-dimensional (2D) maze generation and solution is one way to test these algorithms. By analyzing different metrics, the strengths and weaknesses of each search algorithm can be derived and the best one of certain sizes and types of mazes can be determined.
Disclaimer: All of these papers are for school projects and none of them are published.