This page is used to host and highlight projects I’ve done over the years. All of my code is made public and contained within my GitHub repo. Whenever possible, I make all of the data public as well; however, some work-related use data that cannot be shared due to FERPA regulations or other privacy concerns.
If you have any questions or feedback, feel free to reach out to me.
In response to COVID-19, Eckerd College decided to implement a block schedule for the fall semester of 2020. That is, rather than taking courses simultaneously, students who take one course as a time. However, the decision to implement the block schedule was made after student’s had registered for the fall semester. Rather than forcing all students to re-register, I write a script — using a genetic algorithm — to design a schedule that would minimize the number of course registration conflicts.
Despite being a significant part of a student’s application, the college admission essay receives little attention from an analytics standpoint. Here, I modeled topic themes of college admissions essay using LDA to examine whether an applicant’s essay could be used to predict enrollment and/or retention.
Prior to the normal semester, incoming first year students at Eckerd College take a three week course as part of Autumn Term. These condensed courses are special topic courses taught by faculty from across the college. In the past, students were placed into these section, more or less, at random. I wrote a script — a modified implementation of the assignment problem — to place students into Autumn Term sections based on their academic interest and the discipline of Autumn Term faculty.
Peer institutes are used in higher education for benchmarking purposes. I describe a simple framework for identifying similar institutions.
This project involved scraping a bunch of data from ESPN and Baseball Reference as part of a larger, pet project. You can download the raw and clean data using the Dropbox link above.
A quick analysis of what a team needs to do in order to win in a standard, head-to-head, categories-based fantasy baseball league.
An analysis where I quantify the performance of players in order to better understand their contribution.
A classification project using a bag-of-words approach to predict a wine’s variety based on its description.
This project uses simulated annealing to automate feature selection for predictive models. It considers both main effects and two-way interactions to streamline the modeling process and reduces manual trial-and-error.
In this project, I developed a series of Wordle solvers using Python. I explore a couple of different strategies to improve the function.
This project analyzes crossword puzzles to identify shared patterns and reused motifs across different grids. By extracting and comparing grid-based “motifs”” using a sliding window approach, it uncovers potential duplication/similarities in puzzle construction.