Granted: Building Better Genomic Tools—For Everyone

Assistant Professor of Computer Science Sara Mathieson works with students on developing machine learning models to create more inclusive genomic datasets. Photos by Paola Nogueras.
Details
A federal grant supports Assistant Professor Sara Mathieson’s students as they unlock the potential of machine learning to model genomes more accurately and equitably. This is Granted, Haverford’s series on the impact of federal funding.
Assistant Professor of Computer Science Sara Mathieson
Source of funding: National Institutes of Health (NIH)
Amount: $350,141
About my work: In many areas of genetics, simulated data is essential for validating and comparing methods. However, many simulated datasets are designed with European-ancestry populations in mind. This NIH grant seeks to develop machine learning methods that will translate to any population, thus creating datasets that mirror real human populations. Downstream, we use these datasets to learn about evolution and health-related genomic features.
My students have gained numerous research skills, including working with large, complex datasets, developing algorithms, analyzing results, and creating visualizations. They also created written results and oral presentations. Finally, they used modern machine learning libraries—these types of skills are useful in industry and academia.

Darsha Mehta '25: My undergraduate research experiences have been transformative, shaping my academic path and solidifying my career aspirations. They ignited a passion I never anticipated, solidifying my aspiration to make research a cornerstone of my future career. My simultaneous work in both Professor Amy Cooke's and Professor Sara Mathieson's labs were instrumental in this realization, ultimately inspiring me to pursue an independent major in computational biology, a testament to the impact these opportunities have had on shaping my academic trajectory.
I did my thesis work in Professor Mathieson's lab, focusing on the cutting edge of state-space models to better understand gene regulation and genetic interactions, by adapting and modifying architectures to efficiently predict genetic features in DNA. I had the privilege of building a model to predict genetic features within DNA sequences and traits with a complex basis that are influenced by many mutations we don’t fully understand the relationship between. My research involved adapting these architectures to enhance their efficiency and incorporating specific DNA-oriented modifications. This computationally intensive work was made possible through our college's computer clusters and the powerful capabilities offered by Google's Cloud TPUs. Access to these advanced computational resources was absolutely crucial in transforming these challenging projects into successful and deeply enriching learning experiences.
Federal grants can be life-changing for faculty, students, and those who benefit from their research. Granted, a series about federally funded research, showcases the sort of projects that need your financial support to offset recent, unplanned cuts in government funding. Learn more about how to support research at Haverford.