I am a computational biologist by training, currently doing my postdoc, advised by Rafael Irizarry at the Departments of Biostatistics, Harvard T.H.Chan School of Public Health and Biostatistics and Computational Biology, Dana-Farber Cancer Institute. I am also an affiliate postdoc at the Broad Institute.
I am broadly interested in the design and analyses of open problems in high-throughput biology with direct applications in health and disease. My core area of research is genomic data science, where I draw concepts from biology, machine learning, computer science, and statistics to develop methods for learning from massive, usually public datasets. During my PhD, I sought to understand tumor evolution and heterogeneity by applying ideas in computational phylogenetics to genomic datasets. My current focus is on delineating sources of bias and technical variability in publicly available DNA and RNA sequencing data. I have also worked on computational prioritization of microbial gene products in disease from metagenomic data.