About

For work I build artificial intelligent systems. In past lives I was researcher at Harvard Medical School, the first two startups I joined unicorned (Invitae and Freenome) and I co-founded Ravel, a deep learning cancer screening company which we raised $9.5M for.

I currently run a team of ML research engineers where we build ML models and software for different companies. We've worked on ML projects in lots of domains: LLMs, voice-cloning, stable diffusion, computer vision, financial time series, reinforcement learning, genetics, etc.

I'm searching for something meaningful in AI/AGI research to work on!

For fun I love exploring different forms of movement (strength, mobility, dance, roller blading, gymnastics), a sauna / cold plunge, and an interesting discussion. I was a nationally ranked chess player as a kid.

Work

  • 2021-2023: Head of ML at Eventum AI. Working on LLMs, voice-cloning, reinforcement learning, stable diffusion, computer vision, financial time series, etc. Managed a team of around ML research engineers working for various high growth startups like Sanas, PlaiLabs, and Deepcell.
  • 2021-2021: Machine Learning Research Consultant. Helped Invitae with models and methods for a non-invasive pre-natal screening test which was productionized shortly after.
  • 2018-2021: Co-founder and Principal Machine Learning Scientist of Ravel Biotechnologies. I applied neural architectures from NLP, vision and audio to high throughput genetic sequencing data to build a test to detect the presence of breast cancer in cell-free DNA.
  • 2017-2018: Senior Computational Biologist - Machine Learning team at Freenome. Developed neural network and probabilistic models for colon cancer detection. Freenome went on to raise $1B from venture capitalists.
  • 2013-2017: Senior Bioinformatician at Invitae. Developed machine learning and bayesian probabilistic algorithms for genetic diagnostics. Main developer, first patent author of Invitae's variant calling system which determined what mutations were in a patient's genomes. Patented another algorithm which allowed us to detect mutations in parts of the genome that competitors could not. Invitae IPOed, become worth $6.5B at its height, and my code helped millions of patients.
  • 2010-2013: Senior Research Associate at Harvard Medical School. Built machine learning and bioinformatics systems to analyze high throughput genetic and health data. Worked with one of the first groups to do a clinical exome sequencing and interpretation out of Boston Children's Hospital.

Education

  • 2010-now: Self study - Favorites include, Richard Sutton's Reinforcement Learning, Hands on Deep Reinforcement Learning, Kevin Murphy's Probabilistic Machine Learning, Bishop's Pattern Recognition and Machine Learning Learning
  • 2010-2013: Research at Harvard Medical School - 3 years of bioinformatics research in the Lab for Personalized Medicine under Dr. Peter Tonellato. Taught practicums for graduate and medical students in bioinformatics. I was not enrolled as a student.
  • 2010: Graduated from University of Arizona with two degrees, Computer Science and Physiology.

Patents

  • Systems and Processes of Identifying Genetic Variations Systems and Processes of Identifying Genetic Variations US USSN 15/711,760 · Filed Sep 21, 2017. This is Invitae's Variant Calling pipeline, for which I was first author.
  • METHODS, SYSTEMS AND PROCESSES OF IDENTIFYING GENETIC VARIATION IN HIGHLY SIMILAR GENESMETHODS, SYSTEMS AND PROCESSES OF IDENTIFYING GENETIC VARIATION IN HIGHLY SIMILAR GENESUS 20160300014 · Issued Oct 13, 2016. This allowed Invitae to call variants in highly paralogous genes such as PMS2. Second of two authors.

Publications

  • Evaluation of cfDNA as an early detection assay for dense tissue breast cancer. Nature · May 19, 2022. Middle Author.
  • Cell-free DNA fragments inform epigenomic mechanisms for early detection of breast cancer (Using Deep Learning). Apr 10, 2021. Poster, Cancer Research. First Author.
  • Predicting gene expression from plasma cell-free DNA using both the fragment length and fragment position (Using Deep Learning). Jul 1, 2019 AACR. Second Author
  • Machine learning enables detection of early-stage colorectal cancer by whole-genome sequencing of plasma cell-free DNA. Jan 1, 2019 BMC Cancer. Middle Author.
  • Early Stage Colorectal Cancer Detection Using Artificial Intelligence and Whole-Genome Sequencing of Cell-Free DNA in a Retrospective Cohort of 1,040 Patients. American Journal of Gastroenterology. Middle author.
  • Quantitative Determination of SMN2 Copy Number using Next Generation Sequencing and Correlation to Disease Severity (S5.002). Neurology · Jan 1, 2018. Middle author. COSMOS: cloud enabled NGS analysis. Jan 28, 2015. BMC. Middle Author
  • COSMOS: Python library for massively parallel workflows. Bioinformatics · Jun 30, 2014. First Author.
  • TRANSCRIPTIONAL SUBCLASSES FROM PRIMARY HUMAN GLIOBLASTOMA MULTIFORME CELL LINES DEMONSTRATE PROGNOSTIC VALUE. Neuro-Oncology · Jan 1, 2012. Middle Author. Biomedical Cloud Computing With Amazon Web Services. Plos Computational Biology · Aug 25, 2011. Middle Author.

Misc

  • College: Captained my College's ultimate frisbee team, placed 5th at nationals and won the national sportsmanship award.
  • College: Part-time Middle School Elective and Substitute teacher at Paulo Friere Freedom School (chess club, webdesign, electric solar powered cars, math, PE).
  • Age 12-15: Taught myself to code, built a unix webhosting company that had ~25 clients.
  • Age 12: my third year playing Chess placed 44th at Nationals (1st at Regionals and 2nd at State). Placed 10th at Bughouse Nationals (2v2 chess).