Hi! I am a software engineer and a bioinformatician. I work at OpenAI. I build workflow management systems and online infrastructure services at scale, whether to serve scientists building DNA sequencers, cloud computing platforms for clinical cancer and prenatal diagnostics, population genomics for drug discovery and epidemiology, or deep research workflows to enable AI-driven scientific inference. Before OpenAI, I worked at Color, where I was lucky to help build the future of genetic disease risk modeling, population genomics, and healthcare delivery, as well as provide emergency diagnostic services during the COVID-19 pandemic.

Prior to Color, I worked at the Chan Zuckerberg Initiative, where I helped build infrastructure for the CZID project (also known as IDseq); a lot of this work is available as open source in the SWIPE and czid-workflows repositories.

My areas of deep expertise are in Python, Linux, networking and distributed systems, workflow management systems, CUDA and other compute architectures (inference optimization), Linux containerization (Docker) and orchestration (Kube), statistical modeling with probabilistic graphical models, API design, cloud infrastructure design (I have 15 years of experience and expert level familiarity with many parts of AWS, GCP, and Azure), and developer tooling (Codex, Cursor, VSCode, CI/CD platforms, agentic AI information architecture, Terraform/IaC). While I am not a cryptography, security, or compliance expert, I have built (and still maintain) reference implementations of cryptographic algorithms and know enough to evaluate work, identify and fix problems in these domains, and build systems that prioritize actual security and reliability.

As a scientist, I am an expert in genome sequence analysis, including mapping, assembly, annotation, variant calling and interpretation, clinical reporting, sequencing technology R&D, metagenomics, phylogenomics/evolutionary genomics, epigenomics/methylation detection, and single molecule sequencing. I have a deep interest and long term research objective in evolutionary analysis of large eukaryotic genomes, and in particular transposable element mediated evolution of gene regulatory networks. In my spare time, I develop tools and read research toward that objective.

Before my industry career, I graduated from UC Berkeley with a triple major in computer science, mathematics, and statistics. I then went to grad school at Georgia Tech and graduated with a PhD in bioinformatics. My undergraduate research work was at Lawrence Berkeley Lab with Inna Dubchak and Mike Brudno. Together with Mike’s advisor Serafim Batzoglou, we created the first global multiple genome alignment of multiple large eukaryotic genomes. My graduate advisor was Joshua Weitz, and my thesis title was Algorithm development for next generation sequencing-based metagenome analysis. You can read more exciting details in my CV. But a lot of what I learned during my time in grad school, I actually learned at Black Knight Martial Arts.

A few notable infrastructure projects that I implemented include an LXC-based cloud PaaS virtualization service which continues to power millions of jobs and Docker apps on the DNAnexus Titan platform; an independent implementation of SAML, OAuth2/OIDC, and their underlying technologies for single sign-on applications; an AWS IAM-based symbolic RBAC PDP service; and multiple API designs, CLI tools, and developer productivity tools for a variety of products. On the science side, I have mostly been involved in integration and tuning of existing genomics and machine learning software written by people smarter than me, but you can see some of my concrete contributions on my GitHub and Google Scholar profiles.

I live in San Francisco. In my spare time I do a lot of the usual suspect activities such as biking, running, photography, and selecting the proper power tools for home improvement.