Software, Physics, Data, Mountains

Hi, I’m Mark…

I’m a Data Plumber. I spend most days designing/tuning/fixing/banging-on data pipelines and the infrastructure behind data-intensive applications of all sorts.

Originally trained as a Physicist (Ph.D. researching how quantum computers can learn to tolerate noise and errors), I’ve since had an incredible career out in the world where Data Science meets DevOps and infrastructure engineering.

I’ve been lucky enough to work for companies like Canonical, the folks behind the Ubuntu operating system, Infochimps, DigitalOcean, and now Google. Places where talented teams of people are innovating and leading open-source communities and the tech industry as a whole.

My passions at the moment include:

Scientific AI/ML. Using various deep or hybrid models to accelerate scientific high-performance computing (HPC) workloads
Operationalizing AI. MLOps and test-driven approaches to the data pipelines used to develop, train, and serve AI models in production environments

Past Obsessions include:

Training data scientists. I co-created and taught a core course, Fundamentals of Data Engineering, for data scientists in Berkeley’s MIDS program
Adopting test-driven methodologies into building data pipelines… keeping pipelines strongly connected to the actual problems that need to be solved
and then pretty much anything to do with AI in infrastructure automation

More Info

Social:

Keys:

GPG public key ( What’s this? )
SSH public keys on GitHub ( What’s this? )

Code:

Resume: