Hi, I’m Mark…
I’m a Data Plumber. I spend most days designing/tuning/fixing/banging-on data pipelines and the infrastructure behind data-intensive applications of all sorts.
Originally trained as a Physicist (Ph.D. researching how quantum computers can learn to tolerate noise and errors), I’ve since had an incredible career out in the world where Data Science meets DevOps and infrastructure engineering.
I’ve been lucky enough to work for companies like Canonical, the folks behind the Ubuntu operating system, Infochimps, DigitalOcean, and now Google. Places where talented teams of people are innovating and leading open-source communities and the tech industry as a whole.
My passions at the moment include:
- Scientific AI/ML. Using various deep or hybrid models to accelerate scientific high-performance computing (HPC) workloads
- Operationalizing AI. MLOps and test-driven approaches to the data pipelines used to develop, train, and serve AI models in production environments
Past Obsessions include:
- Training data scientists. I co-created and taught a core course,
Fundamentals of Data Engineering , for data scientists in Berkeley’s MIDS program - Adopting test-driven methodologies into building data pipelines… keeping pipelines strongly connected to the actual problems that need to be solved
- and then pretty much anything to do with AI in infrastructure automation
More Info
Social:Keys:
Code:
Resume: