Building production-grade ML systems on Kubernetes. I care about reliability, observability, and security — not just model accuracy.
Senior Platform & SRE Engineer with 9+ years of experience designing large-scale cloud platforms, Kubernetes infrastructure, and high-volume observability stacks.
I’m drawn to the intersection of reliability and machine learning: where production discipline meets model chaos. I write about the trade-offs, the failures, and the tooling that actually holds up under load.
Open to conversations about MLOps, platform engineering, SRE, or just building stuff on Kubernetes. Find me on GitHub or send a mail.