Platform & MLOps
Engineer
I get models off laptops and onto clusters that don't fall over.
Production infrastructure for AI workloads and distributed systems — Kubernetes, GPU scheduling, GitOps, observability. MSc AI finishing August 2026.Available for fully remote B2B contracts starting September 2026.
What I Build
Resilient infrastructure for distributed systems and AI workloads. The boring fundamentals, done well.
Kubernetes
& EKS
Cluster operations, GPU node pools, zero-downtime upgrades, right-sizing for cost
Observability
Stack
Prometheus, Grafana, Loki, distributed tracing, alerting
GitOps
& CI/CD
ArgoCD, pipeline automation, deployment strategies, security scanning
Security
Automation
SAST/SCA integration, runtime security, policy-as-code
AWS
Cloud
EKS, IAM, VPC, Route 53, S3, MSK, RDS — the boring fundamentals
MLOps
& AI Infrastructure
Model serving on K8s, GPU scheduling, reproducible training pipelines, drift monitoring
Data
Platforms
Kafka, stream processing, training data pipelines, schema evolution
Or just look at what I've shipped.
View Case StudiesFeatured Work
Four things I've owned end-to-end. What they are, what changed, and a few decisions worth flagging.
Heimdall
Deployment intelligence platform
The dashboard the platform team checks every morning. Answers one question: where is my ticket right now? Used daily by 20+ engineers across 17 services.
Pipeline Platform
Shared CI/CD library
One Bitbucket pipeline library, imported by every Java and Node service. Tests live in their own repo, promotion belongs to ArgoCD. ~400 deploys/month across 20 services on a single .ci/builds.yaml.
Observability Stack
Self-hosted monitoring
Prometheus, Grafana and Loki for 20 services across four environments. Built it ourselves because the commercial quotes were ~£100k and we already had the cluster capacity.
Smart Home on K3s
Self-hosted home automation
Single-node Kubernetes cluster on a Raspberry Pi 5, GitOps-reconciled by ArgoCD, observable end-to-end through Prometheus and Grafana. Twenty-plus lights, plugs and sensors. Zero ports exposed to the internet. Same discipline I apply at work, sized to a flat.
Let's talk.
For teams that need specialised infrastructure for AI workloads, GPU-aware Kubernetes, or a platform that holds up under production load. Outside IR35 or international B2B equivalent. I usually reply within a day.