Platform & MLOps
Engineer

I get models off laptops and onto clusters that don't fall over.

Production infrastructure for AI workloads and distributed systems — Kubernetes, GPU scheduling, GitOps, observability. MSc AI finishing August 2026.Available for fully remote B2B contracts starting September 2026.

KubernetesMLOpsAWSGitOpsObservabilityPyTorch

See Heimdall

Say hello

Download CV

scroll

What I Build

Resilient infrastructure for distributed systems and AI workloads. The boring fundamentals, done well.

Kubernetes

& EKS

Cluster operations, GPU node pools, zero-downtime upgrades, right-sizing for cost

Observability

Stack

Prometheus, Grafana, Loki, distributed tracing, alerting

GitOps

& CI/CD

ArgoCD, pipeline automation, deployment strategies, security scanning

Security

Automation

SAST/SCA integration, runtime security, policy-as-code

AWS

Cloud

EKS, IAM, VPC, Route 53, S3, MSK, RDS — the boring fundamentals

MLOps

& AI Infrastructure

Model serving on K8s, GPU scheduling, reproducible training pipelines, drift monitoring

Data

Platforms

Kafka, stream processing, training data pipelines, schema evolution

Or just look at what I've shipped.

View Case Studies

Featured Work

Four things I've owned end-to-end. What they are, what changed, and a few decisions worth flagging.

Heimdall

Deployment intelligence platform

The dashboard the platform team checks every morning. Answers one question: where is my ticket right now? Used daily by 20+ engineers across 17 services.

services tracked

20+

engineers daily

10 min

data freshness

PythonFlaskTimescaleDBPrometheusArgoCDKubernetes

Read Case Study

heimdall

$ curl heimdall/api/v1/debug | jq .

collection.age_seconds: 142

db_pool.checked_out: 2 / 10

circuit_breakers: all closed

Pipeline Platform

Shared CI/CD library

One Bitbucket pipeline library, imported by every Java and Node service. Tests live in their own repo, promotion belongs to ArgoCD. ~400 deploys/month across 20 services on a single .ci/builds.yaml.

services, one library

~400

deploys/month

1 file

to onboard

Bitbucket Shared PipelinesArgoCDImage UpdaterKubernetesKustomize

Read Case Study

pipeline-platform

$ cat .ci/builds.yaml

service: payments-api

import: java-shared-pipeline:1.4.0

→ Image Updater handles the rest

Observability Stack

Self-hosted monitoring

Prometheus, Grafana and Loki for 20 services across four environments. Built it ourselves because the commercial quotes were ~£100k and we already had the cluster capacity.

~£5k/yr

vs ~£100k commercial

~25

dashboards

50+

alerts, runbook each

PrometheusGrafanaLokiThanosAlertmanager

Read Case Study

observability

$ prometheus targets

20/20 targets healthy

25 dashboards active

50+ alert rules configured

Smart Home on K3s

Self-hosted home automation

Single-node Kubernetes cluster on a Raspberry Pi 5, GitOps-reconciled by ArgoCD, observable end-to-end through Prometheus and Grafana. Twenty-plus lights, plugs and sensors. Zero ports exposed to the internet. Same discipline I apply at work, sized to a flat.

Single-node

K3s + ArgoCD + Prometheus

20+

lights, plugs and sensors

ports exposed to the internet

K3sArgoCDHome AssistantZigbee2MQTTPrometheusGrafanaTailscale

Read Case Study

smart-home

$ kubectl get apps -n argocd

home-assistant Synced Healthy

zigbee2mqtt Synced Healthy

prometheus + grafana Synced Healthy

View All Projects

Let's talk.

For teams that need specialised infrastructure for AI workloads, GPU-aware Kubernetes, or a platform that holds up under production load. Outside IR35 or international B2B equivalent. I usually reply within a day.

Say hello jack@devlinops.com

Available for fully remote B2B contracts starting September 2026

Platform & MLOpsEngineer

What I Build

Kubernetes

Observability

GitOps

Security

AWS

MLOps

Data

Featured Work

Heimdall

Pipeline Platform

Observability Stack

Smart Home on K3s

Let's talk.

Platform & MLOps
Engineer