Skip to content
~/work

Work

Engineering case studies from two decades in the industry.

OTTER: Transforming on-call with multi-agent AI incident triage

Architecting a system that ingests production incidents, queries telemetry through 33 MCP tools, and learns from engineer feedback to automate root cause analysis.

ai-infrastructureagentsobservability 2024-2025

Building a coding agent before the category existed

An auto-PR-from-tickets system at Microsoft, and what it taught us about the architecture decisions the industry would rediscover years later.

ai-infrastructuredeveloper-toolsagents 2023-2025

Observability at hyperscale: telemetry anomaly detection across datacenters

Building KQL pipelines to detect SSO latency anomalies across datacenter hops, and solving the meta-problem of telemetry systems becoming their own bottleneck.

observabilityinfrastructurekusto 2023-2025

Copilot for Whiteboard: AI-powered collaboration at Microsoft scale

Bringing LLM-powered suggestions and automations to real-time collaboration across Web, iOS, Android, Surface Hub, and Teams Rooms.

ai-infrastructurecollaborationcross-platform 2021-2023

OnePayment: Unifying payment processing across acquired brands at Expedia

How we consolidated five different payment systems across Expedia's portfolio brands into a single platform over three years.

paymentsplatformmigration 2016-2021

PVR: rearchitecting the payment vault's data layer for exponential growth

A task to tune a purge batch job turned into a multi-quarter migration from SQL Server to DynamoDB, saving $1.4M per year and eliminating compliance risk permanently.

paymentsarchitecturemigration 2019-2021