For API, platform, and developer-experience teams.

Agent experience audits for developer-facing products

I use AI agents to uncover the documentation, onboarding, API, and workflow issues that affect both agents and human developers.

See a published audit Get your baseline audit

I test whether agents can discover your docs, authenticate, make API calls, recover from errors, and complete real workflows. The gaps that trip up agents often slow down human developers too — DX Audit helps you see where that friction lives and what to fix first.

dx-audit

$ dx-audit run --baseline▌

✓ Discover docs

✓ Authenticate

✓ Create resource

⚠ Update record finding

✓ Handle error

4 passed · 1 finding · 0 blocked

Run by Matt Steen — product leader, twenty years shipping APIs, integrations, and self-serve experiences. Previously Head of Product in ecommerce; before that, e-learning, insurance, and telecoms.

Published Audits

Stripe

Universal Baseline

Tested March 2026

Opus 4.6 · Sonnet 4.6

3 Positive findings 2 Observer notes 2 Minor problems 2 Major problems

Read report →

Notion

Universal Baseline

Tested March 2026

Opus 4.6 · Sonnet 4.6

4 Positive findings 3 Minor problems 1 Major problem 1 Critical problem

Read report →

GitHub

Universal Baseline

Tested March 2026

Opus 4.6 · Sonnet 4.6

4 Positive findings 7 Minor problems

Read report →

See patterns across audits →

Rigor

How I test

Every new service starts with a baseline audit — a standardised six-task suite (Discover, Onboard, Core Task, Error Handling, Cleanup, Reflection) run with two Claude models at different capability tiers (Opus and Sonnet). The contrast between tiers separates obvious breakdowns — even the top tier fails — from subtle friction only the lower tier hits.

In client engagements I go further: aligning on the outcomes that matter most to your team and designing task suites around your key workflows and integration paths.

Every report documents the starting state, fixture policy, and credential timing so the conditions are reproducible and the results are auditable. These are case studies, not a benchmark leaderboard.

Read the full methodology →

Who

About

Matt Steen

I'm Matt. I've spent twenty years building products where usability determines adoption — most recently running a product org shipping self-serve platforms, APIs, and integrations across three regions and eight languages. Before that, e-learning, insurance, and telecoms. DX Audit applies the methods I've used on human users — rigorous observation of real people completing real tasks — to a new kind of user: AI agents working with developer-facing products.

See the work. Then talk.

Read a published audit, or get in touch about auditing your own product.

See a published audit Get your baseline audit