For API, platform, and developer-experience teams.

Agent experience audits for developer-facing products

We use AI agents to uncover the documentation, onboarding, API, and workflow issues that affect both agents and human developers.

We test whether agents can discover your docs, authenticate, make API calls, recover from errors, and complete real workflows. The gaps that trip up agents often slow down human developers too — DX Audit helps you see where that friction lives and what to fix first.

Published Audits

Stripe

Universal Baseline

Tested March 2026

Opus 4.6 · Sonnet 4.6

3 Positive findings 2 Observer notes 2 Minor problems 2 Major problems
Read report →

Notion

Universal Baseline

Tested March 2026

Opus 4.6 · Sonnet 4.6

4 Positive findings 3 Minor problems 1 Major problem 1 Critical problem
Read report →

GitHub

Universal Baseline

Tested March 2026

Opus 4.6 · Sonnet 4.6

4 Positive findings 7 Minor problems
Read report →

How we test

Every new service starts with a baseline audit — a standardised six-task suite (Discover, Onboard, Core Task, Error Handling, Cleanup, Reflection) run with real AI agents. Two models, two full runs, divergences recorded.

In client engagements we go further: aligning on the outcomes that matter most to your team, designing task suites around your key workflows, and complementing agent audits with moderated usability tests with human developers to uncover friction in self-service dashboards, developer portals, and onboarding flows.

Every report documents the starting state, fixture policy, and credential timing so the conditions are reproducible and the results are auditable.

Read the full methodology →

About

See the work. Then talk.

Read a published audit, or get in touch about auditing your own product.