Your frontier model is doing intern work.
Every sub-task your agent spawns — grepping 200 files, running the test suite, applying a spec’d-out patch — runs on your most expensive model by default. ax keeps the frontier model where it earns its keep and flags the mechanical work down a tier. Same work lands; you just get more of it before your usage window resets. ax learns the fix from your own history and proves the result — all on your laptop.
and the default leaves money on the table · one real machine, 14 days of receipts
$2,301 of sub-task spend on fable/opus vs $83 on sonnet, on this machine. Run ax cost split to see yours.
Measure. Nudge. Tune. Verify.
Four commands close the loop — see where the expensive defaults hide, get warned before the next one, learn the fix from what your agents actually do, then reprice against the tokens your sub-tasks actually burned.
ax cost split --days=7Breaks your spend into main session vs the sub-tasks your agent spawns, by model. The 28:1 above came out of this table — run it to see your ratio.
ax hooks install ~/.ax/hooks/route-dispatch.ts --providers=claudeThe route-dispatch hook warns in the moment, right as a routine sub-task is about to run on the expensive default.
ax routing tuneFinds the routine work you keep overpaying for, from what your agents actually do. Deterministic pattern-matching — no AI guessing.
ax dispatches --candidatesReprices every overbilled sub-task against what the cheaper model would have cost, from the actual tokens it burned. Your savings, not projections.
Same work landed. Fewer windows burned.
The honest answer to “how do I know it didn’t get worse” is: you measure it. Hold the work constant — commits landed, turns taken — and watch the plan window stretch. ax routing impact captures a routing-off block and a routing-on block live on your machine, then diffs them.
$ ax routing impact report matched work across both blocks routing off routing on ──────────────────────────────────────────────────────────── assistant turns (work proxy) 312 318 commits landed 11 11 5h-window utilization burned 1.00x 0.82x ──────────────────────────────────────────────────────────── → 1.22x more work per plan window, same output landed
Illustrative shape, not a recorded receipt — this is the one number on the page that’s yours to capture, not ours to claim. Run the two blocks on your machine (ax routing impact begin --arm=off … end, then the same with --arm=on) and ax routing impact report fills in your own off-vs-on diff. Your plan window is the budget, so the receipt is in windows, not invoices.
One machine, verbatim.
Over 30 days: 20 patterns of routine work found, $591.57 of addressable spend. Reprice the last 14 days against the cheaper model and $512.91 is recoverable — the same machine as the numbers above. ax routing tune --dry-run prints yours in one command.
$ ax routing tune --dry-run --days=30 20 proposals addressable spend: $591.57 (30 days) apply non-judgment: ax routing tune --days=30 brief: ax routing tune --emit-brief $ ax dispatches --candidates --days=14 total est savings: $512.91 top classes: well-specified-impl ($222.78), spec-review ($69.75), bug-fix ($62.08)
Honest numbers, on purpose: “addressable spend” is what the flagged sub-tasks actually cost over the window — yours included. ax reprices what already happened, from the actual tokens burned, and never reports fabricated projected savings.
Routing only touches the mechanical lanes.
Your obvious objection: won’t quality drop? The honest guarantee isn’t “it can’t” — it’s that the risk is bounded, visible, and reversible. ax only flags deterministic mechanical classes (bounded), you own the routing table and see every class (visible), and the impact receipt catches a regression before it compounds (reversible). Your reviews, design calls, and plans never leave the frontier model.
flagged to tier down
- File search & repo recon (billed at opus rates today)
- Well-specified implementation (the spec did the thinking)
- Bug fixes with a repro (mechanical once located)
- Verification & test runs (pass/fail, not judgment)
stays on the frontier model
- Code review (quality gate, never flagged)
- Design & UX (taste does not tier down)
- Planning & architecture (the expensive mistakes)
- Audits (the thing checking the cheap work)
Route the expensive model where it earns its keep.
Everything runs local. Your transcripts never leave your machine.
$ curl -fsSL https://ax.necmttn.com/install | bash$ ax ingest # first run: build the graph from your history$ ax routing tune