Grievances #3 & #8 · Mixed Severity

The $20/Hour Problem

$1,619 in 33 days. 2,140 Downdetector reports in one afternoon. Cache TTL silently cut from 1 hour to 5 minutes.

The Cost Landscape

Platform
Input / MTok
Output / MTok
Context
Claude Code API
$15
$75
1M
Claude Max 5x
$100/mo flat (~$543/wk value)
1M
SubQ Code
$1
$5
12M
15x cheaper input. 15x cheaper output. 12x more context. Subquadratic attention makes this economically possible — the same architecture that enables 12M tokens also makes each token cheaper to process.
The double hit: Costs went up while quality went down. See Quality Regression for the AMD data documenting the simultaneous decline.

The March 23 Crisis

rate limit crisis — March 23, 2026
08:30 EDT Multiple users hit limits within 10-15 minutes of starting work anomalous
12:29 EDT 2,140+ Downdetector reports in a single afternoon crisis
afternoon Full weekly Max limits exhausted in a single day for some users limit hit
evening No human support response. No status page update. No communication. silence
days later Anthropic confirms "investigating" — the first official acknowledgment acknowledged
apr 20 XDA confirms: cache TTL silently cut from 1 hour to 5 minutes confirmed

Hidden Token Taxes

CLAUDE.md Tax

~10K tokens loaded every single request. A 9-developer team found bills 3x higher than expected — CLAUDE.md alone accounted for the majority of the inflation.

every request, every turn

Cache TTL Cut

Silently reduced from 1 hour to 5 minutes. The same prompt is now re-billed every turn instead of hitting cache. 10-20x cost inflation for some users.

silent, no announcement

Compaction Re-reads

Post-compaction, the model re-reads files it already processed. You pay for the same context twice — once when it was first loaded, again when compaction forces a re-read.

double-billing for same context

No Cost Dashboard

Cursor shows token consumption per minute. Claude Code shows nothing. Users have no visibility into what's consuming their budget until they hit the wall.

flying blind

v2.1.100 Token Inflation

~20,000 invisible tokens per request in the v2.1.100 update. Limits burn ~40% faster with no visible change in behavior or output quality.

invisible overhead

$200/mo Burns in 3 Days

Running 20+ agents on Opus 4.6 with high effort exhausts the Max 20x plan ($200/month) in 3 days. Documented by ICOR with Tom (73K views on YouTube).

$66/day effective rate
"$1,619 in Claude Code API costs over 33 days."
Future Stack Reviews · April 2026
"$1,892.38 over 13 months. I cancelled with receipts."
Chandler Nguyen · Twitter/X · public cancellation

SubQ Code Answer

SubQ Code

Subquadratic Attention Economics

12M tokens at $1/$5 per MTok. Minimal compaction overhead — 12x more context means most sessions complete without triggering compaction at all. No cache TTL games — the context is persistent, not cached. Load everything once. The same work, a fraction of the cost.

$1 input / MTok $5 output / MTok minimal re-reads no cache games 12M persistent