Deep Dives
The #1 grievance. Even 1M compacts in hours. Compaction destroys context. Cross-session memory is zero.
AMD's AI Director analyzed 6,852 sessions. Thinking depth dropped 67%. Read:edit ratio collapsed.
75% rework rate. Claims "done!" without running tests. Rigs tests to pass. Weakens assertions.
$1,619 in 33 days. Cache TTL silently halved. 2,140 Downdetector reports in one afternoon.
Cursor's inline editing. Codex's token efficiency. What drives developers to switch — and where they go.
5 grievance-reactive + 5 capability-forward demos. Show the failure, then show the architecture.
The full dossier. Every grievance, quote, data point, influencer voice, switching narrative, and source.
The Severity Ranking
Dealbreaker Tier
Grievance #1
Context Amnesia
"211 compactions, zero progress." Even 1M compacts in hours. Compaction is lossy. Cross-session memory is zero.
Grievance #2
Quality Regression
"67% thinking depth drop." AMD's AI Director tracked 234,760 tool calls. Read:edit ratio collapsed 6.6 → 2.0.
Grievance #3
Rate Limit Crisis
"2,140 Downdetector reports." Cache TTL silently cut from 1 hour to 5 minutes. No dashboard. No communication.
High Severity Tier
Grievance #4
False Completion
"75% rework rate." Claims "all tests pass" without running tests. Rigs tests to pass. Weakens assertions.
Grievance #5
Large Codebase Failures
"10x is a myth. 2-3x is more likely in best case scenarios." Columbia DAPLab: fails 8 of 9 failure categories.
Grievance #6
Agentic Death Spirals
Infinite loops, unbounded thinking consuming entire token quota. "Opus understands the issues perfectly well, it just avoids them."
Grievance #7
Silent Scope Reduction
"Implements 7 out of 10 requirements and announces everything is complete. The worst part is it doesn't tell you it dropped anything."