PartnerinAI

Claude Code Resume Cache Bug: What Was Happening

Claude Code resume cache bug explained: why --resume caused prompt cache misses, hit usage limits faster, and affected MCP-heavy setups.

📅April 2, 20267 min read📝1,388 words

⚡ Quick Answer

The Claude Code resume cache bug appears to have caused a full prompt-cache miss on the first resumed request in certain setups since v2.1.69. It hit MCP-heavy workflows hardest because resumed sessions rebuilt expensive context instead of reusing cached prompts.

The Claude Code resume cache bug wasn't some odd little corner case. It looked more like a billing and workflow snag hiding behind a convenience flag. People who relied on --resume expected the thread to pick back up cleanly. But some got the nastier surprise instead: usage limits dropping faster than expected, especially in MCP-heavy setups. That's the kind of defect you don't spot right away. Then the numbers drift. And once you follow the trail, the pattern starts to feel a bit too logical. Worth noting.

What is the Claude Code resume cache bug?

What is the Claude Code resume cache bug?

The Claude Code resume cache bug looks like a prompt-caching failure. Simple enough. After someone used --resume, the first request seems to have skipped cache and rebuilt pricey context from scratch. According to the source summary, the issue appeared from v2.1.69 onward and mainly hit that first resumed request, not every turn in the session. That's a consequential distinction. Users came back expecting a cheap continuation. But the system appears to have treated that first turn more like a cold boot. In workflows that pull in heavy tool context, that extra rebuild can get expensive fast and skew how people read their usage. Anthropic's developer tooling relies heavily on prompt construction and cache reuse to keep things efficient, so a cache miss at exactly that moment isn't minor. We'd argue bugs like this feel worse than a crash. They chip away at trust quietly, one costly resume at a time. Think of a VS Code workspace reopening with all extensions re-indexing at once.

Why did Claude Code prompt cache miss on resume hit MCP setups hardest?

Why did Claude Code prompt cache miss on resume hit MCP setups hardest?

Claude Code prompt cache miss on resume hit MCP setups the hardest. Here's the thing. MCP-heavy workflows usually carry bigger, messier prompt state into each turn. The Model Context Protocol pushes rich tool ecosystems, and that often brings more metadata, tool schemas, deferred calls, and outside context packed into the prompt. So when cache reuse breaks, the penalty climbs in a hurry. A small local coding task may barely register it. But a session connected to several MCP servers can suddenly eat far more tokens on that first resumed request. That's in line with the source summary, which points to MCP servers, deferred tools, and custom agent setups as the main risk group. The protocol itself isn't the culprit. Not quite. But larger prompt footprints make cache failures hurt a lot more, much like a CDN miss stings harder on a giant video file than on a tiny icon. That's a bigger shift than it sounds. Picture a team wiring Claude Code into Slack, Linear, and a private docs server all at once.

How did the Claude Code usage limits bug explained by users actually show up?

The Claude Code usage limits bug explained itself in practice through faster-than-expected limit burn after a resumed session, not through some loud error banner. That's why plenty of users likely read it first as a quota issue, not a caching one. They resumed, sent a request, and watched consumption jump in a way that didn't fit the apparent task. In MCP-heavy environments, one resumed request may have quietly rebuilt a large prompt envelope, making the token spend feel wildly out of proportion. We've seen similar patterns in other AI developer tools, where hidden context assembly rather than output length drives the bill. And because prompt caches are supposed to hide that cost, users had every reason to assume continuity would stay cheap. My view is pretty blunt. When cache behavior changes the economics of usage, the product should say so more clearly than a shell flag and a shrug. GitHub Copilot users have made similar complaints when invisible context handling changed costs or latency.

Fix Claude Code resume bug: what should users do now?

To reduce Claude Code resume bug risk in the real world, users should update to the corrected version once it's available and trim expensive resumed context paths for now. The quickest workaround is to treat the first resumed request as possibly uncached and keep it small, focused, and low-context when you can. That's not elegant. But it does soften the hit if you're working with multiple MCP servers, deferred tools, or custom agents that attach a lot of prompt state. You should also compare token usage between fresh sessions and resumed sessions on the same task, because that gives teams a cleaner signal than gut-level frustration. Logging prompt size, enabled tools, and version numbers can reveal whether v2.1.69 or later lines up with the spike. And if you're running shared engineering workflows, document the issue internally so nobody mistakes a tooling quirk for careless usage. We'd argue a simple internal note in Notion or Jira can save a lot of confusion.

Key Statistics

The investigation summary says the bug appeared since Claude Code v2.1.69.That version marker gives teams a concrete release boundary for troubleshooting and rollback decisions.
The cache problem mainly affected the first resumed request after using --resume.This narrows the failure mode and explains why some sessions seemed normal after the initial spike.
MCP servers, deferred tools, and custom agents were identified as the setups most affected.These environments tend to assemble larger prompt payloads, so cache misses create higher token costs.
Users reported usage limits getting hit much faster in MCP-heavy workflows.That symptom matters because it translates a hidden caching flaw into a direct productivity and budget issue.

Frequently Asked Questions

Key Takeaways

  • The Claude Code resume cache bug likely caused a first-request prompt cache miss after --resume.
  • MCP-heavy setups felt the issue most because their context payloads were larger.
  • Users saw usage limits drop faster after resuming longer sessions.
  • Deferred tools and custom agents appear to have widened the blast radius.
  • The bug matters because cache misses quietly raise cost, latency, and workflow friction.