Your CLAUDE.md Is Costing You Thousands of Tokens Per Turn
Your CLAUDE.md Is Costing You Thousands of Tokens Per Turn
I hit the wall. Conversations dying mid-feature, responses slowing down, costs climbing. I was getting great work out of Claude Code — but I was burning through tokens like I had infinite context.
Turns out, most of the damage was self-inflicted.
The hidden tax you're paying every turn
Claude Code loads your CLAUDE.md files on every single turn. Not once per session — every turn. The loading chain walks from your working directory up to ~/.claude/CLAUDE.md, concatenating every CLAUDE.md it finds along the way.
I ran the numbers on mine:
| Source | Size | ~Tokens | Loaded when? |
|---|---|---|---|
| Project CLAUDE.md | 18.7 KB | ~6,000 | Every turn |
| Parent dir CLAUDE.md | 2.9 KB | ~1,000 | Every turn |
| Global CLAUDE.md | 4.7 KB | ~1,500 | Every turn |
| System prompt + tools | ~20 KB | ~6,000 | Every turn |
| Fixed floor | ~14,500 | Before you say a word |
That's ~14,500 tokens of overhead before Claude even reads your message. Over a 50-turn conversation, that's ~725K tokens just on context that never changes.
My project CLAUDE.md alone — 18.7KB of stack descriptions, command references, code organization maps, and domain conventions — was eating 6,000 tokens per turn. Most of that information was only relevant 10% of the time.
The fix: slim router + on-demand references
The core insight is simple: not everything in CLAUDE.md needs to be there.
CLAUDE.md should contain only what Claude needs to know on every single turn — the stuff that prevents catastrophic mistakes. Everything else should live in files Claude reads on demand when the task requires it.
What stays in CLAUDE.md
- Project identity — what is this repo, what are the major components (3-5 lines)
- Safety rails — the "don't screw this up" bullets. Auth patterns that must not be broken, deployment constraints, data migration rules
- Critical environment setup — like a non-obvious PATH requirement that breaks every shell command
- Pointers — "Read
backend/CLAUDE.mdwhen working on the backend"
What moves out
- Full stack descriptions (derivable from package.json / Gemfile)
- Command reference lists
- Code organization maps
- Domain conventions and model templates
- Current project status (changes every session anyway)
I moved all of this into per-component CLAUDE.md files that Claude reads once at the start of a task — not 50 times per conversation.
Result: 18.7KB → 2.6KB. An 86% reduction in per-turn overhead.
The memory file trap
Claude Code's auto-memory system is great for preserving context across sessions. But memory files accumulate. Mine had grown to 78KB across 13 files — and several of them were 10-12KB each, packed with full code examples and war stories.
The audit:
- Delete duplicates. Two of my memory files contained information already in CLAUDE.md. Gone.
- Condense to rules. A 12KB file about framework gotchas became 2KB of key rules without the code examples and commit-by-commit narratives. The rule is what matters, not the story of how you learned it.
- Move reference material. Detailed API endpoint maps and algorithm documentation belong in project docs, not memory. Memory should be a pointer: "the algorithm is documented at
path/to/file."
Result: 78KB → 21KB across 12 files. 73% reduction.
Workflow discipline
Beyond the files, three habits were burning tokens:
Oversized plan documents. One of my implementation plans was 54KB — a small novel. Plans should be 15KB max: task list, key decisions, edge cases. Reference existing patterns by name instead of inlining code.
Reading entire files. Claude Code's Read tool accepts offset and limit parameters. If you need 20 lines from a 500-line file, grep for the line range first, then read just that section. Don't dump entire files into context.
Unnecessary sub-agents. Every sub-agent starts with a fresh context window and gets its own full briefing. For tasks that take fewer than 5 tool calls, just do them in the main context.
How to audit yours
Want to check your overhead? Look at your CLAUDE.md file sizes:
ls -lh ~/.claude/CLAUDE.md ./CLAUDE.md
If your project CLAUDE.md is over 5KB, you're carrying dead weight on every turn. Rough math: 1KB ≈ 300 tokens, multiplied by every turn in your conversation.
The results
| Source | Before | After | Per-session savings |
|---|---|---|---|
| CLAUDE.md (×50 turns) | ~300K tokens | ~40K tokens | 260K |
| Memory files | ~25K tokens | ~7K tokens | 18K |
| Leaner plans + reads | Variable | ~30% less | ~40K |
Conversations last longer before hitting context limits. Responses come back faster. Costs go down. And the quality of work hasn't changed — Claude still has access to all the same information, it just loads it when it needs it instead of carrying it every turn.
Thirty minutes of auditing. No code changes. No feature loss. Just less waste.