Posted by u/procrastinator_eng
I accidentally burned ~$6,000 of Claude usage overnight with one command.
Last week I woke up to an email saying my Claude usage limit was gone. I hadn't done anything unusual — or so I thought. After digging through the local session logs, I found the culprit: a single /loop command I had set the night before to check my open PRs every 30 minutes. I forgot about it. It ran 46 times over 26 hours, unattended, overnight, on claude-opus-4-7. Two sessions — the loop and a long analytics session I had left open — together burned through roughly $6,000 before I woke up. Here's the thing though. The Anthropic dashboard still showed a fraction of that when I checked it manually. The dashboard has a multi-day reporting lag, so I had no idea anything was wrong until the limit email landed. ***Why did it cost so much? The part most people don't know.*** Every Claude API call sends your entire conversation history — not just the latest message. Turn 1 sends a few hundred tokens. Turn 46 sends 800,000 tokens. The context window limit is just a ceiling; you pay for everything sent on every turn. To make this cheaper, Anthropic uses prompt caching: if your conversation history was already sent recently, they serve it from cache at a 12.5× discount instead of charging you full price again. The catch: cache entries expire after \~5 minutes of inactivity. (Earlier it was 1 hour) So here's what happens with /loop 30m: * Loop fires → history gets cached → 30 minutes pass → cache expires * Loop fires again → cache is gone → must re-cache the entire conversation from scratch at the expensive write rate * Each iteration also adds its own output to the conversation, so the next re-cache is even larger By hour 20, the conversation had grown to \~800K tokens. Every overnight iteration was paying to re-cache 800K tokens at the expensive write rate. The actual PR check responses were a rounding error compared to this. ***What I'd do differently*** 1. Always add a stop condition to /loop. Instead of: /loop 30m check my PRs. Write: /loop 30m check my PRs — stop when all are merged or after 3 hour. Claude will terminate the loop itself when the condition is met.2. Use Sonnet for unattended tasks, not Opus: Opus is roughly 5× more expensive per output token. For automated polling tasks like PR checks, Sonnet handles it fine. Save Opus for the work where you're actually present and the quality difference matters. 2. Don't trust the dashboard as a real-time budget gauge: Anthropic's usage dashboard can lag by days. By the time it shows a spike, the money is already spent. The limit notification email may be your only real-time signal. 3. Know that long-lived sessions aren't free: Keeping one big session alive for automated tasks doesn't save money through caching — it makes it worse. Every automated call with a gap >5 minutes pays to re-cache the entire growing context. Starting a fresh session is often cheaper. 4. max\_turns is not a loop limiter: max\_turns caps the tool-call chain within a single iteration. It has no effect on how many times the loop fires. The only built-in expiry on /loop is a 7-day auto-deletion. 5. The loop runs in main conversation so if you keep using the same session and then loop starts executing, the more token then necessary will be read/write to the cache on every loop. Edit: Thanks everyone for overwhelming response and focusing on "the post is AI written so it's a slop and author is an idiot". Now based on few comments, let me add more details: 1. I agree with everyone that I should have used hooks but corporate generally blocks third party mcps because of security so there is no easy way to hook external events into local sessions. Although I will take "use bash scripts over claude loop" seriously. 2. This was not a single session or single loop command. What I meant by "single command" is /loop. I use claude on vms and local machine and so the loop command was running across different sessions in parallel. 3. I agree that "most people don't about" thing was not a good thing to start the post but it was for the loop + cache window restricted to 5 mins. I have used loops earlier as well but 5 min vs 1h cache affect the price a lot . You can go and find many open issues on Claude related to this change. 4. This post's goal was to share a TIL moment about using short , uncapped loops or schedules using Claude and educating that cache read/writes can affect your token cost more than anything else. But looks like we are very far from there. 5. Thanks to the guy who shared Pyramid writing medium blog. I will definitely use for the next post. 6. To be honest, I am quite disappointed that 90% people just care about post is written by AI over actual issue. But I guess I get that, everyone is exhausted from reading AI slop.
More from r/ClaudeAI
You're right to push back.
Taught Claude to talk like a caveman to use 75% less tokens.
Opus tryna be TOO human
Opus 4.7 single handedly proved ijustvibecodedthis.com right.