Anthropic shares how to make Claude code better with a harness

LLMs 863 points 135 comments 1 month ago

I just read Anthropic's new blog post about harness design for Claude. The author addresses two main problems Claude faces when working for extended periods: \- Context anxiety: loss of coherence over long periods \- Self-evaluation bias: Claude often praises his own work even when the quality isn't good The solution is to use multiple agents working together, drawing ideas from GANs: \- Generator: creates code and design \- Evaluator: provides critical evaluation and feedback Frontend: Use 4 scoring criteria (emphasizing aesthetics and creativity) to avoid generic designs. After 5-15 revisions, the result is much more beautiful and unique Full-stack: Use 3 agents (Planner - Generator - Evaluator) Comparison of the same game development requirements: \- Running alone: ​​fast but the game has serious bugs. \- Using a harness: more time-consuming and expensive, but significantly higher quality, beautiful interface, playable game, and added AI support. The article also suggests that when the model becomes more powerful (like Opus 4.6), unnecessary harness elements should be removed. Link: https://www.anthropic.com/engineering/harness-design-long-running-apps Anyone using Claude to code or build agents should give this a try.

More from r/ClaudeAI