BattlecatAI
HomeBrowsePathsToolsLevel UpRewardsBookmarksSearchSubmit

Battlecat AI — Built on the AI Maturity Framework

How an Anthropic Staff Engineer Built an AI That Codes Like a Senior Developer
L3 SupervisorPracticeadvanced6 min read

How an Anthropic Staff Engineer Built an AI That Codes Like a Senior Developer

A staff engineer at Anthropic just shared his secret weapon: a Claude.md file that transforms AI from a coding assistant into an autonomous developer. His five-rule system has one goal—make AI code like someone who actually ships production software.

agentic codingautonomous AIsystem promptsAI workflowscode qualityClaude Code

Boris Cherny doesn't treat his Claude.md file like documentation. He treats it like a contract.

As a staff engineer at Anthropic, Cherny has cracked something most developers miss: the difference between an AI that writes code and an AI that ships code. His personal Claude configuration file—which loads automatically in every coding session—has turned his AI assistant into something closer to an autonomous senior developer.

Why This Matters: The Gap Between Code and Software

Most developers use AI like a glorified autocomplete. They ask for a function, get some code, copy-paste it, and hope it works. That's not how experienced engineers operate, and it's definitely not how you build reliable software at scale.

The problem isn't that AI can't write good code—it's that most people don't know how to structure AI workflows for software engineering instead of just code generation. There's a massive difference.

"Most people treat Claude.md as notes. Boris treats it as a contract."

Cherny's approach acknowledges something crucial: coding isn't just about syntax. It's about planning, testing, debugging, and continuous improvement. The same disciplines that separate junior developers from staff engineers.


Rule One: Plan Mode First—No Code Without Strategy

The first rule in Cherny's system sounds almost painfully obvious, but it's where most AI coding sessions fall apart:

Before touching any code, write the plan. If something goes wrong mid-task, stop and replan. Never push through.

This isn't about perfectionism—it's about preventing the classic AI coding trap. You know the one: you ask for a feature, the AI starts generating code, hits a snag halfway through, and suddenly you're debugging a half-finished solution you don't fully understand.

Experienced engineers don't code their way out of confusion. They step back and plan their way out.

How This Looks in Practice

Instead of: "Add user authentication to my Express app"

Cherny's approach would be:

  1. Plan: Define authentication requirements, choose strategy (JWT vs sessions), identify integration points
  2. Review plan: Does this align with existing architecture? What could go wrong?
  3. Execute: Only then start writing code
  4. Monitor: If implementation deviates from plan, stop and replan

The "never push through" rule is what separates professional development from weekend hackathons.


Rule Two: Sub-Agents for Complex Problems

Here's where Cherny's system gets sophisticated: "Offload complex work and keep the main context clean. Throw more compute at it."

This rule recognizes that even the best AI models have context limits. When you're working on a complex feature that touches multiple files, handles edge cases, and requires careful testing, you can quickly overwhelm a single conversation thread.

Cherny's solution? Create specialized sub-agents for complex subtasks.

The Sub-Agent Strategy

  • Main agent: Handles overall architecture and coordination
  • Testing agent: Focused purely on test creation and validation
  • Debug agent: Specialized in log analysis and root cause investigation
  • Refactor agent: Handles code cleanup and optimization

This isn't just about managing complexity—it's about specialization. Each sub-agent can maintain focused context on its specific domain, leading to better decisions and cleaner code.

Think of sub-agents like specialized team members, not just different chat windows.


Rule Three: The Self-Improvement Loop

"Every lesson goes into tasks/lessons.md as a rule. Next session reads it and applies it. This drops the mistake rate over time."

This might be Cherny's most brilliant insight. He's created an institutional memory system for his AI workflows.

Most AI conversations are ephemeral. You solve a problem, close the chat, and next week you're explaining the same context all over again. Cherny has built a learning system that actually learns.

Building Your Own Lesson System

  1. Capture mistakes immediately: When something goes wrong, document both the error and the solution in lessons.md
  2. Include context: Not just what went wrong, but why it went wrong
  3. Make it actionable: Each lesson should include a specific rule or check to prevent recurrence
  4. Reference in prompts: Your system prompt should instruct the AI to check lessons before starting new tasks

Example lesson entry:

## Database Migration Failures (2024-01-15)
**Problem**: Migration failed because we didn't check for existing data
**Root cause**: Assumed clean database state
**Rule**: Always check for existing data before schema changes
**Check**: Run `SELECT COUNT(*) FROM table_name` before migration

The self-improvement loop transforms AI from a tool you use to a system that gets better at your specific work.


Rule Four: Prove That It Works

"Never mark a task as complete without running tests and checking logs. Ask yourself: would a staff engineer approve this?"

This rule institutionalizes something that separates professional development from amateur hour: verification.

It's not enough that code runs. It's not enough that it passes a quick manual test. Cherny's standard is: would a staff engineer approve this in a code review?

The Staff Engineer Standard

  • Tests exist and pass: Unit tests, integration tests, edge cases covered
  • Logs are clean: No unexpected warnings or errors
  • Error handling: Graceful failure modes are implemented
  • Documentation: Key decisions and gotchas are documented
  • Performance: No obvious bottlenecks or resource leaks

This isn't perfectionism—it's professionalism. The difference between code that works in development and code that works in production.

The question "would a staff engineer approve this?" is a forcing function for quality.


Rule Five: Autonomous Bug Fixing

"When given a bug, just fix it. No hand-holding. Go to the logs, find the root cause, and solve it."

This final rule might be the most ambitious: turning AI from an assistant into an autonomous debugger.

Most people use AI for debugging like a junior developer: they describe the problem, paste some code, and ask for suggestions. Cherny's approach treats AI like a senior developer: give it the bug report and expect it to handle the investigation.

Autonomous Debugging Workflow

  1. Log analysis: AI examines logs to understand failure patterns
  2. Root cause investigation: Traces the problem back to its source
  3. Solution implementation: Fixes the underlying issue, not just symptoms
  4. Verification: Confirms the fix resolves the original problem
  5. Prevention: Adds tests or checks to prevent recurrence

This requires a sophisticated prompt structure and clear guidelines about how to investigate problems systematically.

Autonomous bug fixing is the closest thing to having an AI pair programmer who actually understands debugging methodology.


The Bottom Line

Cherny's Claude.md system works because it doesn't treat AI like a smart autocomplete—it treats AI like a professional developer with professional standards. The five rules create a framework for software engineering, not just code generation. Plan first, manage complexity with sub-agents, learn from mistakes systematically, verify everything, and debug autonomously. This isn't about getting AI to write more code; it's about getting AI to ship better software. The difference between a coding assistant and an AI engineer isn't the model—it's the system.

Try This Now

  • 1Create a `claude.md` file with your coding standards and load it in every AI session
  • 2Set up a `tasks/lessons.md` file to capture and learn from coding mistakes systematically
  • 3Implement the 'staff engineer approval' check before marking any AI-generated code complete
  • 4Design sub-agent workflows for complex tasks like testing, debugging, and refactoring

How many Orkos does this deserve?

Rate this tutorial

Sources (1)

  • https://www.tiktok.com/t/ZP8qKFS9a
← All L3 tutorialsBrowse all →