How an Anthropic Staff Engineer Built an AI That Codes Like a Senior Developer

Boris Cherny doesn't treat his Claude.md file like documentation. He treats it like a contract.

As a staff engineer at Anthropic, Cherny has cracked something most developers miss: the difference between an AI that writes code and an AI that ships code. His personal Claude configuration file—which loads automatically in every coding session—has turned his AI assistant into something closer to an autonomous senior developer.

Why This Matters: The Gap Between Code and Software

Most developers use AI like a glorified autocomplete. They ask for a function, get some code, copy-paste it, and hope it works. That's not how experienced engineers operate, and it's definitely not how you build reliable software at scale.

The problem isn't that AI can't write good code—it's that most people don't know how to structure AI workflows for software engineering instead of just code generation. There's a massive difference.

"Most people treat Claude.md as notes. Boris treats it as a contract."

Cherny's approach acknowledges something crucial: coding isn't just about syntax. It's about planning, testing, debugging, and continuous improvement. The same disciplines that separate junior developers from staff engineers.

Rule One: Plan Mode First—No Code Without Strategy

The first rule in Cherny's system sounds almost painfully obvious, but it's where most AI coding sessions fall apart:

Before touching any code, write the plan. If something goes wrong mid-task, stop and replan. Never push through.

This isn't about perfectionism—it's about preventing the classic AI coding trap. You know the one: you ask for a feature, the AI starts generating code, hits a snag halfway through, and suddenly you're debugging a half-finished solution you don't fully understand.

Experienced engineers don't code their way out of confusion. They step back and plan their way out.

How This Looks in Practice

Instead of: "Add user authentication to my Express app"

Cherny's approach would be:

Plan: Define authentication requirements, choose strategy (JWT vs sessions), identify integration points
Review plan: Does this align with existing architecture? What could go wrong?
Execute: Only then start writing code
Monitor: If implementation deviates from plan, stop and replan

The "never push through" rule is what separates professional development from weekend hackathons.

Rule Two: Sub-Agents for Complex Problems

Here's where Cherny's system gets sophisticated: "Offload complex work and keep the main context clean. Throw more compute at it."

This rule recognizes that even the best AI models have context limits. When you're working on a complex feature that touches multiple files, handles edge cases, and requires careful testing, you can quickly overwhelm a single conversation thread.

Cherny's solution? Create specialized sub-agents for complex subtasks.

The Sub-Agent Strategy

Main agent: Handles overall architecture and coordination
Testing agent: Focused purely on test creation and validation
Debug agent: Specialized in log analysis and root cause investigation
Refactor agent: Handles code cleanup and optimization

This isn't just about managing complexity—it's about specialization. Each sub-agent can maintain focused context on its specific domain, leading to better decisions and cleaner code.

Think of sub-agents like specialized team members, not just different chat windows.

Rule Three: The Self-Improvement Loop

"Every lesson goes into tasks/lessons.md as a rule. Next session reads it and applies it. This drops the mistake rate over time."

This might be Cherny's most brilliant insight. He's created an institutional memory system for his AI workflows.

Most AI conversations are ephemeral. You solve a problem, close the chat, and next week you're explaining the same context all over again. Cherny has built a learning system that actually learns.

Building Your Own Lesson System

Capture mistakes immediately: When something goes wrong, document both the error and the solution in lessons.md
Include context: Not just what went wrong, but why it went wrong
Make it actionable: Each lesson should include a specific rule or check to prevent recurrence
Reference in prompts: Your system prompt should instruct the AI to check lessons before starting new tasks

Example lesson entry:

## Database Migration Failures (2024-01-15)
**Problem**: Migration failed because we didn't check for existing data
**Root cause**: Assumed clean database state
**Rule**: Always check for existing data before schema changes
**Check**: Run `SELECT COUNT(*) FROM table_name` before migration

The self-improvement loop transforms AI from a tool you use to a system that gets better at your specific work.

Rule Four: Prove That It Works

"Never mark a task as complete without running tests and checking logs. Ask yourself: would a staff engineer approve this?"

This rule institutionalizes something that separates professional development from amateur hour: verification.

It's not enough that code runs. It's not enough that it passes a quick manual test. Cherny's standard is: would a staff engineer approve this in a code review?

The Staff Engineer Standard

Tests exist and pass: Unit tests, integration tests, edge cases covered
Logs are clean: No unexpected warnings or errors
Error handling: Graceful failure modes are implemented
Documentation: Key decisions and gotchas are documented
Performance: No obvious bottlenecks or resource leaks

This isn't perfectionism—it's professionalism. The difference between code that works in development and code that works in production.

The question "would a staff engineer approve this?" is a forcing function for quality.

Rule Five: Autonomous Bug Fixing

"When given a bug, just fix it. No hand-holding. Go to the logs, find the root cause, and solve it."

This final rule might be the most ambitious: turning AI from an assistant into an autonomous debugger.

Most people use AI for debugging like a junior developer: they describe the problem, paste some code, and ask for suggestions. Cherny's approach treats AI like a senior developer: give it the bug report and expect it to handle the investigation.

Autonomous Debugging Workflow

Log analysis: AI examines logs to understand failure patterns
Root cause investigation: Traces the problem back to its source
Solution implementation: Fixes the underlying issue, not just symptoms
Verification: Confirms the fix resolves the original problem
Prevention: Adds tests or checks to prevent recurrence

This requires a sophisticated prompt structure and clear guidelines about how to investigate problems systematically.

Autonomous bug fixing is the closest thing to having an AI pair programmer who actually understands debugging methodology.

The Bottom Line

Cherny's Claude.md system works because it doesn't treat AI like a smart autocomplete—it treats AI like a professional developer with professional standards. The five rules create a framework for software engineering, not just code generation. Plan first, manage complexity with sub-agents, learn from mistakes systematically, verify everything, and debug autonomously. This isn't about getting AI to write more code; it's about getting AI to ship better software. The difference between a coding assistant and an AI engineer isn't the model—it's the system.

Boris Cherny doesn't treat his Claude.md file like documentation. He treats it like a contract.

Why This Matters: The Gap Between Code and Software

"Most people treat Claude.md as notes. Boris treats it as a contract."

Rule One: Plan Mode First—No Code Without Strategy

The first rule in Cherny's system sounds almost painfully obvious, but it's where most AI coding sessions fall apart:

Before touching any code, write the plan. If something goes wrong mid-task, stop and replan. Never push through.

Experienced engineers don't code their way out of confusion. They step back and plan their way out.

How This Looks in Practice

Instead of: "Add user authentication to my Express app"

Cherny's approach would be:

Plan: Define authentication requirements, choose strategy (JWT vs sessions), identify integration points
Review plan: Does this align with existing architecture? What could go wrong?
Execute: Only then start writing code
Monitor: If implementation deviates from plan, stop and replan

The "never push through" rule is what separates professional development from weekend hackathons.

Rule Two: Sub-Agents for Complex Problems

Here's where Cherny's system gets sophisticated: "Offload complex work and keep the main context clean. Throw more compute at it."

Cherny's solution? Create specialized sub-agents for complex subtasks.

The Sub-Agent Strategy

Main agent: Handles overall architecture and coordination
Testing agent: Focused purely on test creation and validation
Debug agent: Specialized in log analysis and root cause investigation
Refactor agent: Handles code cleanup and optimization

This isn't just about managing complexity—it's about specialization. Each sub-agent can maintain focused context on its specific domain, leading to better decisions and cleaner code.

Think of sub-agents like specialized team members, not just different chat windows.

Rule Three: The Self-Improvement Loop

"Every lesson goes into tasks/lessons.md as a rule. Next session reads it and applies it. This drops the mistake rate over time."

This might be Cherny's most brilliant insight. He's created an institutional memory system for his AI workflows.

Most AI conversations are ephemeral. You solve a problem, close the chat, and next week you're explaining the same context all over again. Cherny has built a learning system that actually learns.

Building Your Own Lesson System

Capture mistakes immediately: When something goes wrong, document both the error and the solution in lessons.md
Include context: Not just what went wrong, but why it went wrong
Make it actionable: Each lesson should include a specific rule or check to prevent recurrence
Reference in prompts: Your system prompt should instruct the AI to check lessons before starting new tasks

Example lesson entry:

## Database Migration Failures (2024-01-15)
**Problem**: Migration failed because we didn't check for existing data
**Root cause**: Assumed clean database state
**Rule**: Always check for existing data before schema changes
**Check**: Run `SELECT COUNT(*) FROM table_name` before migration

The self-improvement loop transforms AI from a tool you use to a system that gets better at your specific work.

Rule Four: Prove That It Works

"Never mark a task as complete without running tests and checking logs. Ask yourself: would a staff engineer approve this?"

This rule institutionalizes something that separates professional development from amateur hour: verification.

It's not enough that code runs. It's not enough that it passes a quick manual test. Cherny's standard is: would a staff engineer approve this in a code review?

The Staff Engineer Standard

Tests exist and pass: Unit tests, integration tests, edge cases covered
Logs are clean: No unexpected warnings or errors
Error handling: Graceful failure modes are implemented
Documentation: Key decisions and gotchas are documented
Performance: No obvious bottlenecks or resource leaks

This isn't perfectionism—it's professionalism. The difference between code that works in development and code that works in production.

The question "would a staff engineer approve this?" is a forcing function for quality.

Rule Five: Autonomous Bug Fixing

"When given a bug, just fix it. No hand-holding. Go to the logs, find the root cause, and solve it."

This final rule might be the most ambitious: turning AI from an assistant into an autonomous debugger.

Autonomous Debugging Workflow

Log analysis: AI examines logs to understand failure patterns
Root cause investigation: Traces the problem back to its source
Solution implementation: Fixes the underlying issue, not just symptoms
Verification: Confirms the fix resolves the original problem
Prevention: Adds tests or checks to prevent recurrence

This requires a sophisticated prompt structure and clear guidelines about how to investigate problems systematically.

Autonomous bug fixing is the closest thing to having an AI pair programmer who actually understands debugging methodology.

How an Anthropic Staff Engineer Built an AI That Codes Like a Senior Developer

Why This Matters: The Gap Between Code and Software

Rule One: Plan Mode First—No Code Without Strategy

How This Looks in Practice

Rule Two: Sub-Agents for Complex Problems

The Sub-Agent Strategy

Rule Three: The Self-Improvement Loop

Building Your Own Lesson System

Rule Four: Prove That It Works

The Staff Engineer Standard

Rule Five: Autonomous Bug Fixing

Autonomous Debugging Workflow

The Bottom Line

Try This Now

How many Orkos does this deserve?

Sources (1)

How an Anthropic Staff Engineer Built an AI That Codes Like a Senior Developer

Why This Matters: The Gap Between Code and Software

Rule One: Plan Mode First—No Code Without Strategy

How This Looks in Practice

Rule Two: Sub-Agents for Complex Problems

The Sub-Agent Strategy

Rule Three: The Self-Improvement Loop

Building Your Own Lesson System

Rule Four: Prove That It Works

The Staff Engineer Standard

Rule Five: Autonomous Bug Fixing

Autonomous Debugging Workflow

The Bottom Line

Try This Now

How many Orkos does this deserve?

Sources (1)