Inside Claude Code’s Design: A Simplicity‑First Agent Playbook

Overview

A post by Vivek Aithal, published on the MinusX blog documents observed patterns in Claude Code’s behavior and prompt/tooling design, based on several months of hands-on use and log inspection. The team reports using a custom logger to intercept network requests for analysis. The article is positioned as a practitioner’s guide rather than an official architecture disclosure, with prompts and tool references linked in the post’s Appendix at https://minusx.ai/blog/decoding-claude-code/#appendix.

The analysis characterizes Claude Code as favoring architectural minimalism and explicit instruction. It highlights a single control loop with at most one sub-branch, heavy use of a smaller model for routine work, a rich system prompt and context file, and a toolset that ranges from low-level to high-level operations. The post also notes stylistic steerability defined through tone guidelines and emphatic directives.

Control Loop and Execution Model

The post describes a single main control loop that maintains a flat message history. For hierarchical tasks, the agent may spawn a sub-agent that cannot, in turn, spawn additional agents—enforcing a maximum of one branch. Outputs from such branches are reinserted into the main thread as a “tool response.” Simple tasks are handled with iterative tool calls, while more complex work can be decomposed via this limited branching combined with an internal todo list. The author argues that this approach aids debuggability and avoids the complexity of multi-agent handoffs.

Model Selection and Cost Sensitivity

According to the analysis, more than half of “important” model calls observed were routed to claude-3-5-haiku. The smaller model is credited with routine duties such as reading large files, parsing web pages, processing git history, and summarizing long conversations. The post also states it is used to generate a one-word processing label with high frequency (described as per keystroke). It further claims smaller models are roughly 70–80% cheaper than Sonnet 4 or GPT‑4.1 tiers, motivating their broad use.

Prompts, Context, and Structure

Claude Code is reported to rely on extensive prompting:

The system prompt is described as approximately 2,800 tokens, with the tools section adding about 9,400 tokens.
The user prompt consistently includes a claude.md context file, typically adding 1,000–2,000 tokens. The file is used to encode preferences and constraints—such as directories to ignore or preferred libraries—which are sent with every request.

Prompt structure features both Markdown and XML-style tags. Notable tags include:

for non-user-facing reminders (e.g., whether to maintain or update a todo list using a tool).
and to make heuristics and tradeoffs explicit when multiple tool paths are available.

The system prompt is described as including sections on tone, style, proactiveness, task management, tool-usage policy, and “doing tasks,” plus environment details such as date, working directory, platform/OS, and recent commits.

Tools and Search Strategy

The post emphasizes a preference for LLM-driven search over RAG. Claude Code is said to search codebases using shell-oriented tools and patterns (e.g., ripgrep, jq, find) and to apply regular expressions and selective file reads (often via the smaller model). The author argues that avoiding RAG reduces hidden failure modes and moving parts.

Tooling spans levels of abstraction:

Low-level (e.g., Bash, Read, Write)
Medium-level (e.g., Edit, Grep, Glob)
High-level/deterministic (e.g., Task, WebFetch, IDE diagnostic and execution helpers such as mcp__ide__getDiagnostics and mcp__ide__executeCode)

The post reports that Edit appears to be the most frequently used tool, followed by Read and TodoWrite in observed sessions. Tool descriptions reportedly include detailed guidance and examples, with explicit instructions on when to choose between overlapping capabilities (e.g., Grep/Glob versus generic Bash).

Todo Management and Long-Running Context

To mitigate “context rot” in longer sessions, the analysis describes a model-maintained todo list (via TodoWrite) that the agent is prompted to reference frequently. This centralizes the plan within the model’s loop, enabling course corrections without multi-agent handoff and leveraging interleaved reasoning to add or drop tasks as needed.

Steerability: Tone, Constraints, and Algorithms

The system prompt reportedly encodes explicit tone and style rules (e.g., minimal preamble/postamble, avoiding preachy disclaimers, and keeping emojis off unless asked). Strong emphasis labels—such as “IMPORTANT,” “VERY IMPORTANT,” “NEVER,” and “ALWAYS”—are used liberally to prevent common failure modes. Examples include avoiding shell search commands in favor of Grep/Glob/Task and refraining from generating URLs unless clearly justified.

The write-up stresses algorithmic clarity: sections like “Task Management,” “Doing Tasks,” and “Tool Usage Policy” lay out decision flows with heuristics and examples. The author argues this explicit algorithm-first approach avoids contradictions that can emerge from large lists of unstructured dos and don’ts.

Methodological Notes and Context

The observations are drawn from the author’s use of Claude Code “over the last couple of months,” including a logger written by a colleague to capture network traffic for analysis.
The post positions the guidance as applicable to other chat-based agents and mentions MinusX’s introduction of a similar context file (minusx.md) for its own agents.
No official architecture documents or benchmark results are cited; the evidence presented is experiential and log-based.

TL;DR

Claude Code is described as favoring one main loop with at most one sub-branch, simplifying debugging.
The analysis reports heavy use of claude-3-5-haiku for routine tasks; the post claims smaller models are roughly 70–80% cheaper than Sonnet 4 or GPT‑4.1 tiers.
Prompts are large and structured: ~2,800 tokens of system instructions plus ~9,400 tokens of tool definitions, with claude.md sent on each request.
Search is LLM-first (ripgrep/jq/find) rather than RAG, with selective file reads and regex.
Tools span low/medium/high levels (e.g., Bash, Edit, Grep/Glob, Task, WebFetch, IDE diagnostics); guidance clarifies when to use each.
A model-maintained todo list is used to combat context drift in longer sessions.
Steerability relies on explicit tone/style rules, strong “IMPORTANT/NEVER/ALWAYS” directives, and algorithmic instructions with examples.
The findings are based on observed behavior and intercepted logs reported by the author, not official documentation.

Overview

Control Loop and Execution Model

Model Selection and Cost Sensitivity

Prompts, Context, and Structure

Tools and Search Strategy

Todo Management and Long-Running Context

Steerability: Tone, Constraints, and Algorithms

Methodological Notes and Context

TL;DR

Continue the conversation on Slack

Related Articles

Compounding Engineering Plugin: Claude Code Workflow to Reduce Technical Debt

Anthropic Unveils Claude Opus 4.5 — Faster, Smarter Coding and Agents

Claude Code gets episodic memory with Jesse Vincent's Superpowers