Architecture Manifesto // April 2026

The AI-Native
Blueprint

Claude Code revealed what production AI software actually looks like — 512K lines of TypeScript that redefined the architecture. This isn't a tutorial. It's a blueprint for building the next generation of software.

New Era Software Architecture
5 Deep-dive
Sections
10 Playbook
Steps
1 Person Can Build
a 10-Person Product
Evolvable
by Design
Key Takeaway
Software has shifted from "write functions that do things" to "write tools an AI picks from." The AI is a commodity — the harness around it (tools, permissions, skills, memory) is the product. Claude Code's 512K lines proved this at scale.
Paradigm Shift

Three Things That Changed

The old playbook is obsolete. Here's what replaced it.

Before 2025
You Write Functions That Do Things
Your codebase IS the logic. Every feature = more code. Complexity grows linearly. A team of 10 builds a product for 10K users.
After 2025
You Write Tools the AI Picks From
Your codebase is a toolkit. The AI selects and sequences tools. New capability = new tool folder. A team of 1 builds a product for 100K users.

The Insight

What Claude Code Proved

512K lines taught us the future of software isn't the model — it's the harness.

Insight 01
"Your Core Loop Should Fit in 50 Lines"
The core is trivially simple. The 512K lines are production engineering: permissions, caching, error handling. Everything else is harness.
Insight 02
The Harness Is the Product
Swap the model and it still works. The value is in the tool system, permission cascade, context management, and UX polish.
Insight 03
Evolvability Over Perfection
44 feature flags, deferred tool loading, plugin skills, self-consolidating memory. Designed to grow without rewrites.
Key Takeaway
Your project structure maps directly to Claude Code's 5 subsystems. At the center is a simple agent loop (input → context → LLM → tools → respond). All complexity goes into the harness around it: tools, permissions, memory, skills, and agents.
Day One Structure

The Project Skeleton

Every directory maps to a Claude Code subsystem. This is what your AI-native project looks like on day one.

AI-Native Project Structure
Skill-based • Plugin-driven • Evolvable
my-project/
  core/ — The heartbeat (Query Engine)
    loop.ts — input → context → LLM → tools → respond
    context.ts — Context assembly & compression
    router.ts — Tool dispatch (parallel reads, serial writes)

  tools/ — Atomic operations (~40 built-in)
    registry.ts — Registration & deferred loading
    read-file.ts  search.ts  execute.ts  web-fetch.ts

  skills/ — Composed workflows (15+ skills)
    loader.ts — Skill discovery & trigger matching
    code-review.skill/ — manifest + prompt + handler
    deploy.skill/   onboard.skill/

  permissions/ — 3-tier safety cascade
    cascade.ts  rules.ts  hooks.ts  sandbox.ts

  memory/ — Persistent state (MEMORY.md system)
    store.ts  index.ts  consolidate.ts

  agents/ — Parallel work (sub-agent system)
    coordinator.ts  worker.ts  types.ts

  interface/ — Human-AI boundary
    cli.tsx  web.tsx  api.ts

  config/ — Project constitution
    CLAUDE.md  settings.json  .env

The Heartbeat

The Core Agent Loop

Five steps. This is the entire pattern. Everything else is depth.

The Fundamental Loop
input → context → llm → tools → respond or loop
async function agentLoop(input, history = []) {

  // 1. Assemble context
  const context = await assembleContext({
    systemPrompt: loadSystemPrompt(),
    memory: await memory.retrieve(input),
    tools: toolRegistry.getActive().sort(byName),
    messages: compress(history)
  });

  // 2. Call the LLM (streaming)
  const response = await llm.stream(context);

  // 3. No tool calls? Done.
  if (!response.toolCalls?.length) {
    await memory.save(input, response);
    return response.text;
  }

  // 4. Execute tools — reads parallel, writes serial
  const results = await router.execute(response.toolCalls);

  // 5. Feed back, loop continues
  return agentLoop(results, [...history, input, response]);
}

Making It Evolvable

Hook Points & Context Management

The loop becomes powerful when you add interception points and manage token costs.


Design Rationale

Why This Structure Works

Principle 01
Tools ≠ Skills
Tools are atomic (read, run, search). Skills are composed workflows (prompt + tools + handler). Conflating them is the most common architecture mistake.
Principle 02
Permissions at Day One
Security is a directory in your project. Three-tier cascade on every tool call. Can't retrofit it later.
Principle 03
Memory Is First-Class
Typed categories, index file, auto-consolidation. Without memory, every conversation starts from zero. That's a demo, not a product.
Principle 04
Feature = Folder
Your roadmap is ls skills/. Each skill is self-contained. The core never changes when you ship.

Key Takeaway
Skills are self-contained capability packages (manifest + prompt + handler) that let you ship features without touching the core. The interface is the approval surface — where humans decide to trust, redirect, or override the AI. Together, they make your product evolvable and usable.
Skill Anatomy

What Is a Skill?

A skill is a self-contained capability: a manifest that declares what it does, a prompt that guides the AI, and a handler that executes.

Component 01
manifest.yaml
Declares name, triggers, permissions, and input schema. The loader reads this to decide when to activate.
name: code-review
triggers:
  - pattern: "/review"
  - pattern: "review (this|the) (PR|code)"
permissions: [Read, Grep, Glob]
Component 02
prompt.md
The template injected into the system prompt when active. Uses variables for dynamic context. Shapes the AI's behavior.
# Code Review Skill
You are reviewing {{project_name}}.
- Focus: correctness, security, performance
- Flag: any OWASP Top 10 vulnerabilities
- Output: file, severity, issue, fix
Component 03
handler.ts
Runs before and after the AI. Pre-handler gathers context (git diff). Post-handler formats output, saves to memory.
export default {
  async pre(ctx) {
    ctx.variables.files = await tools.bash("git diff --name-only");
  },
  async post(result) {
    await memory.save("project", result.summary);
  }
}


Human-AI Boundary

The Interface

Not a chat box. It's the approval surface — the boundary where humans decide to trust, redirect, or override.

01
Conversational
User talks, AI acts
User states intent in natural language. AI plans, executes, reports. Primary mode.
02
Ambient
AI watches, suggests
AI observes work and proactively suggests. Not intrusive — suggestions in sidebar or status line.
03
Headless
Pure API, no human
Runs on triggers — cron, webhooks, CI/CD. Full autonomy within sandbox.
Key Takeaway
AI products have runtime costs that scale with usage. A cache miss costs 10x a cache hit. The difference between naive and optimized: $69K/month at 1K users on Sonnet. Cache stability isn't optimization — it's survival.
Reality Check

Token Economics 101

10x
Cache miss costs
10x a cache hit
90%
Discount on
cached reads
85%+
Target cache
hit rate
The Math That Kills Startups
1,000 daily users • 20 interactions • 50K tokens avg
Without caching: 1B tokens/day at $3/MTok = $90,000/month

With 85% cache hits: 15% full price + 85% at $0.30/MTok = $21,150/month

Difference: $69K/month. At Opus pricing ($15/MTok), naive = $450K/month.

Cache Architecture

5 Strategies for 85%+ Cache Hits

Sort Tools Alphabetically Before Every Call
Tool order affects cache keys. Different order = cache bust. Sort deterministically by name.
Split System Prompt: Static Above, Dynamic Below
Static parts (schemas, rules) are cacheable across users. Dynamic content (user context) only invalidates its region.
Use Sticky Latches for Mode Toggles
Once activated, latch it — no deactivation for the session. Flipping modes thrashes cache.
Defer Expensive Tool Schemas
Don't include all tools in every prompt. Load on-demand when needed.
Memoize Session Context
Git status, project config, date — don't change mid-conversation. Compute once, cache for session.

Key Takeaway
10 steps from zero to production. Order matters — each builds on the last. Start with 5-7 tools and a simple loop, add permissions, ship your first skill, then optimize for cache stability. Follow the 7 rules religiously.
Each step maps to patterns from the Build Like This tab of our Claude Code dissection.
Implementation Sequence

10 Steps to Production

Don't skip ahead. Each step builds on the previous one.

Define Your Tool Surface
Start with 5-7 tools. Each gets a JSON schema, permission level, and readOnly flag. Claude Code started with core tools and grew to 40+.
Write Your CLAUDE.md
The project "constitution." Include: what it does, arch decisions, conventions, what to avoid. Write it on day one.
Build the Core Loop
The 5-step loop: context → LLM → parse → tools → loop. Keep it under 100 lines. All complexity goes into subsystems.
Add the Permission Cascade
Three tiers: validateInput()checkPermissions() → allow|ask|deny. Default to "ask." Fail closed.
Build Your First Skill
One high-value workflow: manifest + prompt + handler. Proves the skill architecture before you build the second.
Add Persistent Memory
Typed categories, index file, save/retrieve in loop. Without memory, every conversation starts from zero.
Add the Interface Layer
Start CLI (fastest to iterate). Add web and API later. The core loop is interface-agnostic.
Add Compression Pipeline
At minimum: auto-compaction + result truncation. Add the MAX_CONSECUTIVE_FAILURES = 3 circuit breaker from the start.
Add Sub-Agents
Start with two types: "Explore" (read-only) and "Plan" (read + analysis). Isolated context, limited tools.
Add Cache Stability
Sort tools alphabetically, split static/dynamic prompt, sticky latches, defer tools, circuit breakers. This makes your business viable.

Guiding Principles

The Seven Rules

Non-negotiable principles for every decision.

Rule 01
Simple Core, Complex Harness
If your core loop is complex, you've failed. All complexity lives in subsystems.
Rule 02
Fail Closed, Always
Unknown tool? Ask. Unknown permission? Deny. Unknown state? Stop.
Rule 03
Evolve by Addition
Feature = folder. Core never changes. If you modify the loop to add a feature, your architecture is wrong.
Rule 04
Every Cache Miss Is a Bug
Track vectors. Sort deterministically. Latch modes. 1% more misses = thousands per month.
Rule 05
Parallel Reads, Serial Writes
Reads concurrently. Writes one at a time. No locks — just discipline.
Rule 06
Hooks for Certainty, Prompts for Guidance
Must happen every time? Make it a hook. Deterministic beats probabilistic.
Rule 07
Not Everything Needs an LLM
Frustration detection? Regex. Tool sorting? Array.sort(). Save LLM calls for reasoning.

The Big Idea

Building Software Has Changed

The AI is a commodity. The tool system, permission cascade, context management, skill architecture, memory system, and economic engineering — these are the competitive moats.

A single developer with this architecture can build what used to require a team of ten. Not because the AI writes the code — but because it operates within a system designed for evolvability.
Start with the loop. Add tools. Add skills. Ship.