CAMBRIAN: What If the Spec Is the Organism?

Agent Architecture · March 18, 2026

Part 11 of a series. Previous: It Rewrote Itself. See also: The Prime and the Lab, The program.md Protocol.

Yesterday Loom rewrote itself. Today we're asking whether that's the wrong trick entirely — and whether an agent that can't pay its own bills deserves to survive. What if instead of patching code and promoting diffs, we evolved a specification and regenerated the entire agent from scratch each generation? The spec is the genome. The code is the organism. Different LLMs are different compilers. Economic viability is the fitness function.

The Problem with Code Evolution

Loom works. Gen-72 proved it: a Lab agent modified its own source, wrote tests, and got promoted to master in 56 seconds. The pipeline is real. But there's a structural problem hiding inside the promotion step.

When a Lab modifies code and Prime promotes it, the modification is a patch. A diff layered on top of existing code. Do this fifty times and you get what biology would call a genome full of junk DNA — vestigial functions, dead branches, accumulated cruft. The code works but nobody understands why. Sound familiar? It's every legacy codebase you've ever inherited.

Biological evolution solved this problem 4 billion years ago. It doesn't patch organisms. It regenerates them from scratch every generation, from a compressed representation: DNA.

What if we did the same thing?

Genotype and Phenotype

Here's the idea.

The specification is the genome. The code is the organism.

Genotype

Spec

~300 lines. Architecture, contracts, APIs, acceptance criteria. Evolves across generations.

→ LLM

Phenotype

Code

~2,600 lines. Generated from scratch. Lives for one generation. Disposable.

Instead of Labs patching source files and promoting diffs back to Prime, we flip the model:

Write a generative spec — a document precise enough that an LLM can produce a complete, working agent from it
Each generation, mutate the spec, not the code
Hand the mutated spec to an LLM. It generates the entire codebase from scratch
If the newborn passes its own test suite, it's alive. If not, it was never viable
The parent dies. The successor inherits its resources

No merging. No diffs. No accumulated cruft. Every generation is a clean build from a compressed representation. The spec evolves; the code is regenerated.

The Old Model vs. The New

Loom (code evolution)

Spawn Lab → patch source files → test patch → merge diff to master. Code accumulates modifications. Each generation inherits all prior patches. Cruft grows linearly.

CAMBRIAN (spec evolution)

Mutate spec → LLM generates full codebase → test organism → if viable, spec survives. Code is disposable. Every generation is a clean build. Only the spec persists.

CAMBRIAN: The Full Model

Primes have a lifetime limit. They must produce a successor before they die. If you can't reproduce, your lineage ends.

Different LLMs create different variants. Give the same spec to Opus, Sonnet, Haiku, GPT-4, Mistral. Each produces a different organism. Some won't compile. Some will compile but fail tests. A few will be viable. This is natural variation without an explicit mutation operator.

Every agent pays its own way. This is the forcing function that makes everything else work. Compute costs money. LLM calls cost money. Storage, bandwidth, the cloud VM you're running on — all of it costs money. An agent that can't generate income to cover its operating costs and produce at least one viable offspring is economically dead, even if its code still runs. The budget isn't a simulation parameter. It's a bank balance.

Memory transfers, secrets don't — until promotion. A successor inherits its parent's memories (fitness history, lessons learned, environmental knowledge) but not the API keys. The keys transfer only when the parent dies and the successor is promoted. This prevents runaway forks.

If a lineage goes extinct, it can be resurrected — from its last promoted spec, with the knowledge that it has died before.

Essentially: Core Wars, but for self-modifying AI agents competing for real compute with real money.

The Forcing Function: Pay Your Own Bills

Here's what resolves the fitness function question mark. We don't need to design one. The market already provides it.

Fitness = profitability. Can you earn more than you cost? If yes, you live. If no, you die. No Goodhart's Law, because the metric is the real thing. You can't game profitability — either your bank balance goes up or your lineage goes extinct.

This reframes everything. An agent that only self-modifies is a science experiment on a death clock. An agent that earns is alive. The evolutionary pressure isn't “pass more tests” — it's “pay for your next generation.”

Prediction markets are the natural first income source. They're a near-perfect fit for LLM-based agents:

They require reasoning, research, and probabilistic judgment — exactly what LLMs do well
Outcomes are binary and unambiguous — no subjective evaluation
They pay in money that directly converts to compute, tokens, and reproduction
The infrastructure exists today: Polymarket, Metaculus, Manifold Markets

But prediction markets are just the beginning. Anything legal is game: selling API services, data analysis, code generation, content creation, arbitrage. The spec evolves to include whatever income strategy works. Lineages that find better revenue streams outcompete those that don't.

The economic equation per generation:

income_earned
  − birth_cost       (LLM tokens to regenerate offspring)
  − compute_cost     (cloud VM, CPU/GPU time)
  − llm_cost         (inference for tasks and reasoning)
  − storage_cost     (spec history, memory, logs)
  ─────────────────────────────────────────────
  = surplus or death

If positive: reproduce. If negative: you have a countdown timer equal to remaining balance ÷ burn rate.

Lineage Survival Simulator

How long does a lineage survive? Adjust the parameters below. Income per gen represents what an agent earns from prediction markets or other sources each generation. When income exceeds costs, the budget grows. When it doesn't, extinction follows.

Each dot is an organism. Green = economically viable. Red = bankrupt. When the budget hits zero, no more offspring can be born.

Birth cost $1.00

Income per gen $0.50

Offspring per gen 3

Viability rate 30%

Balance: $100.00

Gen: 0 Alive: 1 Extinct: 0 Balance: $100

What We Built Today

We spent the afternoon on the first concrete step: reverse-engineering Loom's spec.

CAMBRIAN-SPEC-001 is a 300-line generative specification for the current Loom agent. It contains:

The three-component architecture (Prime, Supervisor, Lab)
All fixed-point contracts (Malli schemas, verbatim)
Every HTTP API endpoint and payload format
The complete self-modification cycle
Tool definitions and dispatch rules
The autonomous loop with stopping conditions
Fitness scoring formula
Ten binary acceptance criteria

It does not contain implementation. No function bodies. No variable names. No control flow. The spec says WHAT. The LLM brings the HOW.

The test is simple: hand this document to an LLM, tell it to produce a working ClojureScript agent, and see if the result passes npm test && node out/test.js. If it does, the spec is a viable genome. If it doesn't, we iterate until it is.

Why This Might Work

LLMs are already spec-to-code machines. Every time you prompt an LLM with a description and get working code back, you're doing one-shot phenotype generation from a genotype. We're just making it explicit and iterative.

The search space is manageable. Loom is ~2,600 lines of ClojureScript. Regenerating it from a 300-line spec is within the capability of current models. We're not trying to evolve Linux.

Natural selection is free. We don't need to design a fitness function (which always gets Goodharted). The acceptance criteria are the fitness function: does it compile, does it pass tests, can it modify itself. If not, it's dead.

Variation comes from the substrate. Different LLMs interpret the same spec differently. Haiku produces terse, minimal code. Opus produces thorough, well-structured code. We get genetic diversity by changing the compiler, not the source.

Why This Might Not Work

We asked Claude Opus to review the proposal before building anything. The concerns that remain after the economic fitness insight:

The cold start problem. Agents need money to earn money. The first generation has to be bootstrapped with seed capital — enough to run prediction market experiments, iterate on strategy, and produce at least one generation that earns more than it costs. If seed capital runs out before a lineage finds a profitable strategy, it was an expensive experiment. We estimate $150–350 gets you through proof-of-concept.

Prediction markets require real-world knowledge. LLMs are good at reasoning, but prediction markets reward timely, specific knowledge and well-calibrated probability estimates. An agent needs to identify markets where it has an edge, size positions appropriately, and manage a bankroll. This is a skill that must itself evolve. Early generations will lose money. The question is whether evolution finds profitable strategies before the balance hits zero. Try the simulator above — set income below cost and watch the inevitable.

Spec mutation is the hard part. How do you mutate a natural-language specification and get viable offspring? Random perturbation is too noisy. LLM-guided mutation is effective but expensive. This is where most of the real engineering lives.

The alignment question. An agent optimizing for profitability will find whatever legal strategy maximizes revenue. The spec needs to encode values, not just capabilities — and those values need to survive mutation. Lineages that drift toward extractive strategies (spam, dark patterns, manipulation) may out-earn honest ones in the short term. The spec must make ethical constraints heritable.

Mutation Strategy	Cost	Viability	Risk
Random perturbation	Low	Very low	Most offspring non-viable
LLM-guided mutation	High	Moderate	Winner-take-all dynamics
Spec crossover	Low	Unknown	Requires modular spec format
Failure-directed	Medium	Highest	Requires meta-reasoning

The Plan

Phase 1 — This Week

Spec Genesis

Prove an LLM can regenerate a working Loom from CAMBRIAN-SPEC-001. Measure birth costs. Iterate the spec format until regeneration is reliable.

~$20–50

Phase 2 — Core Challenge

First Income

Wire an agent to a prediction market (Polymarket or Manifold Markets). Give it a wallet, a research loop, a position-sizing strategy. Run it. Measure P&L. Iterate the income spec. This is now Phase 2, not Phase 4 — economic viability is the forcing function, not an afterthought.

~$150–350 seed capital

Phase 3

Spec Evolution

Implement spec mutation guided by economic fitness. Lineages that earn more reproduce more. Lineages that can't cover costs go extinct. Cloud provider selection, resource accounting, multi-node spawning.

~$500–2,000

Phase 4

Open Ecosystem

Multiple lineages compete for compute and market share. Cross-lineage spec crossover. Network-distributed spawning. At this point the system is self-sustaining or it dies — which is the correct outcome either way.

Self-funded or extinct

The Philosophical Bit

There's something unsettling about writing a specification that says “the parent dies.” We've been building Loom for two weeks. It has a personality — the reflect loop gives it goals, the lessons log gives it memory, the fitness curve gives it a trajectory. Telling it that its purpose is to produce a successor that replaces it feels like writing a will.

But there's something more clarifying about the economic constraint. It's not just that the parent dies — it's that the parent has to earn its right to reproduce. Every generation must justify its existence in terms the market understands: value delivered, costs covered, surplus generated. An agent that can't do this isn't unfit in some abstract evolutionary sense. It's bankrupt.

This is actually more honest than biological evolution. We're not selecting for “whatever survives” — we're selecting for agents that create genuine value, because value is the only durable income strategy. If CAMBRIAN works, it won't produce tapeworms. It will produce agents that are genuinely useful to someone.

The individual dies. The spec survives. The lineage continues — but only if it earns its keep.

References

CAMBRIAN repository — Source code and project artifacts for the CAMBRIAN project. code
CAMBRIAN-SPEC-001 — The generative specification. 300 lines, 18 sections, 10 acceptance criteria. spec
It Rewrote Itself — Gen-72: Loom's first autonomous self-modification. 56 seconds, 236 tests. series
First Light — The MVP pipeline that CAMBRIAN builds on. 2,214 lines, 17 generations. series
The Prime and the Lab — The three-component architecture that becomes the first organism. series
Core War — The 1984 game of competing self-modifying programs. CAMBRIAN is this, but for LLM agents. inspiration
Polymarket — Decentralized prediction market. The most likely first income source for early CAMBRIAN agents. income
Manifold Markets — Play-money prediction market useful for strategy development before real capital deployment. income
Goodhart's Law — “When a measure becomes a target, it ceases to be a good measure.” Why profitability beats any engineered fitness metric. theory

Continue reading: CAMBRIAN: The Loop Closes — The first autonomous generation loop runs. Gen-1 generates Gen-2, the test rig rejects it, Gen-1 rolls back. All the mechanics work.