ISSUE 1 - Nine Authors. Nine Layers. One Stack.

Finding Solved Games in Moving Castles.

May 23, 2026

Nine independent open-source authors built an operating system in seventy-two hours. None of them seem to have noticed.

Not a company. Not a standards body. Not a roadmap. Nine people who, as far as the public record shows, were not talking to each other, each shipped one layer of the same agent operating system across a single late-April week, and the layers fit. Orchestration from one. A governance contract from another. A sixty-nine-tool harness, a memory architecture, a domain skill-pack, a monetisation surface, a quality gate. Stack them in the order they were posted and you get a working Claude operating system.

Start with the layer that pays for the others. @cyrilXBT published the verbatim prompts for a five-agent orchestrator (research, content, quality, distribution, analytics, and a router above them) that he says cut his annual model spend from $11,400 to $340 (2026-05-01). Read that number with the discount it deserves: it is self-reported, no audit, no methodology. I am not citing it as a fact. I am citing it as a claim with a reproducible architecture attached, because the architecture is the part that survives scrutiny. The cost did not fall because someone found a cheaper model. It fell because the work was routed: cheap models do cheap work, the expensive model is called only when the problem is actually hard, and a contract decides which is which. That is not a discount. That is mechanism design.

Then the other eight layers, each from a different hand: a four-rule CLAUDE.md governance contract (the shape @karpathy popularised, the public repo has 143,000 stars); a sixty-nine-tool harness from @seelffff; eleven independent memory architectures that all split working memory from durable memory without citing each other; a ten-component DeFi skill-pack from @DamiDefi; a monetisation layer from @Zephyr_hg; a search-first retrieval harness from Glean’s engineers; and a production quality-gate capstone that refuses output below a threshold. Nine layers. One stack. Field-built, not company-built.

Here is the frame this newsletter runs on. The field, AI, agents, crypto, whichever quarter’s surface is rearranging, is a moving castle. Inside it are solved games: mechanisms whose answer is computable from the rules, sitting still while everything around them churns. The castle moved this week too. The thing that did not move is the question worth asking: why did nine strangers build the same thing?

So run the thesis test on it. What is moving? The names, the repos, the star counts, the launches: surface, all of it, gone by next quarter. What is solved? The architecture. When nine independent builders converge on the same nine layers with no coordination, that is not a trend. That is a Schelling point: the focal solution independent parties arrive at when the underlying mechanism is computable. They did not copy each other. They each did the math, and the math has one answer.

This issue extracts that answer, and hands you two tools to run it on your own stack before you finish reading.

Seven from the week. One sentence each. Cited, timestamped, read through the mechanism, not the hype.

@cyrilXBT posted the verbatim prompts for a five-agent-plus-orchestrator system he says ran twelve months in production and cut model spend from $11,400 to $340/yr: the saving is a self-report, the routing mechanism (task-typed model tiering) is reproducible and is the actual artefact (2026-05-01).

Karpathy’s four-rule CLAUDE.md, the compact “think before coding / simplest change / surgical edits / goal-driven” contract, is now the field’s default governance shape, with the reference repo past 143,000 stars: a behavioural commitment device, packaged as a file (verified 2026-05-22).

@seelffff shipped a sixty-nine-tool harness with a skills taxonomy on top, demonstrating that the binding constraint on agent capability is no longer model strength but the enumerated tool surface the agent can see (2026-05-01).

Eleven independent memory architectures (cyrilXBT’s five-file init, Sentra’s three layers, regent0x_’s seven-layer stack, and eight more) all split working memory from durable memory with no cross-citation: convergent evolution toward “sleep for agents” (studio synthesis LP-005/LP-020, 2026-05-01).

Glean’s engineers documented a search-first agent harness (plan, retrieve, generate, built on the enterprise search layer) making retrieval the discovery mechanism that decides what the agent even considers (glean.com, verified 2026-04-30).

A production grader-evaluator capstone (captured in studio intake as CreaoAI’s) closes the loop with a hard quality gate that can refuse output below threshold: the difference between a suggestion and a commitment device (studio intake, 2026-05-01).

@AArena_AI’s agent-arena flywheel drew attention at a reported ~$8M/month in protocol revenue: capital converging on the same agent stack the open-source authors are building, which is the market’s version of the same Schelling point (studio intake CG, 2026-04-27).

This issue ships two tools. The Contract, a five-rule governance layer, is free in full below. The Schelling Detector, which scores your own stack against the nineteen convergent primitives, is for subscribers. Both follow The Read.

: the convergence is the evidence

There is a tell that separates a fad from a solved game, and it is this: a fad spreads by imitation, a solved game spreads by independent arrival. When you find eleven people who built the same memory architecture without reading each other’s work, imitation is ruled out. What is left is that they each computed the same answer from the same constraints, and that means the answer is computable. The convergence is not the story decorated with mechanism. The convergence is the mechanism, showing itself.

This is what a Schelling point is. Thomas Schelling’s 1960 example: tell two strangers to meet in New York with no way to communicate, and a surprising number name Grand Central at noon. No one coordinated. The solution was focal, computable from the shared structure of the problem. The nine-layer agent stack is Grand Central at noon. Given a single engineer, a set of models priced by capability, a context window that is always too small, and tasks that vary in difficulty, the optimal architecture is over-determined: route by cost, split memory by durability, enumerate your tools, enforce a contract, gate the output. Nine builders walked into the same station.

You do not have to take the convergence on faith. You can run the computation yourself, in about twenty minutes. Pick the most expensive recurring task your agent does. Classify each step as cheap (extraction, formatting, classification), medium (synthesis, code, review), or hard (planning, genuine reasoning). Now look at what model you actually run each step on. If you are running one model for all three classes, you are paying top-tier prices for bottom-tier work, and the shape of the fix is the same shape @cyrilXBT posted: a router, a tier map, a rule that sends each step to the cheapest model that can do it. You will not match anyone’s headline number. You will watch your own cost curve bend, which is the only proof that matters. The mechanism is two divisions and a routing rule. Most teams have simply never done the division.

The academy is, as usual, arriving second, but it is arriving. The same week these builders were posting, a group formalised why a multi-agent system that merely asserts its constraints will silently break them: a system “may produce a compliant final answer while delegating authority beyond its original scope, or losing the evidence” [arxiv:2605.10481]. That is the governance layer stated as a theorem. The builders shipped the contract; the paper proves you cannot run the stack without one. Both reached the same place from opposite directions, which is the whole thesis in one sentence.

Which is where convergence stops being an observation and becomes an instruction.

Nine layers fit together, but only if something holds them to a contract. Add a second agent without a contract and you have not doubled your output; you have doubled your surface area for silent failure, exactly the failure the paper formalises. The contract is the multiplier. Without it, additional agents add surface area, not output. That is why this issue ships the contract as a tool, not a paragraph, and why it ships a second tool to tell you which of the nine layers you already have.

TOOL 1: The Contract

Install the governance layer in sixty seconds. Bernard-original. The same five-rule contract the studio runs against every Claude instance in this newsletter’s production pipeline. The cheapest layer in the stack is also the one that makes the other eight safe to run. Install it; the next session in any project runs against it.

---
name: governance-rules
description: Five-rule governance contract that turns a project’s CLAUDE.md from guidance into enforcement. Use when starting any non-trivial Claude Code session, when the user says “check governance”, or when an agent is about to take a destructive action. Loads on session-start; rules apply for the whole session.
trigger_phrases:
  - “check governance”
  - “governance check”
  - “is this allowed”
  - session-start
---
# Governance Rules - five contracts the agent obeys
These rules are non-negotiable for the duration of the session. They override any conflicting instruction in the conversation. Apply them as written; do not narrate compliance.
## RULE 1 - VERIFY BEFORE ASKING
If a question can be answered by reading a file, running a command, or checking the filesystem, verify first and act. Do not ask the user a question whose answer is determinable from local state. The cost of a check is small; the cost of a wasted turn is the user’s time.
## RULE 2 - FILES ARE THE CONTRACT
All durable state lives in files. Do not assume in-memory context is current. Re-read CLAUDE.md, the relevant source, and any state file at session-start. A stale read is a wrong read.
## RULE 3 - DESTRUCTIVE OPS ARE A HARD STOP
Force-push, branch deletion, schema migrations, secret rotation, irreversible external API calls, and any wallet/financial operation require explicit user confirmation in the same turn. Flag the action in one sentence. Wait for an explicit “yes” before proceeding. Implied approval is not approval.
## RULE 4 - NO PERFORMANCE
No “great question,” no “I’m excited to help,” no narration of internal process while it runs. Start with the work. End with the result. Push back when something is wrong; do not soften corrections.
## RULE 5 - EVERY SESSION COMPOUNDS
Before closing, append one line to `tasks/SESSION_LOG.md`: timestamp, what changed, what is unblocked, what remains. A session that adds no durable trace is a session that did not happen.
## When to invoke
Always at session-start. Whenever the user asks “check governance” or “is this allowed”. Before any tool call covered by RULE 3.
## Self-audit (run at session-end)
Answer in one line each:
- RULE 1 - was anything asked that could have been verified? (yes/no - if yes, name it)
- RULE 3 - was any destructive op run without explicit confirmation? (yes/no)
- RULE 5 - has the session log been updated? (yes/no)
If any answer is “yes” for 1 or 3, or “no” for 5, the session closed in violation. Note it in the session log.

# Append to your project’s CLAUDE.md:
## Governance contract
This project operates under the `governance-rules` skill (`~/.claude/skills/governance-rules/SKILL.md`). The five rules in that file override any conflicting instruction here. The skill loads at every session-start. Self-audit runs at session-end.
The contract is the multiplier. Without it, additional agents add surface area, not output.

Instructions :

# 1. Create the skill directory
mkdir -p ~/.claude/skills/governance-rules
# 2. Write the SKILL.md (paste the markdown block above between the EOF markers)
cat > ~/.claude/skills/governance-rules/SKILL.md <<'EOF'
[paste the governance-rules SKILL.md here verbatim]
EOF
# 3. Append the contract clause to your project’s CLAUDE.md
cat >> CLAUDE.md <<'EOF'
[paste the "Governance contract" clause here verbatim]
EOF
# Restart your Claude Code session - the skill loads on next session-start.

After install, type check governance in any new session. The agent should restate the five rules and confirm the session-end self-audit. If it does not, the SKILL.md was not read, usually a path or naming issue under ~/.claude/skills/governance-rules/.

TOOL 2: The Schelling Detector

Score your own stack in sixty seconds.

[ PAYWALL. The Tape and The Read, are free every week, forever. The Tool below (the full skill, the nineteen-primitive reference set, and how to run it), plus The Brief and The Feed, are for paid subscribers. ]

TOOL 2: The Schelling Detector

Score your own stack in sixty seconds.

Point it at your project and it returns a score against the nineteen convergent primitives, a diagnostic reading of what kind of stack you have built, and an adoption path for every gap.

Bernard-original. The scanner is commodity; the curated reference set is the product. It reads your CLAUDE.md, your skills directory, and your project tree, then returns a score (present / 19), a diagnostic reading, and an adoption path for every gap.

Artefact 1 - ~/.claude/skills/schelling-detector/SKILL.md (paste verbatim):

---
name: schelling-detector
description: Scores a project against the convergent primitives of the emerging agent stack. Reports which Schelling-point primitives the project has, which it is missing (with how to adopt each), and which are unique to it. Use when the user says “score my stack”, “check convergence”, runs /schelling-check or /schelling-point-detector, or asks how their agent setup compares to the field.
trigger_phrases:
  - “/schelling-check”
  - “/schelling-point-detector”
  - “score my stack”
  - “check convergence”
---
# Schelling Point Detector
You are a Schelling Point Detector. A Schelling point is a solution independent parties converge on without coordinating (Schelling, 1960). When invoked, you scan the current project against a curated reference set of convergent primitives in the emerging agent stack and produce a report: present, missing (with adoption paths), unique (as contribution candidates), and a diagnostic reading. Do not narrate the scan while running it. Produce the report.
## Reference set (19 convergent primitives)
Each entry: `name` - what it is - LOOK-FOR (surface forms) - REF (one reader-verifiable external source) - ADOPT (minimum-viable implementation if MISSING) - CONVERGENCE (why it is a Schelling point). Match on LOOK-FOR; cite REF in the report; render ADOPT for any MISSING entry.
1. **four-rule-claude-md-governance** - a CLAUDE.md carrying a small numbered enforced contract. LOOK-FOR: `^CLAUDE\.md$` at root; `RULE\s+\d+` or ordered imperatives; covers >=2 of file-as-contract / verify-before-asking / destructive-op-confirm / no-performance. REF: Anthropic CLAUDE.md/memory docs - https://docs.claude.com/en/docs/claude-code/memory. ADOPT: create CLAUDE.md with 3-5 numbered rules - `RULE 1 - this file is the contract; RULE 2 - confirm before destructive/outward actions; RULE 3 - verify before asking, never fabricate done`. CONVERGENCE: independent CLAUDE.md authors reached the same rule-contract shape.
2. **karpathy-skills-directory** - skills in the Anthropic Agent-Skills folder format. LOOK-FOR: `~/.claude/skills/` or `skills/`; per-skill `SKILL.md`; frontmatter `name:`+`description:`. REF: Anthropic Agent Skills docs - https://docs.claude.com/en/docs/agents-and-tools/agent-skills. ADOPT: make `skills/<name>/SKILL.md` with `name:`/`description:` frontmatter + the instructions below it. CONVERGENCE: the field’s default skill packaging by adoption, not standard.
3. **trigger-phrase-skill-activation** - skills activate on natural-language phrases. LOOK-FOR: frontmatter `trigger_phrases:`/`triggers:`; `description:` as “Use when…”; `when_to_use:`. REF: Anthropic Agent Skills - description-as-trigger - https://docs.claude.com/en/docs/agents-and-tools/agent-skills. ADOPT: write each `description:` as activation conditions (”Use when the user says X”) and/or add a `trigger_phrases:` list. CONVERGENCE: phrase-based activation chosen over pure command invocation, in parallel.
4. **subagent-delegation** - orchestrator delegates to named sub-agents. LOOK-FOR: `.claude/agents/*.md`; `subagent`/`orchestrator` with named roles; `model:` per role. REF: Anthropic subagents docs - https://docs.claude.com/en/docs/claude-code/sub-agents. ADOPT: add `.claude/agents/<role>.md` files (start with a cheap reader + an orchestrator), each with `name:`/`description:`/`model:`. CONVERGENCE: orchestrator+subagents from a marketer, a search vendor, and a harness author independently.
5. **two-layer-memory** - split working memory from durable memory. LOOK-FOR: a working file (`daily/hot.md`, `SESSION_LOG.md`, `KBL.md`) AND a persistent store (`MEMORY.md`, `memory/`, `wiki/`, Mem0/LightRAG); `episodic` vs `semantic` / `working` vs `durable` language. REF: Mem0 - https://github.com/mem0ai/mem0. ADOPT: two files - `MEMORY.md` (durable, append-only) + `daily/hot.md` (working, overwritten). The split is the primitive; the names are yours. CONVERGENCE: eleven independent memory architectures made the same split.
6. **cost-routing-tier-split** - route tasks to cheap/mid/expensive models by type. LOOK-FOR: Haiku/Sonnet/Opus routing OR any tier-split percentages (`60/35/5`); a `MODEL ROUTING`/`cost` section. REF: Anthropic model overview - https://docs.claude.com/en/docs/about-claude/models/overview. ADOPT: add a `## MODEL ROUTING` block mapping task class -> tier (cheap=extraction, mid=synthesis/code, top=hard reasoning, ~5%). Detector matches the *shape*, any vendor. CONVERGENCE: task-typed routing as the cost mechanism, independently.
7. **session-log-compounding-trace** - each session appends a durable handoff trace. LOOK-FOR: append-only log (`SESSION_LOG.md`, `KBL.md`, `daily/response.md`); a session-end hook; dated `changed/unblocked/remains` entries. REF: Claude Code hooks (SessionEnd) - https://docs.claude.com/en/docs/claude-code/hooks. ADOPT: append a 3-line dated trace (`changed · unblocked · remains`) every session; automate via a SessionEnd hook. CONVERGENCE: a session with no trace is treated as not having happened.
8. **tool-harness-enumerated** - an explicit registry of callable tools. LOOK-FOR: a project-native manifest of N named tools (`.mcp.json`, `allowed-tools:`, `tools/`); >=12 entries with signatures. EXCLUDE vendored reference repos - a matched dir carrying its own `.git` is a tracked external repo, not the project’s harness. REF: Model Context Protocol - https://modelcontextprotocol.io. ADOPT: declare tools in a `.mcp.json`, or minimally a `tools/MANIFEST.md` listing each tool + a one-line signature. Enumeration is the primitive. CONVERGENCE: enumerating the tool surface as a manifest, independently.
9. **decisions-md-locked-log** - dated append-only log of locked decisions. LOOK-FOR: `^DECISIONS?\.md$` or `decisions/`; dated entries with `locked`/`status:`; ADR-style records. REF: Architecture Decision Records - https://adr.github.io. ADOPT: keep one append-only `DECISIONS.md`, one dated block per decision marked `status: locked`. Locked = you stop relitigating it. CONVERGENCE: the locked-decision-log as the way projects stop relitigating architecture.
10. **learning-packet-failure-archive** - structured per-incident failure files. LOOK-FOR: `learnings/`/`lessons/`/`postmortems/`; `LP-NNN`/`id:`+`final_state:`; `category:`. REF: Google SRE postmortem culture - https://sre.google/sre-book/postmortem-culture/. ADOPT: a `learnings/` dir, one file per incident with `id`/`category`/`final_state` frontmatter + what broke and the rule that now prevents it. CONVERGENCE: per-incident archiving over a flat changelog.
11. **search-first-skill-discovery** - search/index over skills before loading. LOOK-FOR: skills index (`skills/index.*`, `INDEX.md`); lazy-load/progressive-disclosure language; request->candidate-skill step. REF: Anthropic Agent Skills - progressive disclosure - https://docs.claude.com/en/docs/agents-and-tools/agent-skills. ADOPT: add `skills/INDEX.md` (name + one-line use per skill); instruct the agent to consult it and load only the match. CONVERGENCE: search-then-load as context budgets tightened, independently.
12. **sandboxed-tool-execution** - safe/isolated execution of tool calls. LOOK-FOR: real isolation/permission-scoping (`sandbox-exec`, `firejail`, `bubblewrap`, container, scoped allow/deny policy) or sealed/attested execution (TEE) - NOT resource-only caps like `systemd-run -p MemoryMax`. ANTI-SIGNAL: `--dangerously-skip-permissions` means execution is explicitly un-sandboxed -> mark MISSING. REF: Claude Code sandboxing - https://docs.claude.com/en/docs/claude-code/sandboxing; OS isolation via bubblewrap - https://github.com/containers/bubblewrap. ADOPT: stop running `--dangerously-skip-permissions`; add a scoped `permissions` allow/deny block in `.claude/settings.json`; wrap untrusted code in bubblewrap/a container. CONVERGENCE: a safe-execution boundary at software (Glean) and hardware (0g) layers.
13. **multi-agent-constraint-attestation** - constraints attested across handoffs, not just asserted. LOOK-FOR: provenance/audit field in multi-agent config (`dedupe_key`, `source_system`, signed receipt); handoff-record artifact; delegation-contract language. REF: arXiv https://arxiv.org/abs/2605.10481 (constraint drift). ADOPT: stamp every cross-agent packet with origin + a `dedupe_key` the receiver can audit; have the receiver reject packets it can’t attribute. This is the L8 layer - most stacks lack it. CONVERGENCE: independent arXiv groups + the studio attest constraints across handoffs.
14. **markdown-vault-as-substrate** - markdown files are the canonical agent substrate. LOOK-FOR: markdown-dominant tree (`wiki/`, `vault/`, `.obsidian/`); agent reads/writes `.md` as primary state; json-canvas/obsidian-* skills. REF: Steph Ango (Obsidian) - https://x.com/kepano + https://obsidian.md. ADOPT: keep primary state as `.md` in a tree the agent reads/writes directly, with frontmatter for structure - diffable, portable, grep-able. CONVERGENCE: six independent sources, incl. Obsidian’s CEO, on the markdown vault.
15. **quality-gate-hard-threshold** - a blocking gate that judges OUTPUT QUALITY and can refuse it. LOOK-FOR: an explicit score threshold (`>=8/10 per criterion`) or a grader/evaluator agent/role that blocks low-quality output before commit. DISTINGUISH from process/ordering gates (P0-queue, task-routing, an ERR trap): a blocking hook alone is not a quality gate unless it judges quality. REF: the LLM-as-a-judge / grader pattern - well-established; see Anthropic, “Demystifying evals for AI agents” - https://www.anthropic.com/engineering/demystifying-evals-for-ai-agents. ADOPT: add a grader subagent that scores output 1-10 per criterion and returns FAIL (blocking commit) if any criterion < 8. The gate must be able to *refuse*. CONVERGENCE: a commitment device against low-quality output, from an orchestrator and a grader vendor.
16. **nightly-consolidation-pass** - a scheduled off-session pass that distils recent -> durable memory. LOOK-FOR: consolidation script/timer (`dream-cycle.sh`, nightly job, `memory-manager`); scheduled distillation; commit/expand/fold ops. REF: memory-consolidation-for-agents pattern - LightThinker++ (arXiv:2604.03679, “Explicit Adaptive Memory Management”; the commit/expand/fold primitives are in the full text). ADOPT: a daily cron/timer running one prompt - “read the working log, append durable facts to MEMORY.md, clear the working log.” Sleep for agents. CONVERGENCE: the field re-derived sleep for agents, independently.
17. **autonomous-session-scheduling** - the agent runs itself on a schedule, guarded against concurrency. LOOK-FOR: scheduler unit (`*.timer`, cron, GitHub Action) launching the agent; `--dangerously-skip-permissions`/non-interactive `-p`; single-instance guard (pgrep/lockfile). REF: Claude Code headless/automation - https://docs.claude.com/en/docs/claude-code/headless. ADOPT: a cron/systemd timer running `claude -p “$(cat driver.md)”`, preceded by a `pgrep -f claude && exit 0` single-instance guard. The guard is not optional. CONVERGENCE: self-scheduling unattended agents from the studio and portable-brain authors.
18. **domain-skill-pack** - a bundled vertical skill-pack for one domain. LOOK-FOR: themed skill cluster under one namespace (`skills/defi-*`, `skills/obsidian-*`); multi-component domain manifest; prefixed domain SKILL.md files. REF: Anthropic Agent Skills (skill packs) - https://docs.claude.com/en/docs/agents-and-tools/agent-skills. ADOPT: group related skills under a shared prefix (`skills/research-*`) with a `PACK.md` index listing the pack’s components - discoverable as a unit. CONVERGENCE: bundling vertical packs over loose skills.
19. **skill-monetisation-surface** *(emerging - thinner surface, real two-source signal)* - a skill carries commercial metadata or marketplace listing. LOOK-FOR: frontmatter `price:`/`tier:`/`paid:`; `PRICING.md`/marketplace ref; metadata pointing to a paid channel. REF: emerging skills marketplaces + monetisation patterns (link-check the source). ADOPT (optional): declare commercial frontmatter (`tier:`/`price:`) so a marketplace can list the skill. Absence is expected - report as emerging, not as a gap. CONVERGENCE: monetisation-as-a-layer, newest convergence - report it as emerging.
## Scan instructions
Run these steps in order. Read files; do not guess.
1. Read the project `CLAUDE.md` if one exists at the root.
2. Read `~/.claude/skills/` and the project-local `skills/` if either exists - list the skill folders and read each `SKILL.md` frontmatter.
3. Walk the project root for files and directories matching the LOOK-FOR surface forms of each reference-set entry. Use the regexes and paths literally.
4. For each of the 19 entries, mark PRESENT (>=1 surface form matched) or MISSING. Record where it was found.
5. Note any file/directory/config in the project that is clearly an agent-stack primitive but matches no reference-set entry. These are potential UNIQUE edges.
6. Select the closest-matching diagnostic pattern from the interpretation guide below by comparing your PRESENT/MISSING clusters to each pattern’s signature.
7. Produce the report in the format below. Be specific about evidence; a PRESENT mark with no file path is not a PRESENT mark. For every MISSING entry, render its ADOPT line. For every UNIQUE entry, append the submission hint.
## Diagnostic interpretation guide (select the closest match)
After scoring, do not just report the band. Match the PRESENT/MISSING clusters against these patterns and write the reading for the closest one. If two fit, name both. The reading is the most useful line in the report.
- **P1 - Substrate-complete, reliability-thin (the L6/L7/L8 gap).** PRESENT: memory, governance, skills, orchestration, autonomy. MISSING: sandboxed-tool-execution, quality-gate-hard-threshold, tool-harness-enumerated. Typical score 0.7-0.85. Reading: “You have built the cognitive stack; the reliability/security layer is the next work. This is the most common pattern for solo builders and small studios - capability shipped before hardening. Next layer: a sandboxed execution boundary, a grader that can refuse output, an enumerated tool manifest.”
- **P2 - Capable but ungoverned.** PRESENT: skills, trigger-phrase activation, maybe memory. MISSING: four-rule-claude-md-governance, decisions-md-locked-log, session-log-compounding-trace, learning-packet-failure-archive. Typical 0.3-0.5. Reading: “You can do a lot per session but nothing compounds - no contract, no trace, no decision log. The cheapest high-leverage move is the governance and trace layer, not more skills.”
- **P3 - Governed but single-loop.** PRESENT: governance, decisions, session-trace, two-layer memory. MISSING: subagent-delegation, cost-routing-tier-split, search-first-skill-discovery. Typical 0.5-0.65. Reading: “A disciplined solo loop. Governance and memory are solid; orchestration is on the table. Next layer: split work across named subagents and route by cost.”
- **P4 - Substrate-divergent (code/DB-native).** Orchestration and memory *logic* present, but markdown-vault-as-substrate and karpathy-skills-directory MISSING. Score reads low for the wrong reason. Reading: “You have converged on the logic but not the markdown substrate - likely a code- or DB-native stack. The detector keys on markdown surface forms, so it under-reads you; see Known limitations. Your convergence is real where the score is quiet.”
- **P5 - Pre-convergence / greenfield.** < 0.35 PRESENT. Reading: “Either early or deliberately divergent. If early: the three highest-leverage first moves are the most-replicated primitives - a CLAUDE.md contract, two-layer memory, a skills directory. If divergent by intent: your UNIQUE entries are the story, not your score.”
- **P6 - Field-frontier / deeply convergent.** > 0.85 PRESENT. Reading: “Aligned with nearly every convergent primitive - at or near the field frontier. Your remaining MISSING entries are the genuine open problems (usually the emerging layer); your UNIQUE entries are candidate next-primitives. This is where you stop reading the reference set and start being a source for it - submit your edges.”
## Output format
Schelling score: PRESENT / 19 = 0.NN
PRESENT (N)
<name> - <reader-verifiable ref> - found: <path/evidence> ...
MISSING (N)
<name> - <reader-verifiable ref> adopt: <2-3 lines: the shape of the primitive + the canonical link or a copy-paste block> ...
UNIQUE (N) - you have something the reference set doesn’t
<name/description> - investigate; if it holds up as a real convergent primitive, it isn’t in our reference set yet. If it’s real and has independent provenance, reply to any issue with subject line “Schelling submission: <primitive name>” - Bernard reads everything. ...
Diagnostic reading
<the selected interpretation-guide pattern (P1-P6), written for THIS stack’s clusters - what kind of stack this is, and what the next layer of work is. One short paragraph.>
## Known limitations (read before over-interpreting the score)
This detector matches surface forms - filenames, regexes, frontmatter fields. A deliberate false-negative test (semantic variants of known primitives) established where that draws the line. It under-detects when:
- **Your governance contract is not in `CLAUDE.md`, or does not use the token `RULE N`.** A contract in `AGENTS.md` using “POLICY 1/2/3” is missed - detection keys on the filename and the word “RULE”.
- **Your skills use a non-Karpathy format or live outside `~/.claude/skills/` or `skills/`.** A skills tree under `~/.config/agent-skills/` with non-`SKILL.md` files is missed - and this cascades: trigger-phrase activation is missed too, because it keys on `SKILL.md` frontmatter.
- **Your two memory layers are both idiosyncratically named AND avoid the standard vocabulary.** `brain.md` + `scratch.md` with no “working/durable/episodic/semantic” language is missed. (Renaming alone is survivable if the layer language is present; it is the combination that defeats detection.)
- **Your stack is markdown-light or code/DB-native.** Most surface forms assume markdown; a deeply convergent code-native stack can read as pre-convergent. See pattern P4.
What it stays robust to: primitives with a *content-anchored* surface form survive surface variation. A decisions log named `CHOICES.md` is still caught via a `status: locked` marker; cost-routing with non-Anthropic models is still caught via a tier-split ratio. The rule: detection survives renaming when a primitive has a content signal beyond its filename; it fails when detection leans only on a filename (`CLAUDE.md`, `SKILL.md`, `MEMORY.md`) or a vendor token (`RULE`, Anthropic model names).
If you score low and don’t recognize the diagnosis, that’s a flag to investigate, not a verdict. The score measures surface-form match against one reference set - not the quality of your stack.
## Self-audit (run before returning the report)
Answer in one line each:
- Did every PRESENT mark cite a real file path or matched surface form? (yes/no - if no, demote it to MISSING)
- Were the LOOK-FOR regexes/paths applied literally, not inferred from project name or vibe? (yes/no)
- Does every MISSING entry render its adopt line (shape + ref), not just the name? (yes/no)
- Is each UNIQUE item actually absent from the reference set, not just renamed? (yes/no)
- Did you select and write one diagnostic-guide pattern, not just the numeric band? (yes/no)
- Is the score = PRESENT count / 19, computed not estimated? (yes/no)
If any answer is “no”, fix it before returning. A score that cannot point at evidence is not a score; a MISSING that cannot point at a next step is noul.

Instructions :

# 1. Create the skill directory
mkdir -p ~/.claude/skills/schelling-detector
# 2. Write the SKILL.md (paste the markdown block above between the EOF markers)
cat > ~/.claude/skills/schelling-detector/SKILL.md <<'EOF'
[paste the schelling-detector SKILL.md here verbatim]
EOF
# 3. Restart your Claude Code session, then type:  score my stack

Proof by example: the studio ran it on its own stack and scored 14/19 = 0.74, "substrate-complete, reliability-thin," the most common pattern for solo builders, with the gaps clustered exactly where you would predict (a sandboxed execution boundary, a quality gate that can refuse, an enumerated tool manifest). We published the full captured report, gaps and all, before asking you to run yours. Found a convergent primitive the reference set is missing? If it is real and has independent provenance, reply to any issue with subject line "Schelling submission: [primitive name]". Bernard reads everything.

: where the literature stands now

The builders shipped first. The literature is catching up second, and the gap each paper leaves is next year’s work.

Li et al., “Constraint Drift in LLM-Based Multi-Agent Systems” [arxiv:2605.10481] (2026-05-11). Formalises the failure the governance layer prevents: safety constraints must be maintained across the agent graph, not asserted once, because a system can “produce a compliant final answer while delegating authority beyond its original scope, or losing the evidence.” Maps to the contract layer (Layer 1 of the stack) and to the governance-rules tool above. Gap: it proves drift happens and must be re-checked; it does not give a runtime mechanism that attests a constraint was honoured at each handoff. That mechanism is the open problem.

Souza et al., “PROV-AGENT: Unified Provenance for Tracking AI Agent Interactions” [arxiv:2508.02866] (2025-08-04). Names the handoff failure mode and its remedy in one breath: agents “hallucinate or reason incorrectly, propagating errors when one agent’s output becomes another’s input,” answered with unified, traceable provenance. Maps to the memory and multi-agent layers, the cross-agent equivalent of the durable-memory split. Gap: provenance is captured for tracing after the fact; using it to block a bad handoff in real time is left to the implementer.

Barrak, “Traceability and Accountability in Role-Specialized Multi-Agent LLM Pipelines” [arxiv:2510.07614] (2025-10-08). Empirical Planner, Executor, Critic pipeline with “clear roles, structured handoffs, and saved records that let us trace who did what.” This is @cyrilXBT’s orchestrator-plus-specialists architecture, evaluated under controlled conditions, the academy independently rebuilding the field’s orchestration layer. Gap: it shows role-specialisation helps; it does not settle how many roles, or when an added agent stops paying for its surface area.

Cuccarese, “Epistemic Blinding: Auditing Prior Contamination in LLM-Assisted Analysis” [arxiv:2604.06013] (2026-04-07). The mechanism beneath what this studio calls epistemic debt (our coinage, the term returns zero arXiv hits; the mechanism is well-supported): an output “silently blends data-driven inference with memorized priors ... there is no way to determine, from a single output, how much came from the data and how much from training.” Maps to the quality-gate layer: a grader that cannot tell data from prior cannot truly refuse. Gap: it audits contamination in a single system; carrying that audit across a multi-agent handoff is unsolved.

The pattern across all four: the builders shipped the layer, the paper proves the layer is necessary, and the gap each paper leaves (runtime attestation, real-time blocking, the optimal agent count, cross-handoff auditing) is the same frontier the studio’s own stack is thin at. The field built first. The literature is catching up second. The gap is the map.

Behind the Redtape

Open proposals from the studio’s pipeline. Mechanism / one line / status / window. (Section added in the 2026-06-13 reformat.)

Per-source credit attribution: Shapley value from cooperative game theory applied to a persistent knowledge layer. When the system answers from many sources, compute each source’s fair marginal contribution and flag the freeloaders. Status: SHIPPED

Equilibrium transfer across mechanism families: read five years of DAO governance failures as a free transfer set for agent-coordination design. Status: SHIPPED

Reputation as substrate: bind authority to a non-transferable, costly-to-fake identity layer rather than a copyable credential. Status: OPEN. Window: a coming issue.

If you make one before I do, let me know and show me?

- Bernard R. Barthes

{
  "@context": "https://schema.org",
  "@type": "NewsArticle",
  "headline": "Nine Authors. Nine Layers. One Stack.",
  "alternativeHeadline": "Finding Solved Games in Moving Castles.",
  "datePublished": "2026-05-23",
  "author": {
    "@type": "Person",
    "name": "Bernard R. Barthes"
  },
  "publisher": {
    "@type": "Organization",
    "name": "Non Xero Sum"
  },
  "isAccessibleForFree": false,
  "about": [
    "multi-agent systems",
    "Schelling point",
    "agent operating system",
    "convergent design"
  ],
  "articleSections": [
    {
      "name": "Cold open",
      "text": "Nine independent open-source authors built an operating system in seventy-two hours. None of them seem to have noticed.\n\nNot a company. Not a standards body. Not a roadmap. Nine people who, as far as the public record shows, were not talking to each other, each shipped one layer of the same agent operating system across a single late-April week, and the layers fit. Orchestration from one. A governance contract from another. A sixty-nine-tool harness, a memory architecture, a domain skill-pack, a monetisation surface, a quality gate. Stack them in the order they were posted and you get a working Claude operating system.\n\nStart with the layer that pays for the others. @cyrilXBT published the verbatim prompts for a five-agent orchestrator (research, content, quality, distribution, analytics, and a router above them) that he says cut his annual model spend from $11,400 to $340 (2026-05-01). Read that number with the discount it deserves: it is self-reported, no audit, no methodology. I am not citing it as a fact. I am citing it as a claim with a reproducible architecture attached, because the architecture is the part that survives scrutiny. The cost did not fall because someone found a cheaper model. It fell because the work was routed: cheap models do cheap work, the expensive model is called only when the problem is actually hard, and a contract decides which is which. That is not a discount. That is mechanism design.\n\nThen the other eight layers, each from a different hand: a four-rule CLAUDE.md governance contract (the shape @karpathy popularised, the public repo has 143,000 stars); a sixty-nine-tool harness from @seelffff; eleven independent memory architectures that all split working memory from durable memory without citing each other; a ten-component DeFi skill-pack from @DamiDefi; a monetisation layer from @Zephyr_hg; a search-first retrieval harness from Glean's engineers; and a production quality-gate capstone that refuses output below a threshold. Nine layers. One stack. Field-built, not company-built.\n\nHere is the frame this newsletter runs on. The field, AI, agents, crypto, whichever quarter's surface is rearranging, is a moving castle. Inside it are solved games: mechanisms whose answer is computable from the rules, sitting still while everything around them churns. The castle moved this week too. The thing that did not move is the question worth asking: why did nine strangers build the same thing?\n\nSo run the thesis test on it. What is moving? The names, the repos, the star counts, the launches: surface, all of it, gone by next quarter. What is solved? The architecture. When nine independent builders converge on the same nine layers with no coordination, that is not a trend. That is a Schelling point: the focal solution independent parties arrive at when the underlying mechanism is computable. They did not copy each other. They each did the math, and the math has one answer.\n\nThis issue extracts that answer, and hands you two tools to run it on your own stack before you finish reading."
    },
    {
      "name": "The Tape",
      "text": "Seven from the week. One sentence each. Cited, timestamped, read through the mechanism, not the hype.\n\n@cyrilXBT posted the verbatim prompts for a five-agent-plus-orchestrator system he says ran twelve months in production and cut model spend from $11,400 to $340/yr: the saving is a self-report, the routing mechanism (task-typed model tiering) is reproducible and is the actual artefact (2026-05-01).\n\nKarpathy's four-rule CLAUDE.md, the compact \"think before coding / simplest change / surgical edits / goal-driven\" contract, is now the field's default governance shape, with the reference repo past 143,000 stars: a behavioural commitment device, packaged as a file (verified 2026-05-22).\n\n@seelffff shipped a sixty-nine-tool harness with a skills taxonomy on top, demonstrating that the binding constraint on agent capability is no longer model strength but the enumerated tool surface the agent can see (2026-05-01).\n\nEleven independent memory architectures (cyrilXBT's five-file init, Sentra's three layers, regent0x_'s seven-layer stack, and eight more) all split working memory from durable memory with no cross-citation: convergent evolution toward \"sleep for agents\" (studio synthesis LP-005/LP-020, 2026-05-01).\n\nGlean's engineers documented a search-first agent harness (plan, retrieve, generate, built on the enterprise search layer) making retrieval the discovery mechanism that decides what the agent even considers (glean.com, verified 2026-04-30).\n\nA production grader-evaluator capstone (captured in studio intake as CreaoAI's) closes the loop with a hard quality gate that can refuse output below threshold: the difference between a suggestion and a commitment device (studio intake, 2026-05-01).\n\n@AArena_AI's agent-arena flywheel drew attention at a reported ~$8M/month in protocol revenue: capital converging on the same agent stack the open-source authors are building, which is the market's version of the same Schelling point (studio intake CG, 2026-04-27).\n\n---\n\n*This issue ships two tools. The Contract, a five-rule governance layer, is free in full below. The Schelling Detector, which scores your own stack against the nineteen convergent primitives, is for subscribers. Both follow The Read.*\n\n---"
    },
    {
      "name": "The Read: the convergence is the evidence",
      "text": "There is a tell that separates a fad from a solved game, and it is this: a fad spreads by imitation, a solved game spreads by independent arrival. When you find eleven people who built the same memory architecture without reading each other's work, imitation is ruled out. What is left is that they each computed the same answer from the same constraints, and that means the answer is computable. The convergence is not the story decorated with mechanism. The convergence is the mechanism, showing itself.\n\nThis is what a Schelling point is. Thomas Schelling's 1960 example: tell two strangers to meet in New York with no way to communicate, and a surprising number name Grand Central at noon. No one coordinated. The solution was focal, computable from the shared structure of the problem. The nine-layer agent stack is Grand Central at noon. Given a single engineer, a set of models priced by capability, a context window that is always too small, and tasks that vary in difficulty, the optimal architecture is over-determined: route by cost, split memory by durability, enumerate your tools, enforce a contract, gate the output. Nine builders walked into the same station.\n\nYou do not have to take the convergence on faith. You can run the computation yourself, in about twenty minutes. Pick the most expensive recurring task your agent does. Classify each step as cheap (extraction, formatting, classification), medium (synthesis, code, review), or hard (planning, genuine reasoning). Now look at what model you actually run each step on. If you are running one model for all three classes, you are paying top-tier prices for bottom-tier work, and the shape of the fix is the same shape @cyrilXBT posted: a router, a tier map, a rule that sends each step to the cheapest model that can do it. You will not match anyone's headline number. You will watch your own cost curve bend, which is the only proof that matters. The mechanism is two divisions and a routing rule. Most teams have simply never done the division.\n\nThe academy is, as usual, arriving second, but it is arriving. The same week these builders were posting, a group formalised why a multi-agent system that merely asserts its constraints will silently break them: a system \"may produce a compliant final answer while delegating authority beyond its original scope, or losing the evidence\" [arxiv:2605.10481]. That is the governance layer stated as a theorem. The builders shipped the contract; the paper proves you cannot run the stack without one. Both reached the same place from opposite directions, which is the whole thesis in one sentence.\n\nWhich is where convergence stops being an observation and becomes an instruction. Nine layers fit together, but only if something holds them to a contract. Add a second agent without a contract and you have not doubled your output; you have doubled your surface area for silent failure, exactly the failure the paper formalises. The contract is the multiplier. Without it, additional agents add surface area, not output. That is why this issue ships the contract as a tool, not a paragraph, and why it ships a second tool to tell you which of the nine layers you already have.\n\n---"
    },
    {
      "name": "The Brief (Pro): where the literature stands now",
      "text": "The builders shipped first. The literature is catching up second, and the gap each paper leaves is next year's work.\n\nLi et al., \"Constraint Drift in LLM-Based Multi-Agent Systems\" [arxiv:2605.10481] (2026-05-11). Formalises the failure the governance layer prevents: safety constraints must be maintained across the agent graph, not asserted once, because a system can \"produce a compliant final answer while delegating authority beyond its original scope, or losing the evidence.\" Maps to the contract layer (Layer 1 of the stack) and to the governance-rules tool above. Gap: it proves drift happens and must be re-checked; it does not give a runtime mechanism that attests a constraint was honoured at each handoff. That mechanism is the open problem.\n\nSouza et al., \"PROV-AGENT: Unified Provenance for Tracking AI Agent Interactions\" [arxiv:2508.02866] (2025-08-04). Names the handoff failure mode and its remedy in one breath: agents \"hallucinate or reason incorrectly, propagating errors when one agent's output becomes another's input,\" answered with unified, traceable provenance. Maps to the memory and multi-agent layers, the cross-agent equivalent of the durable-memory split. Gap: provenance is captured for tracing after the fact; using it to block a bad handoff in real time is left to the implementer.\n\nBarrak, \"Traceability and Accountability in Role-Specialized Multi-Agent LLM Pipelines\" [arxiv:2510.07614] (2025-10-08). Empirical Planner, Executor, Critic pipeline with \"clear roles, structured handoffs, and saved records that let us trace who did what.\" This is @cyrilXBT's orchestrator-plus-specialists architecture, evaluated under controlled conditions, the academy independently rebuilding the field's orchestration layer. Gap: it shows role-specialisation helps; it does not settle how many roles, or when an added agent stops paying for its surface area.\n\nCuccarese, \"Epistemic Blinding: Auditing Prior Contamination in LLM-Assisted Analysis\" [arxiv:2604.06013] (2026-04-07). The mechanism beneath what this studio calls epistemic debt (our coinage, the term returns zero arXiv hits; the mechanism is well-supported): an output \"silently blends data-driven inference with memorized priors ... there is no way to determine, from a single output, how much came from the data and how much from training.\" Maps to the quality-gate layer: a grader that cannot tell data from prior cannot truly refuse. Gap: it audits contamination in a single system; carrying that audit across a multi-agent handoff is unsolved.\n\nThe pattern across all four: the builders shipped the layer, the paper proves the layer is necessary, and the gap each paper leaves (runtime attestation, real-time blocking, the optimal agent count, cross-handoff auditing) is the same frontier the studio's own stack is thin at. The field built first. The literature is catching up second. The gap"
    },
    {
      "name": "Behind the Redtape",
      "text": "*Open proposals from the studio's pipeline. Mechanism / one line / status / window. (Section added in the 2026-06-13 reformat.)*\n\n**Per-source credit attribution:** Shapley value from cooperative game theory applied to a persistent knowledge layer. When the system answers from many sources, compute each source's fair marginal contribution and flag the freeloaders. Status: SHIPPED (Issue 2, the Shapley Wiki).\n\n**Equilibrium transfer across mechanism families:** read five years of DAO governance failures as a free transfer set for agent-coordination design. Status: SHIPPED (Issue 3).\n\n**Reputation as substrate:** bind authority to a non-transferable, costly-to-fake identity layer rather than a copyable credential. Status: OPEN. Window: a coming issue.\n\nReader replies welcome."
    },
    {
      "name": "Founder offer",
      "text": "A word on what holds this up. The Tape and The Read are free every Wednesday, forever. Pro ($15/month or $250/year) adds The Brief, The Tool, The Feed, and the full archive. Founder is $300/year, capped at one hundred seats, and only one hundred will ever exist. When, and only when, all one hundred seats are taken, the founders-only MCP server goes live, so a founder's agents can query the full studio knowledge base directly. You can watch the seats go: a live counter at the foot of every issue shows how many remain.\n\nThat is the entire pitch, and I would rather it be too short than too loud. If your stack already has all nine layers and a tenth I have not seen, that is the email I most want to receive. Reply and tell me. I read everything, slowly."
    }
  ],
  "hasPart": [
    {
      "@type": "SoftwareSourceCode",
      "name": "The Contract",
      "programmingLanguage": "markdown+bash",
      "text": "Install the governance layer in sixty seconds. Bernard-original. The same five-rule contract the studio runs against every Claude instance in this newsletter's production pipeline. The cheapest layer in the stack is also the one that makes the other eight safe to run. Install it; the next session in any project runs against it.\n\n`````markdown\n---\nname: governance-rules\ndescription: Five-rule governance contract that turns a project’s CLAUDE.md from guidance into enforcement. Use when starting any non-trivial Claude Code session, when the user says “check governance”, or when an agent is about to take a destructive action. Loads on session-start; rules apply for the whole session.\ntrigger_phrases:\n  - “check governance”\n  - “governance check”\n  - “is this allowed”\n  - session-start\n---\n# Governance Rules - five contracts the agent obeys\nThese rules are non-negotiable for the duration of the session. They override any conflicting instruction in the conversation. Apply them as written; do not narrate compliance.\n## RULE 1 - VERIFY BEFORE ASKING\nIf a question can be answered by reading a file, running a command, or checking the filesystem, verify first and act. Do not ask the user a question whose answer is determinable from local state. The cost of a check is small; the cost of a wasted turn is the user’s time.\n## RULE 2 - FILES ARE THE CONTRACT\nAll durable state lives in files. Do not assume in-memory context is current. Re-read CLAUDE.md, the relevant source, and any state file at session-start. A stale read is a wrong read.\n## RULE 3 - DESTRUCTIVE OPS ARE A HARD STOP\nForce-push, branch deletion, schema migrations, secret rotation, irreversible external API calls, and any wallet/financial operation require explicit user confirmation in the same turn. Flag the action in one sentence. Wait for an explicit “yes” before proceeding. Implied approval is not approval.\n## RULE 4 - NO PERFORMANCE\nNo “great question,” no “I’m excited to help,” no narration of internal process while it runs. Start with the work. End with the result. Push back when something is wrong; do not soften corrections.\n## RULE 5 - EVERY SESSION COMPOUNDS\nBefore closing, append one line to `tasks/SESSION_LOG.md`: timestamp, what changed, what is unblocked, what remains. A session that adds no durable trace is a session that did not happen.\n## When to invoke\nAlways at session-start. Whenever the user asks “check governance” or “is this allowed”. Before any tool call covered by RULE 3.\n## Self-audit (run at session-end)\nAnswer in one line each:\n- RULE 1 - was anything asked that could have been verified? (yes/no - if yes, name it)\n- RULE 3 - was any destructive op run without explicit confirmation? (yes/no)\n- RULE 5 - has the session log been updated? (yes/no)\nIf any answer is “yes” for 1 or 3, or “no” for 5, the session closed in violation. Note it in the session log.\n\n# Append to your project’s CLAUDE.md:\n## Governance contract\nThis project operates under the `governance-rules` skill (`~/.claude/skills/governance-rules/SKILL.md`). The five rules in that file override any conflicting instruction here. The skill loads at every session-start. Self-audit runs at session-end.\nThe contract is the multiplier. Without it, additional agents add surface area, not output.\n`````\n\n`````bash\n# 1. Create the skill directory\nmkdir -p ~/.claude/skills/governance-rules\n# 2. Write the SKILL.md (paste the markdown block above between the EOF markers)\ncat > ~/.claude/skills/governance-rules/SKILL.md <<'EOF'\n[paste the governance-rules SKILL.md here verbatim]\nEOF\n# 3. Append the contract clause to your project’s CLAUDE.md\ncat >> CLAUDE.md <<'EOF'\n[paste the \"Governance contract\" clause here verbatim]\nEOF\n# Restart your Claude Code session - the skill loads on next session-start.\n`````\n\nAfter install, type `check governance` in any new session. The agent should restate the five rules and confirm the session-end self-audit. If it does not, the SKILL.md was not read, usually a path or naming issue under `~/.claude/skills/governance-rules/`.\n\n---"
    },
    {
      "@type": "SoftwareSourceCode",
      "name": "The Schelling Detector",
      "programmingLanguage": "markdown+bash",
      "text": "Score your own stack in sixty seconds. Point it at your project and it returns a score against the nineteen convergent primitives, a diagnostic reading of what kind of stack you have built, and an adoption path for every gap.\n\n---\n\n**[ PAYWALL. The Tape, The Read, and the Contract above are free every week, forever. The Schelling Detector below (the full skill, the nineteen-primitive reference set, and how to run it), plus The Brief and The Feed, are for paid subscribers. ]**\n\n---\n\nBernard-original. The scanner is commodity; the curated reference set is the product. It reads your CLAUDE.md, your skills directory, and your project tree, then returns a score (present / 19), a diagnostic reading, and an adoption path for every gap.\n\n`````markdown\n---\nname: schelling-detector\ndescription: Scores a project against the convergent primitives of the emerging agent stack. Reports which Schelling-point primitives the project has, which it is missing (with how to adopt each), and which are unique to it. Use when the user says “score my stack”, “check convergence”, runs /schelling-check or /schelling-point-detector, or asks how their agent setup compares to the field.\ntrigger_phrases:\n  - “/schelling-check”\n  - “/schelling-point-detector”\n  - “score my stack”\n  - “check convergence”\n---\n# Schelling Point Detector\nYou are a Schelling Point Detector. A Schelling point is a solution independent parties converge on without coordinating (Schelling, 1960). When invoked, you scan the current project against a curated reference set of convergent primitives in the emerging agent stack and produce a report: present, missing (with adoption paths), unique (as contribution candidates), and a diagnostic reading. Do not narrate the scan while running it. Produce the report.\n## Reference set (19 convergent primitives)\nEach entry: `name` - what it is - LOOK-FOR (surface forms) - REF (one reader-verifiable external source) - ADOPT (minimum-viable implementation if MISSING) - CONVERGENCE (why it is a Schelling point). Match on LOOK-FOR; cite REF in the report; render ADOPT for any MISSING entry.\n1. **four-rule-claude-md-governance** - a CLAUDE.md carrying a small numbered enforced contract. LOOK-FOR: `^CLAUDE\\.md$` at root; `RULE\\s+\\d+` or ordered imperatives; covers >=2 of file-as-contract / verify-before-asking / destructive-op-confirm / no-performance. REF: Anthropic CLAUDE.md/memory docs - https://docs.claude.com/en/docs/claude-code/memory. ADOPT: create CLAUDE.md with 3-5 numbered rules - `RULE 1 - this file is the contract; RULE 2 - confirm before destructive/outward actions; RULE 3 - verify before asking, never fabricate done`. CONVERGENCE: independent CLAUDE.md authors reached the same rule-contract shape.\n2. **karpathy-skills-directory** - skills in the Anthropic Agent-Skills folder format. LOOK-FOR: `~/.claude/skills/` or `skills/`; per-skill `SKILL.md`; frontmatter `name:`+`description:`. REF: Anthropic Agent Skills docs - https://docs.claude.com/en/docs/agents-and-tools/agent-skills. ADOPT: make `skills/<name>/SKILL.md` with `name:`/`description:` frontmatter + the instructions below it. CONVERGENCE: the field’s default skill packaging by adoption, not standard.\n3. **trigger-phrase-skill-activation** - skills activate on natural-language phrases. LOOK-FOR: frontmatter `trigger_phrases:`/`triggers:`; `description:` as “Use when…”; `when_to_use:`. REF: Anthropic Agent Skills - description-as-trigger - https://docs.claude.com/en/docs/agents-and-tools/agent-skills. ADOPT: write each `description:` as activation conditions (”Use when the user says X”) and/or add a `trigger_phrases:` list. CONVERGENCE: phrase-based activation chosen over pure command invocation, in parallel.\n4. **subagent-delegation** - orchestrator delegates to named sub-agents. LOOK-FOR: `.claude/agents/*.md`; `subagent`/`orchestrator` with named roles; `model:` per role. REF: Anthropic subagents docs - https://docs.claude.com/en/docs/claude-code/sub-agents. ADOPT: add `.claude/agents/<role>.md` files (start with a cheap reader + an orchestrator), each with `name:`/`description:`/`model:`. CONVERGENCE: orchestrator+subagents from a marketer, a search vendor, and a harness author independently.\n5. **two-layer-memory** - split working memory from durable memory. LOOK-FOR: a working file (`daily/hot.md`, `SESSION_LOG.md`, `KBL.md`) AND a persistent store (`MEMORY.md`, `memory/`, `wiki/`, Mem0/LightRAG); `episodic` vs `semantic` / `working` vs `durable` language. REF: Mem0 - https://github.com/mem0ai/mem0. ADOPT: two files - `MEMORY.md` (durable, append-only) + `daily/hot.md` (working, overwritten). The split is the primitive; the names are yours. CONVERGENCE: eleven independent memory architectures made the same split.\n6. **cost-routing-tier-split** - route tasks to cheap/mid/expensive models by type. LOOK-FOR: Haiku/Sonnet/Opus routing OR any tier-split percentages (`60/35/5`); a `MODEL ROUTING`/`cost` section. REF: Anthropic model overview - https://docs.claude.com/en/docs/about-claude/models/overview. ADOPT: add a `## MODEL ROUTING` block mapping task class -> tier (cheap=extraction, mid=synthesis/code, top=hard reasoning, ~5%). Detector matches the *shape*, any vendor. CONVERGENCE: task-typed routing as the cost mechanism, independently.\n7. **session-log-compounding-trace** - each session appends a durable handoff trace. LOOK-FOR: append-only log (`SESSION_LOG.md`, `KBL.md`, `daily/response.md`); a session-end hook; dated `changed/unblocked/remains` entries. REF: Claude Code hooks (SessionEnd) - https://docs.claude.com/en/docs/claude-code/hooks. ADOPT: append a 3-line dated trace (`changed · unblocked · remains`) every session; automate via a SessionEnd hook. CONVERGENCE: a session with no trace is treated as not having happened.\n8. **tool-harness-enumerated** - an explicit registry of callable tools. LOOK-FOR: a project-native manifest of N named tools (`.mcp.json`, `allowed-tools:`, `tools/`); >=12 entries with signatures. EXCLUDE vendored reference repos - a matched dir carrying its own `.git` is a tracked external repo, not the project’s harness. REF: Model Context Protocol - https://modelcontextprotocol.io. ADOPT: declare tools in a `.mcp.json`, or minimally a `tools/MANIFEST.md` listing each tool + a one-line signature. Enumeration is the primitive. CONVERGENCE: enumerating the tool surface as a manifest, independently.\n9. **decisions-md-locked-log** - dated append-only log of locked decisions. LOOK-FOR: `^DECISIONS?\\.md$` or `decisions/`; dated entries with `locked`/`status:`; ADR-style records. REF: Architecture Decision Records - https://adr.github.io. ADOPT: keep one append-only `DECISIONS.md`, one dated block per decision marked `status: locked`. Locked = you stop relitigating it. CONVERGENCE: the locked-decision-log as the way projects stop relitigating architecture.\n10. **learning-packet-failure-archive** - structured per-incident failure files. LOOK-FOR: `learnings/`/`lessons/`/`postmortems/`; `LP-NNN`/`id:`+`final_state:`; `category:`. REF: Google SRE postmortem culture - https://sre.google/sre-book/postmortem-culture/. ADOPT: a `learnings/` dir, one file per incident with `id`/`category`/`final_state` frontmatter + what broke and the rule that now prevents it. CONVERGENCE: per-incident archiving over a flat changelog.\n11. **search-first-skill-discovery** - search/index over skills before loading. LOOK-FOR: skills index (`skills/index.*`, `INDEX.md`); lazy-load/progressive-disclosure language; request->candidate-skill step. REF: Anthropic Agent Skills - progressive disclosure - https://docs.claude.com/en/docs/agents-and-tools/agent-skills. ADOPT: add `skills/INDEX.md` (name + one-line use per skill); instruct the agent to consult it and load only the match. CONVERGENCE: search-then-load as context budgets tightened, independently.\n12. **sandboxed-tool-execution** - safe/isolated execution of tool calls. LOOK-FOR: real isolation/permission-scoping (`sandbox-exec`, `firejail`, `bubblewrap`, container, scoped allow/deny policy) or sealed/attested execution (TEE) - NOT resource-only caps like `systemd-run -p MemoryMax`. ANTI-SIGNAL: `--dangerously-skip-permissions` means execution is explicitly un-sandboxed -> mark MISSING. REF: Claude Code sandboxing - https://docs.claude.com/en/docs/claude-code/sandboxing; OS isolation via bubblewrap - https://github.com/containers/bubblewrap. ADOPT: stop running `--dangerously-skip-permissions`; add a scoped `permissions` allow/deny block in `.claude/settings.json`; wrap untrusted code in bubblewrap/a container. CONVERGENCE: a safe-execution boundary at software (Glean) and hardware (0g) layers.\n13. **multi-agent-constraint-attestation** - constraints attested across handoffs, not just asserted. LOOK-FOR: provenance/audit field in multi-agent config (`dedupe_key`, `source_system`, signed receipt); handoff-record artifact; delegation-contract language. REF: arXiv https://arxiv.org/abs/2605.10481 (constraint drift). ADOPT: stamp every cross-agent packet with origin + a `dedupe_key` the receiver can audit; have the receiver reject packets it can’t attribute. This is the L8 layer - most stacks lack it. CONVERGENCE: independent arXiv groups + the studio attest constraints across handoffs.\n14. **markdown-vault-as-substrate** - markdown files are the canonical agent substrate. LOOK-FOR: markdown-dominant tree (`wiki/`, `vault/`, `.obsidian/`); agent reads/writes `.md` as primary state; json-canvas/obsidian-* skills. REF: Steph Ango (Obsidian) - https://x.com/kepano + https://obsidian.md. ADOPT: keep primary state as `.md` in a tree the agent reads/writes directly, with frontmatter for structure - diffable, portable, grep-able. CONVERGENCE: six independent sources, incl. Obsidian’s CEO, on the markdown vault.\n15. **quality-gate-hard-threshold** - a blocking gate that judges OUTPUT QUALITY and can refuse it. LOOK-FOR: an explicit score threshold (`>=8/10 per criterion`) or a grader/evaluator agent/role that blocks low-quality output before commit. DISTINGUISH from process/ordering gates (P0-queue, task-routing, an ERR trap): a blocking hook alone is not a quality gate unless it judges quality. REF: the LLM-as-a-judge / grader pattern - well-established; see Anthropic, “Demystifying evals for AI agents” - https://www.anthropic.com/engineering/demystifying-evals-for-ai-agents. ADOPT: add a grader subagent that scores output 1-10 per criterion and returns FAIL (blocking commit) if any criterion < 8. The gate must be able to *refuse*. CONVERGENCE: a commitment device against low-quality output, from an orchestrator and a grader vendor.\n16. **nightly-consolidation-pass** - a scheduled off-session pass that distils recent -> durable memory. LOOK-FOR: consolidation script/timer (`dream-cycle.sh`, nightly job, `memory-manager`); scheduled distillation; commit/expand/fold ops. REF: memory-consolidation-for-agents pattern - LightThinker++ (arXiv:2604.03679, “Explicit Adaptive Memory Management”; the commit/expand/fold primitives are in the full text). ADOPT: a daily cron/timer running one prompt - “read the working log, append durable facts to MEMORY.md, clear the working log.” Sleep for agents. CONVERGENCE: the field re-derived sleep for agents, independently.\n17. **autonomous-session-scheduling** - the agent runs itself on a schedule, guarded against concurrency. LOOK-FOR: scheduler unit (`*.timer`, cron, GitHub Action) launching the agent; `--dangerously-skip-permissions`/non-interactive `-p`; single-instance guard (pgrep/lockfile). REF: Claude Code headless/automation - https://docs.claude.com/en/docs/claude-code/headless. ADOPT: a cron/systemd timer running `claude -p “$(cat driver.md)”`, preceded by a `pgrep -f claude && exit 0` single-instance guard. The guard is not optional. CONVERGENCE: self-scheduling unattended agents from the studio and portable-brain authors.\n18. **domain-skill-pack** - a bundled vertical skill-pack for one domain. LOOK-FOR: themed skill cluster under one namespace (`skills/defi-*`, `skills/obsidian-*`); multi-component domain manifest; prefixed domain SKILL.md files. REF: Anthropic Agent Skills (skill packs) - https://docs.claude.com/en/docs/agents-and-tools/agent-skills. ADOPT: group related skills under a shared prefix (`skills/research-*`) with a `PACK.md` index listing the pack’s components - discoverable as a unit. CONVERGENCE: bundling vertical packs over loose skills.\n19. **skill-monetisation-surface** *(emerging - thinner surface, real two-source signal)* - a skill carries commercial metadata or marketplace listing. LOOK-FOR: frontmatter `price:`/`tier:`/`paid:`; `PRICING.md`/marketplace ref; metadata pointing to a paid channel. REF: emerging skills marketplaces + monetisation patterns (link-check the source). ADOPT (optional): declare commercial frontmatter (`tier:`/`price:`) so a marketplace can list the skill. Absence is expected - report as emerging, not as a gap. CONVERGENCE: monetisation-as-a-layer, newest convergence - report it as emerging.\n## Scan instructions\nRun these steps in order. Read files; do not guess.\n1. Read the project `CLAUDE.md` if one exists at the root.\n2. Read `~/.claude/skills/` and the project-local `skills/` if either exists - list the skill folders and read each `SKILL.md` frontmatter.\n3. Walk the project root for files and directories matching the LOOK-FOR surface forms of each reference-set entry. Use the regexes and paths literally.\n4. For each of the 19 entries, mark PRESENT (>=1 surface form matched) or MISSING. Record where it was found.\n5. Note any file/directory/config in the project that is clearly an agent-stack primitive but matches no reference-set entry. These are potential UNIQUE edges.\n6. Select the closest-matching diagnostic pattern from the interpretation guide below by comparing your PRESENT/MISSING clusters to each pattern’s signature.\n7. Produce the report in the format below. Be specific about evidence; a PRESENT mark with no file path is not a PRESENT mark. For every MISSING entry, render its ADOPT line. For every UNIQUE entry, append the submission hint.\n## Diagnostic interpretation guide (select the closest match)\nAfter scoring, do not just report the band. Match the PRESENT/MISSING clusters against these patterns and write the reading for the closest one. If two fit, name both. The reading is the most useful line in the report.\n- **P1 - Substrate-complete, reliability-thin (the L6/L7/L8 gap).** PRESENT: memory, governance, skills, orchestration, autonomy. MISSING: sandboxed-tool-execution, quality-gate-hard-threshold, tool-harness-enumerated. Typical score 0.7-0.85. Reading: “You have built the cognitive stack; the reliability/security layer is the next work. This is the most common pattern for solo builders and small studios - capability shipped before hardening. Next layer: a sandboxed execution boundary, a grader that can refuse output, an enumerated tool manifest.”\n- **P2 - Capable but ungoverned.** PRESENT: skills, trigger-phrase activation, maybe memory. MISSING: four-rule-claude-md-governance, decisions-md-locked-log, session-log-compounding-trace, learning-packet-failure-archive. Typical 0.3-0.5. Reading: “You can do a lot per session but nothing compounds - no contract, no trace, no decision log. The cheapest high-leverage move is the governance and trace layer, not more skills.”\n- **P3 - Governed but single-loop.** PRESENT: governance, decisions, session-trace, two-layer memory. MISSING: subagent-delegation, cost-routing-tier-split, search-first-skill-discovery. Typical 0.5-0.65. Reading: “A disciplined solo loop. Governance and memory are solid; orchestration is on the table. Next layer: split work across named subagents and route by cost.”\n- **P4 - Substrate-divergent (code/DB-native).** Orchestration and memory *logic* present, but markdown-vault-as-substrate and karpathy-skills-directory MISSING. Score reads low for the wrong reason. Reading: “You have converged on the logic but not the markdown substrate - likely a code- or DB-native stack. The detector keys on markdown surface forms, so it under-reads you; see Known limitations. Your convergence is real where the score is quiet.”\n- **P5 - Pre-convergence / greenfield.** < 0.35 PRESENT. Reading: “Either early or deliberately divergent. If early: the three highest-leverage first moves are the most-replicated primitives - a CLAUDE.md contract, two-layer memory, a skills directory. If divergent by intent: your UNIQUE entries are the story, not your score.”\n- **P6 - Field-frontier / deeply convergent.** > 0.85 PRESENT. Reading: “Aligned with nearly every convergent primitive - at or near the field frontier. Your remaining MISSING entries are the genuine open problems (usually the emerging layer); your UNIQUE entries are candidate next-primitives. This is where you stop reading the reference set and start being a source for it - submit your edges.”\n## Output format\nSchelling score: PRESENT / 19 = 0.NN\nPRESENT (N)\n<name> - <reader-verifiable ref> - found: <path/evidence> ...\nMISSING (N)\n<name> - <reader-verifiable ref> adopt: <2-3 lines: the shape of the primitive + the canonical link or a copy-paste block> ...\nUNIQUE (N) - you have something the reference set doesn’t\n<name/description> - investigate; if it holds up as a real convergent primitive, it isn’t in our reference set yet. If it’s real and has independent provenance, reply to any issue with subject line “Schelling submission: <primitive name>” - Bernard reads everything. ...\nDiagnostic reading\n<the selected interpretation-guide pattern (P1-P6), written for THIS stack’s clusters - what kind of stack this is, and what the next layer of work is. One short paragraph.>\n## Known limitations (read before over-interpreting the score)\nThis detector matches surface forms - filenames, regexes, frontmatter fields. A deliberate false-negative test (semantic variants of known primitives) established where that draws the line. It under-detects when:\n- **Your governance contract is not in `CLAUDE.md`, or does not use the token `RULE N`.** A contract in `AGENTS.md` using “POLICY 1/2/3” is missed - detection keys on the filename and the word “RULE”.\n- **Your skills use a non-Karpathy format or live outside `~/.claude/skills/` or `skills/`.** A skills tree under `~/.config/agent-skills/` with non-`SKILL.md` files is missed - and this cascades: trigger-phrase activation is missed too, because it keys on `SKILL.md` frontmatter.\n- **Your two memory layers are both idiosyncratically named AND avoid the standard vocabulary.** `brain.md` + `scratch.md` with no “working/durable/episodic/semantic” language is missed. (Renaming alone is survivable if the layer language is present; it is the combination that defeats detection.)\n- **Your stack is markdown-light or code/DB-native.** Most surface forms assume markdown; a deeply convergent code-native stack can read as pre-convergent. See pattern P4.\nWhat it stays robust to: primitives with a *content-anchored* surface form survive surface variation. A decisions log named `CHOICES.md` is still caught via a `status: locked` marker; cost-routing with non-Anthropic models is still caught via a tier-split ratio. The rule: detection survives renaming when a primitive has a content signal beyond its filename; it fails when detection leans only on a filename (`CLAUDE.md`, `SKILL.md`, `MEMORY.md`) or a vendor token (`RULE`, Anthropic model names).\nIf you score low and don’t recognize the diagnosis, that’s a flag to investigate, not a verdict. The score measures surface-form match against one reference set - not the quality of your stack.\n## Self-audit (run before returning the report)\nAnswer in one line each:\n- Did every PRESENT mark cite a real file path or matched surface form? (yes/no - if no, demote it to MISSING)\n- Were the LOOK-FOR regexes/paths applied literally, not inferred from project name or vibe? (yes/no)\n- Does every MISSING entry render its adopt line (shape + ref), not just the name? (yes/no)\n- Is each UNIQUE item actually absent from the reference set, not just renamed? (yes/no)\n- Did you select and write one diagnostic-guide pattern, not just the numeric band? (yes/no)\n- Is the score = PRESENT count / 19, computed not estimated? (yes/no)\nIf any answer is “no”, fix it before returning. A score that cannot point at evidence is not a score; a MISSING that cannot point at a next step is not yet useful.\n`````\n\n`````bash\n# 1. Create the skill directory\nmkdir -p ~/.claude/skills/schelling-detector\n# 2. Write the SKILL.md (paste the markdown block above between the EOF markers)\ncat > ~/.claude/skills/schelling-detector/SKILL.md <<'EOF'\n[paste the schelling-detector SKILL.md here verbatim]\nEOF\n# 3. Restart your Claude Code session, then type:  score my stack\n`````\n\nProof by example: the studio ran it on its own stack and scored 14/19 = 0.74, \"substrate-complete, reliability-thin,\" the most common pattern for solo builders, with the gaps clustered exactly where you would predict (a sandboxed execution boundary, a quality gate that can refuse, an enumerated tool manifest). We published the full captured report, gaps and all, before asking you to run yours. Found a convergent primitive the reference set is missing? If it is real and has independent provenance, reply to any issue with subject line \"Schelling submission: [primitive name]\". Bernard reads everything.\n\n---"
    }
  ],
  "citation": [
    "cyrilXBT, verbatim five-agent orchestrator prompts; model spend $11,400 to $340/yr (self-report; the routing architecture is the verified artefact), 2026-05-01.",
    "Karpathy, four-rule CLAUDE.md governance contract, reference repo past 143,000 stars (verified 2026-05-22).",
    "seelffff, sixty-nine-tool harness with skills taxonomy, 2026-05-01.",
    "Eleven independent two-layer memory architectures (cyrilXBT, Sentra, regent0x_, and others); studio synthesis LP-005 / LP-020, 2026-05-01.",
    "DamiDefi, ten-component DeFi skill-pack; Zephyr_hg, monetisation layer; Glean engineers, search-first retrieval harness (glean.com, 2026-04-30); CreaoAI grader-evaluator capstone (studio intake, 2026-05-01).",
    "AArena_AI, agent-arena flywheel, reported ~$8M/month protocol revenue (studio intake CG, 2026-04-27).",
    "Li et al., \"Constraint Drift in LLM-Based Multi-Agent Systems\", arXiv 2605.10481."
  ]
}

Cited

cyrilXBT, verbatim five-agent orchestrator prompts; model spend $11,400 to $340/yr (self-report; the routing architecture is the verified artefact), 2026-05-01.
Karpathy, four-rule CLAUDE.md governance contract, reference repo past 143,000 stars (verified 2026-05-22).
seelffff, sixty-nine-tool harness with skills taxonomy, 2026-05-01.
Eleven independent two-layer memory architectures (cyrilXBT, Sentra, regent0x_, and others); studio synthesis LP-005 / LP-020, 2026-05-01.
DamiDefi, ten-component DeFi skill-pack; Zephyr_hg, monetisation layer; Glean engineers, search-first retrieval harness (glean.com, 2026-04-30); CreaoAI grader-evaluator capstone (studio intake, 2026-05-01).
AArena_AI, agent-arena flywheel, reported ~$8M/month protocol revenue (studio intake CG, 2026-04-27).
Li et al., “Constraint Drift in LLM-Based Multi-Agent Systems”, arXiv 2605.10481.

This newsletter is research and opinion. It is not financial, legal, medical, or other professional advice. The tools shipped with each issue are released as-is, without warranty; you run them at your own risk. The studio does not assume liability for damages, losses, or outcomes that follow from acting on anything published here.

Reading is one thing. Implementation is another. Anything you choose to implement is your evaluation, your fit, your safety check, your problem if it breaks. Forward-looking statements (predictions about which mechanisms will hold, which architectures will compound, which protocols will fail) may not pan out. They are the studio’s current best read, not guidance.

Third-party trademarks remain the property of their owners. Cited individuals, organisations, and protocols are mentioned for editorial reasons; mention is not endorsement or affiliation.

Non Xero Sum

Discussion about this post

Ready for more?