drop another .md file to compare - side-by-side diff against snappy-subagents

snappy-subagents

Puts several helpers to work in parallel and gathers their results.

personal 2 files 6 recent evals

Export

What it does for you

Puts several helpers to work in parallel and gathers their results.

What it produces

A recent result, so you can see the kind of work it returns.

loading…

How to get it

These run inside the Snappy workspace. Want this working in your business? I set skills like this up with you, in one focused week.

Work with me

For developers how this skill is built, graded, and how it runs

at a glance- the short version

actorThing that

auditorThing that judges (must be different - see CONSTITUTION

eval modeauto-shape

categorySystem

stages5

what's inside - the parts that make up a skill 3/4 present

A skill is just a few plain-text files. Only the main one is required. The rest are optional, added as the work needs them. This is what the skill is made of; how it runs is just below.

The skill

state/skills/snappy-subagents/SKILL.md present

the skill itself, in plain text

The main file. It says what the skill is and lays out the steps in plain English.

Code

state/skills/snappy-subagents/api.ts present

code the skill can run

Reusable code this skill can call when it needs to.

Scripts

state/bin/snappy-subagents/ not present

helper scripts

Optional. Added when a skill has a few commands to run.

Loader

state/skills/snappy-subagents/AGENTS.md present

what the AI loads on the fly

Loaded automatically the moment this skill is needed. Kept short on purpose.

how it runs - the shared frame every skill uses 5/5 present

Every skill runs the same way. One part does the work, a separate part checks it, and a short loader hands the AI exactly what it needs for the job. Anything this skill doesn't use shows a one-line note saying why, on purpose, not by accident.

makes the work The worker

present

Thing that the worker

Does the actual work. Whatever it produces is what gets checked next.

checks the work The reviewer

present

Thing that judges (must be different - see CONSTITUTION the checker

A separate checker grades the work, so the part that made it can't approve its own work.

frame

learns Self-correction

present

fixes itself learns from gaps

When a run hits a gap, the skill gets edited on the spot [FIXED] or queued for a bigger rewrite [LOGGED], so it keeps getting better.

tidies up Background fixes

present

queued for rewrite runs in the background

Bigger fixes that can't be made on the spot get queued and rewritten in the background later.

remembers Run history

present

state/log/evals.ndjson auto-shape runs

Every run is written down here, so the next time this skill is used it already knows how the last runs went.

Critical rules the things this skill must not get wrong

No must-not-break rules called out for this skill. Anything important lives in the writeup below.

what it has learned - fixes written back in over time sample

When a run hits something this skill didn't handle, the fix gets written back into the skill so it doesn't happen again. FIXED means it was corrected on the spot. LOGGED means it's queued for a bigger rewrite. Either way, the skill gets a little better and never makes the same mistake twice.

Loading feedback rows…

how the work flows- who makes it, who checks it

actor Thing that

1 generator

Brief each subagent with a closed PiD

- Every subagent prompt MUST mandate a writeback line before exit:

what this step does

- Every subagent prompt MUST mandate a writeback line before exit: echo "[$(date -u +%FT%TZ)] <slug>: <gap or fix> [FIXED|LOGGED]" >> state/log/loader-feedback.log. No silent return edges — see memory feedback_subagent_prompts_must_close_pid. - Even if the work is fully satisfied with no edits, the subagent must still emit a [FIXED] line confirming proof-of-life — silence trips agi-loop-validator.ts (see cockpit-layout/AGENTS.md Rule on consumer no-op). - Tell the subagent wh

read in prose ↗+ eval for this step

2 generator

Drain on return — orchestrator IS the live drain

- After each subagent returns, read its writeback line from

what this step does

- After each subagent returns, read its writeback line from state/log/loader-feedback.log and elevate the rule into the right loader IN-SESSION (don't wait for the cron Stop-hook drain). The next dispatch in the same session benefits — see memory feedback_orchestrator_is_live_drain. - If the subagent claimed [FIXED] but you don't see the file edit, the subagent lied — re-dispatch with a stricter prompt and log a pid-validator [LOGGED] line.

read in prose ↗+ eval for this step

auditor Thing that judges (must be different - see CONSTITUTION

3 data

Log + eval

```typescript

read in prose ↗+ eval for this step

SKILL.md- the skill, written out in plain English

snappy-subagents

Launch and coordinate parallel subagents in snappy-os. The PiD chain (verb → agent → loader → skill) only learns when each join writes back - this skill is the prose that tells an orchestrator how to dispatch work, how to keep N concurrent commits from clobbering each other, and how to close the loop with a [FIXED|LOGGED] line so the next dispatch starts sharper. Failure mode prevented: parallel agents that silently overwrite each other's work, or worse, return success without a writeback so the loader stays blind to gaps the agent already discovered.

Observed user requests

These are the prompts that triggered this skill being scaffolded:

"/loop 10 minutes make sure you keep launching subagents and keep pushing them lets see if we can achieve homemade AGI"
"Ok great please commit and push and then I need you to first tell me ok so all of these subagents that have been working"
"Tell me what you think, because it seems to me this makes sense, but I just want to make sure that with this, can we ach"
"before you do it can you please look up online first and learn just to make sure we benefit from what others have alread"
"this is kind of nuts are you telling me we actually achieved our 'homemade agi'?"

The Steps section below is a first-draft response to this cluster - refine as eval scores accumulate.

Steps

1. Decide the fan-out shape

One subagent per independent unit of work. If two tasks touch the

same file, queue them sequentially in one subagent - don't fan out.

Prefer the in-session Task tool (Anthropic harness) over headless

claude -p for attended work. Headless is for Stop-hook drains, scheduled cron, or when the parent is about to exit. Both inherit the same memory; the subagent will read this loader on its first prompt.

2. Brief each subagent with a closed PiD

Every subagent prompt MUST mandate a writeback line before exit:

echo "[$(date -u +%FT%TZ)] <slug>: <gap or fix> [FIXED|LOGGED]" >> state/log/loader-feedback.log. No silent return edges - see memory feedback_subagent_prompts_must_close_pid.

Even if the work is fully satisfied with no edits, the subagent must

still emit a [FIXED] line confirming proof-of-life - silence trips agi-loop-validator.ts (see cockpit-layout/AGENTS.md Rule on consumer no-op).

Tell the subagent which loader is its setpoint (state/skills/<slug>/AGENTS.md)

so a P-fix lands in the right file, not a generic note.

3. Race-safe shared worktree (mandatory when N > 1)

NEVER git add -A, git add ., git stash -u, or git commit -a -

they sweep other agents' WIP. Use explicit pathspec on commit: git commit -m "msg" -- <your-paths>. The pathspec on commit closes the TOCTOU window between git add and git commit.

After git pull --rebase origin main, re-Read your files before Edit -

the rebase silently rewrites the worktree. If a stash-pop conflict resolved by dropping your work, recover via git checkout stash@{0} -- <files>.

Each parallel xcodebuild MUST set -derivedDataPath /tmp/snappy-shell-build-<batch>

to avoid colliding artifacts. Cockpit-layout/AGENTS.md Rule 5 has the full incident catalog; cite it.

4. Drain on return - orchestrator IS the live drain

After each subagent returns, read its writeback line from

state/log/loader-feedback.log and elevate the rule into the right loader IN-SESSION (don't wait for the cron Stop-hook drain). The next dispatch in the same session benefits - see memory feedback_orchestrator_is_live_drain.

If the subagent claimed [FIXED] but you don't see the file edit,

the subagent lied - re-dispatch with a stricter prompt and log a pid-validator [LOGGED] line.

5. Log + eval

import { score } from "../../lib/eval";
score("snappy-subagents", run_id, {
  score: <1.0 if every subagent wrote back AND no clobber AND every reported FIX is on disk
         | 0.5 if writebacks landed but at least one needed re-dispatch
         | 0.0 if any subagent returned silently OR clobbered another agent's commit>,
  primary_issue: <null | "silent-return" | "clobber" | "lied-fix" | ...>,
});

Eval

Actor: the thing that produces the output (a dispatch model, a CLI, or the state/lib/snappy-subagents.ts library if one exists). Auditor: the thing that judges (must be different - see CONSTITUTION invariant #3). Name both explicitly.

Score convention:

Outcome	Score
Pass on first try	1.0
Failed first, auto-fix applied, re-check passed	0.5
Still failing or unrecoverable	0.0

If you cannot name a deterministic auditor, switch the frontmatter to eval: manual and log to state/log/pending-eval.ndjson - but fight to avoid manual. Manual is the escape hatch that leaks the thesis.

Gotchas

List concrete failure modes you hit while building. Examples: "API

returns 200 with empty body"; "round-tripped text has a , artifact where em-dashes used to be"; "cache file is stale after 12h."

If a gotcha matches an entry in user memory (see ~/.claude/CLAUDE.md),

cite the memory name so future agents can follow the trail.

AGENTS.md- what the AI loads when this skill comes up

snappy-subagents - loader

Per-turn rules for fan-out + coordination. Full skill: state/skills/snappy-subagents/SKILL.md (5 Steps: fan-out shape → brief → race-safe → drain on return → log+eval). The PiD chain only learns when each agent writes back.

Critical Rules

One subagent per independent unit. Same-file work must sequence in ONE subagent. Concurrent same-file = git clobber.
Prefer in-session Task over headless claude -p. Headless: Stop-hook drains, cron, parent exit only. Task: attended loop.
EVERY subagent MUST emit writeback before exit. Even no-ops. Mandatory: echo "[$(date -u +%FT%TZ)] <slug>: <gap or fix> [FIXED|LOGGED]" >> state/log/loader-feedback.log. Silence = hung consumer.
Tell subagent its AGENTS.md setpoint. Specify state/skills/<slug>/AGENTS.md as P-fix target so edits land correctly.
Pathspec commits only. NEVER: -A, -a, -am, -u, stash -u. Use: git commit -m "msg" -- <paths>.
Re-Read files after git pull --rebase. Rebase silently rewrites. Recover dropped work: git checkout stash@{0} -- <files>.
Parallel xcodebuild: -derivedDataPath /tmp/snappy-shell-build-<batch> prevents artifact collisions.
Never claude -p from attended if git active. Auto-regen child can clobber. Defer to Stop-hook.
FORBIDDEN paths in every brief. Subagent brief MUST list files it must NOT touch (e.g. App.tsx, styles.css). Pattern: FORBIDDEN: web/src/App.tsx, web/src/styles.css in heading.
Orchestrator = live drain. Read writeback on return, elevate rules IN-SESSION (not Stop-hook).
Honest [FIXED] claims must match disk. Missing edit = lie. Re-dispatch stricter. Actor ≠ auditor (CONSTITUTION #3).
Subagent model floor: model: "sonnet". Haiku too weak for UI/Swift parity. Sonnet default. Haiku only for trivial data tasks.
screencapture -x captures active screen. Non-attended: activate app first: osascript -e 'tell app "SnappyChat" to activate'.
Concurrent build-app.sh blocks chat-drive tests. Lock builds or accept test contamination.

Commands

| ui dashboard | state/skills/snappy-subagents/resources/ui.openui |

operation	command
safe-commit	`git commit -m "msg" -- <pathspec>` (NEVER `-a`, `-am`, `-A`)
safe-stash	`git stash push -- <your-paths>` (NEVER `-u` in shared)
recover-stash	`git checkout stash@{0} -- <files>`
detect-rebase-revert	`wc -l <files>` or `git diff <ref> -- <files>`
parallel-xcode	`xcodebuild -derivedDataPath /tmp/snappy-shell-build-<batch> ...`
writeback	`echo "[$(date -u +%FT%TZ)] <slug>: <body> [FIXED\	LOGGED] action_kind=<kind>" >> state/log/loader-feedback.log`
score	`import { score } from "../../lib/eval"; score("snappy-subagents", run_id, { score, primary_issue });`
reference	`state/skills/snappy-subagents/SKILL.md`
logs	`state/log/evals.ndjson` (eval) \	`state/log/loader-feedback.log` (writebacks)

Score Convention

outcome	score
Every subagent wrote back AND no clobber AND every `[FIXED]` claim on disk	1.0
Writebacks landed but ≥1 needed re-dispatch	0.5
Any subagent silent OR clobbered another commit OR lied `[FIXED]`	0.0

Self-Test

[ ] One subagent per independent unit; same-file work sequential?
[ ] Prefer in-session Task over headless claude -p?
[ ] Every subagent writes writeback, even no-ops?
[ ] Tell subagent its AGENTS.md setpoint?
[ ] Pathspec commits only (-- <paths>), never -a/-A?
[ ] Re-Read files after rebase before Edit?
[ ] Parallel xcodebuild: -derivedDataPath /tmp/snappy-shell-build-<batch>?
[ ] Never nested claude -p from attended context?
[ ] Every brief lists explicit FORBIDDEN paths?
[ ] Read writebacks in-session, elevate rules immediately?
[ ] Verify [FIXED] claims on disk?
[ ] Agent/Task calls: model: "sonnet" (floor) - haiku only for trivial data?
[ ] Activate app before screencapture -x in non-attended context?
[ ] Concurrent builds locked or test contamination logged?

Self-correcting loader (PID feedback)

This is the rule that makes loaders learn. Ported from snappy-kernel, where it demonstrably worked: the snappy-dom-cartographer skill mapped Skool admin UI once, every later ship-agent inherited the full map for free.

Before you finish, do two things:

1. Fix gaps (P - proportional). MANDATORY. If this loader didn't cover your case - if you had to read another file, run an undocumented command, work around a wrong selector, or discover a quirk - you MUST attempt an Edit to this AGENTS.md before you log.

Default action: edit this file. Surgical. Add the missing line, table

row, or code example where it belongs. No restructuring.

Only LOGGED is allowed when: the fix needs >10 lines, spans multiple

files, or requires a structural rewrite. In that case the state/regen/drain.sh queue picks it up asynchronously.

"I didn't have time" / "it's minor" / "the next agent will figure it out"

are NOT valid reasons. Minor gaps compound - that's the whole point of this loop.

Concrete example of a valid P-fix: discover agent-browser screenshot

takes path positionally not as --output <path>. One line in a table. Edit it in. Do not just log it.

The goal: the next agent never has to leave the loader.

2. Log the result.

echo "[$(date -u +%FT%TZ)] snappy-subagents: <what was missing or fixed> [FIXED|LOGGED] action_kind=<kind>" >> state/log/loader-feedback.log

<slug> MUST be the literal folder name of this loader

(state/skills/<slug>/AGENTS.md). The class token between [ts] and : is the producer slug, the writeback class, AND the grade class - they must be equal so state/lib/controller-tune.ts can pair the brief.

FIXED = you patched this loader inline (P-fix).
LOGGED = too large for inline; the PostToolUse enqueue + Stop-hook drain

will rewrite the loader from scratch on next session-end.

action_kind is the SECOND pairing predicate. Pick the value that describes what you actually did:
shape-ok - frontmatter-shape verification only
skill-ran - the skill ran end-to-end with eval row in state/log/evals.ndjson
loader-rewritten - you EDITED this AGENTS.md inline (the FIXED case), OR regen rewrote it
pattern-elevated - you promoted a recurring failure to a Critical Rule

Do not skip this. Every agent run must leave the system better than it found it. The loader is the setpoint; you are the sensor; the gap is the error signal; closing the gap is the correction.

OpenUI Resource

Skill-owned OpenUI Lang resource: state/skills/snappy-subagents/resources/ui.openui. Read it before rendering or editing this skill's generated component surface.
Treat this resource as a first-class artifact of the skill, not a generic chat response. Improve it when the skill's user-facing output needs to become richer.
System resources compose OpenUI primitives and inherit SnappyChat tokens. Use ui_contract: branded in SKILL.md only for deliberate platform or client visuals.

api.ts- the code it can call

#!/usr/bin/env npx tsx
/**
 * state/skills/snappy-subagents/api.ts — sidecar stub for the snappy-subagents skill.
 *
 * This file is created by the scaffolder so a fresh skill folder is
 * structurally valid. Replace the placeholder with the real implementation
 * the moment the skill needs executable logic, OR move the implementation
 * to `state/lib/snappy-subagents.ts` (preferred — the lib path is what
 * `eval: shape` validates against).
 *
 * If this skill has no backing code (prose-only slash command), delete this
 * file and rely on `eval: auto-shape` in SKILL.md.
 */
export const SKILL_NAME = "snappy-subagents" as const;

export function describe(): string {
  return "The subagents skill — purpose TBD.";
}

if ((() => { try { return import.meta.url === `file://${process.argv[1]}`; } catch { return false; } })()) {
  console.log(JSON.stringify({ skill: SKILL_NAME, describe: describe() }, null, 2));
}

scripts- helper scripts it can run

prose-only skill - 1 inline code block live in SKILL.md above (no state/bin/ sidecar yet).

how we check it- the checks, plus the last 6 runs

rubric auto-shape no rubric declared

recent mean 1.00 · 6 runs actor/auditor: unverifiable

deps none declared

timestamp	verb	score	primary_issue	artifact
2026-04-29 17:59Z	-	1.00	-	-
2026-04-27 16:54Z	-	1.00	-	-
2026-04-29 17:59Z	-	1.00	-	-
2026-04-27 16:54Z	-	1.00	-	-
2026-04-29 17:59Z	-	1.00	-	-
2026-04-27 16:54Z	-	1.00	-	-