snappy-os — the full stack, named honestly

This is the total description. Every layer has one on-disk home. If a word here isn't backed by a file path, that's a bug in this doc — flag it and fix it. Written 2026-04-22 after repeated drift into taxonomy words that didn't match the code.

Ground rules for this document:


1. The atom: one skill

A skill has four parts on disk. They support each other. Any one can be absent if the skill doesn't need it — a pure-prose skill is fine with just the .md. A skill that calls APIs needs the .ts. A skill that loads every turn needs the .agents.md.

1a. state/skills/<name>.md — the Skill (Anthropic primitive)

The canonical file. Matches the Anthropic Agent Skills spec: YAML frontmatter (name, description, optional category, triggers, etc.) + progressive-disclosure body. This is the file a user could copy out of snappy-os and drop into any Anthropic-spec consumer (Claude Code, agentskills.io reader, Cloudflare Marketplace).

In snappy-os the folder-based SKILL.md layout is flattened to one file per skill (see state/index.md). 135 of these currently. This is the portable atom.

1b. state/lib/<name>.tsapi.ts (the library module)

This is the TypeScript module the skill's prose refers to when it says "run npx tsx state/lib/<name>.ts <verb>." Internally, every one of these files headers itself as snappy-<name>/api.ts. Example, verbatim from the tree:

So the word the code uses for itself is api.ts. The filename on disk is the skill slug (docs.ts, mine.ts) because a flat state/lib/ directory is easier to scan than state/lib/docs/api.ts × 101 directories. But inside, each file:

101 api.ts modules currently in state/lib/. Includes substrate modules (eval.ts, env.ts, log.ts, dispatch.ts, agents.ts) and per-skill modules (docs.ts, mine.ts, ffmpeg.ts, drive.ts, freshbooks.ts, ...).

When you read api.ts in any snappy-os doc or message, it means: state/lib/<slug>.ts, the TypeScript library module for that skill, shebanged and dual-purpose (importable AND runnable). Not a "runner," not a "service layer," not a "sidecar" in the abstract. api.ts.

1c. state/bin/<name> or state/bin/<name>/* — scripts

Separate from api.ts. Where api.ts is a library (TypeScript, importable, internally dispatches CLI verbs), state/bin/ is where shell scripts and standalone scripts live. 64 entries currently. Two shapes:

Why two places? state/lib/<name>.ts is for things meant to be imported by other api.ts modules. state/bin/<name>/* is for operational scripts — one-shots, crons, glue shell. Same skill can have both: api.ts in lib for the library surface, state/bin/<skill>/ for the ops scripts that call it.

This is the split the prose-sidecar-drift lint enforces: a skill with ≥3 executable commands in prose needs either state/bin/<slug>/* or state/lib/<slug>.ts. Not both required; at least one.

1d. state/skills/<name>.agents.md — the loader

Short per-turn context the agent sees when the skill is relevant. Inspired by Vercel's AGENTS.md pattern (the Dec 2025 Vercel blog post: "always-loaded AGENTS.md outperforms on-demand skills, 100% vs 79% pass rate"). snappy-os adapts this by making the loader keyword-gated (only injected when the agent's prompt touches the skill's triggers) instead of always-on. A skill without an .agents.md just doesn't get preloaded — the agent has to read the full .md to learn it.

The loader is the short path. The .md is the long path. Same content distilled.

1e. summary table for one skill

PartPathWhat it isUsed by
Skill (the atom)state/skills/<name>.mdAnthropic-spec markdown + YAML frontmatterAny Anthropic-compatible reader; the long path
api.tsstate/lib/<name>.tsTypeScript library module with shebang, exports + CLI dispatchOther api.ts imports; npx tsx CLI
scriptsstate/bin/<name>/* or state/bin/<name>.{sh,ts}Shell / standalone scripts for opsCrons, hooks, user invocations
Loaderstate/skills/<name>.agents.mdShort keyword-gated per-turn contextThe snappy-os-inject.sh hook

Absent layers are fine. A PID-rule skill might have only .md + .agents.md. A pure library skill might have only .md + api.ts.


2. The cross-cutting parts (same across all skills)

2a. state/regen/footer.md — the self-correction rule

One file. Appended to every regenerated loader. Reads (verbatim excerpt):

1. Fix gaps (P — proportional). MANDATORY. If this loader didn't cover your case — if you had to read another file, run an undocumented command, work around a wrong selector, or discover a quirk — you MUST attempt an Edit to this .agents.md before you log. Only LOGGED is allowed when: the fix needs >10 lines, spans multiple files, or requires a structural rewrite. In that case the state/regen/drain.sh queue picks it up asynchronously. 2. Log the result. echo "[$(date -u +%FT%TZ)] <skill-name>: <what was missing or fixed> [FIXED|LOGGED]" >> state/log/agents-md-feedback.log

This is the PID loop's setpoint. The loader is the setpoint, the agent is the sensor, the gap is the error signal, closing the gap is the correction. Ported from snappy-kernel's dom-cartographer skill where it demonstrably worked (that skill mapped Skool admin UI once; every later ship-agent inherited the map for free).

2b. state/hooks/* — the wiring

12 shell scripts. These are what make the loaders fire. The entry point is state/hooks/snappy-os-inject.sh — the hook body. Claude Code and Codex each have their own wiring:

Other hooks in state/hooks/:

Parity is measured, not claimed. 62 rows in state/log/parity.ndjson as of 2026-04-18:

npx tsx state/lint/parity-test.ts refreshes.


3. Feedback ledgers (write side of the PID loop)

Every time a skill runs, it emits one row — either an eval or a friction, never both, per run. These rows are feedback for the next agent. They are not a dashboard. program.md §6 is explicit about this: "The row is feedback for the next agent, not a dashboard."

3a. state/log/evals.ndjson — eval rows

Written by score() in state/lib/eval.ts. Shape, per row:

skill, score (0.0 / 0.5 / 1.0 — no floats between), actor_session_id,
auditor_session_id, ts, run_id, primary_issue?, writer_id?, touched?, ...

Read back by:

3b. state/log/frictions.ndjson — friction rows

Failure rows. Written by friction() in state/lib/eval.ts. A row carries area, severity (P0/P1/P2), surface (file path the gap was found in), expected, actual, repro command, and fix. The prose-sidecar-drift lint writes friction rows here; so do the other lints.

3c. state/log/diagnostics.ndjson — quarantined evals

Skills listed in state/lib/eval-quarantine.json are diagnostic harnesses (contract-test, view-toggle stub) — they emit fixed-shape rows that would pollute real trends. score() auto-routes them here so evals.ndjson stays a real-skill signal. (Audit P1-8, 2026-04-19.)

3d. state/log/agents-md-feedback.log — the LOGGED stream

The third line of the PID loop. When a gap is too big to fix inline (>10 lines, spans multiple files), the agent appends [LOGGED] here; the drain hook reads it next Stop and regenerates the loader from scratch.


4. The PID loop — three honest modes per run

For every skill turn:

  1. [FIXED] — inline edit. The agent hit a gap ≤10 lines. It edits the .agents.md directly (one line in a table, a missing example) and appends [FIXED] to state/log/agents-md-feedback.log. Surgical. No restructuring.
  2. [LOGGED] — queue for drain. The gap is too big for inline. The agent appends [LOGGED] with the description to agents-md-feedback.log. Also, if a SKILL.md was edited, PostToolUse:Edit|Write fires enqueue-skill-regen.sh, which writes the slug to state/log/regen-pending.txt.
  3. Drain on Stop. snappy-os-stop.sh + drain-skill-regen.sh run at end of turn (or snappy-os-auto-regen.sh in autopilot mode). They read the queue and the LOGGED lines, dispatch a headless Claude to rewrite affected .agents.md files, clear the queue.

No fourth mode. "Every run scores itself" would be a fourth mode that does not exist. Scoring is a separate subprocess (see §5), not self-report.


5. Actor ≠ auditor — how it's enforced

This is the one rule that has to be right. program.md §5: "The thing that generates output cannot be the thing that grades it."

5a. The identity

sessionId() in state/lib/eval.ts:

export function sessionId(): string {
  if (cachedSessionId !== null) return cachedSessionId;
  const rnd = Math.random().toString(36).slice(2, 8);
  cachedSessionId = `s-${process.pid}-${rnd}`;
  return cachedSessionId;
}

The id is s-<process.pid>-<6 random chars>, cached per process. A different OS process gets a different pid gets a different id. An agent turn that generates output is one pid. The lint/check subprocess spawned afterward to grade that output is a different pid. The ids differ because the pids differ.

5b. The cutoff and the check

ACTOR_AUDITOR_REQUIRED_AFTER = "2026-04-20T18:00:00Z". Rows written after this instant must carry actor_session_id and auditor_session_id, and the two values must differ.

Enforcement is state/lint/eval-row-mandatory.ts --publish. It scans every row in evals.ndjson written post-cutoff; if a row is missing one of the ids, or the two are equal, the lint returns exit 1. This is a ship gate — --publish is wired into pre-ship CI.

5c. Honest scope of what's enforced

So: the enforcement is shape-level. The property we care about (different pids in practice) is maintained because the default path through sessionId() produces distinct ids for distinct subprocesses, and callers use the default.


6. The lint suite — what is actually checked

46 scripts in state/lint/. The ones that matter for the contract:

What the lint suite is not: it is not provenance-level. It does not prove an eval row was written by the true grader process. It proves the row has the right shape.


7. Sync — two layers, both bidirectional

7a. git — code and schema

What lives here:

Command: git push origin main from one machine; git pull --rebase origin main on the others. Every change to anything under state/ requires rebase-before-push (rule in CLAUDE.md).

7b. gateway + manifest — runtime artifacts

What lives here:

Command: node bin/cli.js push --auto (default scope=state). Worker writes through to DO Spaces. Each write records sha+size in state/log/sync-manifest.json. A per-batch probe verifies the write landed; on drift, a friction row is logged (root cause of the 4 silent-drop incidents on 2026-04-18).

Verification: npx tsx state/lint/sync-integrity.ts --gateway --manifest.


8. Recipes and engagement — opt-in bundles

state/recipes/*.md — 4 recipes currently:

A recipe is not a plugin. It's a named bundle of hook behaviors declared in markdown. To engage a recipe, its name goes into state/engaged.json:

{
  "recipes": ["ambient-sync", "pid-loop", "autopilot", "evolve"],
  "last_changed": "2026-04-18T22:25:00.000Z",
  "by": "pod-b-evolve-audit"
}

Hooks and cron workers check engaged.json before acting. Empty recipes array = quiet harness (safety valve). This is the user's explicit opt-in. Nothing runs unless it's named in this file.


9. Cron — two classes

From program.md §8:

state/lint/cron-coherence.ts and cron-drift.ts enforce the class split.


10. Environment — where credentials live

.env.cache at the repo root is the canonical file. snappy-os owns it. Loaded by state/lib/env.ts. Every api.ts that hits an external API reads credentials through env.ts.

~/.claude/skills/snappy-settings/.env.cache is a symlink pointing at the repo-root file (back-compat for the old path). If either is missing or broken, anything that hits an external API fails — fix the symlink direction first.

Setup from scratch is documented in CLAUDE.md under "New machine setup."


11. Seed-owned vs user-owned

program.md §9 lists which files the snappy-os project owns (ship with every clone) vs which files belong to the user's tenant (their specific skills, their eval history, their engaged recipes).

The seed manifest is state/lib/seed-manifest.ts. A user pulling snappy-os updates gets new seed files; their own state/log/* and user skills are untouched.


12. Distribution shape — Option A / Option B (no "plugin" word)

snappy-os is a scaffolded tree that lives in the user's git. Not a plugin. Not a package you install once. The tree IS the system. Updates come via git pull.

Two distribution flavors the user can opt into, both building on the same Anthropic-spec skill atoms:

Option A — full tree

The user clones snappy-os, gets all 12 layers above. Keyword-gated loaders via hooks. The PID loop self-heals gaps. Evals ledger. Frictions ledger. Recipe engagement. Sync to gateway. Works across Claude Code, Codex, openclaw, Gemini CLI, Cursor, Windsurf (parity measured in parity.ndjson).

Option B — plain skills only

The user brings their own agent runtime (Claude Code, or any Anthropic-spec consumer) and wants nothing more than the portable state/skills/*.md files. They can copy out single skills and drop them into ~/.claude/skills/<name>/. No loader, no hooks, no ledger. Pure Anthropic-spec. Still useful — the skill atom stands alone.

The goal for skills.snappy.ai: help a user who comes in wanting AI to "have hands and eyes in X" (automate a process, scrape a site, render images, etc.). Help them assemble the skills they need. They choose whether to take the full tree (Option A) or just the atoms (Option B). We don't impose.

The hero tagline, confirmed 2026-04-22: treat markdown like code.


13. Index of every named file

Grouped by layer. Every word in this doc has a file.

The atom

The cross-cutting layer

Ledgers

Eval contract

Recipes and engagement

Env

Schema and catalog

Distribution


14. What is NOT in the system (so future doc writers don't invent it)