Bin/cli.js invoking the Worker over HTTPS. .md file to compare - side-by-side diff against snappy-sync
snappy-sync
description: "Triggers on prompt mention of 'snappy-sync'."
What it does for you
Keeps your setup matched across every machine you use.
What it produces
A recent result, so you can see the kind of work it returns.
loading…
How to get it
These run inside the Snappy workspace. Want this working in your business? I set skills like this up with you, in one focused week.
For developers how this skill is built, graded, and how it runs
at a glance- the short version
what's inside - the parts that make up a skill 2/4 present
A skill is just a few plain-text files. Only the main one is required. The rest are optional, added as the work needs them. This is what the skill is made of; how it runs is just below.
state/skills/snappy-sync/SKILL.md
present
state/lib/snappy-sync.ts
not present
state/bin/snappy-sync/
not present
state/skills/snappy-sync/AGENTS.md
present
how it's graded - what counts as a good run 4 criteria · 3 deterministic · 1 judge
Each row is one thing a good run has to get right. deterministic means a quick check decides, pass or fail. judge means the AI reads the result and rates it. Grading each piece on its own (instead of one overall score) shows exactly where a run fell short, so the fix is obvious.
how it runs - the shared frame every skill uses 5/5 present
Every skill runs the same way. One part does the work, a separate part checks it, and a short loader hands the AI exactly what it needs for the job. Anything this skill doesn't use shows a one-line note saying why, on purpose, not by accident.
Npx tsx state/lint/sync-integrity.ts --gateway --manifest state/log/evals.ndjson - NEVER touch DO Spaces directly. The Worker is the only ingress; client never holds DO creds.
- NEVER pass tenant_id from the client — Worker recomputes it server-side; client values are ignored.
- ALWAYS pass explicit --scope program or --scope sources to push human-owned layers. Default scope is state only.
- ALWAYS acquire the per-runtime debounce lock at /tmp/snappy-os-debounce-<runtime>.lock before push. If <60s old and --auto, exit mode:"debounced" (still scores 1.0).
- A stale lock (mtime >120s) gets force-removed before exit, otherwise a wedged process blocks every subsequent push.
what it has learned - fixes written back in over time sample
When a run hits something this skill didn't handle, the fix gets written back into the skill so it doesn't happen again. FIXED means it was corrected on the spot. LOGGED means it's queued for a bigger rewrite. Either way, the skill gets a little better and never makes the same mistake twice.
- Loading feedback rows…
how the work flows- who makes it, who checks it
bin/cli.js push|pull|status|doctor [--scope state|sources|program|all] [--repo os|kernel|both] [--auto] [--dry]
state/bin/sync/manifest-compare.ts` — independent re-fetch of remote manifest, asserts match modulo deny-list
SKILL.md- the skill, written out in plain English
snappy-sync
The sync subsystem skill. Wraps bin/cli.js push|pull|status|doctor with a uniform contract so every caller (Stop hook, SessionStart hook, cron tick, doctor cron, manual operator) routes through the same gate + log + eval flow. The Worker is the only ingress to DO Spaces; this skill never touches DO directly. Failure modes prevented: silent drift between machines, cross-tenant write attempts, hook-driven push storms, and dirty-tree pushes that overwrite in-flight work.
Purpose
Keep every machine's view of the canonical in agreement without human-in-the-loop. Push uploads local diffs through the Worker. Pull applies remote diffs locally. Status diffs the two manifests without touching either side. Doctor runs every section-A lint and parity-matrix cell, exit code = failure count.
Inputs
| Field | Type | Notes |
|---|---|---|
action | string | push / pull / status / doctor |
scope | string | state (default for --auto), all (default for manual), sources / program require explicit scope |
repo | string | os (default), kernel, both |
auto | boolean | Skips work if debounce lock is fresher than 60s |
dry | boolean | Logs the would-do list, no Worker call |
Steps
Sub-flow: push
- Acquire debounce lock at
/tmp/snappy-os-debounce-<runtime>.lock.
If fresher than 60s and --auto set, log mode:"debounced" and exit.
- Walk local tree under
~/projects/snappy-os/. - Apply
state/bin/sync/sync-rules.jsonallow/deny lists. Refuse on
uncovered state/log/* paths.
- Compute SHA-256 per file (or read script per Phase 0 strategy).
- Diff against
state/log/sync-manifest.json. - For each changed file: acquire upload lock,
POST /_pushwith the
delta + body, release upload lock.
- Update local manifest. Log one row to
state/log/sync-events.ndjson. - Batch-wide round-trip probe (mandatory). Run
npx tsx state/lint/sync-integrity.ts --gateway --manifest before returning 0. This re-fetches the remote manifest and asserts every pushed file's SHA matches - not just the first. Partial coverage (first-file-only spot-check) is the root cause of four silent-drop incidents on 2026-04-18 (gateway-stale-statusline, worker-stale-program, stale-pull-blocked, and an unnamed sibling). Any drift here fails the push; do not exit 0.
- If
sync-integrity.tssurfaces drift: append onestate/log/frictions.ndjson
row per drifted path (not one row for the batch). The breaker-queue processes one row per run; a batch of 12 drifted files must produce 12 rows so the autopilot cannot close the issue by fixing only the first path.
- Run
state/bin/pid-aggregate.tsifstate/log/evals.ndjsonchanged
since last aggregate.
Sub-flow: pull
GET /_pull?since=<manifest_sha>&repo=<repo>→ manifest delta + bodies.- Refuse if remote manifest version > local code understands (Phase 13).
- Honor
--scope.program/sourcesrequire explicit scope. - For each remote-newer file: write to local. Update local manifest.
- Log one row to
state/log/sync-events.ndjson. - KV cache-buster on auto-heal. The Worker KV cache has a 5-minute
TTL. When pull --auto enters the auto-heal path (local diverged from remote), append ?refresh=1 to the /_pull request so the Worker bypasses the KV hit and reads the freshly-written blob. Without this, the heal loop can re-read a 5-min-stale manifest and loop indefinitely.
Sub-flow: status
Diff local manifest vs GET /_pull?since=0. Print summary by repo: added, modified, deleted, unchanged. No writes.
Sub-flow: doctor
Run every section-A lint (sync-freshness, manifest-drift, symlink-check, sync-integrity, runtime-parity, do-spaces-reachable, worker-serves-canonical, no-github-in-product, three-layer-respect, eval-row-mandatory, sync-rules-coverage, aggregate-schema, worker-bundle-size). Exit code = failure count. POST _alert if called via the doctor cron and any failures.
Log + eval
append("chain", { run_id, skill: "snappy-sync", action,
files, bytes, dur_ms, manifest_after, trigger });
score("snappy-sync", run_id, {
score: ok ? 1.0 : 0.0,
primary_issue: ok ? null : reason,
files, bytes, dur_ms,
});
Eval
Actor: bin/cli.js invoking the Worker over HTTPS. Auditor: npx tsx state/lint/sync-integrity.ts --gateway --manifest - batch-wide re-fetch of the remote manifest, asserts every file's SHA matches (not just the first). A first-file-only spot-check is the verified root cause of four silent-drop incidents on 2026-04-18 and must never substitute for the full probe.
| Outcome | Score | primary_issue |
|---|---|---|
| Push or pull clean, manifests converge | 1.0 | null |
| Debounce-suppressed | 1.0 | null (mode:"debounced") |
| Worker 401 | 0.0 | auth-rejected |
| Worker 403 (cross-tenant) | 0.0 | tenant-violation |
| Worker 409 (concurrent shared write) | 0.0 | conflict-409 |
| Worker 5xx | 0.0 | worker-error |
| Local SHA mismatch after pull | 0.0 | pull-corruption |
| Batch integrity drift (any file past first) | 0.0 | batch-drift |
| Lint failure during doctor | 0.0 | lint-failed:<name> |
Failure modes
- Debounce stale lock - mtime over 120s gets force-removed before
exit; otherwise a wedged process blocks every subsequent push.
- Tenant prefix drift - Worker recomputes
tenant_idserver-side;
any client-supplied tenant_id is ignored.
- Manifest version mismatch - pull refuses, prints upgrade prompt.
Never silently overwrites local with newer-schema bytes.
- Conflict 409 - Worker returns expected SHA; client must re-pull,
re-apply locally, re-push. Loser bytes preserved at state/log/sync-conflicts/.
- Cross-runtime debounce collision - three Claude Code windows on
one machine share the same lock. Rate limit is per-runtime, not per-process.
- First-file-only probe false confidence - a push returning HTTP 200
does not mean all files landed. The per-batch sync-integrity.ts --gateway --manifest probe is mandatory; skipping it and spot-checking only the first file in the batch is the root cause of four silent-drop incidents on 2026-04-18. Score 0.0 (batch-drift) if any file past the first fails the SHA check.
- Stale KV read during auto-heal - the Worker KV cache has a 5-min
TTL. Pass ?refresh=1 on /_pull requests issued from the auto-heal path so the Worker reads the freshly-written blob rather than a stale cached manifest.
Rubric
criteria:
- name: sync_operation_success
kind: deterministic
check: "The 'snappy-sync' skill execution logs a 'score': 1.0 and 'primary_issue': null in 'state/log/evals.ndjson' for successful 'push' or 'pull' operations, or if 'mode:\"debounced\"' for auto-triggered actions."
- name: no_batch_drift_on_push
kind: deterministic
check: "A 'push' operation must exit with a non-zero status code if 'npx tsx state/lint/sync-integrity.ts --gateway --manifest' detects any SHA mismatch for pushed files."
- name: doctor_lint_failures_reported
kind: deterministic
check: "A 'doctor' action must produce an exit code equal to the count of failed lint checks and log 'primary_issue': 'lint-failed:<name>' for each failure in 'state/log/evals.ndjson'."
- name: manifest_integrity_after_push
kind: judge
check: "Examine 'state/log/sync-events.ndjson' to verify that the 'manifest_after' field accurately reflects all local changes pushed, and that 'npx tsx state/lint/sync-integrity.ts' would pass if re-run immediately after the push."AGENTS.md- what the AI loads when this skill comes up
snappy-sync - loader
Per-turn rules for the snappy-sync skill. Full reference: state/skills/snappy-sync/SKILL.md. Do not skip these.
Critical Rules
- NEVER touch DO Spaces directly. The Worker is the only ingress; client never holds DO creds.
- NEVER pass
tenant_idfrom the client - Worker recomputes it server-side; client values are ignored. - ALWAYS pass explicit
--scope programor--scope sourcesto push human-owned layers. Default scope isstateonly. - ALWAYS acquire the per-runtime debounce lock at
/tmp/snappy-os-debounce-<runtime>.lockbefore push. If <60s old and--auto, exitmode:"debounced"(still scores 1.0). - A stale lock (mtime >120s) gets force-removed before exit, otherwise a wedged process blocks every subsequent push.
Commands
| ui dashboard | state/skills/snappy-sync/resources/ui.openui | |invoke: bin/cli.js push|pull|status|doctor [--scope state|sources|program|all] [--repo os|kernel|both] [--auto] [--dry] |verify: state/bin/sync/manifest-compare.ts - independent re-fetch of remote manifest, asserts match modulo deny-list |doctor: runs every section-A lint; exit code = failure count; POSTs _alert if called via doctor cron |eval log: state/log/evals.ndjson (skill: "snappy-sync")
OpenUI Resource
- Skill-owned OpenUI Lang resource:
state/skills/snappy-sync/resources/ui.openui. Read it before rendering or editing this skill's generated component surface. - Treat this resource as a first-class artifact of the skill, not a generic chat response. Improve it when the skill's user-facing output needs to become richer.
- System resources compose OpenUI primitives and inherit SnappyChat tokens. Use
ui_contract: brandedin SKILL.md only for deliberate platform or client visuals.
Known Pitfalls
- Conflict 409 - Worker returns expected SHA; client must re-pull, re-apply locally, re-push. Loser bytes preserved at
state/log/sync-conflicts/. - Manifest version mismatch - pull refuses, prints upgrade prompt. Never silently overwrites local with newer-schema bytes.
- Cross-runtime debounce collision - three Claude Code windows on one machine share the same lock. Rate limit is per-runtime, not per-process.
sources/programscope require explicit--scopeflag. The default-staterule prevents an auto-regen worker from rewriting human-owned layers.- After push: run
state/bin/pid-aggregate.tsifstate/log/evals.ndjsonchanged since last aggregate.
primary_issue ladder
auth-rejected (401) · tenant-violation (403) · conflict-409 · worker-error (5xx) · pull-corruption (SHA mismatch) · lint-failed:<name> (doctor)
Self-Test
An agent reading this should correctly:
- [ ] Refuse to write to DO Spaces directly
- [ ] Add
--scope programwhen pushing program.md - [ ] Know the auditor is
state/bin/sync/manifest-compare.ts, not the cli's own response
Self-report
If this loader fell short, append a line:
echo "[$(date -u +%FT%TZ)] snappy-sync: <what was missing>" >> state/log/loader-feedback.log
<!-- kernel-ok: historical attribution for the PID-loader pattern. -->
Self-correcting loader (PID feedback)
This is the rule that makes loaders learn. Ported from snappy-kernel, where it demonstrably worked: the snappy-dom-cartographer skill mapped Skool admin UI once, every later ship-agent inherited the full map for free.
Before you finish, do two things:
1. Fix gaps (P - proportional). MANDATORY. If this loader didn't cover your case - if you had to read another file, run an undocumented command, work around a wrong selector, or discover a quirk - you MUST attempt an Edit to this AGENTS.md before you log.
- Default action: edit this file. Surgical. Add the missing line, table
row, or code example where it belongs. No restructuring.
- Only
LOGGEDis allowed when: the fix needs >10 lines, spans multiple
files, or requires a structural rewrite. In that case the state/regen/drain.sh queue picks it up asynchronously.
- "I didn't have time" / "it's minor" / "the next agent will figure it out"
are NOT valid reasons. Minor gaps compound - that's the whole point of this loop.
- Concrete example of a valid P-fix: discover
agent-browser screenshot
takes path positionally not as --output <path>. One line in a table. Edit it in. Do not just log it.
- The goal: the next agent never has to leave the loader.
2. Log the result.
echo "[$(date -u +%FT%TZ)] snappy-sync: <what was missing or fixed> [FIXED|LOGGED]" >> state/log/loader-feedback.log
FIXED= you patched this loader inline (P-fix).LOGGED= too large for inline; the PostToolUse enqueue + Stop-hook drain
will rewrite the loader from scratch on next session-end.
Do not skip this. Every agent run must leave the system better than it found it. The loader is the setpoint; you are the sensor; the gap is the error signal; closing the gap is the correction.
api.ts- the code it can call
⚠ no api.ts - this skill has no typed action surface
scripts- helper scripts it can run
prose-only skill - 2 inline code blocks live in SKILL.md above (no state/bin/ sidecar yet).
how we check it- the checks, plus the last 10 runs
| timestamp | verb | score | primary_issue | artifact |
|---|---|---|---|---|
| 2026-04-25 04:11Z | - | 1.00 | - | - |
| 2026-04-21 15:59Z | - | 1.00 | - | - |
| 2026-04-21 15:57Z | - | 1.00 | - | - |
| 2026-04-21 03:53Z | - | 1.00 | - | - |
| 2026-04-25 04:11Z | - | 1.00 | - | - |
| 2026-04-21 15:59Z | - | 1.00 | - | - |
| 2026-04-21 15:57Z | - | 1.00 | - | - |
| 2026-04-21 03:53Z | - | 1.00 | - | - |
| 2026-04-25 04:11Z | - | 1.00 | - | - |
| 2026-04-21 15:59Z | - | 1.00 | - | - |