PID loop
What this layer does
Phase 7 wires the per-machine PID loop (already verified working in Phase 0) into the cross-machine substrate. Per-machine evals stay local. An anonymized aggregate ships through the Worker. Quorum promotion runs as a Worker scheduled handler — when ≥3 distinct tenants score a staged rev ≥0.85 across ≥5 runs, the rev promotes to public canonical without manual review.
Files involved
state/bin/auto-regen.sh— Stop hook PID body. Picks up
<skill>.ready markers and dispatches regen subagents.
state/bin/pid-detect.ts— reads local + remote evals via
fetchEvals(); writes briefs to state/log/regen-queue/.
state/bin/pid-drain.ts— marks queued briefs.ready.state/bin/pid-aggregate.ts— anonymizes
state/log/evals.ndjson into state/log/evals.aggregate.ndjson per the locked schema. Runs at the end of push --auto.
state/lib/eval.ts—score()writer;fetchEvals()reader.state/log/regen-queue/— synced (inSYNC_ALLOW); naming uses
<skill>.<machine_id>.md to avoid cross-machine collisions.
~/projects/snappy-skills/src/quorum.ts— Worker scheduled
handler firing every minute.
Aggregate row schema (locked v1)
type AggRow = {
_v: 1;
ts: string; // ISO-8601
skill: string; // normalized from `skill` or legacy `verb`
score: number; // 0..1
run_id: string; // opaque
tenant: string; // sha256(SNAPPY_MASTER_KEY).first(12)
cost_usd_cents?: number; // rounded to nearest cent; sub-cent rounds up to 1
ok: boolean;
};
Aggregator drops every other field. state/lint/aggregate-schema.ts fails on unknown keys.
What syncs vs what doesn't
| File | Direction | Why |
|---|---|---|
state/log/evals.ndjson | NEVER syncs | per-machine ephemeral; in SYNC_DENY |
state/log/evals.aggregate.ndjson | both ways | anonymized PID signal; in SYNC_ALLOW |
state/log/regen-queue/ | both ways | briefs flow between tenants |
state/skills/<name>.md | both ways | PID-rewritten skill pages flow through standard sync |
Quorum promotion
- PID rewrite produced on tenant T. T pushes the rev to
s3://robert-storage/snappy-os-staging/<skill>/<rev_id>/.
- Worker scheduled handler reads
state/log/evals.aggregate.ndjson for any staged rev_ids.
- When ≥3 distinct tenants have scored
<skill>:<rev_id>≥0.85
across ≥5 total runs → promote. Copy snappy-os-staging/<skill>/<rev_id>/ → snappy-os/skills/<skill>/. Write changelog row. Invalidate KV.
- Robert override: any rev tagged
manual_review:robertin
frontmatter bypasses quorum and stays gated until Robert approves via /snappy-ops.
Operational gotchas
- Telemetry is opt-in by default.
SNAPPY_TELEMETRY=0disables
pid-aggregate.ts push entirely.
- Brief naming
<skill>.<machine_id>.mdprevents two tenants
proposing simultaneous rewrites from clobbering each other.
- The staged rev DOES survive a tenant going offline — quorum reads
evals.aggregate.ndjson which other tenants continue to push.
- Rounding
cost_usd_centsup at sub-cent keeps the lint bound
tight; never emit fractional cents.
- Per-machine
evals.ndjsonMUST stay inSYNC_DENY— leaking it
would expose run_ids tied to specific machines.
How to verify it's working
- After a Stop hook fires,
state/log/regen-queue/gains a fresh
brief; on the next push, evals.aggregate.ndjson gains a row with the new run.
curl https://skills.snappy.ai/_statusshows
last_quorum_promotion updating when ≥3 tenants converge.
- A staged rev seeded with synthetic evals from 3 tenants promotes
within 60s of the third tenant's push.
state/lint/aggregate-schema.tsexits 0 on the local aggregate file.