PID audit — 2026-04-18
Executive verdict
PID is PARTIALLY working. Breaker + fixer + precheck stages are wired, coded, and have recent eval rows. The loop-closure stage (re-test / A4) is spec-only: regen.sh exists as a script but has zero callers, is not cron-wired, and state/log/regen-deltas.ndjson does not exist. On top of that, the breaker and fixer crontab entries are currently PAUSED (commented # PAUSED 2026-04-18: lines in the live crontab), so even the stages that work are not running autonomously right now.
Stages present vs. planned
The plan (plans/read-the-plan-nifty-wolf.md) names six A-primitives for the PID loop. Here is the receipts column next to each:
| # | Primitive | Script | Wired? | Observed evidence |
|---|---|---|---|---|
| A1 | Quota-aware exit | fix.sh:184-195 | yes | grep fixer-quota hits in fix.sh, early-exit path intact |
| A2 | Already-fixed precheck | fix.sh:32-75 | yes in code, 0 runs | grep fixer-precheck pattern in fix.sh is live, but evals.ndjson contains zero fixer-precheck rows. Either cron never fires with an eligible row, or cron is paused (see below). |
| A3 | Self-rewrite loader | fix.sh:209-252 | yes | Appends Known Pitfall to <area>.agents.md, commits, pushes. Skips already-fixed / quota-reset-needed. |
| A4a | Recurring-code detect | fix.sh:77-103 | yes | Reads state/log/pid-trends.ndjson, flags IS_RECURRING_CODE=1 when area has recurring-area signal <1h old. |
| A4b/c | Autofix branch push | fix.sh:143-172, 254-303 | yes | Branches autofix/<area>-<ts>, logs to state/log/notify.ndjson. |
| A4 re-test gate | regen.sh | ORPHAN — spec only | NO CALLER | Script at state/bin/autopilot/regen.sh:1-123. grep -rn for regen.sh in repo returns only README + wiki docs + self. auto-regen.sh (the Stop hook) dispatches a rewriter but does not call regen.sh. crontab -l contains 0 references. state/log/regen-deltas.ndjson does not exist. The loop never measures whether the rewrite helped. |
Supporting infra:
state/bin/pid-detect.ts— wired, called fromauto-regen.sh.state/bin/pid-drain.ts— wired, called fromauto-regen.sh.state/lint/pid-trends.ts— wired, live output shows 15 trends detected
this run (recurring-area=5, friction-spike=7, eval-decline=3).
Observed per-verb eval means (autopilot skill)
From jq -s 'group_by(.verb)' <(jq -c 'select(.skill=="autopilot")' state/log/evals.ndjson):
[
{ "verb": "breaker-tick", "n": 34, "mean": 1.000 },
{ "verb": "fixer-tick", "n": 28, "mean": 0.964 },
{ "verb": "dispatch-tick", "n": 2, "mean": 0.000 },
{ "verb": "tick", "n": 1, "mean": 1.000 }
]
Plus snappy-fix.verb=pick rows (not the autopilot cron — operator-invoked): recent mean is 1.0 across 15+ runs; rows repeatedly cite skill-silent-meeting-triage, stale-pull-blocked, gateway-drift-batch — which matches the top recurring-area signals in pid-trends.ndjson:
| Area | Recurring rows |
|---|---|
| meeting-followup | 36 |
| stale-pull-blocked | 34 |
| typecheck-broken | 18 |
| smoke-ubuntu | 18 |
| linkedin-post-scraper | 18 |
| cli-phantom-verbs | 18 |
These are the areas where the fixer is closing rows that keep re-opening — the exact symptom A4 is supposed to catch. A4 is not wired.
Cron reality check
crontab -l | grep autopilot:
# PAUSED 2026-04-18: 0,30 * * * * /Users/robertboulos/projects/snappy-os/state/bin/autopilot/break.sh >> /Users/robertboulos/.claude/logs/snappy-os-breaker.log 2>&1
# PAUSED 2026-04-18: 15,45 * * * * /Users/robertboulos/projects/snappy-os/state/bin/autopilot/fix.sh >> /Users/robertboulos/.claude/logs/snappy-os-fixer.log 2>&1
Both autopilot crons are commented out as of today. The 34 breaker-tick and 28 fixer-tick eval rows are from before the pause (latest fixer-tick: 2026-04-18T05:16:54Z; latest breaker-tick: 2026-04-18T06:09:52Z). Nothing about autopilot has run autonomously since ~06:09 UTC today.
No regen.sh crontab entry has ever existed (grep regen in crontab: 0 hits).
Documentation score
End-to-end docs live in four places:
state/wiki/pid-loop.md— quorum / cross-machine PID (phase 7).state/wiki/autopilot.md— goal-file loop (the older single-cron model).state/skills/snappy-autopilot-twocron.md— breaker/fixer split (the
current design).
state/skills/pid-regen.md— detect/drain/dispatch brief flow.
Grade against Robert's 5 criteria:
| Criterion | Covered? | Where |
|---|---|---|
| (a) Stages | partial | snappy-autopilot-twocron.md covers breaker/fixer; no single page covers regen + re-test. regen.sh has header prose only. |
| (b) Triggers | partial | autopilot.md lists wake_on; pid-regen.md says "Stop hook". Crontab entries live in snappy-autopilot-twocron.md. |
| (c) How to tune thresholds | missing | pid-detect.ts rules (fails >= 3, runs >= 5 && mean <= 0.75, zeros >= 2) are documented in pid-regen.md but no page explains how to change them. They are hardcoded in pid-detect.ts. |
| (d) How to pause | yes | state/engaged.json recipe gate; documented in autopilot.md:100, snappy-autopilot-twocron.md:79. |
| (e) What logs go where | partial | evals.ndjson, breakage-report.ndjson, frictions.ndjson covered. regen-deltas.ndjson is specced in regen.sh header but never produced, and pid-trends.ndjson is never mentioned in any doc prose. |
Options surface — cli.js autopilot
From node bin/cli.js autopilot:
cli.js autopilot [--status|--break|--fix|--tick] twocron detection/remediation wrapper
| Flag | Action | File |
|---|---|---|
--status | Prints open row counts by severity from breakage-report.ndjson. Observed: p0=0 p1=7 p2=3 (total 10) today. | bin/cli.js:1008-1017 |
--break | Runs state/bin/autopilot/break.sh once. | bin/cli.js:54,1013 |
--fix | Runs state/bin/autopilot/fix.sh once. | bin/cli.js:55,1013 |
--tick | Runs break then fix sequentially. | bin/cli.js:1022-1028 |
Hidden levers (not in --help, but real):
SNAPPY_REGEN_MIN_INTERVALenv (auto-regen.sh:42) — rate-limit seconds,
default 900 (15 min).
state/engaged.json— engage/disengageautopilot/pid-looprecipes.pid-detect.ts --stats/pid-detect.ts --dry(perpid-regen.md:80).
Gaps in the options surface:
- No
cli.js autopilot --regen <skill>passthrough for regen.sh, even
though the script exists and takes a single arg.
- No way to see
--statusfor regen deltas (because deltas file doesn't
exist).
- No
--helppage describes the engagement gate or SNAPPY_REGEN_MIN_INTERVAL.
Gaps (concrete list)
- A4 re-test gate is orphan.
state/bin/autopilot/regen.shis coded
but called by nothing. Neither bin/cli.js, state/bin/auto-regen.sh, state/bin/autopilot/fix.sh, nor crontab -l references it. Consequence: the loop cannot measure whether a rewrite improved a skill.
state/log/regen-deltas.ndjsondoes not exist. Direct consequence
of #1. The PID-loop-closure artifact is unobservable.
- Autopilot crons are paused. Breaker + fixer entries in
crontab -l
are commented with # PAUSED 2026-04-18:. Nothing will run autonomously until those lines are uncommented.
- No canonical end-to-end doc. Four pages each cover one slice;
no page shows breaker → fixer → detect → drain → dispatch → regen → deltas as one diagram.
- Thresholds are hardcoded, not tunable via docs.
pid-detect.ts
fails >= 3, mean <= 0.75, zeros >= 2 are prose-documented in pid-regen.md but there is no "how to change these" page.
pid-trends.ndjsonis undocumented. 230 rows on disk, 15 fresh this
run, consumed by fix.sh A4a — but no wiki page explains the schema.
Honesty section
- A1 (quota guard): wired and tested — see
fixer-quotagrep pattern. - A2 (already-fixed precheck): code is wired, **but zero
fixer-precheck
eval rows have ever been written** (evals.ndjson contains 0). Either the precheck condition never fired before the crons were paused, or the precheck fires silently. Worth a probe when crons resume.
- A3 (self-rewrite loader): wired. No direct receipts (would require
a git-log search for fixer: self-write pitfall commits).
- A4a (recurring detect): wired —
IS_RECURRING_CODElogic present. - A4b/c (autofix branch): wired — branch-push logic present.
- A4 proper (re-test gate): SPEC-ONLY.
regen.shexists, is
well-commented, has zero callers. regen-deltas.ndjson does not exist on disk.
One-line verdict: PID loop is three-quarters wired. The part that closes the loop (measure whether the rewrite helped) is spec-only, and the part that runs autonomously (cron) is paused.