PID audit — 2026-04-18

Executive verdict

PID is PARTIALLY working. Breaker + fixer + precheck stages are wired, coded, and have recent eval rows. The loop-closure stage (re-test / A4) is spec-only: regen.sh exists as a script but has zero callers, is not cron-wired, and state/log/regen-deltas.ndjson does not exist. On top of that, the breaker and fixer crontab entries are currently PAUSED (commented # PAUSED 2026-04-18: lines in the live crontab), so even the stages that work are not running autonomously right now.


Stages present vs. planned

The plan (plans/read-the-plan-nifty-wolf.md) names six A-primitives for the PID loop. Here is the receipts column next to each:

#PrimitiveScriptWired?Observed evidence
A1Quota-aware exitfix.sh:184-195yesgrep fixer-quota hits in fix.sh, early-exit path intact
A2Already-fixed precheckfix.sh:32-75yes in code, 0 runsgrep fixer-precheck pattern in fix.sh is live, but evals.ndjson contains zero fixer-precheck rows. Either cron never fires with an eligible row, or cron is paused (see below).
A3Self-rewrite loaderfix.sh:209-252yesAppends Known Pitfall to <area>.agents.md, commits, pushes. Skips already-fixed / quota-reset-needed.
A4aRecurring-code detectfix.sh:77-103yesReads state/log/pid-trends.ndjson, flags IS_RECURRING_CODE=1 when area has recurring-area signal <1h old.
A4b/cAutofix branch pushfix.sh:143-172, 254-303yesBranches autofix/<area>-<ts>, logs to state/log/notify.ndjson.
A4 re-test gateregen.shORPHAN — spec onlyNO CALLERScript at state/bin/autopilot/regen.sh:1-123. grep -rn for regen.sh in repo returns only README + wiki docs + self. auto-regen.sh (the Stop hook) dispatches a rewriter but does not call regen.sh. crontab -l contains 0 references. state/log/regen-deltas.ndjson does not exist. The loop never measures whether the rewrite helped.

Supporting infra:

this run (recurring-area=5, friction-spike=7, eval-decline=3).


Observed per-verb eval means (autopilot skill)

From jq -s 'group_by(.verb)' <(jq -c 'select(.skill=="autopilot")' state/log/evals.ndjson):

[
  { "verb": "breaker-tick",  "n": 34, "mean": 1.000 },
  { "verb": "fixer-tick",    "n": 28, "mean": 0.964 },
  { "verb": "dispatch-tick", "n":  2, "mean": 0.000 },
  { "verb": "tick",          "n":  1, "mean": 1.000 }
]

Plus snappy-fix.verb=pick rows (not the autopilot cron — operator-invoked): recent mean is 1.0 across 15+ runs; rows repeatedly cite skill-silent-meeting-triage, stale-pull-blocked, gateway-drift-batch — which matches the top recurring-area signals in pid-trends.ndjson:

AreaRecurring rows
meeting-followup36
stale-pull-blocked34
typecheck-broken18
smoke-ubuntu18
linkedin-post-scraper18
cli-phantom-verbs18

These are the areas where the fixer is closing rows that keep re-opening — the exact symptom A4 is supposed to catch. A4 is not wired.


Cron reality check

crontab -l | grep autopilot:

# PAUSED 2026-04-18: 0,30 * * * * /Users/robertboulos/projects/snappy-os/state/bin/autopilot/break.sh >> /Users/robertboulos/.claude/logs/snappy-os-breaker.log 2>&1
# PAUSED 2026-04-18: 15,45 * * * * /Users/robertboulos/projects/snappy-os/state/bin/autopilot/fix.sh   >> /Users/robertboulos/.claude/logs/snappy-os-fixer.log 2>&1

Both autopilot crons are commented out as of today. The 34 breaker-tick and 28 fixer-tick eval rows are from before the pause (latest fixer-tick: 2026-04-18T05:16:54Z; latest breaker-tick: 2026-04-18T06:09:52Z). Nothing about autopilot has run autonomously since ~06:09 UTC today.

No regen.sh crontab entry has ever existed (grep regen in crontab: 0 hits).


Documentation score

End-to-end docs live in four places:

  1. state/wiki/pid-loop.md — quorum / cross-machine PID (phase 7).
  2. state/wiki/autopilot.md — goal-file loop (the older single-cron model).
  3. state/skills/snappy-autopilot-twocron.md — breaker/fixer split (the

current design).

  1. state/skills/pid-regen.md — detect/drain/dispatch brief flow.

Grade against Robert's 5 criteria:

CriterionCovered?Where
(a) Stagespartialsnappy-autopilot-twocron.md covers breaker/fixer; no single page covers regen + re-test. regen.sh has header prose only.
(b) Triggerspartialautopilot.md lists wake_on; pid-regen.md says "Stop hook". Crontab entries live in snappy-autopilot-twocron.md.
(c) How to tune thresholdsmissingpid-detect.ts rules (fails >= 3, runs >= 5 && mean <= 0.75, zeros >= 2) are documented in pid-regen.md but no page explains how to change them. They are hardcoded in pid-detect.ts.
(d) How to pauseyesstate/engaged.json recipe gate; documented in autopilot.md:100, snappy-autopilot-twocron.md:79.
(e) What logs go wherepartialevals.ndjson, breakage-report.ndjson, frictions.ndjson covered. regen-deltas.ndjson is specced in regen.sh header but never produced, and pid-trends.ndjson is never mentioned in any doc prose.

Options surface — cli.js autopilot

From node bin/cli.js autopilot:

cli.js autopilot [--status|--break|--fix|--tick]  twocron detection/remediation wrapper
FlagActionFile
--statusPrints open row counts by severity from breakage-report.ndjson. Observed: p0=0 p1=7 p2=3 (total 10) today.bin/cli.js:1008-1017
--breakRuns state/bin/autopilot/break.sh once.bin/cli.js:54,1013
--fixRuns state/bin/autopilot/fix.sh once.bin/cli.js:55,1013
--tickRuns break then fix sequentially.bin/cli.js:1022-1028

Hidden levers (not in --help, but real):

default 900 (15 min).

Gaps in the options surface:

though the script exists and takes a single arg.

exist).


Gaps (concrete list)

  1. A4 re-test gate is orphan. state/bin/autopilot/regen.sh is coded

but called by nothing. Neither bin/cli.js, state/bin/auto-regen.sh, state/bin/autopilot/fix.sh, nor crontab -l references it. Consequence: the loop cannot measure whether a rewrite improved a skill.

  1. state/log/regen-deltas.ndjson does not exist. Direct consequence

of #1. The PID-loop-closure artifact is unobservable.

  1. Autopilot crons are paused. Breaker + fixer entries in crontab -l

are commented with # PAUSED 2026-04-18:. Nothing will run autonomously until those lines are uncommented.

  1. No canonical end-to-end doc. Four pages each cover one slice;

no page shows breaker → fixer → detect → drain → dispatch → regen → deltas as one diagram.

  1. Thresholds are hardcoded, not tunable via docs. pid-detect.ts

fails >= 3, mean <= 0.75, zeros >= 2 are prose-documented in pid-regen.md but there is no "how to change these" page.

  1. pid-trends.ndjson is undocumented. 230 rows on disk, 15 fresh this

run, consumed by fix.sh A4a — but no wiki page explains the schema.


Honesty section

eval rows have ever been written** (evals.ndjson contains 0). Either the precheck condition never fired before the crons were paused, or the precheck fires silently. Worth a probe when crons resume.

a git-log search for fixer: self-write pitfall commits).

well-commented, has zero callers. regen-deltas.ndjson does not exist on disk.

One-line verdict: PID loop is three-quarters wired. The part that closes the loop (measure whether the rewrite helped) is spec-only, and the part that runs autonomously (cron) is paused.