PID loop

What this layer does

Phase 7 wires the per-machine PID loop (already verified working in Phase 0) into the cross-machine substrate. Per-machine evals stay local. An anonymized aggregate ships through the Worker. Quorum promotion runs as a Worker scheduled handler — when ≥3 distinct tenants score a staged rev ≥0.85 across ≥5 runs, the rev promotes to public canonical without manual review.

Files involved

<skill>.ready markers and dispatches regen subagents.

fetchEvals(); writes briefs to state/log/regen-queue/.

state/log/evals.ndjson into state/log/evals.aggregate.ndjson per the locked schema. Runs at the end of push --auto.

<skill>.<machine_id>.md to avoid cross-machine collisions.

handler firing every minute.

Aggregate row schema (locked v1)

type AggRow = {
  _v: 1;
  ts: string;          // ISO-8601
  skill: string;       // normalized from `skill` or legacy `verb`
  score: number;       // 0..1
  run_id: string;      // opaque
  tenant: string;      // sha256(SNAPPY_MASTER_KEY).first(12)
  cost_usd_cents?: number;  // rounded to nearest cent; sub-cent rounds up to 1
  ok: boolean;
};

Aggregator drops every other field. state/lint/aggregate-schema.ts fails on unknown keys.

What syncs vs what doesn't

FileDirectionWhy
state/log/evals.ndjsonNEVER syncsper-machine ephemeral; in SYNC_DENY
state/log/evals.aggregate.ndjsonboth waysanonymized PID signal; in SYNC_ALLOW
state/log/regen-queue/both waysbriefs flow between tenants
state/skills/<name>.mdboth waysPID-rewritten skill pages flow through standard sync

Quorum promotion

  1. PID rewrite produced on tenant T. T pushes the rev to

s3://robert-storage/snappy-os-staging/<skill>/<rev_id>/.

  1. Worker scheduled handler reads

state/log/evals.aggregate.ndjson for any staged rev_ids.

  1. When ≥3 distinct tenants have scored <skill>:<rev_id> ≥0.85

across ≥5 total runs → promote. Copy snappy-os-staging/<skill>/<rev_id>/snappy-os/skills/<skill>/. Write changelog row. Invalidate KV.

  1. Robert override: any rev tagged manual_review:robert in

frontmatter bypasses quorum and stays gated until Robert approves via /snappy-ops.

Operational gotchas

pid-aggregate.ts push entirely.

proposing simultaneous rewrites from clobbering each other.

evals.aggregate.ndjson which other tenants continue to push.

tight; never emit fractional cents.

would expose run_ids tied to specific machines.

How to verify it's working

brief; on the next push, evals.aggregate.ndjson gains a row with the new run.

last_quorum_promotion updating when ≥3 tenants converge.

within 60s of the third tenant's push.