OR Key
drop another .md file to compare - side-by-side diff against dogfood-loop

dogfood-loop

Has your assistant test your app like a real user and fix what falls short.
personal 2 files 10 recent evals

What it does for you

Has your assistant test your app like a real user and fix what falls short.

What it produces

A recent result, so you can see the kind of work it returns.

loading…

How to get it

These run inside the Snappy workspace. Want this working in your business? I set skills like this up with you, in one focused week.

Work with me
For developers how this skill is built, graded, and how it runs

at a glance- the short version

actorDogfood-running subagent
auditorNext agent (or Robert)
eval modeauto-shape
stages1

what's inside - the parts that make up a skill 2/4 present

A skill is just a few plain-text files. Only the main one is required. The rest are optional, added as the work needs them. This is what the skill is made of; how it runs is just below.

The skill
state/skills/dogfood-loop/SKILL.md present
the skill itself, in plain text
The main file. It says what the skill is and lays out the steps in plain English.
Code
state/lib/dogfood-loop.ts not present
code the skill can run
Optional. Many skills are just words and need no code at all.
Scripts
state/bin/dogfood-loop/ not present
helper scripts
Optional. Added when a skill has a few commands to run.
Loader
state/skills/dogfood-loop/AGENTS.md present
what the AI loads on the fly
Loaded automatically the moment this skill is needed. Kept short on purpose.

how it's graded - what counts as a good run 4 criteria · 3 deterministic · 1 judge

Each row is one thing a good run has to get right. deterministic means a quick check decides, pass or fail. judge means the AI reads the result and rates it. Grading each piece on its own (instead of one overall score) shows exactly where a run fell short, so the fix is obvious.

name
kind
check
report_present
deterministic
/tmp/dogfood-loop-report.md exists and contains a Summary section.
scenarios_min_three
deterministic
Report contains at least three Scenario blocks, each with Pre, Post, Card rendered, Render quality, Action, Notes lines.
every_fail_has_action
judge
Every scenario marked 'fail' carries either a commit SHA in Action OR an entry appended to state/log/snappy-chat-backlog.md — no orphan failures.
no_external_sideeffects
deterministic
No scenario triggered a real outbound send (no LinkedIn URN, no Gmail draft, no Slack post). The dispatch backend stays text-only or generative-shape only.

how it runs - the shared frame every skill uses 5/5 present

Every skill runs the same way. One part does the work, a separate part checks it, and a short loader hands the AI exactly what it needs for the job. Anything this skill doesn't use shows a one-line note saying why, on purpose, not by accident.

makes the work The worker
present
Dogfood-running subagent the worker
Does the actual work. Whatever it produces is what gets checked next.
checks the work The reviewer
present
Next agent (or Robert) the checker
A separate checker grades the work, so the part that made it can't approve its own work.
frame
learns Self-correction
present
fixes itself learns from gaps
When a run hits a gap, the skill gets edited on the spot [FIXED] or queued for a bigger rewrite [LOGGED], so it keeps getting better.
tidies up Background fixes
present
queued for rewrite runs in the background
Bigger fixes that can't be made on the spot get queued and rewritten in the background later.
remembers Run history
present
state/log/evals.ndjson auto-shape runs
Every run is written down here, so the next time this skill is used it already knows how the last runs went.
Critical rules the things this skill must not get wrong
No must-not-break rules called out for this skill. Anything important lives in the writeup below.

what it has learned - fixes written back in over time sample

When a run hits something this skill didn't handle, the fix gets written back into the skill so it doesn't happen again. FIXED means it was corrected on the spot. LOGGED means it's queued for a bigger rewrite. Either way, the skill gets a little better and never makes the same mistake twice.

  1. Loading feedback rows…

how the work flows- who makes it, who checks it

actor Dogfood-running subagent
auditor Next agent (or Robert)
1 data
npx
npx tsx state/bin/dogfood-loop/run.ts \

SKILL.md- the skill, written out in plain English

dogfood-loop

The dogfood loop is how snappy-chat gets sharper between releases. A subagent is BOTH a user (drives the app via the chat-drive bridge) AND a developer (commits fixes to snappy-chat when a rendered card falls below world-class). Closing this loop is what unlocks Robert's generative-UI leverage.

The deliverable IS the playbook below. There is no backing lib - the prose is the executable. Producer skills (agent-recap, bundle-snapshot, chat-drive) supply the primitives; this skill stitches them into a cadence.

The primitives stack (as of 2026-04-28)

All routed through process trees with intact TCC grants:

[ subagent in any context ]
        │
        ▼
state/lib/chat-drive.ts  ─ dispatchInChatUI(text)  ──┐
                          resetChatUI()              │
                                                     ▼
state/lib/desktop.ts     ─ captureScreen(path)      [ openclaw bridge :18790 ]
                          captureWindow(app, path)   │
                          listWindows(app?)          ▼
                                                     [ macOS shell with TCC ]
state/lib/bridge.ts      ─ bridgeExec(cmd)
                          bridgeReadFile(path)
                          bridgeWriteFile(path, c)

snappy-chat polls 127.0.0.1:3147/chat-inject-pop every 500ms. When dispatchInChatUI pushes text to /chat-inject-push, the React polling effect injects via the React-controlled-component setter dance and fires submit. resetChatUI pushes a control message that flips setHasThread(false) + setChatKey(k+1), returning the welcome surface for the next scenario.

Steps

The 7-step cadence below is automated by the sidecar at state/bin/dogfood-loop/run.ts. Drive a full pass with:

npx tsx state/bin/dogfood-loop/run.ts \
  --scenarios state/skills/dogfood-loop/scenarios/default.json \
  --out /tmp/dogfood-loop-report.md

The sidecar handles steps 1-4 + 7 (reset, pre-shot, push, post-shot, append to report) for every scenario, then exits. Steps 5 and 6 (visual grading, fix-or-log) stay manual - actor (the sidecar) captures; auditor (the agent that runs it) reads each /tmp/loop-N-post.png against the eval checklist below and decides commit-vs-backlog.

For each scenario:

  1. Reset - await resetChatUI({waitMs: 1500}). Welcome surface should

remount.

  1. Pre-shot - await captureScreen('/tmp/loop-N-pre.png'). Confirms

reset succeeded.

  1. Push intent - `await dispatchInChatUI(intent, {waitForFirstFrame:

14000})`.

  1. Post-shot - await captureScreen('/tmp/loop-N-post.png').
  2. Read both shots with the Read tool. Evaluate against the eval

checklist below.

  1. Fix or log - if anything is below world-class, edit

web/src/styles.css or the relevant component, redeploy via bash scripts/build-app.sh --install, re-run the scenario to verify. If the fix needs >25 LOC or spans 3+ files, stop and append to ~/projects/snappy-os/state/log/snappy-chat-backlog.md instead.

  1. Append to report - /tmp/dogfood-loop-report.md per the format at

the bottom of this doc.

Eval checklist - is this card world-class?

For every rendered scenario, check all of these:

Render integrity

  • [ ] Welcome surface unmounted, thread state visible
  • [ ] No console errors (open DevTools via menubar, or check WKWebView log

if available)

  • [ ] User pill right-anchored at the actual right edge of the message

column

  • [ ] User pill bg = --surface-elevated, border = subtle, radius ~12px,

max-width ≤540px, content fit

  • [ ] Assistant content visible (text or generative-UI card)

Card rendering (when a generative-UI shape fires)

  • [ ] Card chrome present: 1px border, soft shadow, 14px radius, no

side-stripe

  • [ ] Channel hairline color present (3-4px gradient strip on top edge)
  • [ ] Header row: verb chip (mono 11px) + channel chip (mono 10.5px) +

count if applicable

  • [ ] Body text 14px sans, line-height 1.6, color = --text (not

secondary)

  • [ ] Action row only present if the shape uses one (Approve/Edit/Reject

for DispatchCard)

  • [ ] No clipped content, no overflow

Layout quality

  • [ ] Composer at bottom with Queue pill (copper-filled, "Queue" label)
  • [ ] Mode chips inline on left of action-bar, model right-pushed
  • [ ] Mic button is a quiet outline (not copper-filled) when idle
  • [ ] Right panel headers in sentence-case sans, NOT uppercase mono
  • [ ] Sidebar agent rows compact, sentence-case section labels

Behavior

  • [ ] Dispatch completed (no "Unknown dispatch error" or stuck spinner)
  • [ ] Streaming actually streamed (not all-at-once dump)
  • [ ] Approve/Edit/Reject buttons on DispatchCard are clickable

If ALL pass → scenario is pass. If any fail → scenario is fail — <reason>. Don't half-pass.

Standard scenario set (ranked by render-richness)

Run as many as fit in budget. Each tests a different generative-UI shape; each is also a contract test for the producer skill that owns it:

  1. compose a linkedin post about today's snappy-os build session

DispatchCard with linkedin channel (counter /210 fold), Approve/Edit/Reject. Producer: linkedin-post.

  1. show me what the agents did today → ProgressList. Producer:

agent-recap.

  1. what is the snappy-os bundle context → ContextPanel with connectors.

Producer: bundle-snapshot.

  1. open the file state/skills/desktop/SKILL.md → WorkingFolder with file

pill. Producer: desktop.

  1. give me feedback on the welcome screen design → FeedbackForm if

plumbed; else text.

  1. are you sure you want to fire ray-update? → ConfirmDialog if plumbed;

else text.

DO NOT run anything that side-effects externally - no real LinkedIn posts, no emails, no Slack sends. The default backend is openrouter / gemini so most replies are text-only unless the prompt explicitly invokes a generative shape via the dispatcher.

Failure modes - what to do when

SymptomCauseAction
dispatchInChatUI succeeds but composer never receives textPolling stopped or React un-mountedRestart snappy-chat: `pgrep -af /Applications/SnappyChat.app/Contents/MacOS/SnappyChat \awk '{print $1}' \xargs kill; sleep 1; open /Applications/SnappyChat.app`
Welcome surface stays after dispatchReact controlled-component setter didn't fireConfirm injectIntoComposer ran by reading App.tsx; check if textarea is display:none (offsetParent null on both variants)
Dispatch returns "Unknown dispatch error"/dispatch/chat 5xx, often dispatch.json mis-routedcurl -XPOST 127.0.0.1:3147/dispatch-config -d '{"chat":{"backend":"openrouter","model":"gemini"}}' to reset axis
Build fails on bash scripts/build-app.sh --installTS error or stale ResourcesRead the error; never bypass with --no-verify
Screenshot shows blank/whitepeekaboo targeted the wrong windowUse captureScreen (full display) instead of captureWindow
Stuck process at /private/tmp/.../SnappyChatOld build still runningpkill -f /private/tmp.*SnappyChat; sleep 0.5; open /Applications/SnappyChat.app
Multiple npx tsx processes piling upSubagent's previous calls leakedpkill -f "npx tsx" is safe - the desktop helpers are stateless

What goes where

  • Quick fix (≤25 LOC, ≤2 files, in snappy-chat) → commit immediately

with chat: prefix and explicit pathspec.

  • Bigger fix (>25 LOC OR ≥3 files OR cross-repo) → append to

~/projects/snappy-os/state/log/snappy-chat-backlog.md as - [ ] <title> — <one-line scope>. Don't commit. Robert routes the larger work.

  • Reference observation (e.g. "the welcome state's spark is rendering

blurry at 22px") → also goes to backlog. Cosmetic findings batch better.

  • New skill needed (e.g. discovered we need a notification-toast

generative-UI shape) → write a finding in the backlog AND mention it in the loop report's "next" section.

Report format

/tmp/dogfood-loop-report.md:

## Scenario N: <intent>
- Pre: /tmp/loop-N-pre.png
- Post: /tmp/loop-N-post.png
- Card rendered: <name | "text-only" | "none">
- Render quality: pass | fail — <reason>
- Action: <commit SHA> | "no fix needed" | "logged to backlog"
- Notes: <one or two sentences>

## Summary
- Scenarios run: N
- Clean passes: M
- Commits: <SHAs>
- Findings to backlog: <count>
- Next biggest gap: <one sentence>

Hard constraints

  • PRODUCT.md tone is absolute. No emojis. No em dashes. No exclamation

points in user-visible copy.

  • DESIGN.md tokens are the only color source. Every styles.css value

pulls from existing OKLCH tokens.

  • Don't modify snappy-os skills/libs in a dogfood pass. That's a

separate axis.

  • Don't side-effect externally. No real outbound messages.
  • Don't run more than 6 minutes per scenario. If it eats more, log a

finding and move on.

Reference shots

Claude Desktop "Cowork" reference shots live in refs/cd-{01..23}.png - Robert's design target. See refs/README.md for the per-shot index. cd-01..06 = welcome state, cd-13/14/21 = thread state, cd-16..20 = customize / skills directory, cd-08..12 = projects + scheduled, cd-22..23 = artifacts.

Eval

Actor: the dogfood-running subagent. Captures screenshots, evaluates against the checklist, commits fixes, appends to report. Auditor: the next agent (or Robert) reads /tmp/dogfood-loop-report.md and verifies that scenarios actually passed by re-reading the post-shots. Distinct context = actor ≠ auditor.

OutcomeScore
Report present, ≥3 scenarios, every fail has a fix-or-backlog action1.0
Report present but a scenario flagged fail with no follow-through0.5
No report or report missing scenario shots / actions0.0

Rubric

criteria:
  - name: report_present
    kind: deterministic
    check: "/tmp/dogfood-loop-report.md exists and contains a Summary section."
  - name: scenarios_min_three
    kind: deterministic
    check: "Report contains at least three Scenario blocks, each with Pre, Post, Card rendered, Render quality, Action, Notes lines."
  - name: every_fail_has_action
    kind: judge
    check: "Every scenario marked 'fail' carries either a commit SHA in Action OR an entry appended to state/log/snappy-chat-backlog.md — no orphan failures."
  - name: no_external_sideeffects
    kind: deterministic
    check: "No scenario triggered a real outbound send (no LinkedIn URN, no Gmail draft, no Slack post). The dispatch backend stays text-only or generative-shape only."

AGENTS.md- what the AI loads when this skill comes up

dogfood-loop - loader

Per-turn QA rules. Reference: state/skills/dogfood-loop/SKILL.md (7-step cadence, eval checklist, scenarios, report format). Design target: state/skills/dogfood-loop/refs/cd-{01..23}.png (Claude Desktop cowork).

Critical Rules

  1. Dual role: user + developer. Drive app via dispatchInChatUI() and resetChatUI(), read screenshots, decide fix-vs-backlog, commit (≤25 LOC/2 files snappy-chat only) or append to ~/projects/snappy-os/state/log/snappy-chat-backlog.md. No external side-effects.
  2. Primitives stack (TCC-intact). chat-drive.ts (dispatch/reset) + desktop.ts (capture) + bridge.ts (exec/file ops) + producer skills. All four required for QA loop.
  3. 7-step cadence. Sidecar: reset→pre-shot→push→post-shot→append. Agent: read both shots → evaluate → fix or log. Boundary: sidecar = automation, agent = judgment.
  4. Actor ≠ Auditor. Screenshots are ground truth. Returncode lies; visual integrity governs.
  5. Eval checklist (4 categories). (a) Render integrity: no console errors, user pill anchored/fitted, assistant content visible. (b) Card rendering: chrome (border/shadow/radius), channel hairline, header (verb+channel chips), body (14px/1.6 line-height), action row if used. (c) Layout: composer+Queue, mode chips left, right panel, sidebar compact. (d) Behavior: dispatch done, streamed not dumped, buttons clickable. All pass → PASS; any fail → FAIL with reason.
  6. Fix-vs-backlog threshold. Inline fix: ≤25 LOC AND ≤2 files (snappy-chat only). Larger/cross-repo/≥3 files → backlog entry. No "we'll fix it later" excuses; close the gap immediately or log.
  7. Chat-inject queue is destructive SHIFT. Double-reset required: reset → 200ms wait → reset (single resets miss persistence). In-thread mode: /chat-inject-pop curl call STEALS items from React UI (never poll for queue length). Use fixed 3s wait + /chat-inject-flush fallback instead.
  8. Bridge TCC grants per-process. openclaw bridge loses screen-recording after restart/context-switch. Fallback: direct /dispatch/chat curl SSE stream (verifies shape wire without screenshots). When TCC blocks, use Node.js HTTP client, not peekaboo.
  9. Dispatch control format. /chat-inject-control (not /chat-inject-push) for reset/theme. Flat JSON only: {"action":"reset"} or {"action":"theme:light"}. Nested format returns 400. Field name: text for /chat-inject-push, command for bridge exec.
  10. Peekaboo window-id staleness. Call peekaboo list windows --app "Snappy Chat" --json-output (note: app name with space) BEFORE each capture. --window-index unreliable (orders by isMainWindow, not size). Parse JSON → select bounds[1][0] > 1000 (width >1000px) → use --window-id <id>. One-liner: WID=$(peekaboo list windows --app "Snappy Chat" --json-output | jq -r '.data.windows[] | select(.bounds[1][0] > 1000) | .window_id' | head -1) && peekaboo image --app "Snappy Chat" --window-id $WID --path /tmp/out.png. Fallback: --pid if TCC blocks --app.
  11. tsx cache stale crash. evalLeaderboardRegex FATAL after 1-2 dispatches when cache aged. Fix: rm /var/folders/sq/*/T/tsx-*/\*597k\* && pkill -f "head-screen" + npm cache clean --force.
  12. No external side-effects. No real LinkedIn posts, emails, Slack writes. Dispatch backend text-only or generative-shape. Serialize intents; shared queue = lost messages if concurrent.
  13. Standard scenario set (6). (1) "compose a linkedin post..." → DispatchCard+Approve/Edit/Reject. (2) "show me what the agents did today" → ProgressList. (3) "what is the snappy-os bundle context" → ContextPanel. (4) "open the file state/skills/desktop/SKILL.md" → WorkingFolder. (5) "give me feedback on the welcome screen design" → FeedbackForm. (6) "are you sure you want to fire ray-update?" → ConfirmDialog.
  14. NotBuiltYet stubs fatal. Every reachable route MUST show real content. "This page is coming soon" = FAIL → log to backlog. Robert has no patience for dead-ends.
  15. AFTER screenshot mandatory before shipped. Robert's rule: "if i say we need to be taking pics then you need that to be part of my business." Edit without BEFORE + AFTER window-targeted screenshot (both Read by agent) ≠ shipped. Capture with peekaboo (see Rule 10), then Read /tmp/after.png. If app unreachable, STOP and report - no faith-based claims.
  16. Hard constraints. PRODUCT.md tone: no emojis, em-dashes, exclamation points. DESIGN.md tokens only. Don't touch snappy-os skills/libs. No side-effects. 6 min/scenario max.
  17. Report format. /tmp/dogfood-loop-report.md per scenario: Pre/Post paths, Card rendered (name|text-only|none), Render quality (pass|fail - reason), Action (commit SHA|no fix|logged), Notes (1-2 sentences). Summary: run count, clean passes, commits, backlog findings, next gap.

Commands

| ui dashboard | state/skills/dogfood-loop/resources/ui.openui |

OperationCommand
run scenariosnpx tsx state/bin/dogfood-loop/run.ts --scenarios state/skills/dogfood-loop/scenarios/default.json --out /tmp/dogfood-loop-report.md
reset welcome (double-reset pattern)curl -XPOST 127.0.0.1:3147/chat-inject-control -d '{"action":"reset"}'; sleep 0.2; curl -XPOST 127.0.0.1:3147/chat-inject-control -d '{"action":"reset"}'
dispatch config setcurl -XPOST 127.0.0.1:3147/dispatch-config -H "Content-Type: application/json" -d '{"axis":"chat","backend":"openrouter","model":"gemini"}'
app checkpgrep -af "/Applications/SnappyChat.app/Contents/MacOS"
app restartpkill -f /Applications/SnappyChat.app; sleep 1; open /Applications/SnappyChat.app
bridge token path`grep "^OPENCLAW_BRIDGE_TOKEN=" ~/projects/snappy-os/.env.cache \cut -d= -f2- \tr -d '"'`
start head-screenbash ~/projects/snappy-os/state/bin/head-screen/launch.sh
direct dispatch API (shape wire test)`curl -s -XPOST 127.0.0.1:3147/dispatch/chat -H "Content-Type: application/json" -d '{"intent":"<phrase>","threadId":"qa"}' \jq '.tool_shapes,.server_shapes'`
theme controlcurl -XPOST 127.0.0.1:3147/chat-inject-control -d '{"action":"theme:light"}' (or theme:dark)
chat-inject pushcurl -XPOST 127.0.0.1:3147/chat-inject-push -d '{"text":"<intent>","newThread":true}' (uses text field, not message; newThread isolates QA probes from Robert's active thread)
peekaboo window listpeekaboo list windows --app "Snappy Chat" (note: subcommand is "list windows" not "window-list")
bridge-backed capture (preferred)cd ~/projects/snappy-os && npx tsx state/lib/desktop.ts capture-window "Snappy Chat" /tmp/out.png (routes through state/lib/bridge.ts so the helper daemon's Screen Recording grant is used)
peekaboo capture (correct)`WID=$(peekaboo list windows --app "Snappy Chat" --json-output \jq -r '.data.windows[] \select(.bounds[1][0] > 1000) \.window_id' \head -1) && peekaboo image --app "Snappy Chat" --window-id $WID --path /tmp/out.png` (do NOT use --window-index - it is unreliable; pick by width via window_id)
clear tsx cacherm /var/folders/sq/*/T/tsx-501/*597k 2>/dev/null; pkill -f "head-screen"
build appbash ~/projects/snappy-chat/scripts/build-app.sh --install

Failure Modes

SymptomCauseAction
Text dispatched but composer never receives itPolling stopped or React un-mountedRestart app: pkill -f /Applications/SnappyChat.app; sleep 1; open /Applications/SnappyChat.app
Welcome persists after dispatchReact setter didn't fire or injectIntoComposer failedVerify App.tsx:injectIntoComposer submit click reaches activeElement; check display:none
"Unknown dispatch error"/dispatch/chat 5xx, often misconfigured axiscurl -XPOST 127.0.0.1:3147/dispatch-config -d '{"axis":"chat","backend":"openrouter","model":"gemini"}'
Build failsTS error or stale ResourcesRead error; never --no-verify
Screenshot blank/whiteTCC denied or display asleepUse direct /dispatch/chat API; never retry screencapture
peekaboo ... PERMISSION_ERROR_SCREEN_RECORDING in this shellRaw CLI process lacks Screen Recording even though the bridge daemon has itUse npx tsx state/lib/desktop.ts capture-window "Snappy Chat" /tmp/out.png so the call routes through state/lib/bridge.ts
Process stuck at /private/tmpOld build still runningpkill -f /private/tmp.*SnappyChat; sleep 0.5; open /Applications/SnappyChat.app
Double-reset doesn't workFirst reset enqueues but doesn't fire UIAdd 3s wait between resets; verify welcome unmounts via screenshot
chat-inject-pop consumed before React pollsManual curl GET /chat-inject-pop steals queue itemsNEVER use curl to check queue length; use fixed-time wait or /chat-inject-flush
Server crash after 1-2 dispatchestsx cache stale (evalLeaderboardRegex FATAL)rm /var/folders/sq/*/T/tsx-*/\*597k\*; pkill -f "head-screen"

Self-Test

  • [ ] Can explain subagent as both user AND developer?
  • [ ] Know the four primitives libs (chat-drive, desktop, bridge, producers)?
  • [ ] Know sidecar handles steps 1-4+7, you handle 5-6?
  • [ ] Understand actor ≠ auditor (you capture, screenshots grade)?
  • [ ] Can list 4 eval categories: render integrity, card rendering, layout quality, behavior?
  • [ ] Know ≤25 LOC / ≤2 files fix threshold?
  • [ ] Know 6 standard scenarios in priority order?
  • [ ] Know hard constraints: no emojis, DESIGN.md tokens, no snappy-os mods, no side-effects, 6 min/scenario?
  • [ ] Know report format: Scenario N blocks + Summary section?
  • [ ] Know eval scoring: 1.0 (report+3+actions), 0.5 (orphan fails), 0.0 (no report)?
  • [ ] Can restart the app without losing head-screen?
  • [ ] Know dispatch-config format: {"axis":"chat","backend":"...","model":"..."}?
  • [ ] Know when to commit vs when to backlog?
  • [ ] Know bridge token at repo root .env.cache (not ~/.env.cache)?

<!-- kernel-ok: historical attribution for the PID-loader pattern. -->

Self-correcting loader (PID feedback)

This is the rule that makes loaders learn. Ported from snappy-kernel, where it demonstrably worked: the snappy-dom-cartographer skill mapped Skool admin UI once, every later ship-agent inherited the full map for free.

Before you finish, do two things:

1. Fix gaps (P - proportional). MANDATORY. If this loader didn't cover your case - if you had to read another file, run an undocumented command, work around a wrong selector, or discover a quirk - you MUST attempt an Edit to this AGENTS.md before you log.

  • Default action: edit this file. Surgical. Add the missing line, table

row, or code example where it belongs. No restructuring.

  • Only LOGGED is allowed when: the fix needs >10 lines, spans multiple

files, or requires a structural rewrite. In that case the state/regen/drain.sh queue picks it up asynchronously.

  • "I didn't have time" / "it's minor" / "the next agent will figure it out"

are NOT valid reasons. Minor gaps compound - that's the whole point of this loop.

  • Concrete example of a valid P-fix: discover agent-browser screenshot

takes path positionally not as --output <path>. One line in a table. Edit it in. Do not just log it.

  • The goal: the next agent never has to leave the loader.

2. Log the result.

echo "[$(date -u +%FT%TZ)] dogfood-loop: <what was missing or fixed> [FIXED|LOGGED] action_kind=<kind>" >> state/log/loader-feedback.log
  • FIXED = you patched this loader inline (P-fix).
  • LOGGED = too large for inline; the PostToolUse enqueue + Stop-hook drain

will rewrite the loader from scratch on next session-end.

  • action_kind: shape-ok | skill-ran | loader-rewritten | pattern-elevated

Do not skip this. Every agent run must leave the system better than it found it. The loader is the setpoint; you are the sensor; the gap is the error signal; closing the gap is the correction.

OpenUI Resource

  • Skill-owned OpenUI Lang resource: state/skills/dogfood-loop/resources/ui.openui. Read it before rendering or editing this skill's generated component surface.
  • Treat this resource as a first-class artifact of the skill, not a generic chat response. Improve it when the skill's user-facing output needs to become richer.
  • System resources compose OpenUI primitives and inherit SnappyChat tokens. Use ui_contract: branded in SKILL.md only for deliberate platform or client visuals.

api.ts- the code it can call

⚠ no api.ts - this skill has no typed action surface

scripts- helper scripts it can run

prose-only skill - 4 inline code blocks live in SKILL.md above (no state/bin/ sidecar yet).

how we check it- the checks, plus the last 10 runs

rubric auto-shape no rubric declared
recent mean 6.64 · 10 runs actor/auditor: unverifiable
deps none declared
timestamp verb score primary_issue artifact
2026-05-02 20:56Z - 0.50 - -
2026-05-01 05:50Z - 60.00 - -
2026-05-01 03:22Z - 1.00 - -
2026-05-01 01:16Z - 0.30 - -
2026-05-01 01:09Z - 0.70 - -
2026-04-30 23:08Z - 0.70 - -
2026-04-30 21:00Z - 0.85 - -
2026-04-30 20:59Z - 0.85 - -
2026-04-30 20:58Z - 0.75 - -
2026-04-30 20:56Z - 0.75 - -