drop another .md file to compare - side-by-side diff against dogfood-loop

dogfood-loop

Has your assistant test your app like a real user and fix what falls short.

personal 2 files 10 recent evals

Export

What it does for you

Has your assistant test your app like a real user and fix what falls short.

What it produces

A recent result, so you can see the kind of work it returns.

loading…

How to get it

These run inside the Snappy workspace. Want this working in your business? I set skills like this up with you, in one focused week.

Work with me

For developers how this skill is built, graded, and how it runs

at a glance- the short version

actorDogfood-running subagent

auditorNext agent (or Robert)

eval modeauto-shape

stages1

what's inside - the parts that make up a skill 2/4 present

A skill is just a few plain-text files. Only the main one is required. The rest are optional, added as the work needs them. This is what the skill is made of; how it runs is just below.

The skill

state/skills/dogfood-loop/SKILL.md present

the skill itself, in plain text

The main file. It says what the skill is and lays out the steps in plain English.

Code

state/lib/dogfood-loop.ts not present

code the skill can run

Optional. Many skills are just words and need no code at all.

Scripts

state/bin/dogfood-loop/ not present

helper scripts

Optional. Added when a skill has a few commands to run.

Loader

state/skills/dogfood-loop/AGENTS.md present

what the AI loads on the fly

Loaded automatically the moment this skill is needed. Kept short on purpose.

how it's graded - what counts as a good run 4 criteria · 3 deterministic · 1 judge

Each row is one thing a good run has to get right. deterministic means a quick check decides, pass or fail. judge means the AI reads the result and rates it. Grading each piece on its own (instead of one overall score) shows exactly where a run fell short, so the fix is obvious.

name

kind

check

report_present

deterministic

/tmp/dogfood-loop-report.md exists and contains a Summary section.

scenarios_min_three

deterministic

Report contains at least three Scenario blocks, each with Pre, Post, Card rendered, Render quality, Action, Notes lines.

every_fail_has_action

judge

Every scenario marked 'fail' carries either a commit SHA in Action OR an entry appended to state/log/snappy-chat-backlog.md — no orphan failures.

no_external_sideeffects

deterministic

No scenario triggered a real outbound send (no LinkedIn URN, no Gmail draft, no Slack post). The dispatch backend stays text-only or generative-shape only.

how it runs - the shared frame every skill uses 5/5 present

Every skill runs the same way. One part does the work, a separate part checks it, and a short loader hands the AI exactly what it needs for the job. Anything this skill doesn't use shows a one-line note saying why, on purpose, not by accident.

makes the work The worker

present

Dogfood-running subagent the worker

Does the actual work. Whatever it produces is what gets checked next.

checks the work The reviewer

present

Next agent (or Robert) the checker

A separate checker grades the work, so the part that made it can't approve its own work.

frame

learns Self-correction

present

fixes itself learns from gaps

When a run hits a gap, the skill gets edited on the spot [FIXED] or queued for a bigger rewrite [LOGGED], so it keeps getting better.

tidies up Background fixes

present

queued for rewrite runs in the background

Bigger fixes that can't be made on the spot get queued and rewritten in the background later.

remembers Run history

present

state/log/evals.ndjson auto-shape runs

Every run is written down here, so the next time this skill is used it already knows how the last runs went.

Critical rules the things this skill must not get wrong

No must-not-break rules called out for this skill. Anything important lives in the writeup below.

what it has learned - fixes written back in over time sample

When a run hits something this skill didn't handle, the fix gets written back into the skill so it doesn't happen again. FIXED means it was corrected on the spot. LOGGED means it's queued for a bigger rewrite. Either way, the skill gets a little better and never makes the same mistake twice.

Loading feedback rows…

how the work flows- who makes it, who checks it

actor Dogfood-running subagent

auditor Next agent (or Robert)

1 data

npx

npx tsx state/bin/dogfood-loop/run.ts \

+ eval for this step

SKILL.md- the skill, written out in plain English

dogfood-loop

The dogfood loop is how snappy-chat gets sharper between releases. A subagent is BOTH a user (drives the app via the chat-drive bridge) AND a developer (commits fixes to snappy-chat when a rendered card falls below world-class). Closing this loop is what unlocks Robert's generative-UI leverage.

The deliverable IS the playbook below. There is no backing lib - the prose is the executable. Producer skills (agent-recap, bundle-snapshot, chat-drive) supply the primitives; this skill stitches them into a cadence.

The primitives stack (as of 2026-04-28)

All routed through process trees with intact TCC grants:

[ subagent in any context ]
        │
        ▼
state/lib/chat-drive.ts  ─ dispatchInChatUI(text)  ──┐
                          resetChatUI()              │
                                                     ▼
state/lib/desktop.ts     ─ captureScreen(path)      [ openclaw bridge :18790 ]
                          captureWindow(app, path)   │
                          listWindows(app?)          ▼
                                                     [ macOS shell with TCC ]
state/lib/bridge.ts      ─ bridgeExec(cmd)
                          bridgeReadFile(path)
                          bridgeWriteFile(path, c)

snappy-chat polls 127.0.0.1:3147/chat-inject-pop every 500ms. When dispatchInChatUI pushes text to /chat-inject-push, the React polling effect injects via the React-controlled-component setter dance and fires submit. resetChatUI pushes a control message that flips setHasThread(false) + setChatKey(k+1), returning the welcome surface for the next scenario.

Steps

The 7-step cadence below is automated by the sidecar at state/bin/dogfood-loop/run.ts. Drive a full pass with:

npx tsx state/bin/dogfood-loop/run.ts \
  --scenarios state/skills/dogfood-loop/scenarios/default.json \
  --out /tmp/dogfood-loop-report.md

The sidecar handles steps 1-4 + 7 (reset, pre-shot, push, post-shot, append to report) for every scenario, then exits. Steps 5 and 6 (visual grading, fix-or-log) stay manual - actor (the sidecar) captures; auditor (the agent that runs it) reads each /tmp/loop-N-post.png against the eval checklist below and decides commit-vs-backlog.

For each scenario:

Reset - await resetChatUI({waitMs: 1500}). Welcome surface should

remount.

Pre-shot - await captureScreen('/tmp/loop-N-pre.png'). Confirms

reset succeeded.

Push intent - `await dispatchInChatUI(intent, {waitForFirstFrame:

14000})`.

Post-shot - await captureScreen('/tmp/loop-N-post.png').
Read both shots with the Read tool. Evaluate against the eval

checklist below.

Fix or log - if anything is below world-class, edit

web/src/styles.css or the relevant component, redeploy via bash scripts/build-app.sh --install, re-run the scenario to verify. If the fix needs >25 LOC or spans 3+ files, stop and append to ~/projects/snappy-os/state/log/snappy-chat-backlog.md instead.

Append to report - /tmp/dogfood-loop-report.md per the format at

the bottom of this doc.

Eval checklist - is this card world-class?

For every rendered scenario, check all of these:

Render integrity

[ ] Welcome surface unmounted, thread state visible
[ ] No console errors (open DevTools via menubar, or check WKWebView log

if available)

[ ] User pill right-anchored at the actual right edge of the message

column

[ ] User pill bg = --surface-elevated, border = subtle, radius ~12px,

max-width ≤540px, content fit

[ ] Assistant content visible (text or generative-UI card)

Card rendering (when a generative-UI shape fires)

[ ] Card chrome present: 1px border, soft shadow, 14px radius, no

side-stripe

[ ] Channel hairline color present (3-4px gradient strip on top edge)
[ ] Header row: verb chip (mono 11px) + channel chip (mono 10.5px) +

count if applicable

[ ] Body text 14px sans, line-height 1.6, color = --text (not

secondary)

[ ] Action row only present if the shape uses one (Approve/Edit/Reject

for DispatchCard)

[ ] No clipped content, no overflow

Layout quality

[ ] Composer at bottom with Queue pill (copper-filled, "Queue" label)
[ ] Mode chips inline on left of action-bar, model right-pushed
[ ] Mic button is a quiet outline (not copper-filled) when idle
[ ] Right panel headers in sentence-case sans, NOT uppercase mono
[ ] Sidebar agent rows compact, sentence-case section labels

Behavior

[ ] Dispatch completed (no "Unknown dispatch error" or stuck spinner)
[ ] Streaming actually streamed (not all-at-once dump)
[ ] Approve/Edit/Reject buttons on DispatchCard are clickable

If ALL pass → scenario is pass. If any fail → scenario is fail — <reason>. Don't half-pass.

Standard scenario set (ranked by render-richness)

Run as many as fit in budget. Each tests a different generative-UI shape; each is also a contract test for the producer skill that owns it:

compose a linkedin post about today's snappy-os build session →

DispatchCard with linkedin channel (counter /210 fold), Approve/Edit/Reject. Producer: linkedin-post.

show me what the agents did today → ProgressList. Producer:

agent-recap.

what is the snappy-os bundle context → ContextPanel with connectors.

Producer: bundle-snapshot.

open the file state/skills/desktop/SKILL.md → WorkingFolder with file

pill. Producer: desktop.

give me feedback on the welcome screen design → FeedbackForm if

plumbed; else text.

are you sure you want to fire ray-update? → ConfirmDialog if plumbed;

else text.

DO NOT run anything that side-effects externally - no real LinkedIn posts, no emails, no Slack sends. The default backend is openrouter / gemini so most replies are text-only unless the prompt explicitly invokes a generative shape via the dispatcher.

Failure modes - what to do when

Symptom	Cause	Action
`dispatchInChatUI` succeeds but composer never receives text	Polling stopped or React un-mounted	Restart snappy-chat: `pgrep -af /Applications/SnappyChat.app/Contents/MacOS/SnappyChat \	awk '{print $1}' \	xargs kill; sleep 1; open /Applications/SnappyChat.app`
Welcome surface stays after dispatch	React controlled-component setter didn't fire	Confirm `injectIntoComposer` ran by reading App.tsx; check if textarea is `display:none` (offsetParent null on both variants)
Dispatch returns "Unknown dispatch error"	`/dispatch/chat` 5xx, often dispatch.json mis-routed	`curl -XPOST 127.0.0.1:3147/dispatch-config -d '{"chat":{"backend":"openrouter","model":"gemini"}}'` to reset axis
Build fails on `bash scripts/build-app.sh --install`	TS error or stale Resources	Read the error; never bypass with `--no-verify`
Screenshot shows blank/white	peekaboo targeted the wrong window	Use `captureScreen` (full display) instead of `captureWindow`
Stuck process at /private/tmp/.../SnappyChat	Old build still running	`pkill -f /private/tmp.*SnappyChat; sleep 0.5; open /Applications/SnappyChat.app`
Multiple `npx tsx` processes piling up	Subagent's previous calls leaked	`pkill -f "npx tsx"` is safe - the desktop helpers are stateless

What goes where

Quick fix (≤25 LOC, ≤2 files, in snappy-chat) → commit immediately

with chat: prefix and explicit pathspec.

Bigger fix (>25 LOC OR ≥3 files OR cross-repo) → append to

~/projects/snappy-os/state/log/snappy-chat-backlog.md as - [ ] <title> — <one-line scope>. Don't commit. Robert routes the larger work.

Reference observation (e.g. "the welcome state's spark is rendering

blurry at 22px") → also goes to backlog. Cosmetic findings batch better.

New skill needed (e.g. discovered we need a notification-toast

generative-UI shape) → write a finding in the backlog AND mention it in the loop report's "next" section.

Report format

/tmp/dogfood-loop-report.md:

## Scenario N: <intent>
- Pre: /tmp/loop-N-pre.png
- Post: /tmp/loop-N-post.png
- Card rendered: <name | "text-only" | "none">
- Render quality: pass | fail — <reason>
- Action: <commit SHA> | "no fix needed" | "logged to backlog"
- Notes: <one or two sentences>

## Summary
- Scenarios run: N
- Clean passes: M
- Commits: <SHAs>
- Findings to backlog: <count>
- Next biggest gap: <one sentence>

Hard constraints

PRODUCT.md tone is absolute. No emojis. No em dashes. No exclamation

points in user-visible copy.

DESIGN.md tokens are the only color source. Every styles.css value

pulls from existing OKLCH tokens.

Don't modify snappy-os skills/libs in a dogfood pass. That's a

separate axis.

Don't side-effect externally. No real outbound messages.
Don't run more than 6 minutes per scenario. If it eats more, log a

finding and move on.

Reference shots

Claude Desktop "Cowork" reference shots live in refs/cd-{01..23}.png - Robert's design target. See refs/README.md for the per-shot index. cd-01..06 = welcome state, cd-13/14/21 = thread state, cd-16..20 = customize / skills directory, cd-08..12 = projects + scheduled, cd-22..23 = artifacts.

Eval

Actor: the dogfood-running subagent. Captures screenshots, evaluates against the checklist, commits fixes, appends to report. Auditor: the next agent (or Robert) reads /tmp/dogfood-loop-report.md and verifies that scenarios actually passed by re-reading the post-shots. Distinct context = actor ≠ auditor.

Outcome	Score
Report present, ≥3 scenarios, every fail has a fix-or-backlog action	1.0
Report present but a scenario flagged fail with no follow-through	0.5
No report or report missing scenario shots / actions	0.0

Rubric

criteria:
  - name: report_present
    kind: deterministic
    check: "/tmp/dogfood-loop-report.md exists and contains a Summary section."
  - name: scenarios_min_three
    kind: deterministic
    check: "Report contains at least three Scenario blocks, each with Pre, Post, Card rendered, Render quality, Action, Notes lines."
  - name: every_fail_has_action
    kind: judge
    check: "Every scenario marked 'fail' carries either a commit SHA in Action OR an entry appended to state/log/snappy-chat-backlog.md — no orphan failures."
  - name: no_external_sideeffects
    kind: deterministic
    check: "No scenario triggered a real outbound send (no LinkedIn URN, no Gmail draft, no Slack post). The dispatch backend stays text-only or generative-shape only."

AGENTS.md- what the AI loads when this skill comes up

dogfood-loop - loader

Per-turn QA rules. Reference: state/skills/dogfood-loop/SKILL.md (7-step cadence, eval checklist, scenarios, report format). Design target: state/skills/dogfood-loop/refs/cd-{01..23}.png (Claude Desktop cowork).

Critical Rules

Dual role: user + developer. Drive app via dispatchInChatUI() and resetChatUI(), read screenshots, decide fix-vs-backlog, commit (≤25 LOC/2 files snappy-chat only) or append to ~/projects/snappy-os/state/log/snappy-chat-backlog.md. No external side-effects.
Primitives stack (TCC-intact). chat-drive.ts (dispatch/reset) + desktop.ts (capture) + bridge.ts (exec/file ops) + producer skills. All four required for QA loop.
7-step cadence. Sidecar: reset→pre-shot→push→post-shot→append. Agent: read both shots → evaluate → fix or log. Boundary: sidecar = automation, agent = judgment.
Actor ≠ Auditor. Screenshots are ground truth. Returncode lies; visual integrity governs.
Eval checklist (4 categories). (a) Render integrity: no console errors, user pill anchored/fitted, assistant content visible. (b) Card rendering: chrome (border/shadow/radius), channel hairline, header (verb+channel chips), body (14px/1.6 line-height), action row if used. (c) Layout: composer+Queue, mode chips left, right panel, sidebar compact. (d) Behavior: dispatch done, streamed not dumped, buttons clickable. All pass → PASS; any fail → FAIL with reason.
Fix-vs-backlog threshold. Inline fix: ≤25 LOC AND ≤2 files (snappy-chat only). Larger/cross-repo/≥3 files → backlog entry. No "we'll fix it later" excuses; close the gap immediately or log.
Chat-inject queue is destructive SHIFT. Double-reset required: reset → 200ms wait → reset (single resets miss persistence). In-thread mode: /chat-inject-pop curl call STEALS items from React UI (never poll for queue length). Use fixed 3s wait + /chat-inject-flush fallback instead.
Bridge TCC grants per-process. openclaw bridge loses screen-recording after restart/context-switch. Fallback: direct /dispatch/chat curl SSE stream (verifies shape wire without screenshots). When TCC blocks, use Node.js HTTP client, not peekaboo.
Dispatch control format. /chat-inject-control (not /chat-inject-push) for reset/theme. Flat JSON only: {"action":"reset"} or {"action":"theme:light"}. Nested format returns 400. Field name: text for /chat-inject-push, command for bridge exec.
Peekaboo window-id staleness. Call peekaboo list windows --app "Snappy Chat" --json-output (note: app name with space) BEFORE each capture. --window-index unreliable (orders by isMainWindow, not size). Parse JSON → select bounds[1][0] > 1000 (width >1000px) → use --window-id <id>. One-liner: WID=$(peekaboo list windows --app "Snappy Chat" --json-output | jq -r '.data.windows[] | select(.bounds[1][0] > 1000) | .window_id' | head -1) && peekaboo image --app "Snappy Chat" --window-id $WID --path /tmp/out.png. Fallback: --pid if TCC blocks --app.
tsx cache stale crash. evalLeaderboardRegex FATAL after 1-2 dispatches when cache aged. Fix: rm /var/folders/sq/*/T/tsx-*/\*597k\* && pkill -f "head-screen" + npm cache clean --force.
No external side-effects. No real LinkedIn posts, emails, Slack writes. Dispatch backend text-only or generative-shape. Serialize intents; shared queue = lost messages if concurrent.
Standard scenario set (6). (1) "compose a linkedin post..." → DispatchCard+Approve/Edit/Reject. (2) "show me what the agents did today" → ProgressList. (3) "what is the snappy-os bundle context" → ContextPanel. (4) "open the file state/skills/desktop/SKILL.md" → WorkingFolder. (5) "give me feedback on the welcome screen design" → FeedbackForm. (6) "are you sure you want to fire ray-update?" → ConfirmDialog.
NotBuiltYet stubs fatal. Every reachable route MUST show real content. "This page is coming soon" = FAIL → log to backlog. Robert has no patience for dead-ends.
AFTER screenshot mandatory before shipped. Robert's rule: "if i say we need to be taking pics then you need that to be part of my business." Edit without BEFORE + AFTER window-targeted screenshot (both Read by agent) ≠ shipped. Capture with peekaboo (see Rule 10), then Read /tmp/after.png. If app unreachable, STOP and report - no faith-based claims.
Hard constraints. PRODUCT.md tone: no emojis, em-dashes, exclamation points. DESIGN.md tokens only. Don't touch snappy-os skills/libs. No side-effects. 6 min/scenario max.
Report format. /tmp/dogfood-loop-report.md per scenario: Pre/Post paths, Card rendered (name|text-only|none), Render quality (pass|fail - reason), Action (commit SHA|no fix|logged), Notes (1-2 sentences). Summary: run count, clean passes, commits, backlog findings, next gap.

Commands

| ui dashboard | state/skills/dogfood-loop/resources/ui.openui |

Operation	Command
run scenarios	`npx tsx state/bin/dogfood-loop/run.ts --scenarios state/skills/dogfood-loop/scenarios/default.json --out /tmp/dogfood-loop-report.md`
reset welcome (double-reset pattern)	`curl -XPOST 127.0.0.1:3147/chat-inject-control -d '{"action":"reset"}'; sleep 0.2; curl -XPOST 127.0.0.1:3147/chat-inject-control -d '{"action":"reset"}'`
dispatch config set	`curl -XPOST 127.0.0.1:3147/dispatch-config -H "Content-Type: application/json" -d '{"axis":"chat","backend":"openrouter","model":"gemini"}'`
app check	`pgrep -af "/Applications/SnappyChat.app/Contents/MacOS"`
app restart	`pkill -f /Applications/SnappyChat.app; sleep 1; open /Applications/SnappyChat.app`
bridge token path	`grep "^OPENCLAW_BRIDGE_TOKEN=" ~/projects/snappy-os/.env.cache \	cut -d= -f2- \	tr -d '"'`
start head-screen	`bash ~/projects/snappy-os/state/bin/head-screen/launch.sh`
direct dispatch API (shape wire test)	`curl -s -XPOST 127.0.0.1:3147/dispatch/chat -H "Content-Type: application/json" -d '{"intent":"<phrase>","threadId":"qa"}' \	jq '.tool_shapes,.server_shapes'`
theme control	`curl -XPOST 127.0.0.1:3147/chat-inject-control -d '{"action":"theme:light"}'` (or theme:dark)
chat-inject push	`curl -XPOST 127.0.0.1:3147/chat-inject-push -d '{"text":"<intent>","newThread":true}'` (uses `text` field, not `message`; `newThread` isolates QA probes from Robert's active thread)
peekaboo window list	`peekaboo list windows --app "Snappy Chat"` (note: subcommand is "list windows" not "window-list")
bridge-backed capture (preferred)	`cd ~/projects/snappy-os && npx tsx state/lib/desktop.ts capture-window "Snappy Chat" /tmp/out.png` (routes through `state/lib/bridge.ts` so the helper daemon's Screen Recording grant is used)
peekaboo capture (correct)	`WID=$(peekaboo list windows --app "Snappy Chat" --json-output \	jq -r '.data.windows[] \	select(.bounds[1][0] > 1000) \	.window_id' \	head -1) && peekaboo image --app "Snappy Chat" --window-id $WID --path /tmp/out.png` (do NOT use --window-index - it is unreliable; pick by width via window_id)
clear tsx cache	`rm /var/folders/sq//T/tsx-501/597k 2>/dev/null; pkill -f "head-screen"`
build app	`bash ~/projects/snappy-chat/scripts/build-app.sh --install`

Failure Modes

Symptom	Cause	Action
Text dispatched but composer never receives it	Polling stopped or React un-mounted	Restart app: `pkill -f /Applications/SnappyChat.app; sleep 1; open /Applications/SnappyChat.app`
Welcome persists after dispatch	React setter didn't fire or injectIntoComposer failed	Verify App.tsx:injectIntoComposer submit click reaches activeElement; check display:none
"Unknown dispatch error"	`/dispatch/chat` 5xx, often misconfigured axis	`curl -XPOST 127.0.0.1:3147/dispatch-config -d '{"axis":"chat","backend":"openrouter","model":"gemini"}'`
Build fails	TS error or stale Resources	Read error; never `--no-verify`
Screenshot blank/white	TCC denied or display asleep	Use direct /dispatch/chat API; never retry screencapture
`peekaboo ... PERMISSION_ERROR_SCREEN_RECORDING` in this shell	Raw CLI process lacks Screen Recording even though the bridge daemon has it	Use `npx tsx state/lib/desktop.ts capture-window "Snappy Chat" /tmp/out.png` so the call routes through `state/lib/bridge.ts`
Process stuck at /private/tmp	Old build still running	`pkill -f /private/tmp.*SnappyChat; sleep 0.5; open /Applications/SnappyChat.app`
Double-reset doesn't work	First reset enqueues but doesn't fire UI	Add 3s wait between resets; verify welcome unmounts via screenshot
chat-inject-pop consumed before React polls	Manual curl GET /chat-inject-pop steals queue items	NEVER use curl to check queue length; use fixed-time wait or /chat-inject-flush
Server crash after 1-2 dispatches	tsx cache stale (evalLeaderboardRegex FATAL)	`rm /var/folders/sq//T/tsx-/\597k\; pkill -f "head-screen"`

Self-Test

[ ] Can explain subagent as both user AND developer?
[ ] Know the four primitives libs (chat-drive, desktop, bridge, producers)?
[ ] Know sidecar handles steps 1-4+7, you handle 5-6?
[ ] Understand actor ≠ auditor (you capture, screenshots grade)?
[ ] Can list 4 eval categories: render integrity, card rendering, layout quality, behavior?
[ ] Know ≤25 LOC / ≤2 files fix threshold?
[ ] Know 6 standard scenarios in priority order?
[ ] Know hard constraints: no emojis, DESIGN.md tokens, no snappy-os mods, no side-effects, 6 min/scenario?
[ ] Know report format: Scenario N blocks + Summary section?
[ ] Know eval scoring: 1.0 (report+3+actions), 0.5 (orphan fails), 0.0 (no report)?
[ ] Can restart the app without losing head-screen?
[ ] Know dispatch-config format: {"axis":"chat","backend":"...","model":"..."}?
[ ] Know when to commit vs when to backlog?
[ ] Know bridge token at repo root .env.cache (not ~/.env.cache)?

Self-correcting loader (PID feedback)

This is the rule that makes loaders learn. Ported from snappy-kernel, where it demonstrably worked: the snappy-dom-cartographer skill mapped Skool admin UI once, every later ship-agent inherited the full map for free.

Before you finish, do two things:

1. Fix gaps (P - proportional). MANDATORY. If this loader didn't cover your case - if you had to read another file, run an undocumented command, work around a wrong selector, or discover a quirk - you MUST attempt an Edit to this AGENTS.md before you log.

Default action: edit this file. Surgical. Add the missing line, table

row, or code example where it belongs. No restructuring.

Only LOGGED is allowed when: the fix needs >10 lines, spans multiple

files, or requires a structural rewrite. In that case the state/regen/drain.sh queue picks it up asynchronously.

"I didn't have time" / "it's minor" / "the next agent will figure it out"

are NOT valid reasons. Minor gaps compound - that's the whole point of this loop.

Concrete example of a valid P-fix: discover agent-browser screenshot

takes path positionally not as --output <path>. One line in a table. Edit it in. Do not just log it.

The goal: the next agent never has to leave the loader.

2. Log the result.

echo "[$(date -u +%FT%TZ)] dogfood-loop: <what was missing or fixed> [FIXED|LOGGED] action_kind=<kind>" >> state/log/loader-feedback.log

FIXED = you patched this loader inline (P-fix).
LOGGED = too large for inline; the PostToolUse enqueue + Stop-hook drain

will rewrite the loader from scratch on next session-end.

action_kind: shape-ok | skill-ran | loader-rewritten | pattern-elevated

Do not skip this. Every agent run must leave the system better than it found it. The loader is the setpoint; you are the sensor; the gap is the error signal; closing the gap is the correction.

OpenUI Resource

Skill-owned OpenUI Lang resource: state/skills/dogfood-loop/resources/ui.openui. Read it before rendering or editing this skill's generated component surface.
Treat this resource as a first-class artifact of the skill, not a generic chat response. Improve it when the skill's user-facing output needs to become richer.
System resources compose OpenUI primitives and inherit SnappyChat tokens. Use ui_contract: branded in SKILL.md only for deliberate platform or client visuals.

api.ts- the code it can call

⚠ no api.ts - this skill has no typed action surface

scripts- helper scripts it can run

prose-only skill - 4 inline code blocks live in SKILL.md above (no state/bin/ sidecar yet).

how we check it- the checks, plus the last 10 runs

rubric auto-shape no rubric declared

recent mean 6.64 · 10 runs actor/auditor: unverifiable

deps none declared

timestamp	verb	score	primary_issue	artifact
2026-05-02 20:56Z	-	0.50	-	-
2026-05-01 05:50Z	-	60.00	-	-
2026-05-01 03:22Z	-	1.00	-	-
2026-05-01 01:16Z	-	0.30	-	-
2026-05-01 01:09Z	-	0.70	-	-
2026-04-30 23:08Z	-	0.70	-	-
2026-04-30 21:00Z	-	0.85	-	-
2026-04-30 20:59Z	-	0.85	-	-
2026-04-30 20:58Z	-	0.75	-	-
2026-04-30 20:56Z	-	0.75	-	-