.md file to compare - side-by-side diff against chat-drive
chat-drive
What it does for you
Lets you kick off a task by simply typing what you want into chat.
What it produces
A recent result, so you can see the kind of work it returns.
loading…
How to get it
These run inside the Snappy workspace. Want this working in your business? I set skills like this up with you, in one focused week.
For developers how this skill is built, graded, and how it runs
at a glance- the short version
what's inside - the parts that make up a skill 3/4 present
A skill is just a few plain-text files. Only the main one is required. The rest are optional, added as the work needs them. This is what the skill is made of; how it runs is just below.
state/skills/chat-drive/SKILL.md
present
state/lib/chat-drive.ts
present
state/bin/chat-drive/
not present
state/skills/chat-drive/AGENTS.md
present
how it runs - the shared frame every skill uses 5/5 present
Every skill runs the same way. One part does the work, a separate part checks it, and a short loader hands the AI exactly what it needs for the job. Anything this skill doesn't use shows a one-line note saying why, on purpose, not by accident.
state/log/evals.ndjson what it has learned - fixes written back in over time sample
When a run hits something this skill didn't handle, the fix gets written back into the skill so it doesn't happen again. FIXED means it was corrected on the spot. LOGGED means it's queued for a bigger rewrite. Either way, the skill gets a little better and never makes the same mistake twice.
- Loading feedback rows…
how the work flows- who makes it, who checks it
SKILL.md- the skill, written out in plain English
chat-drive
Push text into the snappy-chat composer from any agent, anywhere. The text flows through the same OpenUI submit path the human types into: processMessage → /dispatch/chat → AG-UI stream → generative-UI cards rendered in the live React tree. Same store, same surface, same eyes.
This is the missing primitive for closed-loop snappy-chat dogfood: an agent can now type intent and watch real cards stream in, then audit by screenshot.
What it's for
- Dogfood loops. A subagent pushes a stress-test intent, screenshots the
result, grades the rendered card. The actor (push) and the auditor (read the screenshot) are necessarily distinct - the contract holds for free.
- Automated UX QA. Verify the welcome surface unmounts on first message,
user-pill alignment, dispatch-card variants, etc., end-to-end through the rendered DOM.
- Recursive subagent dispatch. A long-running agent can re-enter the chat
surface mid-task by pushing a follow-up intent. The chat is the agent's outbox.
When NOT to use it
- Anything that doesn't need the rendered UI. If you don't care about the
React tree, call the head-screen server's /dispatch/chat directly, or use the dispatch skill. Running through the chat surface adds streaming latency for no reason.
- As a synthesis transport. The bridge is a queue, not an RPC channel -
there's no callback when streaming finishes. Use /dispatch/chat directly when you need the response programmatically.
Steps
- Verify the head-screen server is up. The bridge endpoints live on it.
bash ~/projects/snappy-os/state/bin/head-screen/launch.sh # idempotent
- Verify snappy-chat is running and on screen so the polled push lands somewhere.
pgrep -af "/Applications/SnappyChat.app/Contents/MacOS/SnappyChat"
If it's not, build + install:
cd ~/projects/snappy-chat && bash scripts/build-app.sh --install
- Push the intent.
npx tsx -e "
import { dispatchInChatUI } from './state/lib/chat-drive.ts';
await dispatchInChatUI('what did the agents do today', { waitForFirstFrame: 12000 });
"
The default waitForFirstFrame is 8000ms. Pass a larger value when the target backend is slow (Claude Code: 12-15s; openrouter/gemini: 6-10s).
- Audit by screenshot. The bridge has no completion callback - actor ≠ auditor.
npx tsx -e "
import { captureScreen } from './state/lib/desktop.ts';
const path = await captureScreen('/tmp/chat-drive-verify.png');
console.log(path);
"
Then Read the PNG. Welcome surface unmounted + user pill on the right + assistant card streaming = bridge working.
Library API
state/lib/chat-drive.ts exports three functions. Importable from any TS agent code; also runnable as a CLI smoke.
export async function dispatchInChatUI(
text: string,
opts?: { waitForFirstFrame?: number } // default 8000ms
): Promise<void>;
export async function resetChatUI(
opts?: { waitMs?: number } // default 1500ms
): Promise<void>;
export async function chatDriveAvailable(): Promise<boolean>;
resetChatUI is for multi-scenario dogfood loops: clears the thread and brings the welcome surface back so the next dispatchInChatUI lands in a clean state. Same FIFO as text pushes (single ordering), distinct /chat-inject-control endpoint so wire shape is unambiguous.
CLI:
npx tsx state/lib/chat-drive.ts "say hello in three words"
HEAD_SCREEN_URL env var overrides the default http://127.0.0.1:3147.
Architecture (one paragraph)
dispatchInChatUI POSTs to the head-screen server's POST /chat-inject-push endpoint, which appends the text to a 50-slot in-memory FIFO with FIFO eviction on overflow. The snappy-chat React app mounts a polling effect that hits GET /chat-inject-pop every 500ms; on hit, it locates whichever OpenUI composer is currently visible (welcome OR thread variant) and writes through React's native value setter to trigger the textarea's onChange, then clicks the submit button. From there, the real processMessage path takes over - same code as a human typing.
For QA probes and dogfood agents, include "newThread": true in the push body. The app resets to a fresh chat before dispatching that item so probe traffic does not land in Robert's active thread. Omit it only when the intent is deliberately a follow-up in the current conversation.
Eval
Actor: dispatchInChatUI(text) - pushes onto the queue. Auditor: the audit harness re-reads the lib's exported function shape (present, async, two parameters); the user-facing audit is "is the rendered card correct" via screenshot, deliberately outside the lib.
Eval kind: shape. Mechanical: import the lib, assert dispatchInChatUI and chatDriveAvailable exist as functions, type-check passes. Logged as the skill's eval row in state/log/evals.ndjson.
Pitfalls
- The head-screen server must be alive. The bridge IS the head-screen
server. If /healthz doesn't answer, push will throw. chatDriveAvailable() is the cheap precheck.
- No completion callback.
waitForFirstFrameis the only synchronization
knob. Tune it per backend, then screenshot.
- Restart drops queued pushes. The queue is in-memory by design - restart
= empty. If a dogfood loop relies on durability across restarts, you're using the wrong primitive.
- The bridge is loopback only. No external network exposure. The CORS
headers are wide so file:// origins (WKWebView) work; the listener is bound to 127.0.0.1.
Files
state/lib/chat-drive.ts- the API (importable + CLI).state/bin/head-screen/server.ts- owns the queue endpoints
(POST /chat-inject-push, POST /chat-inject-control, GET /chat-inject-pop).
~/projects/snappy-chat/web/src/App.tsx- the React polling effect and
composer-injection helper that drains the queue.
AGENTS.md- what the AI loads when this skill comes up
chat-drive - loader
Per-turn rules for chat-drive. Push text via queue to the live React chat. Full reference: state/skills/chat-drive/SKILL.md. Lib: state/lib/chat-drive.ts. Server: state/bin/head-screen/server.ts. Consumer: ~/projects/snappy-chat/web/src/App.tsx (poll 1000ms).
Critical Rules
- Actor (push) ≠ auditor (screenshot).
dispatchInChatUI()queues text. Lib never reports "did card render" - audit viacaptureScreen+ Read. No callback when streaming finishes. waitForFirstFrameis ONLY sync knob. Default 8000ms. Tune per backend: Claude Code 12-15s, openrouter/gemini 6-10s, plus 1000ms React poll lag. After push, screenshot to audit.- Head-screen server MUST be alive. Pre-flight:
chatDriveAvailable()orbash state/bin/head-screen/launch.sh(idempotent). Verify:/healthzanswers. - snappy-chat must be running and visible. Push → in-memory 30-slot FIFO. No React polling = silent eviction. Confirm:
pgrep -af "/Applications/SnappyChat.app". - Queue is in-memory only, 30s TTL per item, server-restart-safe = wiped. Never durability-dependent. Drain before testing:
curl -XPOST /chat-inject-flushreturns flushed count. - React polls every 1000ms (App.tsx:337). Each push incurs up to 1000ms before composer sees it. Factor into
waitForFirstFramebudget. - tsx never hot-reloads server.ts. After any server.ts edit, restart the server process. Verify running code:
pgrep -af server.ts+git logmatch. - Don't push faster than dispatcher streams. Wait for RUN_FINISHED before next text push. Control actions (
open-cowork,stop,theme:*,view-*) must stay live while a stream is running; the React poller calls/chat-inject-pop?busy=1during active streams so controls pop but text stays queued server-side until RUN_FINISHED. UseresetChatUI()(NOT peekaboo) between scenarios. - resetChatUI() does NOT flush. It sends control items for nav reset (welcome). Pre-flush with
/chat-inject-flushif stale pushes queued. - Activate app before screenshotting. SnappyChat on secondary Space = wallpaper screenshot. Run
osascript -e 'tell application "Snappy Chat" to activate'first (1-2s wait). - React pre-fetches queue items. App polls
/chat-inject-popbefore dispatch. Use endpoint/chat-inject-flush, never manual curl drain (competes with React). - Queue inspection is non-consuming. GET
/chat-inject?agentId=uireturns{items, depth, controlDepth, textDepth}without shifting the queue. Never use GET/chat-inject-popfor inspection; pop consumes and can steal React's next item. - Concurrent
claude -p/claude --continuecompete for queue. Symptom: your pushes hitqueued:N, then pop returns empty, no matchingintent_charsin dispatch-chat.ndjson. Mitigation: serialize QA or use direct/dispatch/chatPOST (isolated SSE stream, independent thread). - Server crashes wipe in-memory queue. Failure mode (2026-04-30+):
FATAL evalLeaderboardRegex undefinedfrom server.ts. Verify uptime:ps -p $(pgrep -f server.ts | head -1) -o etime. If elapsed time reset between push and audit, queue is gone. Check log:tail state/log/head-screen.log | grep FATAL. - Force fresh
crypto.randomUUID()for each request messageId. Reusing any id (including OpenUI's optimistic user-message id) causes the assistant message to collide in the store reducer and produces a duplicate-render bug. This is inApp.tsx:processMessage- do not remove thecrypto.randomUUID()call. - Audit with window-targeted screenshot. After dispatch, capture:
peekaboo screenshot --app "Snappy Chat" --window-index 1 --output /tmp/after-dispatch.png. App name has a space. Window index 1 is the real chat window (index 0 is the helper).
Commands
| ui dashboard | state/skills/chat-drive/resources/ui.openui |
| operation | command | |||||
|---|---|---|---|---|---|---|
| push (TS) | import { dispatchInChatUI, resetChatUI, chatDriveAvailable } from "./state/lib/chat-drive.ts" | |||||
| push (CLI) | npx tsx state/lib/chat-drive.ts "<intent>" (all args = text, no control) | |||||
| preflight | chatDriveAvailable() (pings /healthz, returns bool) | |||||
| server | bash state/bin/head-screen/launch.sh (idempotent) | |||||
| verify chat running | pgrep -af "/Applications/SnappyChat.app" | |||||
| verify server PID | pgrep -af server.ts (match git log for running code) | |||||
| activate app | osascript -e 'tell application "Snappy Chat" to activate' (1-2s before screenshot) | |||||
| drain queue | curl -XPOST 127.0.0.1:3147/chat-inject-flush (returns {flushed:N}) | |||||
| reset to welcome | curl -XPOST /chat-inject-control -d '{"action":"reset"}' (does NOT flush) | |||||
| navigate sidebar | `curl -XPOST /chat-inject-control -d '{"action":"view-artifacts | view-files | view-chat | view-scheduled | view-customize | view-projects"}'` - Skills tab = view-files (NOT view-skills) |
| switch to thread | curl -XPOST /chat-inject-control -H "Content-Type: application/json" -d '{"action":"select-thread","threadId":"<uuid>"}' - full message history replays immediately. Use curl -s http://127.0.0.1:3147/threads to list thread IDs + titles. | |||||
| direct dispatch | curl -XPOST /dispatch/chat -d '{"intent":"<text>","threadId":"<id>"}' (bypass queue) | |||||
| check competing consumers | `pgrep -af "claude.*continue\ | claude.*-p" \ | wc -l` (>1 = contention) | |||
| verify push dispatched | `tail state/log/dispatch-chat.ndjson \ | grep intent_chars:<N>` (match intent length) | ||||
| screenshot | npx tsx state/lib/desktop.ts capture-screen /tmp/path.png | |||||
| server uptime | `ps -p $(pgrep -f server.ts \ | head -1) -o etime` (reset = queue wiped) | ||||
| server log | `tail state/log/head-screen.log \ | grep FATAL` (spot crash) | ||||
| env override | HEAD_SCREEN_URL (default http://127.0.0.1:3147) | |||||
| reference | state/skills/chat-drive/SKILL.md | |||||
| eval log | state/log/evals.ndjson (skill: chat-drive) |
Self-Test
An agent reading this should correctly:
- [ ] Pre-flight
chatDriveAvailable()before push? - [ ] Tune
waitForFirstFramefor backend + 1000ms React poll lag? - [ ] Audit by screenshot Read, NOT lib return values?
- [ ] Keep both snappy-chat and head-screen alive?
- [ ] Use
resetChatUI()(not peekaboo) between scenarios? - [ ] Restart server after any server.ts edit (tsx no hot-reload)?
- [ ] Serialize pushes; wait for RUN_FINISHED before next?
- [ ] Drain with
/chat-inject-flushbefore QA tests? - [ ] Know React pre-fetches (no callback on dispatch complete)?
- [ ] Activate app via osascript before screenshot on secondary Space?
- [ ] Know resetChatUI() does NOT flush (pre-flush if stale items queued)?
- [ ] Check
pgrep claude.*-pfor competing queue consumers?
<!-- kernel-ok: historical attribution for the PID-loader pattern. -->
Self-correcting loader (PID feedback)
This is the rule that makes loaders learn. Ported from snappy-kernel, where it demonstrably worked: the snappy-dom-cartographer skill mapped Skool admin UI once, every later ship-agent inherited the full map for free.
Before you finish, do two things:
1. Fix gaps (P - proportional). MANDATORY. If this loader didn't cover your case, you MUST attempt an Edit to this AGENTS.md before you log.
- Default action: edit this file. Surgical. Add the missing line, table row, or code example.
- Only
LOGGEDis allowed when: the fix needs >10 lines or spans multiple files. - The goal: the next agent never has to leave the loader.
2. Log the result.
echo "[$(date -u +%FT%TZ)] chat-drive: <what was missing or fixed> [FIXED|LOGGED] action_kind=<kind>" >> state/log/loader-feedback.log
Do not skip this. Every agent run must leave the system better than it found it.
OpenUI Resource
- Skill-owned OpenUI Lang resource:
state/skills/chat-drive/resources/ui.openui. Read it before rendering or editing this skill's generated component surface. - Treat this resource as a first-class artifact of the skill, not a generic chat response. Improve it when the skill's user-facing output needs to become richer.
- System resources compose OpenUI primitives and inherit SnappyChat tokens. Use
ui_contract: brandedin SKILL.md only for deliberate platform or client visuals.
api.ts- the code it can call
// snappy-chat-drive/api.ts
//
// Push text into the snappy-chat composer programmatically. The bridge is
// the head-screen server's chat-inject FIFO: this lib POSTs to /chat-inject-push,
// the snappy-chat WKWebView polls /chat-inject-pop on a 500ms interval and
// runs the text through the real OpenUI submit path (processMessage →
// /dispatch/chat). The result: dogfood loops, automated UX QA, and recursive
// subagent dispatch all flow through the actual chat surface — same React
// store, same generative-UI cards — instead of trying to drive WKWebView
// with synthetic clicks (peekaboo's clickAt does not fire React onClick on
// WKWebView).
//
// Sync contract: there is NO callback when the chat finishes streaming. The
// caller is the actor (push); the auditor is whatever reads a screenshot
// afterward. `waitForFirstFrame` is a coarse sleep so the dispatcher has
// time to start streaming before the auditor captures.
const HEAD_SCREEN_BASE = process.env.HEAD_SCREEN_URL ?? "http://127.0.0.1:3147";
const DEFAULT_FIRST_FRAME_MS = 8_000;
export interface DispatchInChatUIOpts {
/**
* Sleep duration after the push so the dispatcher has time to start
* streaming. Default 8000ms. Pass 0 to return immediately.
*/
waitForFirstFrame?: number;
/**
* Per-agent queue isolation key. The server keeps a Map<agentId, queue>
* so parallel QA subagents don't share a single FIFO. Default "ui" matches
* the snappy-chat React poll loop — so omitting this routes pushes to the
* actual cockpit. Pass a stable identifier (e.g. "qa-broad-smoke",
* "dogfood-loop2") to isolate from the cockpit and from each other.
*/
agentId?: string;
}
/**
* Push `text` onto the snappy-chat input bridge. Resolves once the queue
* has accepted the push and (optionally) `waitForFirstFrame` ms have passed.
*
* Throws if the head-screen server is unreachable or the push is rejected.
*/
export async function dispatchInChatUI(
text: string,
opts: DispatchInChatUIOpts = {},
): Promise<void> {
if (typeof text !== "string" || text.length === 0) {
throw new Error("dispatchInChatUI: text (non-empty string) required");
}
const wait = opts.waitForFirstFrame ?? DEFAULT_FIRST_FRAME_MS;
const body: { text: string; agentId?: string } = { text };
if (typeof opts.agentId === "string" && opts.agentId.length > 0) {
body.agentId = opts.agentId;
}
const res = await fetch(`${HEAD_SCREEN_BASE}/chat-inject-push`, {
method: "POST",
headers: { "content-type": "application/json" },
body: JSON.stringify(body),
});
if (!res.ok) {
let detail = "";
try { detail = await res.text(); } catch {}
throw new Error(
`chat-inject-push ${res.status}: ${detail.slice(0, 240) || res.statusText}`,
);
}
if (wait > 0) {
await new Promise(r => setTimeout(r, wait));
}
}
export interface ResetChatUIOpts {
/**
* Sleep duration after the control push so the React app has time to
* pop the control message, unmount FullScreen, and remount the welcome
* surface. Default 1500ms — enough for the 500ms poll cadence + a remount.
*/
waitMs?: number;
/**
* Per-agent queue isolation key. See `DispatchInChatUIOpts.agentId`.
* Default "ui". Parallel QA agents pass their own ID so a reset on one
* thread doesn't drop the queue another agent is filling.
*/
agentId?: string;
}
/**
* Push a control message that resets the snappy-chat UI to the welcome
* surface. Equivalent to the user clicking "+ New chat" in the sidebar.
* Use between dogfood scenarios so a single subagent can run multiple
* intents end-to-end without thread state bleeding between them.
*
* Throws if the head-screen server is unreachable or the push is rejected.
*/
export async function resetChatUI(opts: ResetChatUIOpts = {}): Promise<void> {
const wait = opts.waitMs ?? 1500;
const body: { action: string; agentId?: string } = { action: "reset" };
if (typeof opts.agentId === "string" && opts.agentId.length > 0) {
body.agentId = opts.agentId;
}
const res = await fetch(`${HEAD_SCREEN_BASE}/chat-inject-control`, {
method: "POST",
headers: { "content-type": "application/json" },
body: JSON.stringify(body),
});
if (!res.ok) {
let detail = "";
try { detail = await res.text(); } catch {}
throw new Error(
`chat-inject-control ${res.status}: ${detail.slice(0, 240) || res.statusText}`,
);
}
if (wait > 0) {
await new Promise(r => setTimeout(r, wait));
}
}
/**
* Cheap reachability check for the head-screen server. Returns true iff the
* server answers any 2xx-ish response on `/healthz`. Use to gate dogfood
* loops so they fail fast when the bridge is down rather than timing out
* mid-push.
*/
export async function chatDriveAvailable(): Promise<boolean> {
try {
const res = await fetch(`${HEAD_SCREEN_BASE}/healthz`, { method: "GET" });
return res.ok;
} catch {
return false;
}
}
// CLI smoke: `npx tsx state/lib/chat-drive.ts "say hello in three words"`
// Set CHAT_INJECT_AGENT_ID=<id> to isolate from the cockpit's "ui" queue
// (e.g. parallel QA subagents).
if (import.meta.url === `file://${process.argv[1]}`) {
const text = process.argv.slice(2).join(" ").trim();
if (!text) {
console.error('usage: tsx state/lib/chat-drive.ts "<intent>"');
process.exit(2);
}
const agentId = process.env.CHAT_INJECT_AGENT_ID;
(async () => {
const up = await chatDriveAvailable();
if (!up) {
console.error("head-screen server unreachable at", HEAD_SCREEN_BASE);
process.exit(1);
}
await dispatchInChatUI(text, { waitForFirstFrame: 0, agentId });
console.log("OK pushed:", text, agentId ? `(agentId=${agentId})` : "");
})().catch(e => { console.error("FAIL:", e?.message ?? e); process.exit(1); });
}
scripts- helper scripts it can run
prose-only skill - 2 inline code blocks live in SKILL.md above (no state/bin/ sidecar yet).
how we check it- the checks, plus the last 10 runs
| timestamp | verb | score | primary_issue | artifact |
|---|---|---|---|---|
| 2026-04-30 07:51Z | - | 0.80 | - | - |
| 2026-04-30 07:39Z | - | 0.70 | - | - |
| 2026-04-30 07:38Z | - | 0.72 | - | - |
| 2026-04-30 07:33Z | - | 0.70 | - | - |
| 2026-04-30 07:15Z | - | 0.85 | - | - |
| 2026-04-30 03:10Z | - | 0.85 | - | - |
| 2026-04-30 06:55Z | - | 0.67 | - | - |
| 2026-04-29 04:43Z | - | 1.00 | - | - |
| 2026-04-29 04:14Z | - | 1.00 | - | - |
| 2026-04-29 04:01Z | - | 1.00 | - | - |