User-install audit — 2026-04-18
Trace of what a fresh user gets when they run npx snappy-os, vs what Robert (the operator of the control plane) runs locally. Evidence on every claim.
1. Operator plane vs user plane
The distinction IS modeled — tenant IDs, tier grants, invite codes, separate DO Spaces prefixes all exist. But on disk today, only Robert's tenant is active.
| Concern | Operator plane (Robert) | User plane (Joe) |
|---|---|---|
| Tenant ID | sha256(SNAPPY_MASTER_KEY).first(12) — one-of-one today | Same derivation; would get its own 12-char prefix |
| Master key location | .env.cache at ~/projects/snappy-os/.env.cache (13k, full secret bundle) | Single SNAPPY_MASTER_KEY env var from invite URL; no .env.cache required |
| DO Spaces prefix | Writes to snappy-os/ public canonical + snappy-os-tenants/<robert_id>/ | Reads public canonical; writes ONLY to snappy-os-tenants/<joe_id>/ |
| Worker role | Mints invites via POST /invite/mint with Bearer MASTER_KEY | Never authenticates as admin; redeems via GET /invite/<code> |
Kernel library (~/projects/snappy-kernel) | Was the frozen reference library; not present on THIS machine | Never shipped to Joe machines per bin/install.js:326 comment |
| Skills authoring | Edits state/skills/<name>.md, pushes via cli.js push | Can edit locally; push is tenant-scoped; cross-tenant PID requires quorum |
| Worker deploy | Writes ~/projects/snappy-skills/ + wrangler deploy | Never touches Worker code |
| Auth tier | tier: personal with is_master: true (auth.ts:63-65) | tier: public by default, tier: personal only if their key matches a KV API_KEYS record |
Source evidence: ~/projects/snappy-skills/src/auth.ts:62-65, pull.ts:126-145, state/wiki/multi-tenant.md:34-41, state/bin/sync/tenant-id.sh:4-8.
2. Install timeline — what actually runs on npx snappy-os
Entry point: bin/install.js (Joe never clones from GitHub — canonicals come from the Worker via bin/cli.js pull, comment at install.js:5).
| # | Step | Mechanism | Failure mode | Est. time | ||
|---|---|---|---|---|---|---|
| 0 | Default mode is dry-run | SNAPPY_BOOTSTRAP_APPLY=1 required to mutate | User thinks nothing happened | <1s | ||
| 1 | Detect platform | os.platform() + /proc/version WSL check | Unsupported OS → warn, continue | <1ms | ||
| 2 | Detect runtimes + version-gate | which(claude/codex/gemini/openclaw/cursor/windsurf) + --version | refuse-too-old aborts; warn-degraded continues | ~1s (6 which calls) | ||
| 3 | Resolve SNAPPY_MASTER_KEY | Priority: env var → $BOOTSTRAP_INVITE_CODE → op read → doppler secrets get → interactive prompt | Dry-run skips prompt | <100ms | ||
| 4 | Pull snappy-os canonical | node bin/cli.js pull --force --scope all --repo os (presigned DO URLs from Worker) | Requires bin/cli.js to already exist — bootstrap-chicken-egg handled by npm tarball shipping cli.js | varies (manifest-diff) | ||
| 5 | symlink-runtimes.sh | Fan-out per-skill symlinks into ~/.claude/skills/, ~/.codex/, etc. | Backs up existing dirs to ~/.claude/_backups/<ts>/ before overwriting | 2-5s | ||
| 6 | wire-hooks.js | Idempotent ~/.claude/settings.json + ~/.codex/config.toml edit; dedupes by command string | Bare {command} rejects whole settings.json silently (user memory note) | <1s | ||
| 7 | sync-runtimes.ts --codex-expand | 3-file expansion for Codex | <1s | |||
| 8 | sync-runtimes.ts | CLAUDE.md → AGENTS.md / GEMINI.md / .cursorrules / .windsurfrules / .github/copilot-instructions.md | <1s | |||
| 9 | Cron install | Idempotent crontab - append: `30 /6 ... cli.js doctor --silent | ... alert doctor-failed` | Windows skipped; crontab absent → fail | <1s | |
| 10 | Smoke tests | program.md exists, claude mirror resolves, skills.snappy.ai/_status 200, per-runtime snappy-* visibility | Each reported independently | ~5s (HTTP roundtrip) | ||
| 11 | Print next steps | Prose | — | <1ms |
Verified on THIS machine (Robert's) via .bootstrap-report.json: mode dry-run, all 11 steps ok, all smoke checks PASS, 222 snappy-* skills visible under ~/.claude/skills/, Worker HTTP 200.
Live gateway probe: GET https://skills.snappy.ai/.well-known/skills/snappy-os/hooks/snappy-os-inject.sh returns HTTP 200, 5119 bytes — fresh-machine path verified accessible.
The CLAUDE.md 6-step manual block (from ~/.claude/CLAUDE.md "New machine setup") is LEGACY prose. It tells users to clone from github.com, symlink .env.cache, and curl-install the hook by hand. The live code path (bin/install.js) does NONE of that — it uses npx snappy-os + the Worker-backed cli.js pull. The manual steps are a diverged fallback.
Broken: symlink ~/.claude/skills/snappy-settings/.env.cache referenced in both the user CLAUDE.md and the settings skill loader does not exist on this machine. Only SKILL.md is symlinked. The settings skill loader claims "fix the symlink first when anything that hits the network mysteriously fails" — but the symlink that would need fixing isn't there. Scripts that read .env.cache via the kernel fallback path would silently 404.
3. Infra dependencies touched on install
| Service | Called from | Purpose | Required for Joe? | |
|---|---|---|---|---|
Cloudflare Worker (skills.snappy.ai) | cli.js pull/push/alert/_status, hook snappy-os-inject.sh | Authenticated ingress to DO Spaces; KV-cached (60s) skill serve; tenant derivation; invite redemption | YES — hard dependency | |
DO Spaces (robert-storage) | Via Worker only (Joe never holds DO creds per install.js:5 + worker-architecture.md:14) | Canonical storage: snappy-os/, snappy-kernel/, snappy-os-tenants/<id>/, snappy-os-meta/tenants/<id>.json | INDIRECT — Worker gates it | |
| npm registry | npx snappy-os first invocation | Tarball distribution (replaced old `curl | bash per install.ts:2-20` Robert 2026-04-17 directive) | YES — bootstrap entry |
| GitHub | bin/install.js:5 comment explicitly says "no git clone, no github.com" | NOT used on fresh install | NO | |
| Anthropic / OpenAI | Runtime skill invocations (NOT install-time) | Skill execution | NO for install itself | |
| 1Password / Doppler | Optional key resolvers in resolveMasterKey() (install.js:171-178) | Secret retrieval convenience | NO (falls back to prompt) |
Worker routes Joe hits (~/projects/snappy-skills/src/index.ts + route table in worker-architecture.md:39-54): GET /_pull, POST /_alert, GET /_status, GET /invite/<code> (HTML landing page, no longer bash). POST /_push only after first action.
4. Tenant model — honest answer: modeled, single-tenant in practice
What exists in code:
state/bin/sync/tenant-id.sh— deterministic 12-char sha256 derivation.bin/cli.js:970-974— sendstenant_idonPOST /_alert.~/projects/snappy-skills/src/auth.ts— 4 tiers (public/personal/client/subscriber).~/projects/snappy-skills/src/pull.ts:126-145—canReadPath()tier gate.state/bin/invite-mint.ts— per-tenant invite minting.state/wiki/multi-tenant.md+invite-walkthrough.md— 236 lines of spec.- DO prefixes per tier:
snappy-os/public,snappy-os-tenants/<id>/private,
snappy-os-clients/<slug>/, snappy-os-subscriber/.
What's present on disk / in live state today:
~/projects/snappy-os/.env.cache— Robert's only key. No other tenant entries.state/log/invites.ndjson— would hold mint records; not inspected, but
invite-walkthrough.md:44 names it.
API_KEYSKV in the Worker — would hold per-tenant records; empty state
means auth.ts:69-71 returns tier: public, name: invalid-key.
Verdict: the tenancy infrastructure is in place and authorization is server-side enforced, but no second tenant has been onboarded. To onboard one, Robert must:
- Generate a fresh
SNAPPY_MASTER_KEYfor Joe (no tooling found for this —
Joe supplies his own, per invite-mint.ts:86-88 which validates length ≥16).
- Run
invite-mint.ts --tenant=<joes-key> --label="..."against the Worker. - Send Joe the
invite_url. - Add Joe's tenant to
s3://robert-storage/snappy-os-meta/tenants/<id>.json
if non-public tier access is needed.
- Joe pastes the npx line into his terminal.
Step 4 has no helper script — it's a manual DO Spaces write today.
5. Expansion story — how a user adds their own skill
There is a prose skill skill (state/skills/skill.md) with a scaffolder at state/lib/skill.ts. No /snappy-new-skill slash command.
The documented flow:
- Scaffold: Joe creates
state/skills/<name>.mdwith required
frontmatter (name, description, eval). Start with graduation: prose, eval: manual.
- Add sidecar loader:
state/skills/<name>.agents.mdwith Critical
Rules, Commands, Pitfalls, Self-Test per the program.md sidecar spec.
- Update catalog: Manually append to
state/index.md(not auto-generated
— state/skills/skill.md:50-54 explicitly flags this as recurring drift).
- Run lint:
npx tsx state/lint/check.tsvalidates shape — no## Eval
section = fail; sidecar present = pass.
- Optionally graduate: write actor (
state/lib/<name>.tsor
state/bin/<name>/) + independent auditor. Flip eval: manual to eval: auto only after actor/auditor split exists.
- Push:
node bin/cli.js push --autowrites to DO Spaces under tenant
prefix. Public canonical requires quorum promotion (≥3 tenants, ≥0.85 score, ≥5 runs per quorum.ts).
- Cross-tenant promotion: Automatic via Worker scheduled handler every
minute (worker-architecture.md:57-62). No manual review meeting.
Gap: Joe's tenant skill lives at snappy-os-tenants/<joe_id>/state/skills/<name>.md but the pull gate (pull.ts:128) allows ALL tenants to read state/skills/ in the public repo. Tenant-private skills need a separate prefix — today any skill Joe writes either stays local or goes into public canonical via quorum. No middle ground ("my team's private skills") is wired.
Honest gaps (3 concrete)
- Symlink
~/.claude/skills/snappy-settings/.env.cachedoes not exist
on this machine. Both ~/.claude/CLAUDE.md setup block AND the settings skill loader assume it does. Legacy kernel-path fallback silently 404s.
- No tooling to onboard a second tenant end-to-end.
invite-mint.ts
mints but the tenant grants file (snappy-os-meta/tenants/<id>.json) has no helper — operator edits DO Spaces by hand.
- No private-tenant skill tier. Code distinguishes public canonical from
tenant-private state, but skill content is treated as public (pull.ts:128 allows all tenants to read state/skills/). A user who writes a skill either pushes it to quorum promotion or keeps it local. "Personal team skills" is unmodeled.
Verdict
Operator vs user plane is formally modeled, practically single-tenant. The npm package + Worker ingress architecture is real and works (.bootstrap-report.json shows 11/11 green on this machine, gateway HTTP 200). A second user could onboard via invite-mint.ts + npx snappy-os today, but would land in a tenant-of-one shape with no team-private skill space and with the legacy .env.cache symlink advice in CLAUDE.md that doesn't match the install.js flow. The multi-tenant spec is further along than the multi-tenant usage — the substrate is 80% built, the flight path has been walked once (Robert), and the documented onboarding loop would require at least one manual DO Spaces write per new tier grant.