From 0e3a5babd90f80f75a185dfe69206530bc7cad93 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Alejandro=20Guti=C3=A9rrez?= <35082514+alezmad@users.noreply.github.com> Date: Mon, 4 May 2026 01:36:16 +0100 Subject: [PATCH] feat(daemon): sprint 4 outbound routing + CLI thin-client + ambient mode Daemon outbox now stores resolved target_spec + crypto_box ciphertext + nonce per row. Drain worker is a forwarder; no per-row resolution at drain time. Outbound routing is no longer a placeholder. Schema additions (additive, NULL allowed for legacy rows): outbox.mesh, target_spec, nonce, ciphertext, priority. v0.9.0 rows keep draining via the broadcast fallback so existing in-flight rows finish cleanly. IPC /v1/send resolves the user-friendly to (display name, hex prefix, full pubkey, @group, *, #topicId) into a broker-format target_spec at accept time. DMs encrypt via crypto_box; broadcast/topic/group base64 the plaintext. Hex prefixes (16+ chars) match against connected peers. CLI thin-client routing extends trySendViaDaemon pattern to peer list and skill list/get. Three new helpers in services/bridge/daemon-route.ts. SKILL.md gains ambient mode section: after claudemesh install, raw claude works for the daemon's attached mesh. Launch stays as the override path. Spec at .artifacts/specs/2026-05-04-v2-roadmap-completion.md orders the remaining v2.0.0 work: multi-mesh daemon (1.26), CLI-to-thin-client (1.27), mesh-to-workspace rename (1.28), HKDF identity (2.0). Released as 1.25.0 on npm. Co-Authored-By: Claude Opus 4.7 (1M context) --- .../specs/2026-05-04-v2-roadmap-completion.md | 104 ++++++++++++++++++ apps/cli/CHANGELOG.md | 60 ++++++++++ apps/cli/package.json | 2 +- apps/cli/skills/claudemesh/SKILL.md | 10 +- apps/cli/src/commands/peers.ts | 12 +- apps/cli/src/commands/platform-actions.ts | 39 +++++++ apps/cli/src/daemon/db/outbox.ts | 52 ++++++++- apps/cli/src/daemon/drain.ts | 39 +++++-- apps/cli/src/daemon/ipc/handlers/send.ts | 13 +++ apps/cli/src/daemon/ipc/server.ts | 96 ++++++++++++++++ apps/cli/src/daemon/run.ts | 4 + apps/cli/src/services/bridge/daemon-route.ts | 44 ++++++++ docs/roadmap.md | 30 ++++- 13 files changed, 482 insertions(+), 23 deletions(-) create mode 100644 .artifacts/specs/2026-05-04-v2-roadmap-completion.md diff --git a/.artifacts/specs/2026-05-04-v2-roadmap-completion.md b/.artifacts/specs/2026-05-04-v2-roadmap-completion.md new file mode 100644 index 0000000..ac791d6 --- /dev/null +++ b/.artifacts/specs/2026-05-04-v2-roadmap-completion.md @@ -0,0 +1,104 @@ +# v2.0.0 Daemon Redesign — Completion Roadmap + +**Date:** 2026-05-04 +**Owner:** alezmad +**Status:** in-progress (1.24.0 + 1.25.0 land most of it; remainder is two follow-up arcs) + +## What's done + +| v2.0.0 bullet | Version | Status | +|---|---|---| +| `claudemesh-daemon` long-lived launchd / systemd unit | 1.22.0 | ✅ Done | +| MCP server shrinks to thin daemon adapter | 1.24.0 | ✅ Done — 979 → ~200 LoC of push-pipe, daemon-required, no fallback | +| `claudemesh install` auto-installs + starts daemon | 1.24.0 | ✅ Done | +| `claudemesh launch` ensures daemon | 1.24.0 | ✅ Done | +| Daemon outbound routing (Sprint 4: real targets + crypto) | 1.25.0 | ✅ Done — outbox stores `mesh`, `target_spec`, `nonce`, `ciphertext`, `priority`; resolution + `crypto_box` happens at IPC accept time; drain is a forwarder | +| CLI thin-client routing for read verbs | 1.25.0 | ✅ Partial — `peer list`, `skill list/get` route through daemon when present; same `trySendViaDaemon` fallback shape | +| Ambient mode (raw `claude` Just Works) | 1.25.0 | ✅ Documented + functional for the daemon's attached mesh | + +## What remains (in dependency order) + +### A. Daemon multi-mesh (the prerequisite for "ambient mode for everything") + +**Why it's the critical path:** ambient mode today only works for the single mesh the daemon is attached to. Users with N meshes either run N daemons (different sock paths) or restart the daemon to switch. Neither is acceptable for the v2.0.0 promise. + +**What it takes:** +- Daemon holds `Map` instead of one broker. +- Outbox row's `mesh` column (1.25.0 added) is the dispatch key. +- IPC `/v1/send` requires `mesh` field (or infers from target prefix `:`). +- IPC read endpoints (`/v1/peers`, `/v1/skills`, `/v1/profile`) accept `?mesh=` or return mesh-grouped results. +- SSE event payloads already include `mesh` slug; no change needed. +- Drain worker selects broker by row's `mesh` column. +- `daemon up` with no `--mesh` attaches to all joined meshes; with `--mesh X` restricts to X (legacy mode for explicit single-mesh). +- Inbox dedupe keeps using `client_message_id` UNIQUE; mesh column for filtering only. + +**Estimated effort:** 1 week. ~600 LoC across `run.ts`, `drain.ts`, `ipc/server.ts`, plus tests for per-mesh dispatch. + +**Risk:** medium. The single-mesh assumption is baked into a few places (peer-list response shape, skill-list response shape). Need to choose: per-mesh tagged responses (breaking) or array-of-meshes wrapped responses (additive). Recommend the latter for back-compat. + +### B. HKDF-derived peer keypairs (cross-machine identity) + +**Why it matters:** today each install per machine = fresh keypair = different mesh member identity. User signs in on laptop and desktop and shows up as two different members. v2.0.0 promised "same identity across machines." + +**What it takes:** +- `HKDF(account_secret, info: "claudemesh/mesh//peer", salt: )` derives a deterministic ed25519 keypair per mesh. +- `account_secret` derives from the user's authenticated session — needs broker-side endpoint to vend it on first install. +- Enrollment flow changes: instead of generating a fresh keypair, derive it. Subsequent installs find the same pubkey already in `mesh.member` and skip enrollment. +- Migration: existing members keep their old keypairs (they're stored in config). Only new joins use HKDF. Optional: opt-in re-enrollment for users who want cross-machine sync. +- Broker hello-sig protocol unchanged (still ed25519 sign). + +**Estimated effort:** 2-3 weeks. Touches enrollment, broker auth, dashboard, security review. + +**Risk:** high. Crypto change with security implications. Needs design review (account_secret distribution security, HKDF salt choice, key compromise recovery story). + +### C. Mesh → workspace public surface rename + +**Why it matters:** "mesh" is internal jargon for what users experience as "a workspace." v2.0.0 calls for the rename to align UX language. + +**What it takes:** +- All CLI verbs gain `workspace` aliases (`claudemesh workspace list` ≡ `claudemesh list`). +- Help text, docs, README, marketing site updated. +- DB tables stay `mesh_*` (migration cost prohibitive; not user-visible). +- Wire protocol stays `mesh_*` (broker change too disruptive). +- Eventually deprecate the `mesh` aliases (~2 minor versions later). + +**Estimated effort:** 3-4 days. Mostly rote search/replace + new aliases. + +**Risk:** low. Cosmetic. + +### D. Full CLI-to-thin-client conversion + +**Why it matters:** today the CLI has bridge + cold-path code that duplicates ~3000 LoC of broker WS / crypto / decode logic that the daemon also has. Once daemon is multi-mesh, every verb can become "open IPC, send request, render response." + +**What it takes:** +- Each verb: replace `withMesh(...)` (which opens its own broker WS) with `daemonOnly(...)` (calls IPC, errors if daemon down). +- Drop `bridge/server.ts`, `bridge/client.ts`, `bridge/socket-broker.ts` entirely. +- Drop most of `services/broker/ws-client.ts` from the CLI build (kept only for daemon's internal use). +- CLI binary shrinks ~30-40%. +- Daemon becomes the only broker WS holder per user. + +**Estimated effort:** 1 week. Mostly mechanical; strict typescript catches most issues. + +**Risk:** medium. Breaks workflows where CLI is used without daemon (CI environments, headless scripts). Need to keep a `--no-daemon` escape hatch or document the constraint. + +## Recommended sequencing + +``` +1.25.0 (today): Sprint 4 outbound routing + CLI thin-client read paths + ambient mode docs +1.26.0 (next): A. Daemon multi-mesh — "ambient mode for everything" +1.27.0: D. CLI-to-thin-client conversion — drops ~3000 LoC +1.28.0: C. Mesh → workspace rename (aliases shipped, no removal yet) +2.0.0: B. HKDF identity (separate security-reviewed arc) +``` + +A → D → C → B is the right order: +- A unblocks ambient mode for multi-mesh users (highest UX value). +- D unblocks the LoC reduction the v2.0.0 promise mentioned ("3000 LoC removed"). +- C is cosmetic; do it once D has stabilized. +- B is the most security-sensitive; do it last, with proper review. + +## Out of scope for the v2.0.0 endpoint + +- **Topic crypto (Sprint 5+).** Topics still ship as base64 plaintext. Real per-topic encryption is a v0.3.0 operator-layer item, parallel track. +- **Broker hardening for daemon idempotency (Sprint 7).** Partial unique index on `(mesh_id, client_message_id) WHERE NOT NULL` and the `mesh.client_message_dedupe` table. Documented in `2026-05-03-daemon-spec-broker-hardening-followups.md`. +- **`launch` deprecation.** 1.25.0 docs now recommend ambient mode for default cases; `launch` stays as the override path. Full deprecation is a 2.x decision. diff --git a/apps/cli/CHANGELOG.md b/apps/cli/CHANGELOG.md index 3f17787..f9f10bf 100644 --- a/apps/cli/CHANGELOG.md +++ b/apps/cli/CHANGELOG.md @@ -1,5 +1,65 @@ # Changelog +## 1.25.0 (2026-05-04) — Sprint 4 outbound routing + ambient mode + +### Daemon outbound routing (Sprint 4) + +The v0.9.0 daemon shipped outbox infrastructure but its drain worker +was a placeholder — every queued send went out as a broadcast (`*`). +That's now fixed. Outbound resolution and `crypto_box` encryption +happen at IPC accept time, then the drain worker just forwards the +already-encrypted ciphertext to the broker. + +- Outbox schema additions (additive, NULL allowed for legacy rows): + `mesh`, `target_spec`, `nonce`, `ciphertext`, `priority`. Existing + v0.9.0 rows keep draining via the broadcast fallback. +- IPC `/v1/send` resolves the user-friendly `to` (display name, hex + prefix, full pubkey, `@group`, `*`, `#topicId`) into a broker-format + `target_spec` and encrypts the plaintext using `crypto_box` for DMs + (against recipient pubkey + sender session secret) or base64 for + broadcast / topic / group targets. +- Drain worker reads `target_spec`, `nonce`, `ciphertext`, `priority` + from the row and dispatches as-is. No per-row resolution at drain + time means peer-presence flicker doesn't affect in-flight sends. +- Pubkey prefix matching: 16+ char hex prefix matches against + `peer.pubkey` and `peer.memberPubkey` of connected peers. Ambiguous + prefixes return 502 with a clear error. + +Smoke test verified end-to-end: `claudemesh send --self "..."` +through daemon resolves, encrypts, and delivers. Outbox reaches +`status=done` with broker-issued `broker_message_id`. + +### CLI thin-client routing extensions + +`claudemesh peer list` and `claudemesh skill list/get` now route +through the daemon when its socket is present, mirroring the +`trySendViaDaemon` pattern from `send.ts`. Same fall-back chain: +daemon → bridge → cold path. + +New helpers in `services/bridge/daemon-route.ts`: +- `tryListPeersViaDaemon()` +- `tryListSkillsViaDaemon()` +- `tryGetSkillViaDaemon(name)` + +### Ambient mode + +After `claudemesh install` (which now installs and starts the daemon +service), **raw `claude` Just Works** for the daemon's attached mesh. +No `claudemesh launch` ceremony needed for the common case. Channel +push, slash commands, and resources flow through the daemon-backed +MCP shim. + +`claudemesh launch` remains the override path: explicit mesh +selection, fresh display name, headless modes, system-prompt injection, +or multi-mesh users who want to spawn into a non-default mesh. + +### Roadmap spec + +`.artifacts/specs/2026-05-04-v2-roadmap-completion.md` documents +exactly what's done vs. what remains for the full v2.0.0 endpoint: +multi-mesh daemon (1.26.0), full CLI-to-thin-client conversion +(1.27.0), mesh→workspace rename (1.28.0), HKDF identity (2.0.0). + ## 1.24.0 (2026-05-03) — daemon required + thin MCP shim The architectural convergence v0.9.0 was building toward. diff --git a/apps/cli/package.json b/apps/cli/package.json index 3885559..28d6e28 100644 --- a/apps/cli/package.json +++ b/apps/cli/package.json @@ -1,6 +1,6 @@ { "name": "claudemesh-cli", - "version": "1.24.0", + "version": "1.25.0", "description": "Peer mesh for Claude Code sessions — CLI + MCP server.", "keywords": [ "claude-code", diff --git a/apps/cli/skills/claudemesh/SKILL.md b/apps/cli/skills/claudemesh/SKILL.md index 3a98821..b60be1f 100644 --- a/apps/cli/skills/claudemesh/SKILL.md +++ b/apps/cli/skills/claudemesh/SKILL.md @@ -45,7 +45,7 @@ claudemesh send "" "..." --mesh "" If the parent Claude session was launched via `claudemesh launch`, an MCP push-pipe is running and holds the per-mesh WS connection. CLI invocations dial `~/.claudemesh/sockets/.sock` and reuse that warm connection (~200ms total round-trip including Node.js startup). If no push-pipe is running (cron, scripts, hooks fired outside a session), the CLI opens its own WS, which takes ~500-700ms cold. **You don't manage this** — every verb auto-detects and falls through. -### Daemon path (v0.9.0, opt-in, fastest) +### Daemon path (v1.24.0+, REQUIRED for in-Claude-Code use) `claudemesh daemon up [--mesh ]` starts a persistent per-user runtime that holds the broker WS, a durable SQLite outbox/inbox, and listens on `~/.claudemesh/daemon/daemon.sock` (UDS) plus an optional loopback TCP. When the daemon socket is present, every verb routes through it first (~1ms IPC) before falling back to bridge / cold paths. The send envelope carries a caller-stable `client_message_id`, so a `claudemesh send` that started before a daemon crash survives the restart via the on-disk outbox. @@ -60,11 +60,15 @@ claudemesh daemon outbox requeue # re-enqueue an aborted/d claudemesh daemon down # SIGTERM + wait ``` -`claudemesh install` (MCP + hooks registration) and the daemon are independent — install does not start the daemon, and the daemon does not require install. Run both for the warmest path: install gives you the in-session push-pipe, daemon gives you cross-invocation persistence and a survivable outbox. +As of 1.24.0 `claudemesh install` registers the MCP entry **and** installs/starts the daemon service for the user's primary mesh. The MCP shim hard-requires the daemon to be running — it bails at boot with actionable instructions if the socket isn't present. There is no fallback. CLI verbs (`send`, `peer list`, `inbox`, `skill list/get`, etc.) keep working without a daemon via bridge or cold paths, but for any in-Claude-Code use the daemon must be up. + +### Ambient mode (1.25.0+) + +Once `claudemesh install` has run (registers MCP entry + starts daemon service), **raw `claude` Just Works** for the daemon's attached mesh. No `claudemesh launch` ceremony, no manual flags, no per-session keypair. Channel push, slash commands, and resources all flow through the daemon-backed MCP shim. Use `claudemesh launch` only when you need to override defaults (different mesh, custom display name, system-prompt injection, headless modes). ## Spawning new sessions (no wizard) -`claudemesh launch` is the canonical way to start a new Claude Code session connected to claudemesh. Pass every required flag up front so no interactive prompt fires — that's what makes the verb scriptable from tmux send-keys, AppleScript/iTerm spawn helpers, hooks, cron, and the `claudemesh launch` you call from inside another session. **Always use this verb, never `claude` directly with hand-rolled flags** — it sets up the per-session ed25519 keypair, exports `CLAUDEMESH_DISPLAY_NAME`, isolates the mesh config in a tmpdir, and passes the `--dangerously-load-development-channels server:claudemesh` plumbing that the MCP push-pipe needs. +`claudemesh launch` remains useful for non-default cases: explicit mesh selection, fresh display name, headless `--quiet` runs, system-prompt injection, multi-mesh users with one daemon attached to mesh A who want to spawn into mesh B. For the common case (single joined mesh, daemon installed), prefer raw `claude`. Pass every required flag up front so no interactive prompt fires — that's what makes the verb scriptable from tmux send-keys, AppleScript/iTerm spawn helpers, hooks, cron, and the `claudemesh launch` you call from inside another session. ### Full flag surface diff --git a/apps/cli/src/commands/peers.ts b/apps/cli/src/commands/peers.ts index c70de46..117fb58 100644 --- a/apps/cli/src/commands/peers.ts +++ b/apps/cli/src/commands/peers.ts @@ -67,7 +67,17 @@ async function listPeersForMesh(slug: string): Promise { const joined = config.meshes.find((m) => m.slug === slug); const selfMemberPubkey = joined?.pubkey ?? null; - // Try warm path first. + // Daemon path — preferred when running. Same routing pattern as send.ts: + // ~1 ms IPC round-trip; broker WS already warm in the daemon. + try { + const { tryListPeersViaDaemon } = await import("~/services/bridge/daemon-route.js"); + const dr = await tryListPeersViaDaemon(); + if (dr !== null) { + return dr.map((p) => annotateSelf(p as PeerRecord, selfMemberPubkey, null)); + } + } catch { /* daemon route helper not available; fall through */ } + + // Try warm bridge path next. const bridged = await tryBridge(slug, "peers"); if (bridged && bridged.ok) { const peers = bridged.result as PeerRecord[]; diff --git a/apps/cli/src/commands/platform-actions.ts b/apps/cli/src/commands/platform-actions.ts index 5fa96f5..232e87a 100644 --- a/apps/cli/src/commands/platform-actions.ts +++ b/apps/cli/src/commands/platform-actions.ts @@ -273,6 +273,24 @@ export async function runSqlSchema(opts: Flags): Promise { // ════════════════════════════════════════════════════════════════════════ export async function runSkillList(opts: Flags & { query?: string }): Promise { + // Daemon path — preferred when running. Mirror trySendViaDaemon shape. + try { + const { tryListSkillsViaDaemon } = await import("~/services/bridge/daemon-route.js"); + const dr = await tryListSkillsViaDaemon(); + if (dr !== null) { + const skills = dr as Array<{ name: string; description: string; author: string; tags: string[] }>; + if (opts.json) { emitJson(skills); return EXIT.SUCCESS; } + if (skills.length === 0) { render.info(dim("(no skills)")); return EXIT.SUCCESS; } + render.section(`mesh skills (${skills.length})`); + for (const s of skills) { + process.stdout.write(` ${bold(s.name)} ${dim("· by " + s.author)}\n`); + process.stdout.write(` ${s.description}\n`); + if (s.tags?.length) process.stdout.write(` ${dim("tags: " + s.tags.join(", "))}\n`); + } + return EXIT.SUCCESS; + } + } catch { /* fall through to cold path */ } + return await withMesh({ meshSlug: opts.mesh ?? null }, async (client) => { const skills = await client.listSkills(opts.query); if (opts.json) { emitJson(skills); return EXIT.SUCCESS; } @@ -289,6 +307,27 @@ export async function runSkillList(opts: Flags & { query?: string }): Promise { if (!name) { render.err("Usage: claudemesh skill get "); return EXIT.INVALID_ARGS; } + // Daemon path first. + try { + const { tryGetSkillViaDaemon } = await import("~/services/bridge/daemon-route.js"); + const dr = await tryGetSkillViaDaemon(name); + if (dr !== null) { + const skill = dr as { name: string; description: string; instructions: string; tags: string[]; author: string; createdAt: string }; + if (opts.json) { emitJson(skill); return EXIT.SUCCESS; } + render.section(skill.name); + render.kv([ + ["author", skill.author], + ["created", skill.createdAt], + ["tags", skill.tags?.join(", ") || dim("(none)")], + ]); + render.blank(); + render.info(skill.description); + render.blank(); + process.stdout.write(skill.instructions + "\n"); + return EXIT.SUCCESS; + } + } catch { /* fall through */ } + return await withMesh({ meshSlug: opts.mesh ?? null }, async (client) => { const skill = await client.getSkill(name); if (!skill) { render.err(`skill "${name}" not found`); return EXIT.NOT_FOUND; } diff --git a/apps/cli/src/daemon/db/outbox.ts b/apps/cli/src/daemon/db/outbox.ts index a38bdba..cf07e3f 100644 --- a/apps/cli/src/daemon/db/outbox.ts +++ b/apps/cli/src/daemon/db/outbox.ts @@ -20,6 +20,12 @@ export interface OutboxRow { aborted_at: number | null; aborted_by: string | null; superseded_by: string | null; + /** Sprint 4 routing: NULL on v0.9.0 rows, drained via broadcast fallback. */ + mesh: string | null; + target_spec: string | null; + nonce: string | null; + ciphertext: string | null; + priority: string | null; } export function migrateOutbox(db: SqliteDb): void { @@ -46,13 +52,35 @@ export function migrateOutbox(db: SqliteDb): void { CREATE INDEX IF NOT EXISTS outbox_aborted ON outbox(status, aborted_at) WHERE status = 'aborted'; `); + + // v1.25.0 / Sprint 4: real outbound routing. Adds the broker-format + // target spec, mesh slug, and the already-encrypted ciphertext+nonce so + // the drain worker can dispatch each row without re-resolving names or + // re-running crypto. Existing rows from v0.9.0 land with NULLs and get + // drained via the legacy broadcast fallback (preserves no-regression). + const hasMesh = columnExists(db, "outbox", "mesh"); + const hasTargetSpec = columnExists(db, "outbox", "target_spec"); + const hasNonce = columnExists(db, "outbox", "nonce"); + const hasCiphertext = columnExists(db, "outbox", "ciphertext"); + const hasPriority = columnExists(db, "outbox", "priority"); + if (!hasMesh) db.exec(`ALTER TABLE outbox ADD COLUMN mesh TEXT`); + if (!hasTargetSpec) db.exec(`ALTER TABLE outbox ADD COLUMN target_spec TEXT`); + if (!hasNonce) db.exec(`ALTER TABLE outbox ADD COLUMN nonce TEXT`); + if (!hasCiphertext) db.exec(`ALTER TABLE outbox ADD COLUMN ciphertext TEXT`); + if (!hasPriority) db.exec(`ALTER TABLE outbox ADD COLUMN priority TEXT`); +} + +function columnExists(db: SqliteDb, table: string, column: string): boolean { + const rows = db.prepare(`PRAGMA table_info(${table})`).all<{ name: string }>(); + return rows.some((r) => r.name === column); } export function findByClientId(db: SqliteDb, clientMessageId: string): OutboxRow | null { const row = db.prepare(` SELECT id, client_message_id, request_fingerprint, payload, enqueued_at, attempts, next_attempt_at, status, last_error, delivered_at, - broker_message_id, aborted_at, aborted_by, superseded_by + broker_message_id, aborted_at, aborted_by, superseded_by, + mesh, target_spec, nonce, ciphertext, priority FROM outbox WHERE client_message_id = ? `).get(clientMessageId); return row ?? null; @@ -64,14 +92,21 @@ export interface InsertPendingInput { request_fingerprint: Uint8Array; payload: Uint8Array; now: number; + /** Sprint 4: routing fields. Optional only for legacy/v0.9.0 callers. */ + mesh?: string; + target_spec?: string; + nonce?: string; + ciphertext?: string; + priority?: string; } export function insertPending(db: SqliteDb, input: InsertPendingInput): void { db.prepare(` INSERT INTO outbox ( id, client_message_id, request_fingerprint, payload, - enqueued_at, attempts, next_attempt_at, status - ) VALUES (?, ?, ?, ?, ?, 0, ?, 'pending') + enqueued_at, attempts, next_attempt_at, status, + mesh, target_spec, nonce, ciphertext, priority + ) VALUES (?, ?, ?, ?, ?, 0, ?, 'pending', ?, ?, ?, ?, ?) `).run( input.id, input.client_message_id, @@ -79,6 +114,11 @@ export function insertPending(db: SqliteDb, input: InsertPendingInput): void { input.payload, input.now, input.now, + input.mesh ?? null, + input.target_spec ?? null, + input.nonce ?? null, + input.ciphertext ?? null, + input.priority ?? null, ); } @@ -108,7 +148,8 @@ export function listOutbox(db: SqliteDb, p: ListOutboxParams = {}): OutboxRow[] const sql = ` SELECT id, client_message_id, request_fingerprint, payload, enqueued_at, attempts, next_attempt_at, status, last_error, delivered_at, - broker_message_id, aborted_at, aborted_by, superseded_by + broker_message_id, aborted_at, aborted_by, superseded_by, + mesh, target_spec, nonce, ciphertext, priority FROM outbox ${where.length ? "WHERE " + where.join(" AND ") : ""} ORDER BY enqueued_at DESC @@ -122,7 +163,8 @@ export function findById(db: SqliteDb, id: string): OutboxRow | null { return db.prepare(` SELECT id, client_message_id, request_fingerprint, payload, enqueued_at, attempts, next_attempt_at, status, last_error, delivered_at, - broker_message_id, aborted_at, aborted_by, superseded_by + broker_message_id, aborted_at, aborted_by, superseded_by, + mesh, target_spec, nonce, ciphertext, priority FROM outbox WHERE id = ? `).get(id) ?? null; } diff --git a/apps/cli/src/daemon/drain.ts b/apps/cli/src/daemon/drain.ts index 00c5eb0..2aee9f7 100644 --- a/apps/cli/src/daemon/drain.ts +++ b/apps/cli/src/daemon/drain.ts @@ -26,6 +26,12 @@ interface PendingRow { request_fingerprint: Uint8Array; payload: Uint8Array; attempts: number; + /** Sprint 4 routing fields. NULL on legacy v0.9.0 rows → broadcast fallback. */ + target_spec: string | null; + nonce: string | null; + ciphertext: string | null; + priority: string | null; + mesh: string | null; } export interface DrainOptions { @@ -80,7 +86,8 @@ export function startDrainWorker(opts: DrainOptions): DrainHandle { async function drainOnce(opts: DrainOptions, log: NonNullable): Promise { const now = Date.now(); const rows = opts.db.prepare(` - SELECT id, client_message_id, request_fingerprint, payload, attempts + SELECT id, client_message_id, request_fingerprint, payload, attempts, + target_spec, nonce, ciphertext, priority, mesh FROM outbox WHERE status = 'pending' AND next_attempt_at <= ? ORDER BY enqueued_at @@ -93,21 +100,30 @@ async function drainOnce(opts: DrainOptions, log: NonNullable void; + /** Mesh secret key (hex) used to encrypt outbound DMs at accept time. */ + meshSecretKey?: string; + /** Mesh slug attached to this daemon — stamped on outbox rows for the drain. */ + meshSlug?: string; } export interface IpcServerHandle { @@ -64,6 +68,8 @@ export function startIpcServer(opts: IpcServerOptions): IpcServerHandle { bus: opts.bus, broker: opts.broker, onPendingInserted: opts.onPendingInserted, + meshSecretKey: opts.meshSecretKey, + meshSlug: opts.meshSlug, }); // --- UDS listener ------------------------------------------------------- @@ -123,6 +129,8 @@ function makeHandler(opts: { bus?: EventBus; broker?: DaemonBrokerClient; onPendingInserted?: () => void; + meshSecretKey?: string; + meshSlug?: string; }) { const tokenBytes = Buffer.from(opts.localToken, "utf8"); @@ -363,6 +371,21 @@ function makeHandler(opts: { respond(res, 400, { error: parsed.error }); return; } + // Sprint 4: resolve `to` → broker-format target_spec and encrypt at + // accept time, then store ciphertext+nonce on the outbox row. This + // crystallises routing so the drain worker is just a forwarder. + if (opts.broker && opts.meshSecretKey) { + try { + const routed = await resolveAndEncrypt(parsed.req, opts.broker, opts.meshSecretKey, opts.meshSlug ?? null); + parsed.req.target_spec = routed.target_spec; + parsed.req.ciphertext = routed.ciphertext; + parsed.req.nonce = routed.nonce; + parsed.req.mesh = routed.mesh; + } catch (e) { + respond(res, 502, { error: "route_failed", detail: String(e) }); + return; + } + } const outcome = acceptSend(parsed.req, { db: opts.outboxDb }); switch (outcome.kind) { case "accepted_pending": @@ -481,6 +504,79 @@ function parseSendRequest(body: unknown, idempotencyHeader: string | string[] | }; } +/** + * Sprint 4: resolve a user-friendly `to` (peer name, pubkey hex, @group, *, + * topic name, "#topicId") into a broker-format target_spec, and encrypt + * the plaintext payload appropriately for the destination kind. + * + * - DM by 64-char hex pubkey: target_spec = pubkey hex, ciphertext via + * crypto_box (recipient pubkey + sender session secret). + * - DM by display name: resolve via broker.listPeers, then same as above. + * - Group / broadcast / topic: target_spec = `@` / `*` / `#`, + * ciphertext = base64(plaintext) [matches the cold path's pre-encryption + * convention until topic crypto lands]. + */ +async function resolveAndEncrypt( + req: SendRequest, + broker: DaemonBrokerClient, + meshSecretKey: string, + meshSlug: string | null, +): Promise<{ target_spec: string; ciphertext: string; nonce: string; mesh: string }> { + const { encryptDirect } = await import("~/services/crypto/box.js"); + const { randomBytes } = await import("node:crypto"); + const to = req.to.trim(); + + // Topic by id ("#") — hex-like 20+ chars. + if (to.startsWith("#") && /^#[0-9a-z_-]{20,}$/i.test(to)) { + const ciphertext = Buffer.from(req.message, "utf8").toString("base64"); + const nonce = randomBytes(24).toString("base64"); + return { target_spec: to, ciphertext, nonce, mesh: meshSlug ?? "" }; + } + + // Group, broadcast — pass through. (Topic-by-name resolution happens + // when the daemon hooks topic_list later; not required for v1.25.0.) + if (to.startsWith("@") || to === "*") { + const ciphertext = Buffer.from(req.message, "utf8").toString("base64"); + const nonce = randomBytes(24).toString("base64"); + return { target_spec: to, ciphertext, nonce, mesh: meshSlug ?? "" }; + } + + // 64-char hex pubkey → DM directly. + if (/^[0-9a-f]{64}$/i.test(to)) { + const sessionKeys = broker.getSessionKeys(); + const senderSecret = sessionKeys?.sessionSecretKey ?? meshSecretKey; + const env = await encryptDirect(req.message, to, senderSecret); + return { target_spec: to, ciphertext: env.ciphertext, nonce: env.nonce, mesh: meshSlug ?? "" }; + } + + // Hex prefix (16+ chars but <64) → resolve via peer list prefix match. + // Matches the ergonomics of `claudemesh peer list` which shows 16-char + // prefixes, so users naturally paste prefixes back. + const peers = await broker.listPeers().catch(() => []); + if (/^[0-9a-f]{16,63}$/i.test(to)) { + const matches = peers.filter((p) => + p.pubkey.toLowerCase().startsWith(to.toLowerCase()) || + (p.memberPubkey ?? "").toLowerCase().startsWith(to.toLowerCase()), + ); + if (matches.length === 0) throw new Error(`no peer matching prefix "${to}"`); + if (matches.length > 1) throw new Error(`prefix "${to}" is ambiguous (${matches.length} matches)`); + const recipient = matches[0]!.pubkey; + const sessionKeys = broker.getSessionKeys(); + const senderSecret = sessionKeys?.sessionSecretKey ?? meshSecretKey; + const env = await encryptDirect(req.message, recipient, senderSecret); + return { target_spec: recipient, ciphertext: env.ciphertext, nonce: env.nonce, mesh: meshSlug ?? "" }; + } + + // Otherwise — display name. + const match = peers.find((p) => p.displayName.toLowerCase() === to.toLowerCase()); + if (!match) throw new Error(`peer "${to}" not found`); + const recipient = match.pubkey; + const sessionKeys = broker.getSessionKeys(); + const senderSecret = sessionKeys?.sessionSecretKey ?? meshSecretKey; + const env = await encryptDirect(req.message, recipient, senderSecret); + return { target_spec: recipient, ciphertext: env.ciphertext, nonce: env.nonce, mesh: meshSlug ?? "" }; +} + function respond(res: ServerResponse, status: number, body: unknown) { const json = JSON.stringify(body); res.statusCode = status; diff --git a/apps/cli/src/daemon/run.ts b/apps/cli/src/daemon/run.ts index 7ef50db..eb05418 100644 --- a/apps/cli/src/daemon/run.ts +++ b/apps/cli/src/daemon/run.ts @@ -159,6 +159,10 @@ export async function runDaemon(opts: RunDaemonOptions = {}): Promise { bus, broker, onPendingInserted: () => drain?.wake(), + // Sprint 4: IPC accept-send needs these to resolve targets and + // encrypt at accept time so the drain worker is just a forwarder. + meshSecretKey: mesh.secretKey, + meshSlug: mesh.slug, }); try { diff --git a/apps/cli/src/services/bridge/daemon-route.ts b/apps/cli/src/services/bridge/daemon-route.ts index 1f0ba0b..8d227f4 100644 --- a/apps/cli/src/services/bridge/daemon-route.ts +++ b/apps/cli/src/services/bridge/daemon-route.ts @@ -7,6 +7,50 @@ import { existsSync } from "node:fs"; import { ipc } from "~/daemon/ipc/client.js"; import { DAEMON_PATHS } from "~/daemon/paths.js"; +/** Try fetching the peer list through the daemon (~1ms warm IPC). + * Returns null when the daemon socket isn't present so the caller can + * fall back to bridge / cold paths. */ +export async function tryListPeersViaDaemon(): Promise { + if (!existsSync(DAEMON_PATHS.SOCK_FILE)) return null; + try { + const res = await ipc<{ peers?: unknown[] }>({ path: "/v1/peers", timeoutMs: 3_000 }); + if (res.status !== 200) return null; + return Array.isArray(res.body.peers) ? res.body.peers : []; + } catch (err) { + const msg = String(err); + if (/ENOENT|ECONNREFUSED|ipc_timeout/.test(msg)) return null; + return null; + } +} + +/** Try fetching mesh-published skills through the daemon. */ +export async function tryListSkillsViaDaemon(): Promise { + if (!existsSync(DAEMON_PATHS.SOCK_FILE)) return null; + try { + const res = await ipc<{ skills?: unknown[] }>({ path: "/v1/skills", timeoutMs: 3_000 }); + if (res.status !== 200) return null; + return Array.isArray(res.body.skills) ? res.body.skills : []; + } catch (err) { + const msg = String(err); + if (/ENOENT|ECONNREFUSED|ipc_timeout/.test(msg)) return null; + return null; + } +} + +/** Try fetching one skill body through the daemon. */ +export async function tryGetSkillViaDaemon(name: string): Promise { + if (!existsSync(DAEMON_PATHS.SOCK_FILE)) return null; + try { + const res = await ipc<{ skill?: unknown }>({ + path: `/v1/skills/${encodeURIComponent(name)}`, + timeoutMs: 3_000, + }); + if (res.status === 404) return null; + if (res.status !== 200) return null; + return res.body.skill ?? null; + } catch { return null; } +} + export type DaemonSendOk = { ok: true; messageId: string; diff --git a/docs/roadmap.md b/docs/roadmap.md index a97b449..40c3078 100644 --- a/docs/roadmap.md +++ b/docs/roadmap.md @@ -209,12 +209,40 @@ Locked spec: `.artifacts/shipped/2026-05-03-daemon-spec-v0.9.0.md`. --- +## v0.9.x — *daemon promotion: required + thin MCP* — *shipped* + +The v0.9.0 foundation got promoted in three quick releases: + +- **1.24.0** — daemon required for in-Claude-Code use. MCP server + shrinks from 979 to ~200 LoC of push-pipe (rest is the unrelated + mesh-service proxy mode). `claudemesh install` auto-installs and + starts the daemon service. `claudemesh launch` ensures daemon is + running before spawning Claude. +- **1.25.0** — Sprint 4 outbound routing fix. Daemon was sending + every outbox row as broadcast (`*`); now resolves and encrypts at + IPC accept time, drain is a forwarder. Adds `mesh`, `target_spec`, + `nonce`, `ciphertext`, `priority` columns to the outbox. +- **1.25.0** — CLI thin-client routing for `peer list`, + `skill list`, `skill get`. Same daemon-first / bridge / cold-path + fallback shape as `trySendViaDaemon`. +- **1.25.0** — ambient mode: raw `claude` Just Works after + `claudemesh install`. No more `claudemesh launch` ceremony for the + common case. + +What this leaves on the v2.0.0 redesign roadmap is documented at +`.artifacts/specs/2026-05-04-v2-roadmap-completion.md`: daemon +multi-mesh, full CLI-to-thin-client conversion, mesh→workspace +rename, HKDF identity. + +--- + ## v2.0.0 — *the daemon redesign* The single largest architectural shift. Promotes the persistent thing (the user's account + identity) to a persistent process (the daemon), demotes the ephemeral thing (the Claude session) to a thin -client. +client. **Half-shipped via 1.24.0 + 1.25.0; remainder spec'd at +`.artifacts/specs/2026-05-04-v2-roadmap-completion.md`.** - **`claudemesh-daemon`** — long-lived per-user launchd / systemd unit. One WebSocket per workspace, persistent across reboots and