Files
claudemesh/.artifacts/specs/2026-05-02-architecture-north-star.md
Alejandro Gutiérrez b4f457fceb
Some checks failed
CI / Lint (push) Has been cancelled
CI / Typecheck (push) Has been cancelled
CI / Broker tests (Postgres) (push) Has been cancelled
CI / Docker build (linux/amd64) (push) Has been cancelled
feat(cli): 1.5.0 — CLI-first architecture, tool-less MCP, policy engine
CLI becomes the API; MCP becomes a tool-less push-pipe. Bundle -42%
(250 KB → 146 KB) after stripping ~1700 lines of dead tool handlers.

- Tool-less MCP: tools/list returns []. Inbound peer messages still
  arrive as experimental.claude/channel notifications mid-turn.
- Resource-noun-verb CLI: peer list, message send, memory recall, etc.
  Legacy flat verbs (peers, send, remember) remain as aliases.
- Bundled claudemesh skill auto-installed by `claudemesh install` —
  sole CLI-discoverability surface for Claude.
- Unix-socket bridge: CLI invocations dial the push-pipe's warm WS
  (~220 ms warm vs ~600 ms cold).
- --mesh <slug> flag: connect a session to multiple meshes.
- Policy engine: every broker-touching verb runs through a YAML gate
  at ~/.claudemesh/policy.yaml (auto-created). Destructive verbs
  prompt; non-TTY auto-denies. Audit log at ~/.claudemesh/audit.log.
- --approval-mode plan|read-only|write|yolo + --policy <path>.

Spec: .artifacts/specs/2026-05-02-architecture-north-star.md.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-02 01:18:19 +01:00

16 KiB
Raw Blame History

title, status, target, author, date, supersedes, references
title status target author date supersedes references
claudemesh North Star — CLI-first with claude/channel push-pipe canonical 2.0.0 Alejandro 2026-05-02 none
2026-05-01-mcp-tool-surface-trim.md (first cut at the trim)
SPEC.md
docs/protocol.md

claudemesh North Star

The commitment, in one sentence

CLI is the canonical surface for every claudemesh operation. MCP exists for one thing: to deliver claude/channel push notifications mid-turn. That's the killer feature, and it's the only reason an MCP server runs at all.

Everything else — sending messages, listing peers, sharing files, deploying mesh-MCPs, running graph queries, scheduling jobs, publishing skills — is invoked from the CLI, by humans, scripts, cron, hooks, or by Claude itself via Bash.

Why this shape

  1. Mid-turn interrupt is the differentiator. When peer A sends to peer B, B's Claude session pauses what it's doing and reads the message immediately. That requires claude/channel notifications routed through an MCP transport — Claude Code only watches MCP server connections for those events. Lose that, and claudemesh becomes another inbox-polling pattern. Every other primitive can degrade to "delivered at next tool boundary"; this one cannot.

  2. CLI is universal. Bash works in scripts, hooks, cron, CI, terminals, automation, and Claude itself (via Bash tool calls). A primitive that exists as both an MCP tool and a CLI verb is double-maintenance with one calling convention nobody actually wants.

  3. JSON-on-stdout is enough structure. Claude reads claudemesh peers --json exactly as well as it reads a typed MCP tool return. The CLI man page is the schema. The "MCP gives structured I/O" advantage was real when we were paying for nothing else, but warm-WS via socket bridge (below) closes the cost gap.

  4. Surface shrinks where it matters. ToolSearch deferred-tool list drops from ~80 entries to ~0 entries (push-pipe registers no tools). Massive context-budget win for every Claude session.

Prior art (this is not novel architecture)

The "live-state daemon + thin scriptable CLI talking via Unix socket" pattern is the canonical shape for CLIs in this category. Reviewers should not treat this as bespoke design:

  • Dockerdockerd daemon, CLI talks via /var/run/docker.sock. DOCKER_HOST env override. docker context for multi-daemon switching.
  • Tailscaletailscaled daemon, tailscale CLI via socket. Per-key ACL identity model. Same peer-mesh-with-keypairs shape as claudemesh.
  • Stripe listen — long-running CLI daemon receives webhook push, forwards to local consumer. Same push-pipe-as-CLI-subcommand shape.
  • Obsidian CLI — talks to a running Obsidian instance via REST. Notable: ships a Claude skill (~/.claude/skills/obsidian-cli/SKILL.md) that documents every verb and flag for Claude consumption — replacing MCP tool introspection entirely.

Claudemesh's CLI-first + push-pipe + socket-bridge architecture is exactly this pattern. We are following the well-trodden path, not inventing a new one.

The six architectural commitments

1. MCP server is a push-pipe, full stop.

The MCP entrypoint (claudemesh mcp [--mesh <slug>]) does exactly three things:

  • Holds a WS connection to the broker for the meshes it's bound to.
  • Decrypts inbound peer messages.
  • Emits them as claude/channel notifications to the parent Claude Code session.

It registers zero tools. It advertises only experimental: { "claude/channel": {} }. Its tools/list returns an empty array. There is no surface to discover, search, or call.

One push-pipe per joined mesh, registered in ~/.claude.json via claudemesh install (or auto-injected by claudemesh launch). The --mesh flag (shipped 1.0.3) makes this trivial.

2. CLI is the canonical surface for every primitive.

Every resource has uniform CLI verbs:

Resource Verbs
peer claudemesh peers [--json] [--mesh X]
group claudemesh group join/leave @<n> [--role X]
message claudemesh send <to> <msg>, claudemesh inbox, claudemesh msg-status <id>
state claudemesh state get/set/list [--json]
memory claudemesh remember/recall/forget
task claudemesh task create/claim/complete/list
file claudemesh file put/get/list/grant/delete
vector claudemesh vector store/search/delete
graph claudemesh graph query/execute/watch
stream claudemesh stream create/publish/subscribe/list
context claudemesh context share/get/list
skill claudemesh skill publish/list/get/remove
schedule claudemesh schedule msg/webhook/tool/list/cancel
webhook claudemesh webhook create/list/delete
watch claudemesh watch create/list/unwatch
mcp claudemesh mesh-mcp deploy/list/call/undeploy/catalog
clock claudemesh clock get/set/pause/resume
sql claudemesh sql query/schema/execute
vault claudemesh vault set/get/list/delete
profile claudemesh profile/summary/visible/status set

Every verb supports --json for structured consumption. Every verb supports --mesh <slug> for targeting (default: pick first or interactive picker). Verbs share one broker-call implementation — no duplication between CLI and MCP.

3. Warm path via Unix socket bridge (load-bearing for 2.0).

A push-pipe holds a live WS connection. CLI invocations should reuse that connection rather than opening their own (which costs ~300-500ms cold-start).

Mechanism:

  • On startup, push-pipe creates ~/.claudemesh/sockets/<mesh-slug>.sock (Unix domain socket, mode 0600).
  • CLI verbs that need broker round-trip first try to dial that socket.
  • If alive: forward request, get response back over socket (~5ms).
  • If absent / stale: open ephemeral WS, do the op, close (~300ms — fine for cron/scripts where there's no parent push-pipe).

Push-pipe owns one WS, all ops through that WS, broker sees ONE session per mesh per host (no duplicate hellos). On crash, socket file is unlinked by unlink on exit handler; stale-socket detection by connect() ECONNREFUSED.

This is mandatory for 2.0 — without it, every CLI op pays cold-start, and CLI-first becomes unusably slow for tight loops.

4. JSON output is the schema, with field selection and streaming.

Every CLI verb has a deterministic --json output shape, documented in docs/cli-schemas.md, validated by zod parsers in tests. Claude reads claudemesh vector search "x" --json and gets a typed-array shape it can reason over identically to a tool return.

Three output modes, mandatory across every read-shaped verb (modeled on gh and gemini):

  • --json — full record, all fields
  • --json <fields> — field-selected projection (e.g. claudemesh peers --json name,pubkey,status)
  • --output-format stream-json — incremental JSONL for long-running ops (mesh-MCP calls fanning across peers, vector search against large indexes, schedule list with many entries). One object per line, Claude consumes incrementally.

Plus convenience output:

  • --jq <expr> — native jq filter pipeline
  • --template '{{.field}}' — Go template formatting

schema_version: "1.0" field on every JSON output — mandatory. Bumps when shape changes. Old code paths can pin with --schema-version=1.0.

5. All features stay. Nothing is removed.

This is not a feature trim. Every primitive in the current 80-tool surface gets a CLI verb. Vectors, graphs, mesh-MCP, files, vault, SQL — all of it. The user-facing pitch is unchanged: "claudemesh gives your Claude session a name, a network, shared memory, shared compute, shared skills, scheduled actions." The change is how you call it.

6. The Claude skill IS the schema. (load-bearing for CLI-first to work)

Stripping MCP tool introspection (tools/list) costs Claude its discoverability. The replacement: a packaged claudemesh skill at ~/.claude/skills/claudemesh/SKILL.md written by claudemesh install, documenting every verb, flag, JSON shape, and gotcha. Claude reads it on demand via the Skill tool — not on every session, not pre-loaded into deferred-tool-list. This is exactly how obsidian-cli works today and it works perfectly.

The skill replaces three things at once:

  • Tool discovery — Claude knows the verb-set after one Skill invocation. No tools/list needed.
  • Output schemas — every JSON shape is documented in the skill, so Claude knows what to expect from --json without parsing TypeScript types at runtime.
  • Behavioral conventions — the skill teaches "preview before delete," "confirm peer match before kick," "use --mesh for cross-mesh ops" — soft guardrails that complement the policy engine's hard rules.

Topic-shards for size: claudemesh (core), claudemesh-platform (vault/vectors/graph/sql/mesh-mcp), claudemesh-schedule (cron/webhooks/watches), claudemesh-admin (kick/ban/grants/install). Each shard is independently loadable.

This is the answer to the "JSON-on-stdout is a worse schema" caveat. It's not — when Claude has a documented skill to load, the CLI surface is more discoverable than 80 deferred MCP tools that bloat ToolSearch silently.

7. Pluggable policy engine, not binary --yes. (answers the Bash-blast-radius caveat)

Modeled on gemini --policy / --admin-policy and codex --sandbox. Replace the current binary -y/--yes with:

  • --approval-mode plan|read-only|write|yolo — four levels (read-only blocks all writes, plan blocks all side effects, write prompts on dangerous verbs, yolo skips all confirmation).
  • --policy <file> — YAML allow/deny rules per resource × verb × peer. Sample:
# ~/.claudemesh/policy.yaml
default: prompt
rules:
  - resource: send
    verb: "*"
    decision: allow
  - resource: sql
    verb: execute
    decision: prompt
  - resource: file
    verb: delete
    decision: deny
  - resource: mesh-mcp
    verb: call
    peers: ["@trusted"]
    decision: allow

Policy decisions log to a tamper-evident audit file. Org admin can ship --admin-policy that overrides user config. This is the real answer to "Bash carries unrestricted blast-radius once allowed" — claudemesh's own policy engine kicks in before the broker call, regardless of what shell permissions are.

What this means for claude/channel

When peer A's CLI runs claudemesh send peer-B "hello":

  1. CLI dials ~/.claudemesh/sockets/<mesh>.sock (warm path) or opens its own WS (cold).
  2. Encrypts message with peer-B's pubkey via crypto_box.
  3. Broker receives send envelope, forwards encrypted blob to peer-B's connected push-pipe.
  4. Peer-B's push-pipe decrypts and emits a claude/channel notification.
  5. Claude Code mid-turn-injects the message as a <channel source="claudemesh" ...> reminder.
  6. Claude responds immediately per the system prompt convention.

Step 5 is the only step that requires MCP. Steps 1-4 are pure CLI + broker. The architecture is "CLI for everything, MCP for the one thing it's irreplaceable for."

Migration path from 1.1.0

Version Ships Behavior
1.2.0 Unix socket bridge. CLI verbs auto-detect push-pipe and use warm path. Field-selectable JSON (--json a,b,c) + --jq + --template adopted. All existing MCP tools still work. Nothing breaks.
1.2.1 Ships ~/.claude/skills/claudemesh/SKILL.md written by claudemesh install. Includes full verb reference + output schemas + gotchas. Topic-shards (-platform, -schedule, -admin). Skill auto-installs on claudemesh install.
1.3.0 Schedule unification (schedule msg/webhook/tool). All remaining missing CLI verbs (file, vector, graph, mesh-mcp, vault, sql, stream, context, skill, watch). --output-format stream-json for long-running ops. All existing MCP tools still work. New verbs additive.
1.4.0 Resource-model rename pass — every CLI verb is <resource> <verb>. Old verbs become aliases. All existing MCP tools still work. Old CLI verbs aliased forever.
1.5.0 Pluggable policy engine (--approval-mode, --policy, --admin-policy). MCP tools/list shrinks to configurable allowlist (default: empty). CLAUDEMESH_MCP_FAT=1 for users who need typed tool surface. Default 1.5 install: MCP exposes zero tools. Push-pipe-only. Policy engine gates all writes.
2.0.0 MCP server hardcoded to push-pipe-only. Strip all tool registrations + handlers. Old MCP tool calls return tool-not-found. Users must update scripts to CLI verbs. Old CLI verbs (1.4 aliases) still work.

What stays exactly the same

  • Crypto: ed25519 sign + x25519 sealing + crypto_box for DMs. No change.
  • Broker protocol: WS frame format, hello flow, audit log. No change.
  • Membership / mesh-scope / capability grants. No change.
  • Web app, dashboard, Telegram bridge, OAuth. No change.
  • The platform vision (vault, vectors, graph, files, skills, mesh-MCPs, scheduled jobs). All shipped, all stay.

What changes for users

  • ~/.claude.json simplifies: "claudemesh": { "command": "claudemesh", "args": ["mcp"] } becomes one entry per joined mesh after claudemesh install. Multi-mesh push works out of the box.
  • ToolSearch loses ~80 deferred entries. Sessions are lighter.
  • Scripts that called mcp__claudemesh__* get a deprecation warning in 1.x, break in 2.0 — replaced by claudemesh <verb> --json + jq.
  • Claude Code system prompt for the MCP server gets shorter (no tool catalog), focused only on "RESPOND IMMEDIATELY to channel events."

Open questions parked for future specs

  • Federation — broker-to-broker encrypted relay so peers on different brokers can talk. Not in 2.0 scope.
  • Offline-with-TTL inbox — persist now priority messages on broker if recipient is offline, with explicit TTL. Reasonable for 2.x.
  • Compute attribution — when peer X invokes a mesh-MCP that peer Y deployed, who pays for broker compute / outbound calls? Pre-empts the eventual billing question. 2.x.
  • Universal hash-chained audit — every state mutation per mesh is hash-chained, replayable, verifiable. Today only some events are; making it universal is its own spec.
  • ACP (Agent Communication Protocol) interop with Gemini CLI. Gemini CLI exposes --acp for agent-to-agent comms — the same problem domain claudemesh occupies. Research question: is ACP a documented standard claudemesh can speak (making claudemesh peers and Gemini peers cross-talk in the same mesh), or is it Google-proprietary? If standard, implementing it is a major platform expansion. File as separate research spec before 2.x.

What this spec is NOT

  • Not a redesign of the broker. The broker stays as-is.
  • Not a redesign of crypto. Crypto stays as-is.
  • Not a feature deprecation. Every feature stays.
  • Not optional. This is the canonical 2.0 architecture; intermediate versions migrate toward it.

Effort estimate to 2.0

Sequential, single dev (revised after caveats survey — original estimate was rosy):

  • 1.2.0 (socket bridge + field-JSON): 1-2 weeks. Socket bridge is real distributed-systems work (stale-cleanup, version negotiation, NFS/Windows edge cases) — not 2-3 days.
  • 1.2.1 (claudemesh skill + topic shards): 2-3 days. Mostly content writing once schemas are documented.
  • 1.3.0 (schedule unification + remaining verbs + stream-json): 1 week. Each of the ~10 missing verbs is small but adds up.
  • 1.4.0 (resource-model rename + alias compat): 2-3 days.
  • 1.5.0 (policy engine + MCP allowlist): 4-5 days. Policy engine is its own subsystem — parser, evaluator, audit log, admin override.
  • 2.0.0 (strip tool handlers + cutover): 2 days.

Total: ~5-6 weeks of focused work spread over 3-4 months calendar. Each release is independently shippable; the policy engine specifically can land later than 1.5 if needed.

Acceptance signals — how we know it worked

  • ToolSearch in a freshly-installed claudemesh session shows zero mcp__claudemesh__* entries by default (vs ~80 today).
  • claudemesh peers --json name,status projects exactly two fields, no extra noise.
  • claudemesh send <peer> "hi" from a Bash call inside a Claude session round-trips in <50ms (warm path via socket bridge) on localhost-broker, <250ms on EU-from-US.
  • Skill: claudemesh loaded once teaches Claude the entire mesh surface; subsequent CLI calls require no further introspection.
  • A policy file with decision: deny for file delete blocks the call before it hits the broker, with a clear stderr explanation.
  • claudemesh status set working from cron opens its own WS (no daemon), succeeds in <1s, no orphan connections on broker.