Add the foundation for deploying and managing MCP servers on the VPS
broker, with per-peer credential vaults and visibility scopes.
Architecture:
- One Docker container per mesh with a Node supervisor
- Each MCP server runs as a child process with its own stdio pipe
- claudemesh launch installs native MCP entries in ~/.claude.json
- Mid-session deploys fall back to svc__* dynamic tools + list_changed
New components:
- DB: mesh.service + mesh.vault_entry tables, mesh.skill extensions
- Broker: 19 wire protocol types, 11 message handlers, service catalog
in hello_ack with scope filtering, service-manager.ts (775 lines)
- CLI: 13 tool definitions, 12 WS client methods, tool call handlers,
startServiceProxy() for native MCP proxy mode
- Launch: catalog fetch, native MCP entry install, stale sweep, cleanup,
MCP_TIMEOUT=30s, MAX_MCP_OUTPUT_TOKENS=50k
Security: path sanitization on service names, column whitelist on
upsertService, returning()-based delete checks, vault E2E encryption.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Persistent MCP servers (opt-in via `persistent: true`) survive host
disconnects — they appear as offline in mcp_list and auto-restore when
the host reconnects. Ephemeral servers (default) still clean up on
disconnect. Offline servers return a clear error on mcp_call with
time-since-disconnect info.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Save groups, profile, visibility, summary, display name, and cumulative
stats to a new mesh.peer_state table on disconnect. On reconnect (same
meshId + memberId), restore them automatically — hello groups take
precedence over stored groups if provided. Broadcast peer_returned
system event with last-seen time and summary to other peers.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Peers report os.hostname() in the hello handshake. list_peers shows
[local] or [remote] tag per peer. MCP instructions teach AI to read
local peers' files directly via filesystem instead of relay.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Broker-driven clock that broadcasts periodic heartbeat ticks to all
peers in a mesh. Speed is configurable from x1 (real-time, 60s ticks)
to x100 (600ms ticks) for load testing simulations. Auto-pauses when
the last peer disconnects.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Adds share_skill, get_skill, list_skills, and remove_skill across the full
stack (Drizzle schema, broker CRUD + WS handlers, CLI client methods, MCP
tools). Skills are mesh-scoped, unique by name, and searchable via ILIKE
on name/description/tags.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Peers self-report resource usage via set_stats; stats visible in
list_peers responses and the new mesh_stats MCP tool. CLI auto-reports
every 60s and tracks messagesIn/Out, toolCalls, uptime, and errors.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Peers can register MCP servers with the mesh and other peers can invoke
those tools through the existing claudemesh connection without restarting.
Broker: in-memory MCP registry with mcp_register/unregister/list/call
handlers, call forwarding to hosting peer with 30s timeout, and automatic
cleanup on peer disconnect.
CLI: mcpRegister/mcpUnregister/mcpList/mcpCall client methods, inbound
mcp_call_forward handler, and 4 new MCP tools (mesh_mcp_register,
mesh_mcp_list, mesh_tool_call, mesh_mcp_remove).
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Replace in-memory-only setTimeout scheduling with a DB-backed system
that survives broker restarts. Adds:
- `scheduled_message` table in mesh schema (Drizzle + raw CREATE TABLE
for zero-downtime deploys)
- Minimal 5-field cron parser (no dependencies) with next-fire-time
calculation for recurring entries
- On broker boot, all non-cancelled entries are loaded from PostgreSQL
and timers re-armed automatically
- CLI `schedule_reminder` MCP tool accepts optional `cron` expression
- CLI `remind` command accepts `--cron` flag
- One-shot reminders remain backward compatible — no cron field = same
behavior as before
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Extend the WS hello handshake with optional peerType, channel, and model
fields so peers can advertise what kind of client they are. The broker
stores these in-memory on PeerConn and returns them (along with cwd) in
the peers_list response. CLI peers command and MCP list_peers tool now
display the new metadata.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
When a peer connects or disconnects, the broker now broadcasts a
system push (subtype: "system") to all other peers in the same mesh.
The CLI formats these as [system] channel notifications so AI sessions
can react to topology changes without polling.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Replace all 22 resolver Array<fn> patterns with Map<reqId, {resolve, timer}>.
Outgoing messages now include _reqId; on response the broker's echoed _reqId
is used for exact matching, with FIFO fallback for brokers that don't echo it.
Add makeReqId() helper and resolveFromMap() utility. Error propagation block
updated to iterate Maps and pop the oldest entry across all queues.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- claim_task/complete_task: send taskId not id
- graph_result: read msg.records not msg.rows
- message_status: try all mesh clients, not only first
- broker: omit state_result for set_state (fixes get_state cross-contamination)
- error handler: unblock first pending resolver on unmatched broker errors
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Node.js stdout to a pipe is buffered. Without periodic event loop
activity, WS callback → server.notification() → stdout.write() may
not flush until the next I/O event. A 1s setInterval (NOT unref'd)
keeps the event loop ticking so notifications flush immediately.
This is why claude-intercom worked: its 1s HTTP poll kept the event
loop active as a side effect. Claudemesh's passive WS listener let
the event loop settle, causing stdout to buffer indefinitely.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Files: MinIO-backed file sharing built into the broker.
share_file for persistent mesh files, send_message(file:) for
ephemeral attachments. Presigned URLs for download, access
tracking per peer.
Broker infra: MinIO in docker-compose, internal network.
HTTP POST /upload endpoint. WS handlers for get_file,
list_files, file_status, delete_file.
Multi-target: send_message(to:) accepts string or array.
Targets deduplicated before delivery.
Targeted views: MCP instructions teach Claude to send
tailored messages per audience instead of generic broadcasts.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Phase B + C + message delivery status.
State: shared key-value store per mesh. set_state pushes changes
to all peers. get_state/list_state for reads. Peers coordinate
through shared facts instead of messages.
Memory: persistent knowledge with full-text search (tsvector).
remember/recall/forget. New peers recall context from past sessions.
message_status: check delivery status with per-recipient detail
(delivered/held/disconnected).
Multicast fix: broadcast and @group messages now push directly to
all connected peers instead of racing through queue drain.
MCP instructions: dynamic identity injection (name, groups, role),
comprehensive tool reference, group coordination guide.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Phase A of the claudemesh spec. Peers can now join named groups
with roles, and messages route to @group targets.
Broker:
- @group routing in fan-out (matches peer group membership)
- @all alias for broadcast
- join_group/leave_group WS messages + DB persistence
- list_peers returns group metadata
- drainForMember matches @group targetSpecs in SQL
CLI:
- join_group/leave_group MCP tools
- send_message supports @group targets
- list_peers shows group membership
- PeerInfo includes groups array
- Peer name cache for push notifications
Launch:
- --role flag (optional peer role)
- --groups flag (comma-separated, e.g. "frontend:lead,reviewers")
- Interactive wizard for role + groups when flags omitted
- Groups written to session config for broker hello
Spec: SPEC.md added with full v0.2 vision (groups, state, memory)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Write displayName into tmpdir config.json so the MCP server reads
it directly. Env vars from claudemesh launch may not propagate to
MCP child processes spawned by Claude Code. Config file is reliable.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Each WS connection generates its own ed25519 keypair (sessionPubkey)
sent in the hello handshake. The broker stores it on the presence
row and uses it for message routing + list_peers. This gives every
`claudemesh launch` a unique crypto identity without burning invite
uses — member auth stays permanent, session identity is ephemeral.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Remove ALL Payload imports, withPayload wrapper, and (payload)
routes. Blog index + changelog are now static data arrays.
Blog post at /blog/peer-messaging-claude-code is static TSX.
Payload CMS stays as a dev dependency for future local admin
but has zero presence in the production build.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The private setStatus(ConnStatus) conflicted with the public
setStatus("idle"|"working"|"dnd") method, causing TS2393 under strict
typecheck. Rename the private one to setConnStatus and update all
internal call sites.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The push handler previously fell through to base64-decoding the
raw ciphertext whenever decryptDirect() returned null. For direct
(crypto_box) messages that produces garbage binary which surfaces
as garbled bytes in Claude's <channel> reminder. Limit the base64
fallback to legacy broadcast/channel messages (no senderPubkey),
and emit a clearer "⚠ message from <pubkey> failed to decrypt"
warning when direct decryption fails.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
WS handshake is now authenticated end-to-end. The broker proves that
every connected peer actually holds the secret key for the pubkey
they claim as identity — not just that they know the pubkey.
wire format change:
{type:"hello", meshId, memberId, pubkey, sessionId, pid, cwd,
timestamp, signature}
where signature = ed25519_sign(canonical, secretKey)
and canonical = `${meshId}|${memberId}|${pubkey}|${timestamp}`
broker verifies on every hello:
1. timestamp within ±60s of broker clock → else close(1008, timestamp_skew)
2. pubkey is 64 hex chars, signature is 128 hex chars → else malformed
3. crypto_sign_verify_detached(signature, canonical, pubkey) → else bad_signature
4. (existing) mesh.member row exists for (meshId, pubkey) → else unauthorized
All rejection paths close the WS with code 1008 + structured error
message + metrics counter increment (connections_rejected_total by
reason).
new modules:
- apps/broker/src/crypto.ts: canonicalHello, verifyHelloSignature,
HELLO_SKEW_MS constant
- apps/cli/src/crypto/hello-sig.ts: matching signHello helper
clients updated:
- apps/cli/src/ws/client.ts: signs hello before send
- apps/broker/scripts/{peer-a,peer-b}.ts (smoke-test): sign hellos
with seed-provided secret keys
new regression tests — tests/hello-signature.test.ts (7):
- valid signature accepted
- bad signature (signed with wrong key) rejected
- timestamp too old rejected (>60s)
- timestamp too far in future rejected (>60s)
- tampered canonical field (different meshId at verify time) rejected
- malformed hex pubkey rejected
- malformed signature length rejected
verified live:
- apps/broker/scripts/smoke-test.sh: full hello+ack+send+push flow
- apps/cli/scripts/roundtrip.ts: signed hello + encrypted message
- 55/55 tests pass
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Direct messages between peers are now end-to-end encrypted. The
broker only ever sees {nonce, ciphertext} — plaintext lives on the
two endpoints.
apps/cli/src/crypto/envelope.ts:
- encryptDirect(message, recipientPubkeyHex, senderSecretKeyHex)
→ {nonce, ciphertext} via crypto_box_easy, 24-byte fresh nonce
- decryptDirect(envelope, senderPubkeyHex, recipientSecretKeyHex)
→ plaintext or null (null on MAC failure / malformed input)
- ed25519 keys (from Step 17) are converted to X25519 on the fly via
crypto_sign_ed25519_{pk,sk}_to_curve25519 — one signing keypair
covers both signing + encryption roles.
BrokerClient.send():
- if targetSpec is a 64-hex pubkey → encrypt via crypto_box
- else (broadcast "*" or channel "#foo") → base64-wrapped plaintext
(shared-key encryption for channels lands in a later step)
InboundPush now carries:
- plaintext: string | null (decrypted body, null if decryption failed
OR it's a non-direct message)
- kind: "direct" | "broadcast" | "channel" | "unknown"
MCP check_messages formatter reads plaintext directly.
side-fixes pulled in during 18a:
- apps/broker/scripts/seed-test-mesh.ts now generates real ed25519
keypairs (the previous "aaaa…" / "bbbb…" fillers weren't valid
curve points, so crypto_sign_ed25519_pk_to_curve25519 rejected
them). Seed output now includes secretKey for each peer.
- apps/broker/src/broker.ts drainForMember wraps the atomic claim in
a CTE + outer ORDER BY so FIFO ordering is SQL-sourced, not
JS-sorted (Postgres microsecond timestamps collapse to the same
Date.getTime() milliseconds otherwise).
- vitest.config.ts fileParallelism: false — test files share
DB state via cleanupAllTestMeshes afterAll, so running them in
parallel caused one file's cleanup to race another's inserts.
- integration/health.test.ts "returns 200" now uses waitFullyHealthy
(a 200-only waiter) instead of waitHealthyOrAny — prevents a race
with the startup DB ping.
verified live:
- apps/cli/scripts/roundtrip.ts (direct A→B): ciphertext in DB is
opaque bytes (not base64-plaintext), decrypted correctly on arrival
- apps/cli/scripts/join-roundtrip.ts (full join → encrypted send):
PASSED
- 48/48 broker tests green
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
broker-client: full WS client with hello handshake + ack, auto-reconnect
with exponential backoff (1s → 30s capped), in-memory outbound queue
(max 100) during reconnect, 500-entry push buffer for check_messages.
MCP tool integration:
- send_message: "slug:target" prefix or single-mesh fast path
- check_messages: drains push buffers across all clients
- set_status: fans manual override across all connected meshes
- set_summary: stubbed (broker protocol extension needed)
- list_peers: stubbed — lists connected mesh slugs + statuses
manager module holds Map<meshId, BrokerClient>, starts on MCP server
boot for every joined mesh in ~/.claudemesh/config.json.
new CLI command: seed-test-mesh injects a mesh row for dev testing.
also fixes a broker-side hello race: handleHello sent hello_ack before
the caller closure assigned presenceId, so clients sending right after
the ack hit the no_hello check. Fix: return presenceId, caller sets
closure var, THEN sends hello_ack. Queue drain is fire-and-forget now.
round-trip verified: two clients, A→B, push received with correct
senderPubkey + ciphertext. 44/44 broker tests still pass.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>