Fixes the 'live peer looks disconnected' class of bugs. Two layers:
ROOT CAUSE — involuntary mesh context loss:
The session→mesh binding lived only in the daemon's in-memory registry,
so a daemon restart (e.g. `daemon down && up`) wiped it. Every live
session then lost its mesh, and CLI commands fell back to an arbitrary
default mesh — a peer that never moved looked offline.
Fix: persist session bindings to ~/.claudemesh/daemon/sessions.json
(secret-free — keypairs reload from the per-session keypair store). On
boot the daemon rehydrates each binding whose pid is still alive (with a
start-time PID-reuse guard), reloads its keypair, re-signs a parent
attestation, and re-registers it — which reconnects its SessionBroker
WS. Restarts are now transparent; sessions keep their mesh.
DEFENSIVE LAYER — cross-mesh send resolution:
`send` without --mesh and several joined meshes returned mesh_required;
a prefix under --mesh X resolved against the default mesh's roster, not
X's (only the full 64-char pubkey worked). Now a name/prefix is resolved
across all joined meshes (or scoped to --mesh): unique match auto-selects
its mesh, multi-mesh match asks for --mesh, none gives a clear error.
Kills mesh_required for peers on a non-default mesh and fixes P3.
Maps to field-report P1/P2/P3. P4 (shared member) left as-is (by design).
New: 5 persistence unit tests. Full suite 119/119. Daemon boot verified.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Session identity is now anchored on Claude Code's session UUID instead of a
fresh random keypair per launch. The ed25519 session keypair is generated
once per (mesh, session UUID) and persisted under
~/.claudemesh/sessions/<mesh>/<uuid>.json, so relaunching or --resume-ing the
same session reuses the same sessionPubkey.
Why: a DM is sealed (crypto_box) to the recipient's sessionPubkey. With
ephemeral per-launch keys, the pubkey rotated on every relaunch, so queued
messages became undecryptable AND the old presence lingered as a same-name
ghost that won queued-DM claim races. Reconnecting could not recover the
peer because it minted yet another key. On --resume the CLI also registered
a throwaway random id unrelated to the resumed session, so the broker never
recognized the returning peer.
CLI (launch.ts):
- resolve the stable UUID for all paths: fresh mints + forces via
--session-id; --resume V registers V; --continue resolves the most-recent
session UUID from ~/.claude/projects/<cwd>.
- use loadOrCreateSessionKeypair(mesh, uuid) instead of generateKeypair().
CLI (daemon/run.ts):
- onRegister closes any prior SessionBrokerClient holding the same pubkey
under a different token (the leaked-WS ghost).
Broker (handleSessionHello):
- reattach by sessionPubkey regardless of lease state (online or grace),
closing the stale socket — enforces one live presence per session pubkey,
killing the duplicate and draining queued DMs on return.
Trade-off: session secret keys now persist on disk (the member key already
does); SPEC.md updated to reflect the stable-identity model. Older CLIs
remain compatible (they keep using ephemeral keys).
New: keypair-store.ts + 7 unit tests. Full CLI suite: 114/114 green.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Seven-ship sequence that took the daemon from "works for one session"
to "internally consistent for N sessions on one daemon." Architecture
invariant after 1.34.13: every shared store / channel scopes by
recipient (SSE demux at bind layer + token forwarding, inbox per-
recipient columns, outbox sender-session routing).
- 1.34.7 inbox flush + delete commands
- 1.34.8 seen_at column + TTL prune + first echo guard
- 1.34.9 broader echo guard + system-event polish + staleness warning
- 1.34.10 per-session SSE demux (SseFilterOptions) + universal daemon
(--mesh / --name deprecated) + daemon_started version stamp
- 1.34.11 inbox per-recipient column (storage half of 1.34.10)
- 1.34.12 daemon up detaches by default (logs to ~/.claudemesh/daemon/
daemon.log; service units explicitly pass --foreground)
- 1.34.13 MCP forwards session token on /v1/events — the actual fix
that activates 1.34.10's demux. Without this header the
daemon's session resolved null, filter was empty, every MCP
received the unfiltered global stream.
Roadmap entry at docs/roadmap.md captures the timeline + the four
known gaps tracked for follow-ups (launch env-var leak, broker
listPeers mesh-filter, kick on control-plane no-op, session caps as
first-class concept).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Resolves the merge of m1-broker-drain-race-and-presence-role and
m1-cli-lifecycle-and-role-peer-list into main:
* Rename wire-level role classification field `role` → `peerRole`
to avoid collision with 1.31.5's top-level `role` lift of
`profile.role` (user-supplied string consumed by the agent-vibes
claudemesh skill). `peerRole` is the broker presence taxonomy
(control-plane/session/service); top-level `role` keeps its 1.31.5
semantics.
- apps/broker/src/broker.ts (listPeersInMesh return)
- apps/broker/src/index.ts (peers_list response)
- apps/broker/src/types.ts (WSPeersListMessage)
- apps/cli/src/commands/peers.ts (PeerRecord + filter + lift)
* Wire CLI client_ack emission: handleBrokerPush gains
ackClientMessage callback; daemon-WS and session-WS each got a
sendClientAck() method that frames {type:"client_ack",
clientMessageId, brokerMessageId?} and forwards via the lifecycle
helper. Run.ts wires the callback into both onPush paths.
Receiver dedupes against existing inbox row first then acks
unconditionally — broker needs the ack regardless of dedupe to
release its claim lease.
- apps/cli/src/daemon/inbound.ts (ackClientMessage in InboundContext)
- apps/cli/src/daemon/broker.ts + session-broker.ts (sendClientAck)
- apps/cli/src/daemon/run.ts (wire-up)
* Version bump 1.32.1 → 1.33.0; CHANGELOG entry replaces "Unreleased"
with full m1 description.
Verification: tsc clean across cli + broker; CLI 83/83 unit tests
pass; broker 50 unit tests pass (5 integration test files require a
live Postgres and were skipped — pre-existing infra gap, not a
regression). CLI bundle rebuilt; version 1.33.0 baked.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Foundational cleanups before agentic-comms architecture work
(.artifacts/specs/2026-05-04-agentic-comms-architecture-v2.md).
All behavior-preserving.
1. Extract `connectWsWithBackoff` into apps/cli/src/daemon/ws-lifecycle.ts.
Both DaemonBrokerClient and SessionBrokerClient now share one
lifecycle implementation (connect, hello-handshake, ack-timeout,
close + backoff reconnect). Each client provides its own buildHello
/ isHelloAck / onMessage hooks and keeps its own RPC bookkeeping
(pendingAcks, peerListResolvers, onPush). Composition over
inheritance per Codex's review; no protocol shape changes.
2. Drop daemon-WS ephemeral session pubkey. DaemonBrokerClient no
longer mints + sends a per-reconnect ephemeral keypair in its
hello. Session-targeted DMs land on SessionBrokerClient since
1.32.1, not the member-keyed daemon-WS, so the field was
vestigial. Send-encrypt path now signs DMs with the stable mesh
member secret. handleBrokerPush invocations from daemon-WS only
pass the member secret — session decryption is the session-WS's
job.
3. Role-aware peer list. `peer list` now hides peers whose
broker-emitted `role` is `'control-plane'`. `--all` opts back in.
JSON output emits `role` at top level. Older brokers that don't
emit role yet default to 'session', so legacy peer rows stay
visible without the broker-side change shipped first. Replaces
the prior `peerType === 'claudemesh-daemon'` channel-name hack.
Typecheck + tests + build all green.
SessionBrokerClient (daemon-side, since 1.30.0) was constructed
without a push handler and silently dropped every inbound `push` /
`inbound` frame. Header docstring claimed it handled "inbound DM
delivery for messages targeted at the session pubkey" but the
callback was never wired.
Net effect: any DM sent to a peer's session pubkey (everything
`peer list` returns now) was queued, broker-acked, marked
delivered_at on the broker, and thrown away by the recipient
daemon. inbox.db stayed at zero rows; `claudemesh inbox` reported
"no messages" no matter what arrived.
Two-session smoke surfaced this — sender outbox status=done with
broker_message_id, recipient inbox empty.
Fix: wire SessionBrokerClient to forward push/inbound frames to
the same handleBrokerPush the member-keyed broker already uses.
Pass the per-session secret key as sessionSecretKeyHex so
decryptOrFallback tries it first; member key remains the fallback
for legacy member-targeted traffic.
Verified end-to-end with two registered sessions sending in both
directions — inbox.db row count went 0 → 2.
Files: apps/cli/src/daemon/session-broker.ts,
apps/cli/src/daemon/run.ts. No broker change required.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
per-session presence is small and uncomplicated enough that a rollback
flag isn't load-bearing. backwards compat is already covered at the
protocol layer — older brokers reply unknown_message_type to
session_hello and the SessionBrokerClient marks itself closed for that
mesh, which is the same outcome the flag would have given. removing
the flag, the helper, and the conditional from the registry hook.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
flips CLAUDEMESH_SESSION_PRESENCE default to ON. With the broker side
already shipped (the session_hello handler from earlier in this sprint
A wave), every claudemesh launch now gets its own long-lived broker
presence row owned by the daemon and identified by a per-launch
ephemeral keypair vouched by the member's stable key. Two sessions in
the same cwd finally see each other in peer list — the symptom users
have been hitting since 1.28.0 dropped the bridge tier.
Bumps roadmap: 1.30.0 = presence (was queued for 1.30/wizard); the
launch-wizard refactor moves to 1.31.0, setup wizard to 1.32.0, the
mesh→workspace rename to 1.33.0. Verification smoke documented in the
1.30.0 changelog entry.
Rollback: CLAUDEMESH_SESSION_PRESENCE=0 (also accepts "false"/"off").
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
daemon-side half of 1.30.0 per-session broker presence. behind
CLAUDEMESH_SESSION_PRESENCE=1 (default OFF this cycle so the broker
side bakes before the flag flips).
- SessionBrokerClient (apps/cli/src/daemon/session-broker.ts) — slim
WS that opens with session_hello, presence-only, no outbox drain.
- session-hello-sig.ts — signParentAttestation (12h TTL, ≤24h cap) and
signSessionHello, mirroring the broker canonical formats.
- session-registry: optional presence field on SessionInfo;
setRegistryHooks for onRegister/onDeregister callbacks. Hook errors
are caught so they can never throttle registry mutations.
- IPC POST /v1/sessions/register accepts the presence material under
body.presence (session_pubkey, session_secret_key, parent_attestation).
Older callers without it stay scoped + supported.
- run.ts wires the registry hooks: on register, opens a SessionBrokerClient
for the matching mesh; on deregister (explicit or reaper), closes it.
Shutdown closes any remaining session WSes before the IPC server.
8 new unit tests cover registry lifecycle (replace/throw/presence
roundtrip) and signature canonical-bytes verification against libsodium.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
every claudemesh launch-spawned session now mints a 32-byte random
token, writes it under tmpdir (mode 0600), and registers it with the
daemon. cli invocations from inside that session inherit
CLAUDEMESH_IPC_TOKEN_FILE in env, attach the token via Authorization:
ClaudeMesh-Session <hex>, and the daemon resolves it to a SessionInfo.
server-side: every read route that filters by mesh now uses meshFromCtx —
explicit query/body wins, session default fills in when missing. write
routes follow the same pattern.
cli-side: peers.ts (and other multi-mesh-iterating verbs in future)
prefers session-token mesh over all joined meshes when the user didn't
pass --mesh explicitly.
backward-compatible in both directions — tokenless callers behave
exactly as before. registry is in-memory; daemon restart loses it but
the 30s reaper handles dead pids and most callers re-register on next
launch.
verified end-to-end: peer list with token returns 4 prueba1 peers,
without token returns 3 meshes' peers (aggregate).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The 1.26.0 step that finally delivers ambient mode for multi-mesh
users. Daemon holds Map<slug, DaemonBrokerClient>; one process, one
PID per user, all your meshes online concurrently.
run.ts: claudemesh daemon up with no --mesh attaches to every joined
mesh from config. --mesh <slug> still scopes to one (legacy mode).
The daemon_started log line reports meshes: [...] instead of mesh.
drain.ts: dispatches each outbox row to the broker keyed by row.mesh
(column added in 1.25.0). Legacy rows with mesh=NULL fall back to the
only broker if there's exactly one, otherwise mark dead with a clear
error.
ipc/server.ts:
- GET /v1/peers aggregates across all attached meshes; each peer
record gains a mesh field. ?mesh=<slug> narrows server-side.
- GET /v1/skills aggregates similarly; /v1/skills/:name walks meshes
and returns first match.
- POST /v1/send requires mesh field on multi-mesh daemons; auto-picks
on single-mesh; returns 400 with attached list if ambiguous.
- POST /v1/profile accepts optional mesh; without it, fans out to all
attached meshes (consistent presence).
CLI: trySendViaDaemon now forwards expectedMesh as the body's mesh
field (was informational, now authoritative). claudemesh send
--mesh A and --mesh B from the same shell both route to the right
broker via the same daemon process.
Verified: aggregated peer list across 3 attached meshes; cross-mesh
sends from CLI reach status=done with correct broker_message_ids.
Released as 1.26.0 on npm.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Daemon outbox now stores resolved target_spec + crypto_box ciphertext
+ nonce per row. Drain worker is a forwarder; no per-row resolution at
drain time. Outbound routing is no longer a placeholder.
Schema additions (additive, NULL allowed for legacy rows): outbox.mesh,
target_spec, nonce, ciphertext, priority. v0.9.0 rows keep draining via
the broadcast fallback so existing in-flight rows finish cleanly.
IPC /v1/send resolves the user-friendly to (display name, hex prefix,
full pubkey, @group, *, #topicId) into a broker-format target_spec at
accept time. DMs encrypt via crypto_box; broadcast/topic/group base64
the plaintext. Hex prefixes (16+ chars) match against connected peers.
CLI thin-client routing extends trySendViaDaemon pattern to peer list
and skill list/get. Three new helpers in services/bridge/daemon-route.ts.
SKILL.md gains ambient mode section: after claudemesh install, raw
claude works for the daemon's attached mesh. Launch stays as the
override path.
Spec at .artifacts/specs/2026-05-04-v2-roadmap-completion.md orders
the remaining v2.0.0 work: multi-mesh daemon (1.26), CLI-to-thin-client
(1.27), mesh-to-workspace rename (1.28), HKDF identity (2.0).
Released as 1.25.0 on npm.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>