2 Commits

Author SHA1 Message Date
Alejandro Gutiérrez
96520394ff docs(spec): session capabilities — first-class concept
Some checks failed
CI / Lint (push) Has been cancelled
CI / Typecheck (push) Has been cancelled
CI / Broker tests (Postgres) (push) Has been cancelled
CI / Docker build (linux/amd64) (push) Has been cancelled
Spec for the gap #4 follow-up from the 1.34.x triage. Builds on
2026-04-15-per-peer-capabilities.md (member-keyed recipient grants)
by adding a sender-side cap subset on session attestations: parent
member signs {session_pubkey, allowed_caps[], expires_at}, broker
enforces intersection of recipient grants × session caps on every
protected operation.

v2 attestation alongside v1 (different canonical prefix
"claudemesh-session-attest-v2|..." → no collision). Default when
no caps subset is declared = full member caps (today's behavior;
opt-in restriction, not breaking).

CLI surface: claudemesh launch --caps dm,read. Bonus: set_state
gate (state-write cap) ships in the same release — closes the
"any session can clobber shared keys like current-pr" footgun.

Migration: dry-run mode for one release before flipping
enforcement. Mirrors the original per-peer-capabilities rollout.

Estimate: ~1 sprint + 1 week dry-run window.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-04 21:59:18 +01:00
Alejandro Gutiérrez
a2a53ff355 feat(cli,broker): 1.34.14 + 1.34.15 — env-var fallback, peer list scope, kick refuses control-plane
Three follow-ups from the 1.34.x multi-session correctness train,
all backwards-compatible.

1.34.14 — stale CLAUDEMESH_CONFIG_DIR falls back. The launch flow
exposes CLAUDEMESH_CONFIG_DIR=<tmpdir> to its spawned claude; if a
later claudemesh invocation inherited that env (Bash tool inside
Claude Code, tmux update-environment, exported var), the inherited
path pointed at a tmpdir that no longer existed and readConfig()
silently returned empty. paths.ts now memoizes resolution: env unset
→ default; env points at a real dir → trust it; env set but dir gone
→ TTY-only stderr warning with shell-specific unset hint, fall back
to ~/.claudemesh.

1.34.15 — peer list --mesh actually scopes. peers.ts and launch.ts
were calling tryListPeersViaDaemon() with no argument; the daemon's
?mesh= filter (server-side, since 1.26.0) was already correct, the
CLI just wasn't passing the slug. Forwarding fixed in both sites;
send.ts cross-mesh hex-prefix resolution intentionally untouched.

1.34.15 — kick refuses no-op kicks on control-plane. Pre-1.34.15
kicking a daemon's member-WS just closed the socket and triggered
auto-reconnect — a no-op with a misleading "session ended" message.
Broker now skips peers where peerRole === "control-plane" and
surfaces them in a new additive ack field skipped_control_plane;
the CLI reads it and prints a clearer hint pointing at ban / daemon
down. Soft disconnect verb keeps old behavior. PeerConn gains a
peerRole slot populated at both connections.set sites.

Tests: 4 new for paths-stale-env, 5 for kick-control-plane-skip.
CLI 87/87 green; broker 55/55 unit green (integration tests
pre-existing infra failure on this machine).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-04 21:59:06 +01:00
10 changed files with 664 additions and 10 deletions

View File

@@ -0,0 +1,288 @@
# Session capabilities — first-class concept
**Status:** spec, queued behind v0.3.0 topic-encryption work.
**Owner:** alezmad
**Author:** Claude (Sprint B follow-up, 2026-05-04)
**Related:** `2026-04-15-per-peer-capabilities.md` (existing per-peer
caps system, member-keyed), `2026-05-04-per-session-presence.md`
(per-launch session presence — what we're now restricting).
## Problem
Per-peer capability grants (`apps/broker/src/index.ts:2178+, 2309+`)
are keyed on the sender's **stable member pubkey**. The grant model
gives the recipient fine-grained control: "alice can DM me",
"bob can read state but not broadcast", etc.
But: as of v1.30.0 (`per-session-presence`), every `claudemesh
launch` mints a per-launch ephemeral keypair with a parent attestation
binding it to the member identity. The launched session inherits **all**
the member's capabilities transitively, because cap enforcement always
falls through to the member key.
Concretely:
- Member `alice` is in mesh `flexicar`, granted `dm + state-read +
state-write` by everyone.
- Alice launches a session with `claudemesh launch` to do an automated
task — say, run a Claude Code agent that iterates over PRs.
- That session has full member privileges. It can DM peers, write
shared state keys (e.g. clobber `current-pr`), grant new caps, ban
members, etc. — none of which the user wanted to delegate.
There is no way to express "this session can DM peers but cannot
deploy services or grant caps." The parent attestation is a binary
existence proof — "this session was vouched by a member" — with no
capability subset.
Plus an adjacent footgun: `set_state` (`apps/broker/src/index.ts:2949`)
has **no cap check at all**. Anyone in the mesh can write any key. The
spec at `2026-04-15-per-peer-capabilities.md` lists `state-write` as a
planned cap but it was never wired into the broker. Shared keys like
`current-pr` are write-anyone today.
## Goal
A launched session can be issued **a capability subset** of its
parent member, signed by the parent at launch time, and the broker
enforces the **intersection** of recipient grants × session caps on
every protected operation.
## Non-goals
- Changing the existing per-peer cap model. Member-keyed grants stay
authoritative for "who is allowed to talk to me."
- Cross-machine session caps (waiting on 2.0.0 HKDF identity).
- Per-tool granularity inside the Claude Code MCP surface — this
spec only covers the broker-enforceable verbs (dm, broadcast,
state-read, state-write, grant, kick, ban, profile-write,
service-deploy).
- Delegation: a session cannot re-vouch a sub-session with its own
cap subset. Only members can attest sessions. (Could be lifted in
a future spec; today's launch flow doesn't need it.)
## Design
### Capability vocabulary
Existing (today, member-level):
| Capability | Effect when GRANTED on a recipient → sender pair |
|---------------|---------------------------------------------------|
| `read` | Sender appears in recipient's `list_peers` |
| `dm` | Sender can DM recipient |
| `broadcast` | Sender's broadcasts reach recipient |
| `state-read` | Sender can read shared state |
| `state-write` | (planned) Sender can write shared state |
| `file-read` | Sender can fetch files recipient shared |
New (session-level — cap subset on the attestation):
These are the **verbs the session is allowed to invoke**, NOT what
peers can do TO it. A session attestation declaring `["dm", "read"]`
means the session can SEND dm/read-list operations; it cannot
broadcast, write state, grant, etc.
| Session cap | Gates which broker operations |
|-------------------|------------------------------------------------|
| `dm` | `send` with single recipient |
| `broadcast` | `send` with `*`, `@group`, `#topic` |
| `state-read` | `get_state`, `list_state` |
| `state-write` | `set_state` |
| `grant` | `grant`, `revoke`, `block` |
| `kick` | `kick`, `disconnect` |
| `ban` | `ban`, `unban` |
| `profile-write` | `set_profile`, `set_summary`, `set_status` |
| `service-deploy` | `mesh_service_register`, `_unregister` |
The default cap set when no subset is declared: the **full member
set** (today's behavior — opt-in restriction, not breaking).
### Attestation v2
Existing v1 (`apps/cli/src/services/broker/session-hello-sig.ts`):
```
canonical = `claudemesh-session-attest|<parent>|<session>|<expires>`
```
New v2 (additive — broker accepts both):
```
canonical = `claudemesh-session-attest-v2|<parent>|<session>|<expires>|<sorted-caps-csv>`
```
Where `<sorted-caps-csv>` is the lower-cased, comma-joined,
ASCII-sorted cap list. Empty-list = full member caps (default,
back-compat).
**Wire shape additions on `session_hello`:**
```ts
{
type: "session_hello",
...existing fields...,
parentAttestation: {
sessionPubkey,
parentMemberPubkey,
expiresAt,
signature,
// NEW:
allowed_caps?: string[], // omitted = full member set
version?: 2, // omitted = v1
},
}
```
The broker version-detects: `version === 2` → verify v2 canonical
including `allowed_caps`. Default behavior is unchanged for clients
that don't pass it.
### Enforcement
Add `allowed_caps: string[] | null` to the in-memory `PeerConn`
shape (`apps/broker/src/index.ts:131`). Populated from
`handleSessionHello` (the v2 attestation supplies it) and from
`handleHello` (control-plane / member connection — set to `null`,
meaning "full member caps").
**Effective cap check** for a sending peer needing `cap`:
```ts
function senderHasCap(conn: PeerConn, cap: string): boolean {
if (conn.allowed_caps === null) return true; // member-level, no subset
return conn.allowed_caps.includes(cap);
}
```
Wire this into every broker operation in the table above. The
existing per-peer recipient-cap check at `2178+, 2309+` stays —
session caps gate the **sender side**, recipient grants gate the
**receive side**, and both must allow:
```
allowed = senderHasCap(conn, capNeeded) && recipientGrants[sender][capNeeded]
```
### `set_state` gate (bonus, ship together)
Today: no cap check. After this spec: `set_state` requires
`state-write` on the sender side. Migration: existing members
default to having `state-write` in their member caps (no recipient
grant model for state-write — it's a sender-side gate only, mesh-
wide). New attestations can omit it to forbid the session.
The recipient-side analog (per-peer state-write grants) is left for
a future spec — today the value of guarding state-write is
session-level (avoid an automated session clobbering shared keys),
not peer-level.
### CLI surface
```
claudemesh launch --caps dm,read # tight: read-only chat agent
claudemesh launch --caps dm,broadcast # send-only, no state writes
claudemesh launch # default: full member caps
```
`claudemesh launch --caps ?` prints the table above with descriptions.
`claudemesh peer list --json` includes `allowed_caps` per row when
present (`null` = full member). Lets users audit what their running
sessions can actually do.
### Migration plan (mirrors `2026-04-15-per-peer-capabilities.md` §"Migration plan")
1. **Broker schema additive** — `PeerConn.allowed_caps` in-memory
only; no DB column. Reload-on-reconnect is fine because the
attestation is re-sent on every WS open (it's the proof of
identity).
2. **CLI ships v2 attestation alongside v1.** New `--caps` flag
defaults to omitted (= v1 attestation, full caps). Older
brokers ignore the new fields entirely.
3. **Broker accepts v2.** When `allowed_caps` arrives, store it.
No enforcement yet — log denied operations as `cap_check_dryrun`
metric counter, still allow them through.
4. **Dry-run release.** Ship one CLI + broker release that emits
the metric but doesn't enforce. Watch for false positives in
real meshes for ≥ 1 week.
5. **Flip enforcement on.** Broker rejects operations failing the
cap check with `forbidden: missing session capability "<cap>"`.
Default ("no caps declared = full member") keeps existing
sessions unaffected.
6. **`set_state` gate** ships in step 5 alongside the rest. Default
member caps include `state-write`, so flipping it on doesn't
break existing flows. Only sessions that explicitly omit
`state-write` from `--caps` lose write access.
### Crypto notes
- v2 attestation re-uses `crypto_sign_detached` over the new
canonical string; same parent member secret key, same TTL caps
(≤24 h), same `expiresAt` semantics.
- v1 signatures are NOT v2 signatures — collision is impossible
because the canonical strings have different prefixes
(`claudemesh-session-attest` vs `claudemesh-session-attest-v2`).
Domain separation is intrinsic.
- Like the existing per-peer cap system: caps are server-enforced
metadata, not capability tokens. A malicious broker can ignore
them. This is about UX trust + footgun prevention, not protocol-
level security.
## Open questions
1. **Should the session attestation also bind to a fingerprint of
the launched binary / Claude version?** Would let a member say
"this session is constrained to Claude Code v1.34.15" so a
compromised launched-binary doesn't get reused. Probably no — too
much friction for the threat model.
2. **What's the right default for `claudemesh launch` going forward?**
Once enforcement ships, do we change the default `--caps` from
"full member" to "dm + read + state-read"? Tighter but breaks
existing automation that writes state. Probably worth a one-
release deprecation warning ("your session will lose state-write
in v2.0.0 unless you pass --caps state-write") and then flip in
v2.0.0.
3. **Does `--caps` belong in `~/.claudemesh/config.json` per-mesh
defaults too?** A user who always launches read-only agents
wants `caps: ["dm", "read"]` as a personal default. Easy add;
defer until users ask for it.
4. **Per-tool MCP cap surface?** Out of scope here, but: a `claudemesh
launch --tools peer:read,memory:write` would be a finer cut than
broker-verb caps. The broker can't enforce that — it'd live in the
MCP wrapper / Claude Code's allowedTools. Different layer.
## Test plan
- Pure-logic tests on `senderHasCap` (member-level → always true,
empty caps → always false, declared caps → exact match).
- Broker integration: launch a session with `--caps dm`, attempt
`set_state` → expect `forbidden: missing session capability
"state-write"`.
- v1 attestation still accepted, no `allowed_caps` set, all caps
permitted (back-compat).
- v2 attestation with empty `allowed_caps` array → broker treats
as "explicitly empty, no caps allowed" (NOT "full member"). The
full-member default is "field omitted entirely". Test both.
- Dry-run mode: cap fail increments the counter but the operation
proceeds. Smoke-test before flipping enforcement.
## Estimate
- Spec review + open-question resolution: 12 days.
- Broker change (PeerConn field, attestation v2 accept, per-verb
enforcement, dry-run mode): 23 days.
- CLI change (`--caps` flag, attestation builder, peer list
surface): 1 day.
- Tests: 1 day.
- Dry-run release window: ≥ 1 week.
Total: ~1 sprint of focused work, plus a dry-run window.

View File

@@ -156,6 +156,11 @@ interface PeerConn {
bio?: string; bio?: string;
capabilities?: string[]; capabilities?: string[];
}; };
/** v2 agentic-comms presence taxonomy. Mirrors the value passed to
* `recordPresence`. Used by the kick handler to refuse no-op kicks
* on long-lived control-plane connections (daemon, dashboard) that
* would just auto-reconnect. */
peerRole: "control-plane" | "session" | "service";
} }
const connections = new Map<string, PeerConn>(); const connections = new Map<string, PeerConn>();
@@ -1797,6 +1802,7 @@ async function handleHello(
groups: initialGroups, groups: initialGroups,
visible: saved?.visible ?? true, visible: saved?.visible ?? true,
profile: saved?.profile ?? {}, profile: saved?.profile ?? {},
peerRole: "control-plane",
}); });
incMeshCount(hello.meshId); incMeshCount(hello.meshId);
void audit(hello.meshId, "peer_joined", member.id, effectiveDisplayName, { void audit(hello.meshId, "peer_joined", member.id, effectiveDisplayName, {
@@ -2022,6 +2028,7 @@ async function handleSessionHello(
groups: initialGroups, groups: initialGroups,
visible: true, visible: true,
profile: {}, profile: {},
peerRole: "session",
}); });
incMeshCount(hello.meshId); incMeshCount(hello.meshId);
void audit(hello.meshId, "peer_joined", member.id, effectiveDisplayName, { void audit(hello.meshId, "peer_joined", member.id, effectiveDisplayName, {
@@ -4645,11 +4652,30 @@ function handleConnection(ws: WebSocket): void {
} }
const affected: string[] = []; const affected: string[] = [];
// 1.34.15 (gap #3a): kick was a no-op against long-lived
// control-plane connections (daemon, dashboard) — closing
// their WS just triggered the auto-reconnect loop, the
// kicker's CLI rendered "Their Claude Code session ended"
// (which was misleading), and the user-visible state was
// unchanged seconds later. We now refuse to close control-
// plane WSes and surface the skipped peers in a new
// additive ack field. Pre-1.34.15 CLI clients only read
// `kicked`/`affected`, so this stays back-compat.
//
// For `kick`-only: the soft `disconnect` verb still closes
// control-plane WSes intentionally — that's what users want
// when they're nudging a peer for it to re-authenticate.
const skippedControlPlane: string[] = [];
const skipControlPlane = isKick;
const now = Date.now(); const now = Date.now();
if (km.all) { if (km.all) {
for (const [pid, peer] of connections) { for (const [pid, peer] of connections) {
if (peer.meshId !== conn.meshId || pid === presenceId) continue; if (peer.meshId !== conn.meshId || pid === presenceId) continue;
if (skipControlPlane && peer.peerRole === "control-plane") {
skippedControlPlane.push(peer.displayName || pid);
continue;
}
try { peer.ws.close(closeCode, closeReason); } catch {} try { peer.ws.close(closeCode, closeReason); } catch {}
connections.delete(pid); connections.delete(pid);
void disconnectPresence(pid); void disconnectPresence(pid);
@@ -4661,6 +4687,10 @@ function handleConnection(ws: WebSocket): void {
if (peer.meshId !== conn.meshId || pid === presenceId) continue; if (peer.meshId !== conn.meshId || pid === presenceId) continue;
const [pres] = await db.select({ lastPingAt: presence.lastPingAt }).from(presence).where(eq(presence.id, pid)).limit(1); const [pres] = await db.select({ lastPingAt: presence.lastPingAt }).from(presence).where(eq(presence.id, pid)).limit(1);
if (pres && pres.lastPingAt && pres.lastPingAt.getTime() < cutoff) { if (pres && pres.lastPingAt && pres.lastPingAt.getTime() < cutoff) {
if (skipControlPlane && peer.peerRole === "control-plane") {
skippedControlPlane.push(peer.displayName || pid);
continue;
}
try { peer.ws.close(closeCode, `${closeReason}_stale`); } catch {} try { peer.ws.close(closeCode, `${closeReason}_stale`); } catch {}
connections.delete(pid); connections.delete(pid);
void disconnectPresence(pid); void disconnectPresence(pid);
@@ -4671,6 +4701,10 @@ function handleConnection(ws: WebSocket): void {
for (const [pid, peer] of connections) { for (const [pid, peer] of connections) {
if (peer.meshId !== conn.meshId) continue; if (peer.meshId !== conn.meshId) continue;
if (peer.displayName === km.target || peer.memberPubkey === km.target || peer.memberPubkey.startsWith(km.target)) { if (peer.displayName === km.target || peer.memberPubkey === km.target || peer.memberPubkey.startsWith(km.target)) {
if (skipControlPlane && peer.peerRole === "control-plane") {
skippedControlPlane.push(peer.displayName || pid);
continue;
}
try { peer.ws.close(closeCode, closeReason); } catch {} try { peer.ws.close(closeCode, closeReason); } catch {}
connections.delete(pid); connections.delete(pid);
void disconnectPresence(pid); void disconnectPresence(pid);
@@ -4679,8 +4713,20 @@ function handleConnection(ws: WebSocket): void {
} }
} }
conn.ws.send(JSON.stringify({ type: ackType, kicked: affected, affected, _reqId: km._reqId })); conn.ws.send(JSON.stringify({
log.info(`ws ${closeReason}`, { presence_id: presenceId, count: affected.length, target: km.target ?? km.stale ?? "all" }); type: ackType,
kicked: affected,
affected,
// Additive — older CLI clients ignore this field.
...(skippedControlPlane.length > 0 ? { skipped_control_plane: skippedControlPlane } : {}),
_reqId: km._reqId,
}));
log.info(`ws ${closeReason}`, {
presence_id: presenceId,
count: affected.length,
target: km.target ?? km.stale ?? "all",
skipped_control_plane: skippedControlPlane.length,
});
break; break;
} }

View File

@@ -0,0 +1,47 @@
/**
* Kick control-plane skip: 1.34.15 (gap #3a) refuses to close
* long-lived control-plane connections (claudemesh daemon, dashboard)
* via `kick`, because they auto-reconnect within seconds and the verb
* was effectively a no-op. The soft `disconnect` verb keeps the old
* behavior so users can still nudge a control-plane peer to
* re-authenticate.
*
* Pure-logic test — mirrors the branch inside handleSend's kick case
* without spinning up a broker. Same pattern as
* grants-enforcement.test.ts.
*/
import { describe, expect, test } from "vitest";
type PeerRole = "control-plane" | "session" | "service";
/** Mirrors the predicate inserted into the kick handler. */
function shouldSkipKick(args: {
verb: "kick" | "disconnect";
peerRole: PeerRole;
}): boolean {
const skipControlPlane = args.verb === "kick";
return skipControlPlane && args.peerRole === "control-plane";
}
describe("kick control-plane skip (gap #3a)", () => {
test("kick on control-plane → skipped (would auto-reconnect)", () => {
expect(shouldSkipKick({ verb: "kick", peerRole: "control-plane" })).toBe(true);
});
test("kick on session → not skipped (closes user session)", () => {
expect(shouldSkipKick({ verb: "kick", peerRole: "session" })).toBe(false);
});
test("kick on service → not skipped", () => {
expect(shouldSkipKick({ verb: "kick", peerRole: "service" })).toBe(false);
});
test("disconnect on control-plane → not skipped (intentional nudge)", () => {
expect(shouldSkipKick({ verb: "disconnect", peerRole: "control-plane" })).toBe(false);
});
test("disconnect on session → not skipped", () => {
expect(shouldSkipKick({ verb: "disconnect", peerRole: "session" })).toBe(false);
});
});

View File

@@ -1,5 +1,110 @@
# Changelog # Changelog
## 1.34.15 (2026-05-04) — `peer list --mesh` actually scopes + `kick` refuses control-plane
Two follow-ups from the 1.34.x train, both backwards-compatible.
### `peer list --mesh <slug>` no longer aggregates across meshes
`apps/cli/src/commands/peers.ts:140` was calling
`tryListPeersViaDaemon()` with no argument, so a multi-mesh daemon
returned peers from EVERY attached mesh and the renderer printed
"peers on flexicar" with cross-mesh rows mixed in. The daemon's
`/v1/peers?mesh=<slug>` filter (server-side, since 1.26.0) was
already correctly scoping when the slug was passed; the CLI just
wasn't passing it. Fixed.
`apps/cli/src/commands/launch.ts:407` (the `printBrokerWelcome` peer
count in the launch banner) had the same bug. The "N peers online"
line in the welcome now shows the count for the launched mesh only.
`apps/cli/src/commands/send.ts` cross-mesh hex-prefix resolution is
intentionally cross-mesh (the user is targeting by hex without
specifying a mesh) and was deliberately left as-is.
### `claudemesh kick` refuses no-op kicks on control-plane connections
Pre-1.34.15, kicking a daemon's member-WS or a dashboard connection
just closed the socket — the daemon's WS-lifecycle reconnect loop
brought it back within seconds, the kicker's CLI rendered "Their
Claude Code session ended" (which was misleading), and the user-
visible state was unchanged. The verb was effectively a no-op, but
the user had to learn that the hard way.
The broker's kick handler (`apps/broker/src/index.ts:4628+`) now
skips peers where `peerRole === "control-plane"` and surfaces the
skipped peers in a new additive ack field `skipped_control_plane`.
The soft `disconnect` verb keeps the old behavior — useful when
intentionally nudging a control-plane peer to re-authenticate.
The CLI (`apps/cli/src/commands/kick.ts`) reads the new field and
prints a clearer message: refused peers are listed, with the hint
that `claudemesh ban <peer>` is the right tool to remove a member,
or `claudemesh daemon down` to take a daemon offline locally.
`apps/broker/src/index.ts` adds `peerRole` to the in-memory
`PeerConn` shape, populated from both connection paths
(member-keyed `hello``"control-plane"`, per-launch
`session_hello``"session"`). The DB-side role taxonomy is
unchanged.
### Back-compat
- Older CLI clients ignore the new `skipped_control_plane` ack
field; their kick continues to print "Kicked 0 peer(s)" against
a control-plane target as before.
- Older brokers don't emit the field at all; newer CLI handles
the absence (the new branch is only reached when the field is
present and non-empty).
- The new `peerRole` slot on `PeerConn` is filled at every
`connections.set` callsite, so older code paths never read
`undefined`.
### Tests
- `apps/broker/tests/kick-control-plane-skip.test.ts` — 5 cases
covering the kick/disconnect × control-plane/session/service
truth table.
## 1.34.14 (2026-05-04) — stale `CLAUDEMESH_CONFIG_DIR` falls back
`claudemesh launch` exports `CLAUDEMESH_CONFIG_DIR=<tmpdir>` to its
spawned `claude` so the per-session mesh selection is isolated from
`~/.claudemesh/config.json`. The tmpdir is `rmSync`'d on launch exit
via the `process.on('exit', cleanup)` handler.
Footgun: if a later `claudemesh` invocation INHERITED that env — a
Bash tool call inside Claude Code, a tmux pane that captured the env
via `update-environment`, an exported var the user forgot to clear —
the inherited path pointed at a tmpdir that no longer existed.
Pre-1.34.14 we silently used the dead path, `readConfig()` came back
empty, and the user saw "No meshes joined" from an otherwise-working
install. Fish users hit it harder because fish has no `unset`
they had to discover `set -e CLAUDEMESH_CONFIG_DIR`.
`apps/cli/src/constants/paths.ts` now resolves `CONFIG_DIR` once via
a memoized `resolveConfigDir()`:
1. No env var → `~/.claudemesh` (default, unchanged).
2. Env points at a dir containing `config.json` → trust it. The
legitimate per-session-launch case is byte-identical to before.
3. Env set but stale (dir gone) → warn once on stderr (TTY-only —
CI / MCP boot / piped scripts stay quiet) with a shell-specific
unset hint, then fall back to `~/.claudemesh`.
The check is on the directory's existence, not on `config.json`,
because a fresh-launch tmpdir legitimately has no `config.json` until
the first write. The stale signature we catch is the outer launch's
`rmSync(tmpDir, {recursive: true})` cleanup, which removes the
directory entirely.
The "no meshes" check from the original triage was deliberately NOT
adopted: a launched session that legitimately joins one mesh would
hit it.
No back-compat surface affected. No other files changed. `_resetPathsForTest()`
exported for unit tests.
## 1.34.13 (2026-05-04) — MCP forwards session token on /v1/events ## 1.34.13 (2026-05-04) — MCP forwards session token on /v1/events
The 1.34.10 SSE demux + 1.34.11 inbox per-recipient column were both The 1.34.10 SSE demux + 1.34.11 inbox per-recipient column were both

View File

@@ -1,6 +1,6 @@
{ {
"name": "claudemesh-cli", "name": "claudemesh-cli",
"version": "1.34.13", "version": "1.34.15",
"description": "Peer mesh for Claude Code sessions — CLI + MCP server.", "description": "Peer mesh for Claude Code sessions — CLI + MCP server.",
"keywords": [ "keywords": [
"claude-code", "claude-code",

View File

@@ -76,12 +76,32 @@ export async function runKick(
if ("error" in built) { render.err(String(built.error)); return EXIT.INVALID_ARGS; } if ("error" in built) { render.err(String(built.error)); return EXIT.INVALID_ARGS; }
return await withMesh({ meshSlug }, async (client) => { return await withMesh({ meshSlug }, async (client) => {
const result = await client.sendAndWait(built as Record<string, unknown>) as { affected?: string[]; kicked?: string[] }; const result = await client.sendAndWait(built as Record<string, unknown>) as {
affected?: string[];
kicked?: string[];
// 1.34.15: broker refuses to kick control-plane WSes (they'd
// just auto-reconnect). Older brokers don't emit this field.
skipped_control_plane?: string[];
};
const peers = result?.affected ?? result?.kicked ?? []; const peers = result?.affected ?? result?.kicked ?? [];
if (peers.length === 0) render.info("No peers matched."); const skipped = result?.skipped_control_plane ?? [];
else {
if (peers.length === 0 && skipped.length === 0) {
render.info("No peers matched.");
} else if (peers.length === 0 && skipped.length > 0) {
render.warn(
`${skipped.length} match(es) refused: ${skipped.join(", ")} — control-plane connections (daemon / dashboard) auto-reconnect, so kick is a no-op.`,
"To take a daemon offline locally, run `claudemesh daemon down` on that machine. To remove a member from the mesh, use `claudemesh ban <peer>`.",
);
} else {
render.ok(`Kicked ${peers.length} peer(s): ${peers.join(", ")}`); render.ok(`Kicked ${peers.length} peer(s): ${peers.join(", ")}`);
render.hint("Their Claude Code session ended. They can rejoin anytime by running `claudemesh`."); render.hint("Their Claude Code session ended. They can rejoin anytime by running `claudemesh`.");
if (skipped.length > 0) {
render.warn(
`(also refused ${skipped.length} control-plane connection(s): ${skipped.join(", ")})`,
"Daemon / dashboard connections auto-reconnect; kick is a no-op against them. Use `claudemesh ban <peer>` to remove a member entirely.",
);
}
} }
return EXIT.SUCCESS; return EXIT.SUCCESS;
}); });

View File

@@ -400,11 +400,13 @@ async function printBrokerWelcome(meshSlug: string): Promise<void> {
} }
} catch { /* daemon unreachable — not fatal */ } } catch { /* daemon unreachable — not fatal */ }
// Peer count (best-effort). // Peer count (best-effort). 1.34.15: scope to the launched mesh so
// multi-mesh daemons don't inflate the welcome banner with peers
// from other meshes the user didn't just attach to.
let peerCount = -1; let peerCount = -1;
try { try {
const { tryListPeersViaDaemon } = await import("~/services/bridge/daemon-route.js"); const { tryListPeersViaDaemon } = await import("~/services/bridge/daemon-route.js");
const peers = (await tryListPeersViaDaemon()) ?? []; const peers = (await tryListPeersViaDaemon(meshSlug)) ?? [];
peerCount = peers.filter((p) => peerCount = peers.filter((p) =>
(p as { channel?: string }).channel !== "claudemesh-daemon", (p as { channel?: string }).channel !== "claudemesh-daemon",
).length; ).length;

View File

@@ -135,9 +135,17 @@ async function listPeersForMesh(slug: string): Promise<PeerRecord[]> {
// lifecycle helper inside tryListPeersViaDaemon auto-spawns the // lifecycle helper inside tryListPeersViaDaemon auto-spawns the
// daemon if it's down and probes it for liveness — no separate bridge // daemon if it's down and probes it for liveness — no separate bridge
// tier is needed any more (1.28.0). // tier is needed any more (1.28.0).
//
// 1.34.15: forward `slug` to the daemon as `?mesh=<slug>` so the
// server-side aggregator narrows to the requested mesh. Pre-1.34.15
// we called this with no argument, so a multi-mesh daemon returned
// peers from every attached mesh and the renderer printed "peers on
// flexicar" with cross-mesh rows mixed in. The daemon's
// `meshFromCtx` already does the right scoping when the slug is
// passed; the CLI just wasn't passing it.
try { try {
const { tryListPeersViaDaemon } = await import("~/services/bridge/daemon-route.js"); const { tryListPeersViaDaemon } = await import("~/services/bridge/daemon-route.js");
const dr = await tryListPeersViaDaemon(); const dr = await tryListPeersViaDaemon(slug);
if (dr !== null) { if (dr !== null) {
return dr.map((p) => annotateSelf(p as PeerRecord, selfMemberPubkey, selfSessionPubkey)); return dr.map((p) => annotateSelf(p as PeerRecord, selfMemberPubkey, selfSessionPubkey));
} }

View File

@@ -1,10 +1,82 @@
import { existsSync } from "node:fs";
import { homedir } from "node:os"; import { homedir } from "node:os";
import { join } from "node:path"; import { join } from "node:path";
const home = homedir(); const home = homedir();
const DEFAULT_CONFIG_DIR = join(home, ".claudemesh");
/**
* Resolve `CONFIG_DIR` once, with stale-env detection.
*
* `claudemesh launch` exposes `CLAUDEMESH_CONFIG_DIR=<tmpdir>` to its
* spawned `claude` so the per-session mesh selection is isolated from
* `~/.claudemesh/config.json`. The tmpdir is rmSync'd on launch exit.
*
* Footgun: if a `claudemesh` invocation INHERITS that env from an
* already-launched (or previously-launched) session — e.g. a Bash tool
* call inside Claude Code, or a tmux pane that captured the env via
* `update-environment` — the inherited path may point at a tmpdir that
* no longer exists. Pre-1.34.14 we silently used the dead path,
* `readConfig()` came back empty, and the user saw "No meshes joined"
* from an otherwise-working install.
*
* Resolution rules:
* 1. No env var → `~/.claudemesh` (default).
* 2. Env points at a dir containing `config.json` → trust it
* (the legitimate per-session-launch case).
* 3. Env set but stale (dir missing or no `config.json`) → warn
* once on stderr (TTY-only) and fall back to `~/.claudemesh`.
*
* Memoized: resolves once on first access. Mid-process env mutations
* are intentionally ignored — paths must stay stable across one CLI
* invocation.
*/
let _resolvedConfigDir: string | null = null;
let _warnedStaleEnv = false;
function resolveConfigDir(): string {
if (_resolvedConfigDir !== null) return _resolvedConfigDir;
const envDir = process.env.CLAUDEMESH_CONFIG_DIR;
if (!envDir) {
_resolvedConfigDir = DEFAULT_CONFIG_DIR;
return DEFAULT_CONFIG_DIR;
}
// Trust the env when it resolves to a real directory. We check
// the DIR (not `config.json`) because the legitimate "fresh launch
// before any write" case has the dir but no config.json yet.
// The stale signature we want to catch is `rmSync(tmpDir,
// {recursive: true})` from the outer launch's cleanup — that
// removes the directory entirely, so a missing dir is the
// unambiguous "stale" signal.
if (existsSync(envDir)) {
_resolvedConfigDir = envDir;
return envDir;
}
// Stale: env set but the dir is gone. Most likely the outer
// launch's cleanup ran and we inherited its (now-dead) tmpdir
// path. Fall back to default and warn the user once on stderr —
// only when attached to a TTY, so non-interactive callers (CI,
// MCP boot, scripts piping stdout) stay quiet.
if (!_warnedStaleEnv && process.stderr.isTTY) {
_warnedStaleEnv = true;
const unsetHint =
process.env.SHELL?.endsWith("fish")
? "set -e CLAUDEMESH_CONFIG_DIR CLAUDEMESH_IPC_TOKEN_FILE"
: "unset CLAUDEMESH_CONFIG_DIR CLAUDEMESH_IPC_TOKEN_FILE";
process.stderr.write(
`claudemesh: ignoring stale CLAUDEMESH_CONFIG_DIR=${envDir} (no config.json there); using ${DEFAULT_CONFIG_DIR}.\n`
+ ` Hint: this is usually a leftover env from a previous \`claudemesh launch\`. Clean it with:\n`
+ ` ${unsetHint}\n`,
);
}
_resolvedConfigDir = DEFAULT_CONFIG_DIR;
return DEFAULT_CONFIG_DIR;
}
export const PATHS = { export const PATHS = {
CONFIG_DIR: process.env.CLAUDEMESH_CONFIG_DIR || join(home, ".claudemesh"), get CONFIG_DIR() {
return resolveConfigDir();
},
get CONFIG_FILE() { get CONFIG_FILE() {
return join(this.CONFIG_DIR, "config.json"); return join(this.CONFIG_DIR, "config.json");
}, },
@@ -20,3 +92,12 @@ export const PATHS = {
CLAUDE_JSON: join(home, ".claude.json"), CLAUDE_JSON: join(home, ".claude.json"),
CLAUDE_SETTINGS: join(home, ".claude", "settings.json"), CLAUDE_SETTINGS: join(home, ".claude", "settings.json"),
} as const; } as const;
/**
* Test-only: reset the memoized resolution. Not exported from the
* package barrel; reach in via the relative path from a test file.
*/
export function _resetPathsForTest(): void {
_resolvedConfigDir = null;
_warnedStaleEnv = false;
}

View File

@@ -0,0 +1,57 @@
import { describe, it, expect, beforeEach, afterEach, vi } from "vitest";
import { mkdirSync, rmSync, existsSync } from "node:fs";
import { join } from "node:path";
import { tmpdir, homedir } from "node:os";
/** Each test imports a fresh copy of paths.ts via dynamic import +
* `_resetPathsForTest()` so memoization doesn't leak across cases. */
const TEST_DIR = join(tmpdir(), "claudemesh-paths-test-" + Date.now());
describe("paths CONFIG_DIR resolution", () => {
beforeEach(() => {
delete process.env.CLAUDEMESH_CONFIG_DIR;
if (existsSync(TEST_DIR)) rmSync(TEST_DIR, { recursive: true, force: true });
});
afterEach(() => {
delete process.env.CLAUDEMESH_CONFIG_DIR;
if (existsSync(TEST_DIR)) rmSync(TEST_DIR, { recursive: true, force: true });
});
it("falls back to ~/.claudemesh when env var is unset", async () => {
const mod = await import("~/constants/paths.js");
mod._resetPathsForTest();
expect(mod.PATHS.CONFIG_DIR).toBe(join(homedir(), ".claudemesh"));
});
it("honors CLAUDEMESH_CONFIG_DIR when the dir exists, even without config.json", async () => {
mkdirSync(TEST_DIR, { recursive: true });
process.env.CLAUDEMESH_CONFIG_DIR = TEST_DIR;
const mod = await import("~/constants/paths.js");
mod._resetPathsForTest();
expect(mod.PATHS.CONFIG_DIR).toBe(TEST_DIR);
});
it("falls back to default when env points at a missing dir (stale-tmpdir case)", async () => {
process.env.CLAUDEMESH_CONFIG_DIR = "/var/folders/_nonexistent_claudemesh_dir_xyz123";
const mod = await import("~/constants/paths.js");
mod._resetPathsForTest();
// Suppress the stderr warning to keep test output clean
const stderr = vi.spyOn(process.stderr, "write").mockImplementation(() => true);
try {
expect(mod.PATHS.CONFIG_DIR).toBe(join(homedir(), ".claudemesh"));
} finally {
stderr.mockRestore();
}
});
it("memoizes — second access returns the same path even if env changes mid-process", async () => {
mkdirSync(TEST_DIR, { recursive: true });
process.env.CLAUDEMESH_CONFIG_DIR = TEST_DIR;
const mod = await import("~/constants/paths.js");
mod._resetPathsForTest();
const first = mod.PATHS.CONFIG_DIR;
process.env.CLAUDEMESH_CONFIG_DIR = "/something/else";
expect(mod.PATHS.CONFIG_DIR).toBe(first);
});
});