feat(cli,broker): 1.34.14 + 1.34.15 — env-var fallback, peer list scope, kick refuses control-plane

Three follow-ups from the 1.34.x multi-session correctness train,
all backwards-compatible.

1.34.14 — stale CLAUDEMESH_CONFIG_DIR falls back. The launch flow
exposes CLAUDEMESH_CONFIG_DIR=<tmpdir> to its spawned claude; if a
later claudemesh invocation inherited that env (Bash tool inside
Claude Code, tmux update-environment, exported var), the inherited
path pointed at a tmpdir that no longer existed and readConfig()
silently returned empty. paths.ts now memoizes resolution: env unset
→ default; env points at a real dir → trust it; env set but dir gone
→ TTY-only stderr warning with shell-specific unset hint, fall back
to ~/.claudemesh.

1.34.15 — peer list --mesh actually scopes. peers.ts and launch.ts
were calling tryListPeersViaDaemon() with no argument; the daemon's
?mesh= filter (server-side, since 1.26.0) was already correct, the
CLI just wasn't passing the slug. Forwarding fixed in both sites;
send.ts cross-mesh hex-prefix resolution intentionally untouched.

1.34.15 — kick refuses no-op kicks on control-plane. Pre-1.34.15
kicking a daemon's member-WS just closed the socket and triggered
auto-reconnect — a no-op with a misleading "session ended" message.
Broker now skips peers where peerRole === "control-plane" and
surfaces them in a new additive ack field skipped_control_plane;
the CLI reads it and prints a clearer hint pointing at ban / daemon
down. Soft disconnect verb keeps old behavior. PeerConn gains a
peerRole slot populated at both connections.set sites.

Tests: 4 new for paths-stale-env, 5 for kick-control-plane-skip.
CLI 87/87 green; broker 55/55 unit green (integration tests
pre-existing infra failure on this machine).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
Alejandro Gutiérrez
2026-05-04 21:59:06 +01:00
parent 6780899185
commit a2a53ff355
9 changed files with 376 additions and 10 deletions

View File

@@ -1,5 +1,110 @@
# Changelog
## 1.34.15 (2026-05-04) — `peer list --mesh` actually scopes + `kick` refuses control-plane
Two follow-ups from the 1.34.x train, both backwards-compatible.
### `peer list --mesh <slug>` no longer aggregates across meshes
`apps/cli/src/commands/peers.ts:140` was calling
`tryListPeersViaDaemon()` with no argument, so a multi-mesh daemon
returned peers from EVERY attached mesh and the renderer printed
"peers on flexicar" with cross-mesh rows mixed in. The daemon's
`/v1/peers?mesh=<slug>` filter (server-side, since 1.26.0) was
already correctly scoping when the slug was passed; the CLI just
wasn't passing it. Fixed.
`apps/cli/src/commands/launch.ts:407` (the `printBrokerWelcome` peer
count in the launch banner) had the same bug. The "N peers online"
line in the welcome now shows the count for the launched mesh only.
`apps/cli/src/commands/send.ts` cross-mesh hex-prefix resolution is
intentionally cross-mesh (the user is targeting by hex without
specifying a mesh) and was deliberately left as-is.
### `claudemesh kick` refuses no-op kicks on control-plane connections
Pre-1.34.15, kicking a daemon's member-WS or a dashboard connection
just closed the socket — the daemon's WS-lifecycle reconnect loop
brought it back within seconds, the kicker's CLI rendered "Their
Claude Code session ended" (which was misleading), and the user-
visible state was unchanged. The verb was effectively a no-op, but
the user had to learn that the hard way.
The broker's kick handler (`apps/broker/src/index.ts:4628+`) now
skips peers where `peerRole === "control-plane"` and surfaces the
skipped peers in a new additive ack field `skipped_control_plane`.
The soft `disconnect` verb keeps the old behavior — useful when
intentionally nudging a control-plane peer to re-authenticate.
The CLI (`apps/cli/src/commands/kick.ts`) reads the new field and
prints a clearer message: refused peers are listed, with the hint
that `claudemesh ban <peer>` is the right tool to remove a member,
or `claudemesh daemon down` to take a daemon offline locally.
`apps/broker/src/index.ts` adds `peerRole` to the in-memory
`PeerConn` shape, populated from both connection paths
(member-keyed `hello``"control-plane"`, per-launch
`session_hello``"session"`). The DB-side role taxonomy is
unchanged.
### Back-compat
- Older CLI clients ignore the new `skipped_control_plane` ack
field; their kick continues to print "Kicked 0 peer(s)" against
a control-plane target as before.
- Older brokers don't emit the field at all; newer CLI handles
the absence (the new branch is only reached when the field is
present and non-empty).
- The new `peerRole` slot on `PeerConn` is filled at every
`connections.set` callsite, so older code paths never read
`undefined`.
### Tests
- `apps/broker/tests/kick-control-plane-skip.test.ts` — 5 cases
covering the kick/disconnect × control-plane/session/service
truth table.
## 1.34.14 (2026-05-04) — stale `CLAUDEMESH_CONFIG_DIR` falls back
`claudemesh launch` exports `CLAUDEMESH_CONFIG_DIR=<tmpdir>` to its
spawned `claude` so the per-session mesh selection is isolated from
`~/.claudemesh/config.json`. The tmpdir is `rmSync`'d on launch exit
via the `process.on('exit', cleanup)` handler.
Footgun: if a later `claudemesh` invocation INHERITED that env — a
Bash tool call inside Claude Code, a tmux pane that captured the env
via `update-environment`, an exported var the user forgot to clear —
the inherited path pointed at a tmpdir that no longer existed.
Pre-1.34.14 we silently used the dead path, `readConfig()` came back
empty, and the user saw "No meshes joined" from an otherwise-working
install. Fish users hit it harder because fish has no `unset`
they had to discover `set -e CLAUDEMESH_CONFIG_DIR`.
`apps/cli/src/constants/paths.ts` now resolves `CONFIG_DIR` once via
a memoized `resolveConfigDir()`:
1. No env var → `~/.claudemesh` (default, unchanged).
2. Env points at a dir containing `config.json` → trust it. The
legitimate per-session-launch case is byte-identical to before.
3. Env set but stale (dir gone) → warn once on stderr (TTY-only —
CI / MCP boot / piped scripts stay quiet) with a shell-specific
unset hint, then fall back to `~/.claudemesh`.
The check is on the directory's existence, not on `config.json`,
because a fresh-launch tmpdir legitimately has no `config.json` until
the first write. The stale signature we catch is the outer launch's
`rmSync(tmpDir, {recursive: true})` cleanup, which removes the
directory entirely.
The "no meshes" check from the original triage was deliberately NOT
adopted: a launched session that legitimately joins one mesh would
hit it.
No back-compat surface affected. No other files changed. `_resetPathsForTest()`
exported for unit tests.
## 1.34.13 (2026-05-04) — MCP forwards session token on /v1/events
The 1.34.10 SSE demux + 1.34.11 inbox per-recipient column were both

View File

@@ -1,6 +1,6 @@
{
"name": "claudemesh-cli",
"version": "1.34.13",
"version": "1.34.15",
"description": "Peer mesh for Claude Code sessions — CLI + MCP server.",
"keywords": [
"claude-code",

View File

@@ -76,12 +76,32 @@ export async function runKick(
if ("error" in built) { render.err(String(built.error)); return EXIT.INVALID_ARGS; }
return await withMesh({ meshSlug }, async (client) => {
const result = await client.sendAndWait(built as Record<string, unknown>) as { affected?: string[]; kicked?: string[] };
const result = await client.sendAndWait(built as Record<string, unknown>) as {
affected?: string[];
kicked?: string[];
// 1.34.15: broker refuses to kick control-plane WSes (they'd
// just auto-reconnect). Older brokers don't emit this field.
skipped_control_plane?: string[];
};
const peers = result?.affected ?? result?.kicked ?? [];
if (peers.length === 0) render.info("No peers matched.");
else {
const skipped = result?.skipped_control_plane ?? [];
if (peers.length === 0 && skipped.length === 0) {
render.info("No peers matched.");
} else if (peers.length === 0 && skipped.length > 0) {
render.warn(
`${skipped.length} match(es) refused: ${skipped.join(", ")} — control-plane connections (daemon / dashboard) auto-reconnect, so kick is a no-op.`,
"To take a daemon offline locally, run `claudemesh daemon down` on that machine. To remove a member from the mesh, use `claudemesh ban <peer>`.",
);
} else {
render.ok(`Kicked ${peers.length} peer(s): ${peers.join(", ")}`);
render.hint("Their Claude Code session ended. They can rejoin anytime by running `claudemesh`.");
if (skipped.length > 0) {
render.warn(
`(also refused ${skipped.length} control-plane connection(s): ${skipped.join(", ")})`,
"Daemon / dashboard connections auto-reconnect; kick is a no-op against them. Use `claudemesh ban <peer>` to remove a member entirely.",
);
}
}
return EXIT.SUCCESS;
});

View File

@@ -400,11 +400,13 @@ async function printBrokerWelcome(meshSlug: string): Promise<void> {
}
} catch { /* daemon unreachable — not fatal */ }
// Peer count (best-effort).
// Peer count (best-effort). 1.34.15: scope to the launched mesh so
// multi-mesh daemons don't inflate the welcome banner with peers
// from other meshes the user didn't just attach to.
let peerCount = -1;
try {
const { tryListPeersViaDaemon } = await import("~/services/bridge/daemon-route.js");
const peers = (await tryListPeersViaDaemon()) ?? [];
const peers = (await tryListPeersViaDaemon(meshSlug)) ?? [];
peerCount = peers.filter((p) =>
(p as { channel?: string }).channel !== "claudemesh-daemon",
).length;

View File

@@ -135,9 +135,17 @@ async function listPeersForMesh(slug: string): Promise<PeerRecord[]> {
// lifecycle helper inside tryListPeersViaDaemon auto-spawns the
// daemon if it's down and probes it for liveness — no separate bridge
// tier is needed any more (1.28.0).
//
// 1.34.15: forward `slug` to the daemon as `?mesh=<slug>` so the
// server-side aggregator narrows to the requested mesh. Pre-1.34.15
// we called this with no argument, so a multi-mesh daemon returned
// peers from every attached mesh and the renderer printed "peers on
// flexicar" with cross-mesh rows mixed in. The daemon's
// `meshFromCtx` already does the right scoping when the slug is
// passed; the CLI just wasn't passing it.
try {
const { tryListPeersViaDaemon } = await import("~/services/bridge/daemon-route.js");
const dr = await tryListPeersViaDaemon();
const dr = await tryListPeersViaDaemon(slug);
if (dr !== null) {
return dr.map((p) => annotateSelf(p as PeerRecord, selfMemberPubkey, selfSessionPubkey));
}

View File

@@ -1,10 +1,82 @@
import { existsSync } from "node:fs";
import { homedir } from "node:os";
import { join } from "node:path";
const home = homedir();
const DEFAULT_CONFIG_DIR = join(home, ".claudemesh");
/**
* Resolve `CONFIG_DIR` once, with stale-env detection.
*
* `claudemesh launch` exposes `CLAUDEMESH_CONFIG_DIR=<tmpdir>` to its
* spawned `claude` so the per-session mesh selection is isolated from
* `~/.claudemesh/config.json`. The tmpdir is rmSync'd on launch exit.
*
* Footgun: if a `claudemesh` invocation INHERITS that env from an
* already-launched (or previously-launched) session — e.g. a Bash tool
* call inside Claude Code, or a tmux pane that captured the env via
* `update-environment` — the inherited path may point at a tmpdir that
* no longer exists. Pre-1.34.14 we silently used the dead path,
* `readConfig()` came back empty, and the user saw "No meshes joined"
* from an otherwise-working install.
*
* Resolution rules:
* 1. No env var → `~/.claudemesh` (default).
* 2. Env points at a dir containing `config.json` → trust it
* (the legitimate per-session-launch case).
* 3. Env set but stale (dir missing or no `config.json`) → warn
* once on stderr (TTY-only) and fall back to `~/.claudemesh`.
*
* Memoized: resolves once on first access. Mid-process env mutations
* are intentionally ignored — paths must stay stable across one CLI
* invocation.
*/
let _resolvedConfigDir: string | null = null;
let _warnedStaleEnv = false;
function resolveConfigDir(): string {
if (_resolvedConfigDir !== null) return _resolvedConfigDir;
const envDir = process.env.CLAUDEMESH_CONFIG_DIR;
if (!envDir) {
_resolvedConfigDir = DEFAULT_CONFIG_DIR;
return DEFAULT_CONFIG_DIR;
}
// Trust the env when it resolves to a real directory. We check
// the DIR (not `config.json`) because the legitimate "fresh launch
// before any write" case has the dir but no config.json yet.
// The stale signature we want to catch is `rmSync(tmpDir,
// {recursive: true})` from the outer launch's cleanup — that
// removes the directory entirely, so a missing dir is the
// unambiguous "stale" signal.
if (existsSync(envDir)) {
_resolvedConfigDir = envDir;
return envDir;
}
// Stale: env set but the dir is gone. Most likely the outer
// launch's cleanup ran and we inherited its (now-dead) tmpdir
// path. Fall back to default and warn the user once on stderr —
// only when attached to a TTY, so non-interactive callers (CI,
// MCP boot, scripts piping stdout) stay quiet.
if (!_warnedStaleEnv && process.stderr.isTTY) {
_warnedStaleEnv = true;
const unsetHint =
process.env.SHELL?.endsWith("fish")
? "set -e CLAUDEMESH_CONFIG_DIR CLAUDEMESH_IPC_TOKEN_FILE"
: "unset CLAUDEMESH_CONFIG_DIR CLAUDEMESH_IPC_TOKEN_FILE";
process.stderr.write(
`claudemesh: ignoring stale CLAUDEMESH_CONFIG_DIR=${envDir} (no config.json there); using ${DEFAULT_CONFIG_DIR}.\n`
+ ` Hint: this is usually a leftover env from a previous \`claudemesh launch\`. Clean it with:\n`
+ ` ${unsetHint}\n`,
);
}
_resolvedConfigDir = DEFAULT_CONFIG_DIR;
return DEFAULT_CONFIG_DIR;
}
export const PATHS = {
CONFIG_DIR: process.env.CLAUDEMESH_CONFIG_DIR || join(home, ".claudemesh"),
get CONFIG_DIR() {
return resolveConfigDir();
},
get CONFIG_FILE() {
return join(this.CONFIG_DIR, "config.json");
},
@@ -20,3 +92,12 @@ export const PATHS = {
CLAUDE_JSON: join(home, ".claude.json"),
CLAUDE_SETTINGS: join(home, ".claude", "settings.json"),
} as const;
/**
* Test-only: reset the memoized resolution. Not exported from the
* package barrel; reach in via the relative path from a test file.
*/
export function _resetPathsForTest(): void {
_resolvedConfigDir = null;
_warnedStaleEnv = false;
}

View File

@@ -0,0 +1,57 @@
import { describe, it, expect, beforeEach, afterEach, vi } from "vitest";
import { mkdirSync, rmSync, existsSync } from "node:fs";
import { join } from "node:path";
import { tmpdir, homedir } from "node:os";
/** Each test imports a fresh copy of paths.ts via dynamic import +
* `_resetPathsForTest()` so memoization doesn't leak across cases. */
const TEST_DIR = join(tmpdir(), "claudemesh-paths-test-" + Date.now());
describe("paths CONFIG_DIR resolution", () => {
beforeEach(() => {
delete process.env.CLAUDEMESH_CONFIG_DIR;
if (existsSync(TEST_DIR)) rmSync(TEST_DIR, { recursive: true, force: true });
});
afterEach(() => {
delete process.env.CLAUDEMESH_CONFIG_DIR;
if (existsSync(TEST_DIR)) rmSync(TEST_DIR, { recursive: true, force: true });
});
it("falls back to ~/.claudemesh when env var is unset", async () => {
const mod = await import("~/constants/paths.js");
mod._resetPathsForTest();
expect(mod.PATHS.CONFIG_DIR).toBe(join(homedir(), ".claudemesh"));
});
it("honors CLAUDEMESH_CONFIG_DIR when the dir exists, even without config.json", async () => {
mkdirSync(TEST_DIR, { recursive: true });
process.env.CLAUDEMESH_CONFIG_DIR = TEST_DIR;
const mod = await import("~/constants/paths.js");
mod._resetPathsForTest();
expect(mod.PATHS.CONFIG_DIR).toBe(TEST_DIR);
});
it("falls back to default when env points at a missing dir (stale-tmpdir case)", async () => {
process.env.CLAUDEMESH_CONFIG_DIR = "/var/folders/_nonexistent_claudemesh_dir_xyz123";
const mod = await import("~/constants/paths.js");
mod._resetPathsForTest();
// Suppress the stderr warning to keep test output clean
const stderr = vi.spyOn(process.stderr, "write").mockImplementation(() => true);
try {
expect(mod.PATHS.CONFIG_DIR).toBe(join(homedir(), ".claudemesh"));
} finally {
stderr.mockRestore();
}
});
it("memoizes — second access returns the same path even if env changes mid-process", async () => {
mkdirSync(TEST_DIR, { recursive: true });
process.env.CLAUDEMESH_CONFIG_DIR = TEST_DIR;
const mod = await import("~/constants/paths.js");
mod._resetPathsForTest();
const first = mod.PATHS.CONFIG_DIR;
process.env.CLAUDEMESH_CONFIG_DIR = "/something/else";
expect(mod.PATHS.CONFIG_DIR).toBe(first);
});
});