Compare commits
11 Commits
706e681d6e
...
main
| Author | SHA1 | Date | |
|---|---|---|---|
|
|
1b28550f30 | ||
|
|
9d1b4f3d4c | ||
|
|
ffd0621ccc | ||
|
|
b9ecbe79ad | ||
|
|
33051b95bf | ||
|
|
64d9f9f6f9 | ||
|
|
7f61a711f1 | ||
|
|
96520394ff | ||
|
|
a2a53ff355 | ||
|
|
6780899185 | ||
|
|
cba4a938ec |
288
.artifacts/specs/2026-05-04-session-capabilities.md
Normal file
288
.artifacts/specs/2026-05-04-session-capabilities.md
Normal file
@@ -0,0 +1,288 @@
|
||||
# Session capabilities — first-class concept
|
||||
|
||||
**Status:** spec, queued behind v0.3.0 topic-encryption work.
|
||||
**Owner:** alezmad
|
||||
**Author:** Claude (Sprint B follow-up, 2026-05-04)
|
||||
**Related:** `2026-04-15-per-peer-capabilities.md` (existing per-peer
|
||||
caps system, member-keyed), `2026-05-04-per-session-presence.md`
|
||||
(per-launch session presence — what we're now restricting).
|
||||
|
||||
## Problem
|
||||
|
||||
Per-peer capability grants (`apps/broker/src/index.ts:2178+, 2309+`)
|
||||
are keyed on the sender's **stable member pubkey**. The grant model
|
||||
gives the recipient fine-grained control: "alice can DM me",
|
||||
"bob can read state but not broadcast", etc.
|
||||
|
||||
But: as of v1.30.0 (`per-session-presence`), every `claudemesh
|
||||
launch` mints a per-launch ephemeral keypair with a parent attestation
|
||||
binding it to the member identity. The launched session inherits **all**
|
||||
the member's capabilities transitively, because cap enforcement always
|
||||
falls through to the member key.
|
||||
|
||||
Concretely:
|
||||
|
||||
- Member `alice` is in mesh `flexicar`, granted `dm + state-read +
|
||||
state-write` by everyone.
|
||||
- Alice launches a session with `claudemesh launch` to do an automated
|
||||
task — say, run a Claude Code agent that iterates over PRs.
|
||||
- That session has full member privileges. It can DM peers, write
|
||||
shared state keys (e.g. clobber `current-pr`), grant new caps, ban
|
||||
members, etc. — none of which the user wanted to delegate.
|
||||
|
||||
There is no way to express "this session can DM peers but cannot
|
||||
deploy services or grant caps." The parent attestation is a binary
|
||||
existence proof — "this session was vouched by a member" — with no
|
||||
capability subset.
|
||||
|
||||
Plus an adjacent footgun: `set_state` (`apps/broker/src/index.ts:2949`)
|
||||
has **no cap check at all**. Anyone in the mesh can write any key. The
|
||||
spec at `2026-04-15-per-peer-capabilities.md` lists `state-write` as a
|
||||
planned cap but it was never wired into the broker. Shared keys like
|
||||
`current-pr` are write-anyone today.
|
||||
|
||||
## Goal
|
||||
|
||||
A launched session can be issued **a capability subset** of its
|
||||
parent member, signed by the parent at launch time, and the broker
|
||||
enforces the **intersection** of recipient grants × session caps on
|
||||
every protected operation.
|
||||
|
||||
## Non-goals
|
||||
|
||||
- Changing the existing per-peer cap model. Member-keyed grants stay
|
||||
authoritative for "who is allowed to talk to me."
|
||||
- Cross-machine session caps (waiting on 2.0.0 HKDF identity).
|
||||
- Per-tool granularity inside the Claude Code MCP surface — this
|
||||
spec only covers the broker-enforceable verbs (dm, broadcast,
|
||||
state-read, state-write, grant, kick, ban, profile-write,
|
||||
service-deploy).
|
||||
- Delegation: a session cannot re-vouch a sub-session with its own
|
||||
cap subset. Only members can attest sessions. (Could be lifted in
|
||||
a future spec; today's launch flow doesn't need it.)
|
||||
|
||||
## Design
|
||||
|
||||
### Capability vocabulary
|
||||
|
||||
Existing (today, member-level):
|
||||
|
||||
| Capability | Effect when GRANTED on a recipient → sender pair |
|
||||
|---------------|---------------------------------------------------|
|
||||
| `read` | Sender appears in recipient's `list_peers` |
|
||||
| `dm` | Sender can DM recipient |
|
||||
| `broadcast` | Sender's broadcasts reach recipient |
|
||||
| `state-read` | Sender can read shared state |
|
||||
| `state-write` | (planned) Sender can write shared state |
|
||||
| `file-read` | Sender can fetch files recipient shared |
|
||||
|
||||
New (session-level — cap subset on the attestation):
|
||||
|
||||
These are the **verbs the session is allowed to invoke**, NOT what
|
||||
peers can do TO it. A session attestation declaring `["dm", "read"]`
|
||||
means the session can SEND dm/read-list operations; it cannot
|
||||
broadcast, write state, grant, etc.
|
||||
|
||||
| Session cap | Gates which broker operations |
|
||||
|-------------------|------------------------------------------------|
|
||||
| `dm` | `send` with single recipient |
|
||||
| `broadcast` | `send` with `*`, `@group`, `#topic` |
|
||||
| `state-read` | `get_state`, `list_state` |
|
||||
| `state-write` | `set_state` |
|
||||
| `grant` | `grant`, `revoke`, `block` |
|
||||
| `kick` | `kick`, `disconnect` |
|
||||
| `ban` | `ban`, `unban` |
|
||||
| `profile-write` | `set_profile`, `set_summary`, `set_status` |
|
||||
| `service-deploy` | `mesh_service_register`, `_unregister` |
|
||||
|
||||
The default cap set when no subset is declared: the **full member
|
||||
set** (today's behavior — opt-in restriction, not breaking).
|
||||
|
||||
### Attestation v2
|
||||
|
||||
Existing v1 (`apps/cli/src/services/broker/session-hello-sig.ts`):
|
||||
|
||||
```
|
||||
canonical = `claudemesh-session-attest|<parent>|<session>|<expires>`
|
||||
```
|
||||
|
||||
New v2 (additive — broker accepts both):
|
||||
|
||||
```
|
||||
canonical = `claudemesh-session-attest-v2|<parent>|<session>|<expires>|<sorted-caps-csv>`
|
||||
```
|
||||
|
||||
Where `<sorted-caps-csv>` is the lower-cased, comma-joined,
|
||||
ASCII-sorted cap list. Empty-list = full member caps (default,
|
||||
back-compat).
|
||||
|
||||
**Wire shape additions on `session_hello`:**
|
||||
|
||||
```ts
|
||||
{
|
||||
type: "session_hello",
|
||||
...existing fields...,
|
||||
parentAttestation: {
|
||||
sessionPubkey,
|
||||
parentMemberPubkey,
|
||||
expiresAt,
|
||||
signature,
|
||||
// NEW:
|
||||
allowed_caps?: string[], // omitted = full member set
|
||||
version?: 2, // omitted = v1
|
||||
},
|
||||
}
|
||||
```
|
||||
|
||||
The broker version-detects: `version === 2` → verify v2 canonical
|
||||
including `allowed_caps`. Default behavior is unchanged for clients
|
||||
that don't pass it.
|
||||
|
||||
### Enforcement
|
||||
|
||||
Add `allowed_caps: string[] | null` to the in-memory `PeerConn`
|
||||
shape (`apps/broker/src/index.ts:131`). Populated from
|
||||
`handleSessionHello` (the v2 attestation supplies it) and from
|
||||
`handleHello` (control-plane / member connection — set to `null`,
|
||||
meaning "full member caps").
|
||||
|
||||
**Effective cap check** for a sending peer needing `cap`:
|
||||
|
||||
```ts
|
||||
function senderHasCap(conn: PeerConn, cap: string): boolean {
|
||||
if (conn.allowed_caps === null) return true; // member-level, no subset
|
||||
return conn.allowed_caps.includes(cap);
|
||||
}
|
||||
```
|
||||
|
||||
Wire this into every broker operation in the table above. The
|
||||
existing per-peer recipient-cap check at `2178+, 2309+` stays —
|
||||
session caps gate the **sender side**, recipient grants gate the
|
||||
**receive side**, and both must allow:
|
||||
|
||||
```
|
||||
allowed = senderHasCap(conn, capNeeded) && recipientGrants[sender][capNeeded]
|
||||
```
|
||||
|
||||
### `set_state` gate (bonus, ship together)
|
||||
|
||||
Today: no cap check. After this spec: `set_state` requires
|
||||
`state-write` on the sender side. Migration: existing members
|
||||
default to having `state-write` in their member caps (no recipient
|
||||
grant model for state-write — it's a sender-side gate only, mesh-
|
||||
wide). New attestations can omit it to forbid the session.
|
||||
|
||||
The recipient-side analog (per-peer state-write grants) is left for
|
||||
a future spec — today the value of guarding state-write is
|
||||
session-level (avoid an automated session clobbering shared keys),
|
||||
not peer-level.
|
||||
|
||||
### CLI surface
|
||||
|
||||
```
|
||||
claudemesh launch --caps dm,read # tight: read-only chat agent
|
||||
claudemesh launch --caps dm,broadcast # send-only, no state writes
|
||||
claudemesh launch # default: full member caps
|
||||
```
|
||||
|
||||
`claudemesh launch --caps ?` prints the table above with descriptions.
|
||||
|
||||
`claudemesh peer list --json` includes `allowed_caps` per row when
|
||||
present (`null` = full member). Lets users audit what their running
|
||||
sessions can actually do.
|
||||
|
||||
### Migration plan (mirrors `2026-04-15-per-peer-capabilities.md` §"Migration plan")
|
||||
|
||||
1. **Broker schema additive** — `PeerConn.allowed_caps` in-memory
|
||||
only; no DB column. Reload-on-reconnect is fine because the
|
||||
attestation is re-sent on every WS open (it's the proof of
|
||||
identity).
|
||||
|
||||
2. **CLI ships v2 attestation alongside v1.** New `--caps` flag
|
||||
defaults to omitted (= v1 attestation, full caps). Older
|
||||
brokers ignore the new fields entirely.
|
||||
|
||||
3. **Broker accepts v2.** When `allowed_caps` arrives, store it.
|
||||
No enforcement yet — log denied operations as `cap_check_dryrun`
|
||||
metric counter, still allow them through.
|
||||
|
||||
4. **Dry-run release.** Ship one CLI + broker release that emits
|
||||
the metric but doesn't enforce. Watch for false positives in
|
||||
real meshes for ≥ 1 week.
|
||||
|
||||
5. **Flip enforcement on.** Broker rejects operations failing the
|
||||
cap check with `forbidden: missing session capability "<cap>"`.
|
||||
Default ("no caps declared = full member") keeps existing
|
||||
sessions unaffected.
|
||||
|
||||
6. **`set_state` gate** ships in step 5 alongside the rest. Default
|
||||
member caps include `state-write`, so flipping it on doesn't
|
||||
break existing flows. Only sessions that explicitly omit
|
||||
`state-write` from `--caps` lose write access.
|
||||
|
||||
### Crypto notes
|
||||
|
||||
- v2 attestation re-uses `crypto_sign_detached` over the new
|
||||
canonical string; same parent member secret key, same TTL caps
|
||||
(≤24 h), same `expiresAt` semantics.
|
||||
- v1 signatures are NOT v2 signatures — collision is impossible
|
||||
because the canonical strings have different prefixes
|
||||
(`claudemesh-session-attest` vs `claudemesh-session-attest-v2`).
|
||||
Domain separation is intrinsic.
|
||||
- Like the existing per-peer cap system: caps are server-enforced
|
||||
metadata, not capability tokens. A malicious broker can ignore
|
||||
them. This is about UX trust + footgun prevention, not protocol-
|
||||
level security.
|
||||
|
||||
## Open questions
|
||||
|
||||
1. **Should the session attestation also bind to a fingerprint of
|
||||
the launched binary / Claude version?** Would let a member say
|
||||
"this session is constrained to Claude Code v1.34.15" so a
|
||||
compromised launched-binary doesn't get reused. Probably no — too
|
||||
much friction for the threat model.
|
||||
|
||||
2. **What's the right default for `claudemesh launch` going forward?**
|
||||
Once enforcement ships, do we change the default `--caps` from
|
||||
"full member" to "dm + read + state-read"? Tighter but breaks
|
||||
existing automation that writes state. Probably worth a one-
|
||||
release deprecation warning ("your session will lose state-write
|
||||
in v2.0.0 unless you pass --caps state-write") and then flip in
|
||||
v2.0.0.
|
||||
|
||||
3. **Does `--caps` belong in `~/.claudemesh/config.json` per-mesh
|
||||
defaults too?** A user who always launches read-only agents
|
||||
wants `caps: ["dm", "read"]` as a personal default. Easy add;
|
||||
defer until users ask for it.
|
||||
|
||||
4. **Per-tool MCP cap surface?** Out of scope here, but: a `claudemesh
|
||||
launch --tools peer:read,memory:write` would be a finer cut than
|
||||
broker-verb caps. The broker can't enforce that — it'd live in the
|
||||
MCP wrapper / Claude Code's allowedTools. Different layer.
|
||||
|
||||
## Test plan
|
||||
|
||||
- Pure-logic tests on `senderHasCap` (member-level → always true,
|
||||
empty caps → always false, declared caps → exact match).
|
||||
- Broker integration: launch a session with `--caps dm`, attempt
|
||||
`set_state` → expect `forbidden: missing session capability
|
||||
"state-write"`.
|
||||
- v1 attestation still accepted, no `allowed_caps` set, all caps
|
||||
permitted (back-compat).
|
||||
- v2 attestation with empty `allowed_caps` array → broker treats
|
||||
as "explicitly empty, no caps allowed" (NOT "full member"). The
|
||||
full-member default is "field omitted entirely". Test both.
|
||||
- Dry-run mode: cap fail increments the counter but the operation
|
||||
proceeds. Smoke-test before flipping enforcement.
|
||||
|
||||
## Estimate
|
||||
|
||||
- Spec review + open-question resolution: 1–2 days.
|
||||
- Broker change (PeerConn field, attestation v2 accept, per-verb
|
||||
enforcement, dry-run mode): 2–3 days.
|
||||
- CLI change (`--caps` flag, attestation builder, peer list
|
||||
surface): 1 day.
|
||||
- Tests: 1 day.
|
||||
- Dry-run release window: ≥ 1 week.
|
||||
|
||||
Total: ~1 sprint of focused work, plus a dry-run window.
|
||||
350
.artifacts/specs/2026-05-05-continuous-presence.md
Normal file
350
.artifacts/specs/2026-05-05-continuous-presence.md
Normal file
@@ -0,0 +1,350 @@
|
||||
# Continuous presence — lease model + resume token
|
||||
|
||||
**Status:** spec, ready for v0.3.0.
|
||||
**Owner:** alezmad
|
||||
**Author:** Claude (2026-05-05, follow-up to user-reported "after hours claudemesh disconnects")
|
||||
**Related:** `2026-05-04-per-session-presence.md` (per-launch ephemeral keypair), `apps/broker/src/index.ts:5430-5436` (current 30s ping loop), `apps/cli/src/daemon/ws-lifecycle.ts` (current backoff reconnect).
|
||||
|
||||
## Problem
|
||||
|
||||
Today, presence is fused to a single TCP/WS connection. When the
|
||||
connection breaks — half-dead NAT entries, ISP route changes, laptop
|
||||
sleep, broker restart — the broker tears down the presence row, fires
|
||||
`peer_left`, and waits for the daemon to dial a fresh socket and run
|
||||
the full attestation hello again. Other peers see the user blink
|
||||
offline → back online. Messages sent to the session during the gap are
|
||||
either dropped (if it's a `now`/`next` priority DM with no recipient
|
||||
match) or held in `message_queue` for `low` only.
|
||||
|
||||
Concrete symptom (user-reported): `claudemesh peer list` shows zero
|
||||
peers despite multiple sessions being "up" — they're stuck on
|
||||
half-dead TCP connections. Daemon hasn't noticed because no `close`
|
||||
fired. Hours later, kernel TCP keepalive (default Linux: 7200s idle +
|
||||
9 × 75s probes ≈ 2h11m) finally RSTs the socket, daemon's existing
|
||||
backoff reconnects, peers reappear. Until then: zombie session.
|
||||
|
||||
Two coupled bugs:
|
||||
|
||||
1. **No application-layer staleness detection.** Broker pings every
|
||||
30s (line 5431) and updates `lastPingAt` on pong, but never
|
||||
`terminate()`s a connection that stops returning pongs. Daemon
|
||||
doesn't ping at all. Both sides trust the kernel for liveness,
|
||||
which only fires after hours.
|
||||
|
||||
2. **Presence == connection.** Even once the staleness IS detected
|
||||
and the daemon reconnects, peers see a full `peer_left` /
|
||||
`peer_joined` cycle for a network blip that took 1–30 seconds.
|
||||
Outbound messages during the gap that target the session by
|
||||
pubkey route to nothing.
|
||||
|
||||
The user's ask: peers should never see a gap during transient
|
||||
disconnects. Presence should be continuous as long as the *session
|
||||
intent* is alive, regardless of how many sockets carried it.
|
||||
|
||||
## Goal
|
||||
|
||||
Presence is a **lease** keyed off the session's stable identity
|
||||
(`sessionPubkey`), held in broker memory + DB, with a TTL refreshed
|
||||
on every keepalive. Sockets come and go beneath the lease. Other peers
|
||||
see continuous online status across reconnects up to the lease TTL.
|
||||
|
||||
Specifically:
|
||||
|
||||
- A daemon (or per-session WS) can drop and re-establish the WS
|
||||
within a configurable grace window (default 90s) without any peer
|
||||
observing `peer_left` / `peer_joined`.
|
||||
- Messages sent to a session while its socket is mid-flap are queued,
|
||||
delivered on the next reattach, ordered.
|
||||
- Reconnect itself is sub-second on the wire when a `resume_token` is
|
||||
presented — broker recognises the session, restores the slot, no
|
||||
re-attestation round-trip.
|
||||
- After the grace window expires, the broker fires `peer_left`
|
||||
exactly once; on a later reconnect it fires `peer_joined` exactly
|
||||
once. No flapping.
|
||||
|
||||
## Non-goals
|
||||
|
||||
- **Multi-broker handoff.** Out of scope. If the broker process
|
||||
restarts, leases are lost and we fall back to today's behavior
|
||||
(clean reconnect, peers see one cycle). A future spec can address
|
||||
this with a shared lease store (Redis / Postgres LISTEN).
|
||||
- **Dual-socket on the daemon.** Useful gold-plating but not required
|
||||
for the user-facing problem. Single-socket with watchdog +
|
||||
resume-token covers the failure modes actually observed (NAT drops,
|
||||
ISP blips, sleep <90s).
|
||||
- **Manual `claudemesh reconnect` CLI.** Not needed; the lease model
|
||||
makes it redundant. Re-evaluate if real support cases surface.
|
||||
|
||||
## Design
|
||||
|
||||
### Lease model
|
||||
|
||||
```
|
||||
sessionPubkey → { transport: "online" | "offline",
|
||||
leaseUntil: Date,
|
||||
ws: WebSocket | null,
|
||||
...existing PeerConn fields }
|
||||
```
|
||||
|
||||
Today the `connections` Map IS keyed by `presenceId`, which is a fresh
|
||||
UUID per WS. We change that key to `sessionPubkey` (member-WS:
|
||||
`memberPubkey`; session-WS: `sessionPubkey`). The PeerConn struct
|
||||
gains:
|
||||
|
||||
```ts
|
||||
transport: "online" | "offline";
|
||||
leaseUntil: Date; // Date.now() + LEASE_TTL_MS
|
||||
evictionTimer: NodeJS.Timeout | null;
|
||||
```
|
||||
|
||||
### State transitions
|
||||
|
||||
**On WS open + hello accepted (initial):**
|
||||
- Insert into `connections` with `transport: "online"`,
|
||||
`leaseUntil: now + 90s`, `evictionTimer: null`.
|
||||
- Broadcast `peer_joined` (today's behavior).
|
||||
- Issue `resume_token` (see below) in the `hello_ack`.
|
||||
|
||||
**On WS open + hello carries valid `resume_token`:**
|
||||
- Look up by `sessionPubkey`, verify token signature + freshness
|
||||
(TTL <= LEASE_TTL_MS). If valid AND entry exists with
|
||||
`transport: "offline"`:
|
||||
- Cancel `evictionTimer`.
|
||||
- Swap `ws` reference.
|
||||
- Set `transport: "online"`, refresh `leaseUntil`.
|
||||
- **Do NOT** broadcast `peer_joined`. The lease never expired.
|
||||
- Drain any queued DMs accumulated during offline window.
|
||||
- Reply `hello_ack` with new `resume_token`.
|
||||
- If entry exists with `transport: "online"` (token replay attack or
|
||||
rapid reconnect race): close old `ws` with `1000, "session_replaced"`
|
||||
before swapping. Same as today's `oldConn.ws.close(1000, ...)`
|
||||
pattern at lines 1768/1996.
|
||||
- If no entry exists or token is stale: treat as a fresh hello,
|
||||
broadcast `peer_joined`. Token expired = same as a cold start.
|
||||
|
||||
**On WS close (any reason):**
|
||||
- Look up by `sessionPubkey`. If not found, no-op (already evicted).
|
||||
- Set `transport: "offline"`, clear `ws` reference.
|
||||
- Start `evictionTimer = setTimeout(evict, GRACE_MS)`.
|
||||
- **Do NOT** broadcast `peer_left`. **Do NOT** delete the entry.
|
||||
- **Do NOT** call `disconnectPresence(presenceId)` yet.
|
||||
|
||||
**On `evictionTimer` fire (lease expired without reattach):**
|
||||
- Delete from `connections`.
|
||||
- Broadcast `peer_left` (today's behavior at lines 5167-5189).
|
||||
- `decMeshCount`.
|
||||
- `disconnectPresence(presenceId)`.
|
||||
- Clean up URL watches, stream subs, MCP registry — same as today's
|
||||
close handler.
|
||||
- Audit `peer_left`.
|
||||
|
||||
**Watchdog (broker):**
|
||||
- The 30s ping loop (line 5431) gains a staleness check: if any
|
||||
conn's `transport === "online"` and `lastPingAt < now - 75s`, call
|
||||
`ws.terminate()`. This converts the half-dead socket into a clean
|
||||
`close` event, which fires the lease-offline transition above.
|
||||
- Same logic on the daemon side (see § Daemon changes).
|
||||
|
||||
### Resume token
|
||||
|
||||
A short opaque string the broker hands the daemon in `hello_ack`.
|
||||
Format: `mesh-resume.v1.<base64url(JSON-payload)>.<base64url(sig)>`
|
||||
where `JSON-payload = { sub: <sessionPubkey>, mid: <meshId>, exp:
|
||||
<unix-ms>, iat: <unix-ms> }` and `sig = ed25519(brokerSigningKey,
|
||||
JSON-payload)`.
|
||||
|
||||
- **Why a token, not just sessionPubkey?** A session needs to prove
|
||||
it's the holder of an existing lease without re-running the full
|
||||
attestation handshake (which involves member key + parent
|
||||
attestation lookup). The token is a server-issued cookie: cheap to
|
||||
verify, scoped to a single session, expires with the lease.
|
||||
- **Storage:** broker keeps the signing key in env (`RESUME_TOKEN_KEY`,
|
||||
generated on first boot if missing, persisted to a config row). No
|
||||
DB column needed for the tokens themselves — they're verified by
|
||||
signature alone.
|
||||
- **TTL:** equal to LEASE_TTL_MS (90s). After that the daemon must
|
||||
re-handshake with full attestation. Refreshed on every successful
|
||||
reattach.
|
||||
- **Daemon storage:** in-memory only. Lost on daemon restart, which
|
||||
is correct: a daemon restart is a real reconnect and should run
|
||||
the full hello.
|
||||
|
||||
### Wire protocol additions
|
||||
|
||||
`hello` (member-WS, session-WS, fresh-launch hello — all three):
|
||||
```diff
|
||||
{
|
||||
type: "hello",
|
||||
memberPubkey: "...",
|
||||
sessionPubkey: "...", // session-WS only
|
||||
attestation: "...", // session-WS only
|
||||
signature: "...",
|
||||
+ resumeToken?: "mesh-resume.v1...", // optional; presence = reattach attempt
|
||||
...
|
||||
}
|
||||
```
|
||||
|
||||
`hello_ack`:
|
||||
```diff
|
||||
{
|
||||
type: "hello_ack",
|
||||
presenceId: "...",
|
||||
...
|
||||
+ resumeToken: "mesh-resume.v1...", // always issued; replaces prior on reattach
|
||||
+ leaseTtlMs: 90000, // informational; daemon may use for ping cadence
|
||||
}
|
||||
```
|
||||
|
||||
No new message types. Old daemons that don't send `resumeToken` get
|
||||
today's full-handshake behavior — fully backward compatible.
|
||||
|
||||
### Message queue during grace window
|
||||
|
||||
Today: DMs to a presence whose WS is closed → routed to
|
||||
`message_queue` only for `priority: low`; `now`/`next` either route
|
||||
to a different connected session of the same member or drop.
|
||||
|
||||
Change: when broker would route to a session whose
|
||||
`transport === "offline"` (lease still valid), enqueue regardless of
|
||||
priority. On reattach, the existing inbox-drain path
|
||||
(`maybePushQueuedMessages` at line 967) flushes them in order. The
|
||||
`message_queue` already has the schema for this; we're just relaxing
|
||||
the priority gate when the target is in grace.
|
||||
|
||||
### Constants
|
||||
|
||||
```ts
|
||||
const LEASE_TTL_MS = 90_000; // grace window after WS close
|
||||
const PING_INTERVAL_MS = 30_000; // unchanged
|
||||
const STALE_PONG_THRESHOLD_MS = 75_000; // 2.5x ping interval
|
||||
const RESUME_TOKEN_TTL_MS = LEASE_TTL_MS;
|
||||
```
|
||||
|
||||
`LEASE_TTL_MS` = 90s rationale: long enough to absorb a sleep/resume
|
||||
cycle, NAT timeout, ISP route flap, mobile→wifi handover. Short
|
||||
enough that a true crash (daemon killed, machine off) clears the
|
||||
session within 90s — peers don't see ghost online status forever.
|
||||
Configurable via env (`LEASE_TTL_MS`) for self-hosted brokers.
|
||||
|
||||
## Daemon changes
|
||||
|
||||
### Watchdog
|
||||
|
||||
In `ws-lifecycle.ts`, add an `idleWatchdog` parallel to the existing
|
||||
backoff/reconnect machinery:
|
||||
|
||||
```ts
|
||||
let lastActivity = Date.now(); // bumped on every incoming message + pong
|
||||
const watchdog = setInterval(() => {
|
||||
if (Date.now() - lastActivity > STALE_THRESHOLD_MS) {
|
||||
log("warn", "ws_stale_terminate", { url: opts.url });
|
||||
sock.terminate(); // fires existing close handler → reconnect path
|
||||
} else if (sock.readyState === sock.OPEN) {
|
||||
sock.ping(); // matches broker's 30s cadence, gives broker a pong
|
||||
}
|
||||
}, PING_INTERVAL_MS);
|
||||
sock.on("message", () => { lastActivity = Date.now(); });
|
||||
sock.on("pong", () => { lastActivity = Date.now(); });
|
||||
```
|
||||
|
||||
Cleanup `clearInterval(watchdog)` in the close handler and explicit
|
||||
`close()` path.
|
||||
|
||||
### Resume token in hello
|
||||
|
||||
`apps/cli/src/daemon/broker.ts:136` and equivalent in
|
||||
`session-broker.ts`: persist the `resumeToken` from each successful
|
||||
`hello_ack` into a private field, include it in the next
|
||||
`buildHello()` call. On daemon restart the field is empty → cold
|
||||
start, exactly today's behavior.
|
||||
|
||||
### No CLI changes
|
||||
|
||||
`claudemesh peer list` keeps reading the broker's `connections` Map
|
||||
which now reflects continuous presence. Users see online sessions as
|
||||
online during transient blips. No UX surface changes.
|
||||
|
||||
## Migration
|
||||
|
||||
- New broker is fully backward compatible with old daemons (resume
|
||||
token is optional, defaults fall through to today's path).
|
||||
- New daemons against an old broker: token is sent but ignored, full
|
||||
handshake runs each reconnect — same as today.
|
||||
- DB migration: none. `presence` table semantics unchanged. The
|
||||
`disconnectedAt` column is now set only on lease eviction (>90s),
|
||||
not on every WS close. This is a behavioral change but not a
|
||||
schema change.
|
||||
- Add ENV var `RESUME_TOKEN_KEY` (broker generates on first boot if
|
||||
unset, persists to a singleton config row).
|
||||
|
||||
## Test plan
|
||||
|
||||
1. **Sleep test:** kill -STOP the daemon for 60s, then kill -CONT.
|
||||
Expect: peers never see `peer_left`. Daemon's WS is dead-on-arrival
|
||||
when it wakes; watchdog terminates it; reconnect with resume_token
|
||||
succeeds within 1-2s; lease was at ~30s of its 90s TTL when the
|
||||
daemon resumed.
|
||||
|
||||
2. **Hard offline:** kill -STOP for 120s, kill -CONT. Expect: peers
|
||||
see exactly one `peer_left` at t=90s, then exactly one
|
||||
`peer_joined` after the daemon resumes and reconnects (resume
|
||||
token is now stale; full handshake runs).
|
||||
|
||||
3. **NAT drop simulation:** `iptables -A OUTPUT -p tcp --dport 443
|
||||
-j DROP` for 60s on the daemon host, then remove the rule. Expect:
|
||||
broker pings stop landing, broker-side watchdog calls
|
||||
`ws.terminate()` at t=75s, lease enters grace, daemon's own
|
||||
watchdog fires within ~30s, daemon reconnects with resume_token,
|
||||
peers never see a flap.
|
||||
|
||||
4. **Message-during-grace:** while a target session is in grace
|
||||
(offline, lease valid), send a `priority: now` DM. Expect: queued
|
||||
in `message_queue`, delivered exactly once on reattach, no
|
||||
`peer_left` visible to sender, ack returns delivered.
|
||||
|
||||
5. **Replay attack:** capture a resume_token in flight, replay it
|
||||
against a different broker connection while the original session
|
||||
is still online. Expect: broker treats it as a reconnect for an
|
||||
already-online session → closes old WS with `session_replaced`,
|
||||
new WS takes over. Equivalent to today's session-replacement
|
||||
semantics; the original session detects the close and either
|
||||
reconnects (if it's still alive) or gives up.
|
||||
|
||||
6. **Token forgery:** send a `resumeToken` not signed by the broker.
|
||||
Expect: signature check fails, broker treats hello as a fresh
|
||||
handshake (or rejects if the rest of the hello is invalid).
|
||||
|
||||
## Open questions
|
||||
|
||||
- **Should `peer list` expose a `transport` field** so callers can
|
||||
distinguish "leased but offline" from "online"? Default no — the
|
||||
abstraction we're selling is "they're online." But debugging may
|
||||
want it; gate it behind `--all` or `--debug`.
|
||||
- **What about the broker-side `mcpRegistry` cleanup?** Today we
|
||||
delete non-persistent MCP entries on WS close (line 5217). With
|
||||
leases, we should defer that to lease eviction, not WS close.
|
||||
Otherwise an MCP server registered by a session disappears every
|
||||
time its WS reconnects.
|
||||
|
||||
## Build order
|
||||
|
||||
1. **Broker lease model** — change `connections` keying, add
|
||||
`transport`/`leaseUntil`/`evictionTimer`, refactor close handler
|
||||
to start grace timer instead of immediate teardown, refactor
|
||||
eviction path. (~80 lines.)
|
||||
2. **Resume token** — signing key bootstrap, token issue/verify,
|
||||
wire format, hello_ack changes. (~50 lines + 1 config row.)
|
||||
3. **Daemon watchdog** — `ws-lifecycle.ts` adds `idleWatchdog` and
|
||||
stores `resumeToken` from acks. (~25 lines.)
|
||||
4. **Daemon hello** — pass `resumeToken` in next `buildHello()`.
|
||||
(~10 lines across `broker.ts` + `session-broker.ts`.)
|
||||
5. **Broker watchdog** — extend the 30s ping loop with
|
||||
`terminate()`-on-stale logic. (~15 lines.)
|
||||
6. **Queue-during-grace** — relax priority gate in DM routing.
|
||||
(~5 lines.)
|
||||
7. **Spec docs** — update `docs/protocol.md` with resume_token,
|
||||
lease semantics. (~30 lines.)
|
||||
8. **Tests** — six scenarios above. Likely ~3 new test files.
|
||||
|
||||
Estimated total: one focused day. The broker lease model is the load-
|
||||
bearing change; everything else slots in cleanly once that's done.
|
||||
@@ -427,6 +427,21 @@ export async function heartbeat(presenceId: string): Promise<void> {
|
||||
.where(eq(presence.id, presenceId));
|
||||
}
|
||||
|
||||
/**
|
||||
* Restore a presence row to online state on lease reattach: clear
|
||||
* `disconnectedAt` and bump `lastPingAt`. Needed because the DB-level
|
||||
* stale-presence sweeper may have flipped the row to disconnected
|
||||
* during the grace window — the lease is in-memory truth, but other
|
||||
* code paths read presence.disconnectedAt directly.
|
||||
*/
|
||||
export async function restorePresence(presenceId: string): Promise<void> {
|
||||
const now = new Date();
|
||||
await db
|
||||
.update(presence)
|
||||
.set({ disconnectedAt: null, lastPingAt: now })
|
||||
.where(eq(presence.id, presenceId));
|
||||
}
|
||||
|
||||
// --- Peer discovery ---
|
||||
|
||||
/** Return all active (connected) presences in a mesh, joined with member info. */
|
||||
|
||||
@@ -41,6 +41,7 @@ import {
|
||||
grantFileKey,
|
||||
handleHookSetStatus,
|
||||
heartbeat,
|
||||
restorePresence,
|
||||
insertFileKeys,
|
||||
joinGroup,
|
||||
joinMesh,
|
||||
@@ -156,11 +157,53 @@ interface PeerConn {
|
||||
bio?: string;
|
||||
capabilities?: string[];
|
||||
};
|
||||
/** v2 agentic-comms presence taxonomy. Mirrors the value passed to
|
||||
* `recordPresence`. Used by the kick handler to refuse no-op kicks
|
||||
* on long-lived control-plane connections (daemon, dashboard) that
|
||||
* would just auto-reconnect. */
|
||||
peerRole: "control-plane" | "session" | "service";
|
||||
/** Last time this connection's WS replied to a broker ping. Bumped
|
||||
* in the `pong` handler. Used by the staleness watchdog to detect
|
||||
* half-dead TCP/NAT-dropped connections that the kernel hasn't yet
|
||||
* RST'd (Linux default keepalive ≈ 2hrs). */
|
||||
lastPongAt: number;
|
||||
/** Lease state: "online" while the WS is healthy, "offline" during
|
||||
* the GRACE window after a WS close. While offline, the entry stays
|
||||
* in `connections` so peer_list / sendToPeer still see it; DMs land
|
||||
* in the message_queue (sendToPeer no-ops on dead WS, but the queue
|
||||
* row stays with deliveredAt=NULL and drains on reattach). After
|
||||
* GRACE_MS without a reattach, evictionTimer fires the full peer_left
|
||||
* + cleanup. Reattach (same sessionPubkey hello arriving on a fresh
|
||||
* WS) cancels the timer, swaps in the new ws, restores online. */
|
||||
leaseState: "online" | "offline";
|
||||
/** When the lease will be evicted if no reattach happens. 0 when online. */
|
||||
leaseUntil: number;
|
||||
/** Timer that fires evictPresenceFully(presenceId) at leaseUntil. null when online. */
|
||||
evictionTimer: NodeJS.Timeout | null;
|
||||
}
|
||||
|
||||
const connections = new Map<string, PeerConn>();
|
||||
const connectionsPerMesh = new Map<string, number>();
|
||||
|
||||
/**
|
||||
* Lease grace window — how long after a WS close the broker will hold
|
||||
* the presence row open before evicting and broadcasting peer_left.
|
||||
*
|
||||
* 90s: long enough to absorb a sleep/resume cycle, NAT timeout, ISP
|
||||
* route flap, mobile→wifi handover, broker restart of the daemon's
|
||||
* machine. Short enough that a true crash (machine off, daemon killed)
|
||||
* clears the session within 90s — peers don't see ghost online status
|
||||
* forever.
|
||||
*
|
||||
* During grace: lease stays in `connections`, peer_list keeps showing
|
||||
* the session as online to other peers, DMs route through message_queue
|
||||
* (sendToPeer no-ops on dead WS, drain happens on reattach). On
|
||||
* reattach (same sessionPubkey hello on a new WS): silent swap, no
|
||||
* peer_joined / peer_left visible to anyone. After grace expires:
|
||||
* full eviction (peer_left + cleanup) fires exactly once.
|
||||
*/
|
||||
const GRACE_MS = 90_000;
|
||||
|
||||
// Rate limiter for /tg/token endpoint (IP → count, cleared hourly)
|
||||
const tgTokenRateLimit = new Map<string, number>();
|
||||
setInterval(() => tgTokenRateLimit.clear(), 60 * 60_000).unref();
|
||||
@@ -525,6 +568,97 @@ function sendToPeer(presenceId: string, msg: WSServerMessage): void {
|
||||
}
|
||||
}
|
||||
|
||||
/**
|
||||
* Run the full presence-cleanup path: broadcast peer_left, decMeshCount,
|
||||
* disconnectPresence in DB, audit, clean up URL watches / streams /
|
||||
* MCP entries / clock. Removes the entry from `connections`.
|
||||
*
|
||||
* Called from two places:
|
||||
* 1. `ws.on("close")` when the closing WS belongs to a connection
|
||||
* with no active lease (no grace) — i.e. the lease had already
|
||||
* been evicted, or the close fires before lease is established.
|
||||
* 2. The grace-window evictionTimer when no reattach happened in
|
||||
* GRACE_MS. This is the "presence is really gone" path.
|
||||
*
|
||||
* Idempotent: re-entering when the connections entry is already gone
|
||||
* is a no-op.
|
||||
*/
|
||||
async function evictPresenceFully(presenceId: string): Promise<void> {
|
||||
const conn = connections.get(presenceId);
|
||||
if (!conn) return; // already evicted
|
||||
if (conn.evictionTimer) {
|
||||
clearTimeout(conn.evictionTimer);
|
||||
conn.evictionTimer = null;
|
||||
}
|
||||
connections.delete(presenceId);
|
||||
decMeshCount(conn.meshId);
|
||||
|
||||
const leaveMsg: WSPushMessage = {
|
||||
type: "push",
|
||||
subtype: "system",
|
||||
event: "peer_left",
|
||||
eventData: {
|
||||
name: conn.displayName,
|
||||
pubkey: conn.sessionPubkey ?? conn.memberPubkey,
|
||||
},
|
||||
messageId: crypto.randomUUID(),
|
||||
meshId: conn.meshId,
|
||||
senderPubkey: "system",
|
||||
priority: "low",
|
||||
nonce: "",
|
||||
ciphertext: "",
|
||||
createdAt: new Date().toISOString(),
|
||||
};
|
||||
for (const [pid, peer] of connections) {
|
||||
if (peer.meshId !== conn.meshId) continue;
|
||||
// Don't tell the user's own other sessions they "left" when one
|
||||
// of their Claude Code instances closes. Same pubkey = same user.
|
||||
if (peer.memberPubkey === conn.memberPubkey) continue;
|
||||
sendToPeer(pid, leaveMsg);
|
||||
}
|
||||
|
||||
await disconnectPresence(presenceId);
|
||||
void audit(conn.meshId, "peer_left", conn.memberId, conn.displayName, {});
|
||||
|
||||
// URL watches owned by this presence — interval would otherwise
|
||||
// happily fetch forever after the peer is gone.
|
||||
for (const [watchId, watch] of urlWatches) {
|
||||
if (watch.presenceId === presenceId) {
|
||||
clearInterval(watch.timer);
|
||||
urlWatches.delete(watchId);
|
||||
}
|
||||
}
|
||||
// Stream subscriptions for this presence.
|
||||
for (const [key, subs] of streamSubscriptions) {
|
||||
subs.delete(presenceId);
|
||||
if (subs.size === 0) streamSubscriptions.delete(key);
|
||||
}
|
||||
// MCP servers registered by this presence.
|
||||
for (const [key, entry] of mcpRegistry) {
|
||||
if (entry.presenceId === presenceId) {
|
||||
if (entry.persistent) {
|
||||
// Keep persistent entries but mark offline
|
||||
entry.online = false;
|
||||
entry.offlineSince = new Date().toISOString();
|
||||
entry.presenceId = "";
|
||||
} else {
|
||||
mcpRegistry.delete(key);
|
||||
}
|
||||
}
|
||||
}
|
||||
// Auto-pause clock when mesh becomes empty.
|
||||
if (!connectionsPerMesh.has(conn.meshId)) {
|
||||
const clock = meshClocks.get(conn.meshId);
|
||||
if (clock && clock.timer) {
|
||||
clearInterval(clock.timer);
|
||||
clock.timer = null;
|
||||
clock.paused = true;
|
||||
log.info("clock auto-paused (mesh empty)", { mesh_id: conn.meshId });
|
||||
}
|
||||
}
|
||||
log.info("ws evict full", { presence_id: presenceId });
|
||||
}
|
||||
|
||||
async function maybePushQueuedMessages(
|
||||
presenceId: string,
|
||||
excludeSenderSessionPubkey?: string,
|
||||
@@ -1661,6 +1795,10 @@ async function handleHello(
|
||||
lastSeenAt?: string;
|
||||
restoredGroups?: Array<{ name: string; role?: string }>;
|
||||
restoredStats?: unknown;
|
||||
/** True when this hello reattached an existing offline lease — caller
|
||||
* must skip the peer_joined broadcast and the services-list ack
|
||||
* augmentation. The session was never visibly absent from peers. */
|
||||
silent?: boolean;
|
||||
} | null> {
|
||||
// Validate sessionPubkey shape — it becomes a routable identity in
|
||||
// listPeers/drainForMember, so arbitrary strings let a client claim
|
||||
@@ -1753,6 +1891,61 @@ async function handleHello(
|
||||
const initialGroups = helloHasGroups
|
||||
? hello.groups!
|
||||
: (saved?.groups?.length ? saved.groups : (member.defaultGroups ?? []));
|
||||
// Reattach check: if an offline-leased lease exists for the same
|
||||
// stable identity (sessionPubkey when present, otherwise sessionId
|
||||
// for member-WS), this hello is a transient reconnect within the
|
||||
// grace window — swap the WS reference, clear the eviction timer,
|
||||
// restore online state. No peer_joined broadcast — peers never saw
|
||||
// this session leave.
|
||||
for (const [pid, oldConn] of connections) {
|
||||
if (oldConn.meshId !== hello.meshId) continue;
|
||||
if (oldConn.leaseState !== "offline") continue;
|
||||
const matchByPubkey =
|
||||
!!hello.sessionPubkey
|
||||
&& oldConn.sessionPubkey === hello.sessionPubkey;
|
||||
const matchBySessionId =
|
||||
!hello.sessionPubkey
|
||||
&& !oldConn.sessionPubkey
|
||||
&& oldConn.sessionId === hello.sessionId
|
||||
&& oldConn.memberPubkey === hello.pubkey;
|
||||
if (!matchByPubkey && !matchBySessionId) continue;
|
||||
|
||||
if (oldConn.evictionTimer) {
|
||||
clearTimeout(oldConn.evictionTimer);
|
||||
oldConn.evictionTimer = null;
|
||||
}
|
||||
oldConn.ws = ws;
|
||||
oldConn.leaseState = "online";
|
||||
oldConn.leaseUntil = 0;
|
||||
oldConn.lastPongAt = Date.now();
|
||||
// Refresh mutable fields from the new hello — the same session may
|
||||
// have moved cwd / changed display name across the blip.
|
||||
oldConn.cwd = hello.cwd;
|
||||
if (hello.displayName) oldConn.displayName = hello.displayName;
|
||||
log.info("ws hello reattach (lease)", {
|
||||
presence_id: pid,
|
||||
session_pubkey: hello.sessionPubkey?.slice(0, 12) ?? "(member-WS)",
|
||||
session_id: hello.sessionId,
|
||||
});
|
||||
// Reset DB row to online: the stale-presence sweeper may have set
|
||||
// disconnectedAt during the grace window. Lease is in-memory truth
|
||||
// but downstream code paths read presence.disconnectedAt directly.
|
||||
void restorePresence(pid);
|
||||
// Drain any queued DMs that landed during the offline window.
|
||||
void maybePushQueuedMessages(pid);
|
||||
return {
|
||||
presenceId: pid,
|
||||
memberDisplayName: oldConn.displayName,
|
||||
memberProfile: {
|
||||
roleTag: member.roleTag,
|
||||
groups: member.defaultGroups ?? [],
|
||||
messageMode: member.messageMode ?? "push",
|
||||
},
|
||||
meshPolicy,
|
||||
silent: true,
|
||||
};
|
||||
}
|
||||
|
||||
// Session-id dedup: if this session_id already has an active presence,
|
||||
// disconnect the ghost. Happens when a client reconnects after a
|
||||
// network blip or broker restart before the 90s stale sweeper runs.
|
||||
@@ -1797,6 +1990,11 @@ async function handleHello(
|
||||
groups: initialGroups,
|
||||
visible: saved?.visible ?? true,
|
||||
profile: saved?.profile ?? {},
|
||||
peerRole: "control-plane",
|
||||
lastPongAt: Date.now(),
|
||||
leaseState: "online",
|
||||
leaseUntil: 0,
|
||||
evictionTimer: null,
|
||||
});
|
||||
incMeshCount(hello.meshId);
|
||||
void audit(hello.meshId, "peer_joined", member.id, effectiveDisplayName, {
|
||||
@@ -1853,6 +2051,10 @@ async function handleSessionHello(
|
||||
memberDisplayName: string;
|
||||
memberProfile?: unknown;
|
||||
meshPolicy?: Record<string, unknown>;
|
||||
/** True when this hello reattached an existing offline lease — caller
|
||||
* must skip the peer_joined broadcast. The session was never visibly
|
||||
* absent from peers. */
|
||||
silent?: boolean;
|
||||
} | null> {
|
||||
// Shape checks. The crypto helpers also enforce these but bailing
|
||||
// early gives a clearer error code on the wire.
|
||||
@@ -1982,6 +2184,42 @@ async function handleSessionHello(
|
||||
|
||||
const initialGroups = hello.groups ?? member.defaultGroups ?? [];
|
||||
|
||||
// Reattach check: an offline-leased connection with the same
|
||||
// sessionPubkey is the same launched session resuming inside the
|
||||
// grace window. Cancel the eviction timer, swap the WS, restore
|
||||
// online state. No peer_joined broadcast — peers never saw the
|
||||
// session leave.
|
||||
for (const [pid, oldConn] of connections) {
|
||||
if (oldConn.meshId !== hello.meshId) continue;
|
||||
if (oldConn.leaseState !== "offline") continue;
|
||||
if (oldConn.sessionPubkey !== hello.sessionPubkey) continue;
|
||||
|
||||
if (oldConn.evictionTimer) {
|
||||
clearTimeout(oldConn.evictionTimer);
|
||||
oldConn.evictionTimer = null;
|
||||
}
|
||||
oldConn.ws = ws;
|
||||
oldConn.leaseState = "online";
|
||||
oldConn.leaseUntil = 0;
|
||||
oldConn.lastPongAt = Date.now();
|
||||
// Refresh mutable fields from the new hello.
|
||||
oldConn.cwd = hello.cwd;
|
||||
if (hello.displayName) oldConn.displayName = hello.displayName;
|
||||
log.info("session_hello reattach (lease)", {
|
||||
presence_id: pid,
|
||||
session_pubkey: hello.sessionPubkey.slice(0, 12),
|
||||
});
|
||||
void restorePresence(pid);
|
||||
void maybePushQueuedMessages(pid);
|
||||
return {
|
||||
presenceId: pid,
|
||||
memberDisplayName: oldConn.displayName,
|
||||
memberProfile: undefined,
|
||||
meshPolicy,
|
||||
silent: true,
|
||||
};
|
||||
}
|
||||
|
||||
// Session-id dedup: if the same session_id is already connected, kick
|
||||
// the ghost. Reconnect after a network blip lands here cleanly.
|
||||
for (const [oldPid, oldConn] of connections) {
|
||||
@@ -2022,6 +2260,11 @@ async function handleSessionHello(
|
||||
groups: initialGroups,
|
||||
visible: true,
|
||||
profile: {},
|
||||
peerRole: "session",
|
||||
lastPongAt: Date.now(),
|
||||
leaseState: "online",
|
||||
leaseUntil: 0,
|
||||
evictionTimer: null,
|
||||
});
|
||||
incMeshCount(hello.meshId);
|
||||
void audit(hello.meshId, "peer_joined", member.id, effectiveDisplayName, {
|
||||
@@ -2420,8 +2663,10 @@ function handleConnection(ws: WebSocket): void {
|
||||
}
|
||||
// Broadcast peer_joined to siblings — same shape as the regular
|
||||
// hello path, so list_peers consumers don't need to special-case.
|
||||
// Skipped on lease reattach: the session was never visibly absent,
|
||||
// so no synthetic join event should fire.
|
||||
const joinedConn = connections.get(presenceId);
|
||||
if (joinedConn) {
|
||||
if (joinedConn && !result.silent) {
|
||||
const joinMsg: WSPushMessage = {
|
||||
type: "push",
|
||||
subtype: "system",
|
||||
@@ -2504,9 +2749,11 @@ function handleConnection(ws: WebSocket): void {
|
||||
} catch {
|
||||
/* ws closed during hello */
|
||||
}
|
||||
// Broadcast peer_joined or peer_returned to all other peers in the same mesh.
|
||||
// Broadcast peer_joined or peer_returned to all other peers in the
|
||||
// same mesh. Skipped on lease reattach: the session never appeared
|
||||
// offline so no synthetic join event should fire.
|
||||
const joinedConn = connections.get(presenceId);
|
||||
if (joinedConn) {
|
||||
if (joinedConn && !result.silent) {
|
||||
const isReturning = !!result.restored;
|
||||
const joinMsg: WSPushMessage = {
|
||||
type: "push",
|
||||
@@ -4645,11 +4892,30 @@ function handleConnection(ws: WebSocket): void {
|
||||
}
|
||||
|
||||
const affected: string[] = [];
|
||||
// 1.34.15 (gap #3a): kick was a no-op against long-lived
|
||||
// control-plane connections (daemon, dashboard) — closing
|
||||
// their WS just triggered the auto-reconnect loop, the
|
||||
// kicker's CLI rendered "Their Claude Code session ended"
|
||||
// (which was misleading), and the user-visible state was
|
||||
// unchanged seconds later. We now refuse to close control-
|
||||
// plane WSes and surface the skipped peers in a new
|
||||
// additive ack field. Pre-1.34.15 CLI clients only read
|
||||
// `kicked`/`affected`, so this stays back-compat.
|
||||
//
|
||||
// For `kick`-only: the soft `disconnect` verb still closes
|
||||
// control-plane WSes intentionally — that's what users want
|
||||
// when they're nudging a peer for it to re-authenticate.
|
||||
const skippedControlPlane: string[] = [];
|
||||
const skipControlPlane = isKick;
|
||||
const now = Date.now();
|
||||
|
||||
if (km.all) {
|
||||
for (const [pid, peer] of connections) {
|
||||
if (peer.meshId !== conn.meshId || pid === presenceId) continue;
|
||||
if (skipControlPlane && peer.peerRole === "control-plane") {
|
||||
skippedControlPlane.push(peer.displayName || pid);
|
||||
continue;
|
||||
}
|
||||
try { peer.ws.close(closeCode, closeReason); } catch {}
|
||||
connections.delete(pid);
|
||||
void disconnectPresence(pid);
|
||||
@@ -4661,6 +4927,10 @@ function handleConnection(ws: WebSocket): void {
|
||||
if (peer.meshId !== conn.meshId || pid === presenceId) continue;
|
||||
const [pres] = await db.select({ lastPingAt: presence.lastPingAt }).from(presence).where(eq(presence.id, pid)).limit(1);
|
||||
if (pres && pres.lastPingAt && pres.lastPingAt.getTime() < cutoff) {
|
||||
if (skipControlPlane && peer.peerRole === "control-plane") {
|
||||
skippedControlPlane.push(peer.displayName || pid);
|
||||
continue;
|
||||
}
|
||||
try { peer.ws.close(closeCode, `${closeReason}_stale`); } catch {}
|
||||
connections.delete(pid);
|
||||
void disconnectPresence(pid);
|
||||
@@ -4671,6 +4941,10 @@ function handleConnection(ws: WebSocket): void {
|
||||
for (const [pid, peer] of connections) {
|
||||
if (peer.meshId !== conn.meshId) continue;
|
||||
if (peer.displayName === km.target || peer.memberPubkey === km.target || peer.memberPubkey.startsWith(km.target)) {
|
||||
if (skipControlPlane && peer.peerRole === "control-plane") {
|
||||
skippedControlPlane.push(peer.displayName || pid);
|
||||
continue;
|
||||
}
|
||||
try { peer.ws.close(closeCode, closeReason); } catch {}
|
||||
connections.delete(pid);
|
||||
void disconnectPresence(pid);
|
||||
@@ -4679,8 +4953,20 @@ function handleConnection(ws: WebSocket): void {
|
||||
}
|
||||
}
|
||||
|
||||
conn.ws.send(JSON.stringify({ type: ackType, kicked: affected, affected, _reqId: km._reqId }));
|
||||
log.info(`ws ${closeReason}`, { presence_id: presenceId, count: affected.length, target: km.target ?? km.stale ?? "all" });
|
||||
conn.ws.send(JSON.stringify({
|
||||
type: ackType,
|
||||
kicked: affected,
|
||||
affected,
|
||||
// Additive — older CLI clients ignore this field.
|
||||
...(skippedControlPlane.length > 0 ? { skipped_control_plane: skippedControlPlane } : {}),
|
||||
_reqId: km._reqId,
|
||||
}));
|
||||
log.info(`ws ${closeReason}`, {
|
||||
presence_id: presenceId,
|
||||
count: affected.length,
|
||||
target: km.target ?? km.stale ?? "all",
|
||||
skipped_control_plane: skippedControlPlane.length,
|
||||
});
|
||||
break;
|
||||
}
|
||||
|
||||
@@ -5108,88 +5394,52 @@ function handleConnection(ws: WebSocket): void {
|
||||
}
|
||||
});
|
||||
ws.on("close", async () => {
|
||||
if (presenceId) {
|
||||
const conn = connections.get(presenceId);
|
||||
// Persist peer state BEFORE removing from connections.
|
||||
if (conn) {
|
||||
await savePeerState(conn, conn.memberId, conn.meshId);
|
||||
}
|
||||
connections.delete(presenceId);
|
||||
if (conn) {
|
||||
decMeshCount(conn.meshId);
|
||||
// Broadcast peer_left to remaining peers in the same mesh.
|
||||
const leaveMsg: WSPushMessage = {
|
||||
type: "push",
|
||||
subtype: "system",
|
||||
event: "peer_left",
|
||||
eventData: {
|
||||
name: conn.displayName,
|
||||
pubkey: conn.sessionPubkey ?? conn.memberPubkey,
|
||||
},
|
||||
messageId: crypto.randomUUID(),
|
||||
meshId: conn.meshId,
|
||||
senderPubkey: "system",
|
||||
priority: "low",
|
||||
nonce: "",
|
||||
ciphertext: "",
|
||||
createdAt: new Date().toISOString(),
|
||||
};
|
||||
for (const [pid, peer] of connections) {
|
||||
if (peer.meshId !== conn.meshId) continue;
|
||||
// Don't tell the user's own other sessions they "left" when one
|
||||
// of their Claude Code instances closes. Same pubkey = same user.
|
||||
if (peer.memberPubkey === conn.memberPubkey) continue;
|
||||
sendToPeer(pid, leaveMsg);
|
||||
}
|
||||
}
|
||||
await disconnectPresence(presenceId);
|
||||
if (conn) {
|
||||
void audit(conn.meshId, "peer_left", conn.memberId, conn.displayName, {});
|
||||
}
|
||||
// Clean up URL watches owned by this peer — the interval was
|
||||
// happily fetching forever after the peer disconnected.
|
||||
for (const [watchId, watch] of urlWatches) {
|
||||
if (watch.presenceId === presenceId) {
|
||||
clearInterval(watch.timer);
|
||||
urlWatches.delete(watchId);
|
||||
}
|
||||
}
|
||||
// Clean up stream subscriptions for this peer
|
||||
for (const [key, subs] of streamSubscriptions) {
|
||||
subs.delete(presenceId);
|
||||
if (subs.size === 0) streamSubscriptions.delete(key);
|
||||
}
|
||||
// Clean up MCP servers registered by this peer
|
||||
for (const [key, entry] of mcpRegistry) {
|
||||
if (entry.presenceId === presenceId) {
|
||||
if (entry.persistent) {
|
||||
// Keep persistent entries but mark offline
|
||||
entry.online = false;
|
||||
entry.offlineSince = new Date().toISOString();
|
||||
entry.presenceId = "";
|
||||
} else {
|
||||
mcpRegistry.delete(key);
|
||||
}
|
||||
}
|
||||
}
|
||||
// Auto-pause clock when mesh becomes empty
|
||||
if (conn && !connectionsPerMesh.has(conn.meshId)) {
|
||||
const clock = meshClocks.get(conn.meshId);
|
||||
if (clock && clock.timer) {
|
||||
clearInterval(clock.timer);
|
||||
clock.timer = null;
|
||||
clock.paused = true;
|
||||
log.info("clock auto-paused (mesh empty)", { mesh_id: conn.meshId });
|
||||
}
|
||||
}
|
||||
log.info("ws close", { presence_id: presenceId });
|
||||
if (!presenceId) return;
|
||||
const conn = connections.get(presenceId);
|
||||
if (!conn) return; // already evicted
|
||||
|
||||
// If the conn's `ws` is no longer THIS ws, the close belongs to an
|
||||
// older socket that was already replaced by a reattach. Ignore — the
|
||||
// lease is healthy with the new WS, no eviction needed.
|
||||
if (conn.ws !== ws) {
|
||||
log.debug("ws close on replaced socket — ignoring", { presence_id: presenceId });
|
||||
return;
|
||||
}
|
||||
|
||||
await savePeerState(conn, conn.memberId, conn.meshId);
|
||||
|
||||
// If lease is currently online, enter grace. Other peers see the
|
||||
// session as still online; DMs queue (sendToPeer no-ops on dead
|
||||
// WS, drain on reattach). After GRACE_MS without a reattach, the
|
||||
// timer fires evictPresenceFully and cleanup runs as before.
|
||||
const pid = presenceId;
|
||||
if (conn.leaseState === "online") {
|
||||
conn.leaseState = "offline";
|
||||
conn.leaseUntil = Date.now() + GRACE_MS;
|
||||
conn.evictionTimer = setTimeout(() => {
|
||||
log.info("lease grace expired — evicting", { presence_id: pid });
|
||||
void evictPresenceFully(pid);
|
||||
}, GRACE_MS);
|
||||
log.info("ws close — lease grace started", {
|
||||
presence_id: pid,
|
||||
grace_ms: GRACE_MS,
|
||||
});
|
||||
return;
|
||||
}
|
||||
|
||||
// Not online (already in grace from an earlier close, or odd state).
|
||||
// Run full eviction immediately.
|
||||
await evictPresenceFully(pid);
|
||||
});
|
||||
ws.on("error", (err) => {
|
||||
log.warn("ws error", { error: err.message });
|
||||
});
|
||||
ws.on("pong", () => {
|
||||
if (presenceId) void heartbeat(presenceId);
|
||||
if (presenceId) {
|
||||
const conn = connections.get(presenceId);
|
||||
if (conn) conn.lastPongAt = Date.now();
|
||||
void heartbeat(presenceId);
|
||||
}
|
||||
});
|
||||
}
|
||||
|
||||
@@ -5381,10 +5631,29 @@ async function main(): Promise<void> {
|
||||
});
|
||||
});
|
||||
|
||||
// WS heartbeat ping every 30s; clients reply with pong → bumps lastPingAt.
|
||||
// WS heartbeat ping every 30s; clients reply with pong → bumps
|
||||
// lastPongAt. Connections whose pong is older than 75s (2.5x the
|
||||
// ping interval) are considered half-dead — kernel hasn't yet RST'd
|
||||
// the socket but no application traffic is flowing. Force-terminate
|
||||
// them to fire the close handler and free the connection slot.
|
||||
const STALE_PONG_THRESHOLD_MS = 75_000;
|
||||
const pingInterval = setInterval(() => {
|
||||
for (const { ws } of connections.values()) {
|
||||
if (ws.readyState === ws.OPEN) ws.ping();
|
||||
const now = Date.now();
|
||||
for (const [pid, conn] of connections) {
|
||||
// Skip offline-leased entries: their WS is intentionally dead
|
||||
// during grace; the eviction timer handles their lifecycle.
|
||||
if (conn.leaseState === "offline") continue;
|
||||
const { ws } = conn;
|
||||
if (ws.readyState !== ws.OPEN) continue;
|
||||
if (now - conn.lastPongAt > STALE_PONG_THRESHOLD_MS) {
|
||||
log.warn("ws stale terminate", {
|
||||
presence_id: pid,
|
||||
last_pong_ago_ms: now - conn.lastPongAt,
|
||||
});
|
||||
try { ws.terminate(); } catch { /* socket already gone */ }
|
||||
continue;
|
||||
}
|
||||
ws.ping();
|
||||
}
|
||||
}, 30_000);
|
||||
pingInterval.unref();
|
||||
|
||||
47
apps/broker/tests/kick-control-plane-skip.test.ts
Normal file
47
apps/broker/tests/kick-control-plane-skip.test.ts
Normal file
@@ -0,0 +1,47 @@
|
||||
/**
|
||||
* Kick control-plane skip: 1.34.15 (gap #3a) refuses to close
|
||||
* long-lived control-plane connections (claudemesh daemon, dashboard)
|
||||
* via `kick`, because they auto-reconnect within seconds and the verb
|
||||
* was effectively a no-op. The soft `disconnect` verb keeps the old
|
||||
* behavior so users can still nudge a control-plane peer to
|
||||
* re-authenticate.
|
||||
*
|
||||
* Pure-logic test — mirrors the branch inside handleSend's kick case
|
||||
* without spinning up a broker. Same pattern as
|
||||
* grants-enforcement.test.ts.
|
||||
*/
|
||||
|
||||
import { describe, expect, test } from "vitest";
|
||||
|
||||
type PeerRole = "control-plane" | "session" | "service";
|
||||
|
||||
/** Mirrors the predicate inserted into the kick handler. */
|
||||
function shouldSkipKick(args: {
|
||||
verb: "kick" | "disconnect";
|
||||
peerRole: PeerRole;
|
||||
}): boolean {
|
||||
const skipControlPlane = args.verb === "kick";
|
||||
return skipControlPlane && args.peerRole === "control-plane";
|
||||
}
|
||||
|
||||
describe("kick control-plane skip (gap #3a)", () => {
|
||||
test("kick on control-plane → skipped (would auto-reconnect)", () => {
|
||||
expect(shouldSkipKick({ verb: "kick", peerRole: "control-plane" })).toBe(true);
|
||||
});
|
||||
|
||||
test("kick on session → not skipped (closes user session)", () => {
|
||||
expect(shouldSkipKick({ verb: "kick", peerRole: "session" })).toBe(false);
|
||||
});
|
||||
|
||||
test("kick on service → not skipped", () => {
|
||||
expect(shouldSkipKick({ verb: "kick", peerRole: "service" })).toBe(false);
|
||||
});
|
||||
|
||||
test("disconnect on control-plane → not skipped (intentional nudge)", () => {
|
||||
expect(shouldSkipKick({ verb: "disconnect", peerRole: "control-plane" })).toBe(false);
|
||||
});
|
||||
|
||||
test("disconnect on session → not skipped", () => {
|
||||
expect(shouldSkipKick({ verb: "disconnect", peerRole: "session" })).toBe(false);
|
||||
});
|
||||
});
|
||||
@@ -1,5 +1,841 @@
|
||||
# Changelog
|
||||
|
||||
## 1.34.15 (2026-05-04) — `peer list --mesh` actually scopes + `kick` refuses control-plane
|
||||
|
||||
Two follow-ups from the 1.34.x train, both backwards-compatible.
|
||||
|
||||
### `peer list --mesh <slug>` no longer aggregates across meshes
|
||||
|
||||
`apps/cli/src/commands/peers.ts:140` was calling
|
||||
`tryListPeersViaDaemon()` with no argument, so a multi-mesh daemon
|
||||
returned peers from EVERY attached mesh and the renderer printed
|
||||
"peers on flexicar" with cross-mesh rows mixed in. The daemon's
|
||||
`/v1/peers?mesh=<slug>` filter (server-side, since 1.26.0) was
|
||||
already correctly scoping when the slug was passed; the CLI just
|
||||
wasn't passing it. Fixed.
|
||||
|
||||
`apps/cli/src/commands/launch.ts:407` (the `printBrokerWelcome` peer
|
||||
count in the launch banner) had the same bug. The "N peers online"
|
||||
line in the welcome now shows the count for the launched mesh only.
|
||||
|
||||
`apps/cli/src/commands/send.ts` cross-mesh hex-prefix resolution is
|
||||
intentionally cross-mesh (the user is targeting by hex without
|
||||
specifying a mesh) and was deliberately left as-is.
|
||||
|
||||
### `claudemesh kick` refuses no-op kicks on control-plane connections
|
||||
|
||||
Pre-1.34.15, kicking a daemon's member-WS or a dashboard connection
|
||||
just closed the socket — the daemon's WS-lifecycle reconnect loop
|
||||
brought it back within seconds, the kicker's CLI rendered "Their
|
||||
Claude Code session ended" (which was misleading), and the user-
|
||||
visible state was unchanged. The verb was effectively a no-op, but
|
||||
the user had to learn that the hard way.
|
||||
|
||||
The broker's kick handler (`apps/broker/src/index.ts:4628+`) now
|
||||
skips peers where `peerRole === "control-plane"` and surfaces the
|
||||
skipped peers in a new additive ack field `skipped_control_plane`.
|
||||
The soft `disconnect` verb keeps the old behavior — useful when
|
||||
intentionally nudging a control-plane peer to re-authenticate.
|
||||
|
||||
The CLI (`apps/cli/src/commands/kick.ts`) reads the new field and
|
||||
prints a clearer message: refused peers are listed, with the hint
|
||||
that `claudemesh ban <peer>` is the right tool to remove a member,
|
||||
or `claudemesh daemon down` to take a daemon offline locally.
|
||||
|
||||
`apps/broker/src/index.ts` adds `peerRole` to the in-memory
|
||||
`PeerConn` shape, populated from both connection paths
|
||||
(member-keyed `hello` → `"control-plane"`, per-launch
|
||||
`session_hello` → `"session"`). The DB-side role taxonomy is
|
||||
unchanged.
|
||||
|
||||
### Back-compat
|
||||
|
||||
- Older CLI clients ignore the new `skipped_control_plane` ack
|
||||
field; their kick continues to print "Kicked 0 peer(s)" against
|
||||
a control-plane target as before.
|
||||
- Older brokers don't emit the field at all; newer CLI handles
|
||||
the absence (the new branch is only reached when the field is
|
||||
present and non-empty).
|
||||
- The new `peerRole` slot on `PeerConn` is filled at every
|
||||
`connections.set` callsite, so older code paths never read
|
||||
`undefined`.
|
||||
|
||||
### Tests
|
||||
|
||||
- `apps/broker/tests/kick-control-plane-skip.test.ts` — 5 cases
|
||||
covering the kick/disconnect × control-plane/session/service
|
||||
truth table.
|
||||
|
||||
## 1.34.14 (2026-05-04) — stale `CLAUDEMESH_CONFIG_DIR` falls back
|
||||
|
||||
`claudemesh launch` exports `CLAUDEMESH_CONFIG_DIR=<tmpdir>` to its
|
||||
spawned `claude` so the per-session mesh selection is isolated from
|
||||
`~/.claudemesh/config.json`. The tmpdir is `rmSync`'d on launch exit
|
||||
via the `process.on('exit', cleanup)` handler.
|
||||
|
||||
Footgun: if a later `claudemesh` invocation INHERITED that env — a
|
||||
Bash tool call inside Claude Code, a tmux pane that captured the env
|
||||
via `update-environment`, an exported var the user forgot to clear —
|
||||
the inherited path pointed at a tmpdir that no longer existed.
|
||||
Pre-1.34.14 we silently used the dead path, `readConfig()` came back
|
||||
empty, and the user saw "No meshes joined" from an otherwise-working
|
||||
install. Fish users hit it harder because fish has no `unset` —
|
||||
they had to discover `set -e CLAUDEMESH_CONFIG_DIR`.
|
||||
|
||||
`apps/cli/src/constants/paths.ts` now resolves `CONFIG_DIR` once via
|
||||
a memoized `resolveConfigDir()`:
|
||||
|
||||
1. No env var → `~/.claudemesh` (default, unchanged).
|
||||
2. Env points at a dir containing `config.json` → trust it. The
|
||||
legitimate per-session-launch case is byte-identical to before.
|
||||
3. Env set but stale (dir gone) → warn once on stderr (TTY-only —
|
||||
CI / MCP boot / piped scripts stay quiet) with a shell-specific
|
||||
unset hint, then fall back to `~/.claudemesh`.
|
||||
|
||||
The check is on the directory's existence, not on `config.json`,
|
||||
because a fresh-launch tmpdir legitimately has no `config.json` until
|
||||
the first write. The stale signature we catch is the outer launch's
|
||||
`rmSync(tmpDir, {recursive: true})` cleanup, which removes the
|
||||
directory entirely.
|
||||
|
||||
The "no meshes" check from the original triage was deliberately NOT
|
||||
adopted: a launched session that legitimately joins one mesh would
|
||||
hit it.
|
||||
|
||||
No back-compat surface affected. No other files changed. `_resetPathsForTest()`
|
||||
exported for unit tests.
|
||||
|
||||
## 1.34.13 (2026-05-04) — MCP forwards session token on /v1/events
|
||||
|
||||
The 1.34.10 SSE demux + 1.34.11 inbox per-recipient column were both
|
||||
in place but the bug user kept seeing wasn't actually fixed. Cause:
|
||||
the MCP server's SSE subscription didn't forward the session token,
|
||||
so the daemon's `/v1/events` route resolved `session` to null, the
|
||||
SseFilterOptions filter was empty, and every MCP received the
|
||||
unfiltered global stream. Demux at the bind layer was correct;
|
||||
the subscriber just wasn't telling the daemon who it was.
|
||||
|
||||
`apps/cli/src/mcp/server.ts` — `subscribeEvents` now accepts
|
||||
`{ sessionToken }` and forwards it as `Authorization: ClaudeMesh-Session
|
||||
<token>` on the SSE connect, identical to how `daemonGet` and
|
||||
`daemonMarkSeen` already authenticate. The MCP boot already reads the
|
||||
token via `readSessionTokenFromEnv()`; this just threads it one more
|
||||
hop. Without this, A's MCP would render DMs that arrived on B's
|
||||
session-WS — exact symptom from the 2026-05-04 two-session smoke,
|
||||
even after restarting the daemon to pick up 1.34.11.
|
||||
|
||||
## 1.34.12 (2026-05-04) — `daemon up` detaches by default
|
||||
|
||||
Pre-1.34.12 `claudemesh daemon up` ran in the foreground and streamed
|
||||
JSON logs to the terminal until Ctrl-C. Surprising for users who just
|
||||
want the daemon "up" — they'd run it, see a wall of broker_status /
|
||||
broker_ws_open_attempt logs, and not realize the shell was now blocked.
|
||||
|
||||
`up` now spawns a detached child re-execing `daemon up --foreground`
|
||||
with stdout/stderr redirected to `~/.claudemesh/daemon/daemon.log`,
|
||||
then exits the parent cleanly:
|
||||
|
||||
```
|
||||
$ claudemesh daemon up
|
||||
✔ daemon started (pid 59175)
|
||||
→ log: /Users/agutierrez/.claudemesh/daemon/daemon.log
|
||||
→ stop: claudemesh daemon down
|
||||
```
|
||||
|
||||
Pass `--foreground` for the pre-1.34.12 behavior (debugging, or when
|
||||
something else owns lifecycle).
|
||||
|
||||
The launchd plist + systemd-user unit + `claudemesh launch`'s
|
||||
auto-spawn helper all explicitly pass `--foreground` because their
|
||||
parents (launchd / systemd-user / the launch helper) own process
|
||||
lifecycle and stdio redirection. Without that, the child would
|
||||
double-fork and orphan a grandchild the parent service couldn't track.
|
||||
|
||||
The parent waits up to 3s for the IPC socket to appear before
|
||||
declaring success; if the child crashes during boot (config read
|
||||
failed, port bind failed, etc.), the parent surfaces the log path
|
||||
instead of silently exiting 0.
|
||||
|
||||
### Files
|
||||
|
||||
- `apps/cli/src/commands/daemon.ts` — `--foreground` flag,
|
||||
`spawnDetachedDaemon` helper, updated help text.
|
||||
- `apps/cli/src/cli/argv.ts` — `foreground` / `no-tcp` / `public-health`
|
||||
added to BOOLEAN_FLAGS so the parser doesn't try to consume the
|
||||
next positional as their value.
|
||||
- `apps/cli/src/entrypoints/cli.ts` — threads `foreground` through to
|
||||
runDaemonCommand.
|
||||
- `apps/cli/src/services/daemon/lifecycle.ts` — auto-spawn passes
|
||||
`--foreground` (lifecycle helper IS the detacher).
|
||||
- `apps/cli/src/daemon/service-install.ts` — launchd plist + systemd
|
||||
unit pass `--foreground` (launchd / systemd own lifecycle).
|
||||
|
||||
## 1.34.11 (2026-05-04) — inbox per-recipient column
|
||||
|
||||
Closes the storage half of the per-session scoping story 1.34.10
|
||||
opened. The SSE demux fixed the live event path; this fixes the inbox
|
||||
reads. Same bug shape: every session shared one `inbox.db`, so any
|
||||
session running `claudemesh inbox` (and the MCP welcome calling
|
||||
`/v1/inbox?unread_only=true`) returned the global table — A's launch
|
||||
surfaced B's unread DMs as if they were A's, and `markInboxSeen`
|
||||
flipped seen-state for rows the asking session never owned.
|
||||
|
||||
### Schema
|
||||
|
||||
`apps/cli/src/daemon/db/inbox.ts`:
|
||||
|
||||
- New columns: `recipient_pubkey TEXT`, `recipient_kind TEXT`,
|
||||
indexed by recipient_pubkey. Migration is non-destructive — pre-
|
||||
1.34.11 rows land with NULL and are visible to every session on
|
||||
the same mesh (best-effort back-compat).
|
||||
- `insertIfNew` now writes both fields; `inbound.ts` populates them
|
||||
from the `recipientPubkey` / `recipientKind` already passed for
|
||||
the bus event.
|
||||
- `listInbox` accepts `recipientPubkey` (session) and
|
||||
`recipientMemberPubkey` (member), composes a WHERE clause:
|
||||
`recipient_pubkey IS NULL OR recipient_pubkey IN (session, member)`.
|
||||
|
||||
### IPC
|
||||
|
||||
`apps/cli/src/daemon/ipc/server.ts` — `/v1/inbox` resolves the
|
||||
session bearer token to a session pubkey + the matching mesh's
|
||||
member pubkey, threads both into `listInbox`. Diagnostic callers
|
||||
without a token (no session header) still get the unscoped global
|
||||
view.
|
||||
|
||||
The response now surfaces `recipient_pubkey` + `recipient_kind` so
|
||||
`--json` consumers can tell session DMs apart from member-keyed
|
||||
broadcasts.
|
||||
|
||||
### Welcome auto-fixes
|
||||
|
||||
The welcome path already calls `/v1/inbox?unread_only=true` with the
|
||||
session token; with this scoping in place it now returns ONLY rows
|
||||
the session is meant to see. No code change needed in
|
||||
`apps/cli/src/mcp/server.ts`.
|
||||
|
||||
### Architecture invariant after 1.34.11
|
||||
|
||||
Every shared store / channel on the daemon now scopes by recipient:
|
||||
|
||||
- EventBus → SSE demux at bind layer (1.34.10)
|
||||
- inbox.db → recipient_pubkey / recipient_kind columns + listInbox
|
||||
scoping (1.34.11)
|
||||
- outbox.db → already scoped via `sender_session_pubkey` for routing
|
||||
(1.34.0)
|
||||
|
||||
Single bus + single tables remain the canonical pattern; demux is
|
||||
isolated to one chokepoint per layer.
|
||||
|
||||
## 1.34.10 (2026-05-04) — per-session SSE demux + universal daemon
|
||||
|
||||
The "echo" the user kept seeing across 1.34.7→1.34.9 wasn't a broker-side
|
||||
echo at all. With two sessions on the same daemon (a + b), the daemon
|
||||
runs ONE event bus shared across every connected MCP. b's session-WS
|
||||
receives a's DM, publishes one `message` event to the bus, and BOTH a's
|
||||
MCP and b's MCP fan that event into a `<channel>` reminder. Result: a
|
||||
sees its own outbound rendered with `from_pubkey = a.session.pubkey`
|
||||
because a's MCP indiscriminately renders every bus event.
|
||||
|
||||
Fix is per-subscriber demux at the SSE bind layer (`apps/cli/src/daemon/
|
||||
events.ts`). The bus stays single-shot — it just publishes once with
|
||||
recipient context attached. Each `/v1/events` subscription scopes via
|
||||
the session token presented by the MCP, and the bind helper drops
|
||||
events whose `recipient_pubkey` doesn't match. System events
|
||||
(peer_join etc.) bypass the recipient check; mesh-scoped events
|
||||
(broker_status with `data.mesh`) get a mesh-slug filter so a session
|
||||
on prueba1 doesn't see flexicar's broker reconnect lines.
|
||||
|
||||
`handleBrokerPush` (`apps/cli/src/daemon/inbound.ts`) gains
|
||||
`recipientPubkey` + `recipientKind` on its context. Run.ts wires the
|
||||
session-WS path with `{ recipientKind: "session", recipientPubkey:
|
||||
session.pubkey }` and the daemon-WS path with `{ recipientKind:
|
||||
"member", recipientPubkey: mesh.pubkey }`. SSE bind uses the session
|
||||
registry to resolve the subscriber's session pubkey + member pubkey
|
||||
+ mesh from its bearer token.
|
||||
|
||||
The 1.34.8/9 "echo guards" (drop pushes whose senderPubkey/Member ===
|
||||
ours) are kept as defense-in-depth; the actual fix lives in the SSE
|
||||
demux. Diagnostic callers without a session token (`claudemesh daemon
|
||||
events`) get the unfiltered legacy stream — backwards compatible.
|
||||
|
||||
### Universal daemon (`--mesh` and `--name` deprecated)
|
||||
|
||||
`claudemesh daemon up` and `daemon install-service` no longer accept
|
||||
mesh / name overrides. The daemon attaches to every mesh in
|
||||
`~/.claudemesh/config.json`, full stop. Single-mesh isolation is
|
||||
handled by joining only one mesh in that environment (containers,
|
||||
etc.). Pinning at start time was the source of "I joined a new mesh
|
||||
but my service still ignores it" — gone.
|
||||
|
||||
`--mesh` and `--name` are still parsed for back-compat with existing
|
||||
launchd plists baked at install time, but ignored with a deprecation
|
||||
warning. New installs no longer write them. Help text updated.
|
||||
|
||||
### Daemon version stamp
|
||||
|
||||
`daemon_started` boot log now includes `"version": "1.34.10"` so users
|
||||
can grep their daemon log to confirm whether the running process
|
||||
picked up a recent ship. Pairs with the existing `claudemesh launch`
|
||||
warning that fires when CLI ≠ daemon.
|
||||
|
||||
### Files
|
||||
|
||||
- `apps/cli/src/daemon/events.ts` — `SseFilterOptions`,
|
||||
`shouldDeliver`, `bindSseStream(res, bus, filter)`.
|
||||
- `apps/cli/src/daemon/inbound.ts` — `recipientPubkey` /
|
||||
`recipientKind` on InboundContext; bus event carries them through.
|
||||
- `apps/cli/src/daemon/run.ts` — both onPush call sites tag with the
|
||||
right kind; daemon_started boot log includes version.
|
||||
- `apps/cli/src/daemon/ipc/server.ts` — `/v1/events` resolves the
|
||||
bearer session into a filter and passes it to bindSseStream.
|
||||
- `apps/cli/src/commands/daemon.ts` — deprecation warnings on `up` /
|
||||
`install-service` for `--mesh` / `--name`; help text trimmed.
|
||||
- `apps/cli/src/entrypoints/cli.ts` — top-level help drops `--mesh
|
||||
<slug>` from the daemon section, adds the universal-daemon note.
|
||||
- `apps/cli/src/commands/launch.ts` — staleness warning copy clean
|
||||
(no misleading `--mesh` example).
|
||||
|
||||
## 1.34.9 (2026-05-04) — broader self-echo guard + system event polish
|
||||
|
||||
Two-session smoke after 1.34.8 surfaced two regressions and one missing
|
||||
piece: echoes still arrived on the daemon-WS path (the 1.34.8 guard was
|
||||
too strict — it required BOTH senderPubkey === ownMember AND
|
||||
senderMemberPubkey === ownMember, but session-attributed echoes carry
|
||||
the session pubkey on `senderPubkey`); peer_join system events
|
||||
duplicated because both the member-WS and the session-WS forwarded
|
||||
them; and the channel reminder collapsed all peer joins to just a
|
||||
display name with no disambiguation.
|
||||
|
||||
### Daemon-WS self-echo guard relaxed
|
||||
|
||||
`apps/cli/src/daemon/run.ts` — drop on `senderMemberPubkey === ownMember`
|
||||
alone. Anything attributed to OUR member is either our own send echoing
|
||||
back via the broker fan-out OR (theoretically) a peer with the same
|
||||
pubkey, which is impossible by construction. Sibling-session DMs fan
|
||||
session-to-session, not via the same member-WS, so they aren't affected.
|
||||
|
||||
### Session-WS skips system events
|
||||
|
||||
`apps/cli/src/daemon/session-broker.ts` — system pushes (`subtype:
|
||||
"system"`) are dropped before `onPush` so they don't re-publish on the
|
||||
bus. The member-WS already handles system events; forwarding through
|
||||
both paths produced two `[system] Peer "X" joined` channel reminders
|
||||
per join, plus another set per sibling session.
|
||||
|
||||
### Self-join filter on member-WS
|
||||
|
||||
`apps/cli/src/daemon/inbound.ts` — new `isOwnPubkey` closure on
|
||||
`InboundContext`. The broker's peer_joined fan-out excludes the
|
||||
JOINING connection but our daemon owns multiple connections per mesh
|
||||
(member-WS + N session-WSs from the same identity), so a session's
|
||||
own peer_joined arrives at the same daemon's member-WS. The filter
|
||||
walks `mesh.pubkey` plus every live entry in `sessionBrokersByPubkey`
|
||||
to recognize "us" and drops the event verbatim. Wired in run.ts.
|
||||
|
||||
### Richer peer-join channel content
|
||||
|
||||
`apps/cli/src/mcp/server.ts` — `[system] Peer "name" joined the mesh`
|
||||
becomes `[system] Peer "name" (pubkey-prefix) [groups] joined the
|
||||
mesh — last seen … · "summary"` (last-seen + summary fields only on
|
||||
`peer_returned` events). The meta payload now carries `peer_pubkey`,
|
||||
`peer_groups`, `peer_last_seen_at`, `peer_summary` for downstream
|
||||
consumers. cwd / role aren't surfaced yet — broker-side change
|
||||
required.
|
||||
|
||||
### Daemon staleness warning
|
||||
|
||||
`apps/cli/src/commands/launch.ts` — when `claudemesh launch` finds the
|
||||
daemon already running with a different version than the CLI, it
|
||||
surfaces a one-shot warning + restart instructions. Catches the
|
||||
common "I `npm i -g`d the latest CLI but the launchd service is still
|
||||
running last week's daemon" footgun.
|
||||
|
||||
## 1.34.8 (2026-05-04) — self-echo guard, inbox read-state + TTL prune
|
||||
|
||||
Three closely-related fixes shipped together because they all touch the
|
||||
"what does the user actually see in inbox.db / on the channel" axis.
|
||||
|
||||
### Self-echo guard
|
||||
|
||||
The 1.34.0 sender-attribution fix routed session-originated DMs through
|
||||
the per-session WS so the broker's fan-out attributed each push to the
|
||||
sender's session pubkey. A side effect (visible in the 2026-05-04
|
||||
two-session smoke): some broker fan-out paths mirror the outbound DM
|
||||
back to the originating session-WS, so the sender saw their own
|
||||
message land in inbox.db, publish a `message` bus event, and surface
|
||||
as `← claudemesh: <self>: <text>` in their own Claude Code session
|
||||
immediately after typing `claudemesh send`.
|
||||
|
||||
Fixed at the WS boundary in two places:
|
||||
|
||||
- `apps/cli/src/daemon/session-broker.ts` — drop pushes where
|
||||
`senderPubkey === opts.sessionPubkey` before forwarding to
|
||||
`handleBrokerPush`. Match on session pubkey only — sibling sessions
|
||||
of the same member share `senderMemberPubkey`, so a member-level
|
||||
filter would wrongly drop legit sibling DMs.
|
||||
- `apps/cli/src/daemon/run.ts` — daemon-WS onPush drops pushes where
|
||||
BOTH `senderMemberPubkey === mesh.pubkey` AND `senderPubkey ===
|
||||
mesh.pubkey` (i.e. an actual member-WS self-echo, not a sibling
|
||||
session whose senderPubkey is its session key).
|
||||
|
||||
### Inbox read-state (`seen_at`)
|
||||
|
||||
Replaces the welcome's "last 24h" window with a proper read-state
|
||||
filter. New `seen_at INTEGER` column on `inbox`, plus `markInboxSeen`
|
||||
and `pruneInboxBefore` helpers in `apps/cli/src/daemon/db/inbox.ts`.
|
||||
|
||||
Read-state flips on three paths:
|
||||
|
||||
1. Interactive listing — `/v1/inbox` GET auto-stamps every returned
|
||||
row that was previously NULL. Pass `?mark_seen=false` to peek
|
||||
without flipping (used by the welcome — it stamps explicitly only
|
||||
AFTER the channel notification succeeds, so a Zod-rejected welcome
|
||||
doesn't silently lose unread state).
|
||||
2. MCP welcome — `/v1/inbox?unread_only=true&mark_seen=false&limit=50`
|
||||
surfaces only rows the user hasn't seen, then `POST /v1/inbox/seen`
|
||||
stamps the ids the welcome actually rendered.
|
||||
3. MCP live channel emit — after a successful
|
||||
`notifications/claude/channel` for a single inbox row, the MCP
|
||||
server calls `/v1/inbox/seen` for that id so the next launch's
|
||||
welcome doesn't re-surface it.
|
||||
|
||||
CLI surface:
|
||||
|
||||
```sh
|
||||
claudemesh inbox --unread # only seen_at IS NULL rows
|
||||
claudemesh inbox --json # row now includes seen_at
|
||||
```
|
||||
|
||||
### Inbox TTL prune
|
||||
|
||||
`apps/cli/src/daemon/inbox-pruner.ts` runs `pruneInboxBefore(db,
|
||||
Date.now() - 30d)` once at daemon startup and hourly thereafter. Logs
|
||||
`inbox_prune_completed` whenever rows were removed. No CLI knob — it's
|
||||
a built-in retention policy that prevents inbox.db from growing
|
||||
unbounded. Manual override remains `claudemesh inbox flush --before
|
||||
<iso>`.
|
||||
|
||||
### Files
|
||||
|
||||
- `apps/cli/src/daemon/db/inbox.ts` — `seen_at` column + migration,
|
||||
`unreadOnly` filter, `markInboxSeen`, `pruneInboxBefore`.
|
||||
- `apps/cli/src/daemon/inbox-pruner.ts` — new file, hourly TTL sweep.
|
||||
- `apps/cli/src/daemon/run.ts` — wires the pruner into startup +
|
||||
shutdown; daemon-WS self-echo guard.
|
||||
- `apps/cli/src/daemon/session-broker.ts` — session-WS self-echo
|
||||
guard.
|
||||
- `apps/cli/src/daemon/ipc/server.ts` — `unread_only` + `mark_seen`
|
||||
query params; new `POST /v1/inbox/seen` route.
|
||||
- `apps/cli/src/mcp/server.ts` — `daemonMarkSeen` helper; welcome
|
||||
switched to `unread_only=true`; mark-seen after channel emit.
|
||||
- `apps/cli/src/services/bridge/daemon-route.ts` —
|
||||
`tryListInboxViaDaemon` accepts `{ unreadOnly, markSeen }`;
|
||||
`InboxItem.seen_at` exposed.
|
||||
- `apps/cli/src/commands/inbox.ts` + `apps/cli/src/entrypoints/cli.ts`
|
||||
+ `apps/cli/src/cli/argv.ts` — `--unread` flag.
|
||||
- `apps/cli/skills/claudemesh/SKILL.md` — documents seen_at semantics,
|
||||
self-echo guard, TTL prune.
|
||||
|
||||
## 1.34.7 (2026-05-04) — inbox flush + delete commands
|
||||
|
||||
The CLI had no first-class way to clean the persisted inbox; the only
|
||||
recourse was `sqlite3 ~/.claudemesh/daemon/inbox.db "DELETE FROM
|
||||
inbox"`, which bypasses IPC and is invisible to anyone who doesn't
|
||||
know the schema. Two new verbs:
|
||||
|
||||
```sh
|
||||
claudemesh inbox flush --mesh prueba1
|
||||
claudemesh inbox flush --before 2026-05-04T18:00:00Z
|
||||
claudemesh inbox flush --all # required guard with no other filter
|
||||
claudemesh inbox delete <message-id> # alias: rm
|
||||
claudemesh inbox flush --json # → { ok: true, removed: N }
|
||||
```
|
||||
|
||||
`flush` without filters refuses with an `--all` confirmation hint —
|
||||
prevents an accidental "wipe every row on every mesh" from a
|
||||
fat-fingered command.
|
||||
|
||||
### Mechanics
|
||||
|
||||
- `apps/cli/src/daemon/db/inbox.ts` gains `deleteInboxRow(id)` and
|
||||
`flushInbox({ mesh?, before? })` (returns `changes`).
|
||||
- IPC: `DELETE /v1/inbox?mesh=…&before=…` + `DELETE /v1/inbox/<id>`.
|
||||
Mesh filter honors session-default scoping (same as listing).
|
||||
- Daemon-route helpers `tryFlushInboxViaDaemon` and
|
||||
`tryDeleteInboxRowViaDaemon` mirror the existing
|
||||
`tryListInboxViaDaemon` shape.
|
||||
- New CLI command file `apps/cli/src/commands/inbox-actions.ts`.
|
||||
- Help and SKILL.md document the verbs.
|
||||
|
||||
## 1.34.6 (2026-05-04) — welcome: stringify meta values to pass Zod schema
|
||||
|
||||
The 1.34.2 → 1.34.5 timing-race theory was wrong. Reading Claude Code
|
||||
v2.1.126's binary at the `notifications/claude/channel` schema:
|
||||
|
||||
```js
|
||||
IJ_ = y.object({
|
||||
method: y.literal("notifications/claude/channel"),
|
||||
params: y.object({
|
||||
content: y.string(),
|
||||
meta: y.record(y.string(), y.string()).optional(),
|
||||
}),
|
||||
})
|
||||
```
|
||||
|
||||
`meta` is a `record(string, string)` — every value MUST be a string.
|
||||
Pre-1.34.6 the welcome shipped:
|
||||
|
||||
- `peer_count: number` → Zod reject
|
||||
- `peer_names: string[]` → Zod reject
|
||||
- `unread_count: number` → Zod reject
|
||||
- `latest_message_ids: string[]` → Zod reject
|
||||
|
||||
The whole notification was dropped at the schema-validation step
|
||||
BEFORE the channel handler ever ran. No log, no error, no UI surface —
|
||||
exactly the symptoms 1.34.2 → 1.34.5 chased.
|
||||
|
||||
Live peer DMs always worked because every meta value already went
|
||||
through `String(...)`. The welcome was the only notification shape
|
||||
with non-string meta, uniquely affected.
|
||||
|
||||
### Fix
|
||||
|
||||
`emitMeshWelcome` now coerces every meta value to string. Counts
|
||||
become digit strings (`"3"`, `"16"`); arrays serialize as JSON
|
||||
(`'["b","c"]'`, parseable on the receiving side). Schema validation
|
||||
passes, notification reaches the handler, channel reminder surfaces.
|
||||
|
||||
The 1.34.5 dual-lane retry is removed — single emit at 3s grace
|
||||
after `oninitialized` is enough now that the schema is right.
|
||||
|
||||
### What changed in `~/.claudemesh/daemon/mcp-<pid>.log`
|
||||
|
||||
`welcome_attempt` rows are gone (no more lanes). You'll see
|
||||
`mcp_started` → `server_initialized` → `welcome_peers_resolved` →
|
||||
`welcome_emitted` per launch — the same shape as 1.34.4 minus the
|
||||
`fast`/`slow` lane field.
|
||||
|
||||
## 1.34.5 (2026-05-04) — dual-lane welcome retry to defeat handler-registration race
|
||||
|
||||
1.34.4 hooked `server.oninitialized` + 2s grace. The MCP-side log
|
||||
confirmed `welcome_emitted` ran at +2.4s, but the user still saw
|
||||
nothing in Claude Code. Claude Code's React effect that calls
|
||||
`setNotificationHandler("notifications/claude/channel", ...)` runs
|
||||
multiple ticks AFTER its UI state flips to "connected", which happens
|
||||
after `server.oninitialized` fires. 2s was still inside the dead zone.
|
||||
|
||||
We can't directly observe handler-registration timing from the MCP
|
||||
side (the SDK has no hook for it), so this version emits the welcome
|
||||
TWICE: 5s post-init (`lane: "fast"`) and 15s post-init (`lane: "slow"`).
|
||||
Whichever one lands surfaces; the duplicate is acceptable for an
|
||||
informational welcome. Both attempts log to `mcp-<pid>.log` so we can
|
||||
see which lane wins in production. If observation shows the slow
|
||||
path always wins, future versions can drop the fast attempt.
|
||||
|
||||
## 1.34.4 (2026-05-04) — welcome triggers on `oninitialized`, peer count fix
|
||||
|
||||
### Welcome trigger: post-initialization, not fixed timer
|
||||
|
||||
1.34.3's welcome fired on a fixed 5s timer after `server.connect`.
|
||||
Diagnostic logging confirmed the emit ran (`welcome_emitted` in
|
||||
`mcp-<pid>.log`) but the user never saw the channel reminder. Cause:
|
||||
Claude Code only registers its `notifications/claude/channel`
|
||||
notification handler AFTER the MCP init handshake completes
|
||||
(initialize request → initialized notification from client →
|
||||
`server.oninitialized` fires). 5s commonly closed before that
|
||||
sequence finished, so the welcome notification arrived at a server
|
||||
that hadn't wired up a handler yet — silently dropped.
|
||||
|
||||
Live peer DMs worked because they arrive seconds-to-minutes later,
|
||||
well past the window. The welcome is the only event with a
|
||||
deterministic close-to-zero delay, so it was uniquely affected.
|
||||
|
||||
The fix gates the welcome on `Server.oninitialized`, then adds 2s of
|
||||
grace for any pending list_tools / list_prompts round-trips to settle
|
||||
before emitting. Matches the registration timing exactly — by the
|
||||
time `oninitialized` fires, Claude Code has already accepted the
|
||||
server and registered the channel handler.
|
||||
|
||||
### Peer count filter mirrors the launch banner
|
||||
|
||||
The 1.34.3 welcome used `peerRole !== "control-plane"` to filter the
|
||||
peer list — that's the new taxonomy from broker M1, but older brokers
|
||||
still emit only `channel: "claudemesh-daemon"` for control-plane rows.
|
||||
Result: `peer_count: 0` even when the launch banner showed "2 peers
|
||||
online". The welcome filter now matches the launch banner exactly
|
||||
(`channel !== "claudemesh-daemon"`) and additionally honors
|
||||
`peerRole !== "control-plane"` when present.
|
||||
|
||||
Self-exclusion is now opt-in: only filtered when `self_session_pubkey`
|
||||
is known (from the `/v1/sessions/me` lookup). This prevents over-
|
||||
filtering when the token route fails and we fall back to the
|
||||
unauthenticated peer list.
|
||||
|
||||
`mcp-<pid>.log` now records `server_initialized`,
|
||||
`welcome_peers_resolved` (with total / real counts), and
|
||||
`welcome_peers_status` so a missing welcome can be traced through the
|
||||
init handshake → peer query → notification chain.
|
||||
|
||||
## 1.34.3 (2026-05-04) — welcome always fires + skill / help refresh
|
||||
|
||||
### Welcome always emits, regardless of inbox state
|
||||
|
||||
The 1.34.2 welcome only fired when there were unread messages, so a
|
||||
freshly-launched session with an empty inbox saw nothing — no
|
||||
confirmation that the mesh pipe was live. Now it always emits, and
|
||||
carries useful launch context:
|
||||
|
||||
- **identity** — display name, session pubkey prefix, role
|
||||
- **mesh** — active mesh slug
|
||||
- **peers** — live peer count + up to 5 names (control-plane filtered out)
|
||||
- **inbox** — recent count + up to 3 previews (or "Inbox is empty (last 24h)")
|
||||
- **CLI hints** — `peer list` · `send` · `inbox`
|
||||
- **skill pointer** — `📚 Read the claudemesh skill (SKILL.md)…` so the
|
||||
model loads the canonical reference if it isn't already in context
|
||||
|
||||
Composes from up to three best-effort daemon queries
|
||||
(`/v1/sessions/me`, `/v1/peers?mesh=…`, `/v1/inbox?mesh=…&since=…`),
|
||||
each degrading silently. The welcome ALWAYS goes out unless the IPC
|
||||
socket is unreachable. Meta carries `kind: "welcome"`,
|
||||
`self_display_name`, `self_session_pubkey`, `self_role`, `mesh_slug`,
|
||||
`peer_count`, `peer_names`, `unread_count`, and
|
||||
`latest_message_ids` for downstream consumers.
|
||||
|
||||
### `daemonGet` now forwards the session token
|
||||
|
||||
The MCP's IPC client gained an optional `sessionToken` field. The
|
||||
welcome path uses it for `/v1/sessions/me` (which 401s without auth)
|
||||
and for default-mesh scoping on `/v1/peers` and `/v1/inbox`. Token
|
||||
read from `CLAUDEMESH_IPC_TOKEN_FILE` set by `claudemesh launch`.
|
||||
|
||||
### Skill (`apps/cli/skills/claudemesh/SKILL.md`) refresh
|
||||
|
||||
- New section: "Launch welcome (`kind: "welcome"`)" — describes the
|
||||
5-second handshake, its meta fields, and that it should NOT be
|
||||
replied to like a DM.
|
||||
- Channel attributes table: clarified that `from_pubkey` is the
|
||||
ephemeral session pubkey of the originator (post-1.34.0 attribution
|
||||
fix), separated `from_session_pubkey` and `from_member_pubkey`,
|
||||
added `client_message_id` and `kind` rows.
|
||||
- Inbox section: documented `--mesh <slug>`, `--limit N`, and that
|
||||
the command reads `~/.claudemesh/daemon/inbox.db` via daemon IPC
|
||||
(NOT a fresh broker-WS buffer drain — that path was removed in
|
||||
1.34.0).
|
||||
- Reply behavior: explicit "do NOT reply when `meta.kind` is
|
||||
`"welcome"` or `"system"`".
|
||||
|
||||
### `claudemesh --help` refresh
|
||||
|
||||
`message inbox` line was still labeled "drain pending" from the
|
||||
pre-1.34.0 cold-path implementation. Updated to "read persisted
|
||||
inbox" with the new flags (`--mesh`, `--limit`, `--json`) and a
|
||||
note that it reads from `~/.claudemesh/daemon/inbox.db` via the
|
||||
daemon.
|
||||
|
||||
## 1.34.2 (2026-05-04) — launch welcome push summarizing recent inbox
|
||||
|
||||
When a Claude Code session launches via `claudemesh launch`, the user
|
||||
lands cold — they don't know whether peers messaged them while they
|
||||
were offline. Real-time pushes only cover messages that arrive AFTER
|
||||
the SSE subscription is alive, so anything queued at the broker that
|
||||
drains during the hello-handshake window can land in `inbox.db`
|
||||
before the MCP subscribes. Without a welcome, the user has to remember
|
||||
to run `claudemesh inbox` to discover the gap.
|
||||
|
||||
The MCP server now fires a one-shot welcome 5s after the transport is
|
||||
up:
|
||||
|
||||
- queries `/v1/inbox?since=<24h-ago>&limit=20` for the recent window;
|
||||
- skips silently when there are no rows;
|
||||
- emits a single `notifications/claude/channel` with header
|
||||
(`📥 [welcome] N messages from last 24h (mesh-a, mesh-b)`),
|
||||
up to three preview lines (sender, mesh, time, 60-char body),
|
||||
a remainder count, and the `claudemesh inbox` CLI hint;
|
||||
- meta carries `kind: "welcome"`, `unread_count`, mesh list, and the
|
||||
first 10 message ids so a downstream agent can `claudemesh message
|
||||
status <id>` if it wants to inspect.
|
||||
|
||||
Why a 5s delay: gives the daemon's session-WS time to reconnect,
|
||||
re-claim leased rows, drain pending broker queue, and finish writing
|
||||
to inbox.db before we summarize. Earlier and the welcome would
|
||||
under-report; later and it stops feeling like a launch handshake.
|
||||
|
||||
Why a 24h window: narrow enough to feel relevant on a freshly-launched
|
||||
session, wide enough to surface overnight messages without dumping
|
||||
the entire history into the channel.
|
||||
|
||||
The welcome flow is fully diagnostic — `welcome_skip` (with reason),
|
||||
`welcome_emitted`, or `welcome_emit_failed` lands in
|
||||
`~/.claudemesh/daemon/mcp-<pid>.log` for every launch.
|
||||
|
||||
## 1.34.1 (2026-05-04) — declare `claude/channel` capability so Claude Code surfaces pushes
|
||||
|
||||
The 1.34.0 ship fixed the daemon-side push pipeline (correct sender
|
||||
attribution, persistent inbox readable from CLI). Bus events fire,
|
||||
SSE delivers them to the MCP, and the MCP calls
|
||||
`server.notification("notifications/claude/channel", ...)` — but
|
||||
Claude Code v2.1.x stopped surfacing them as `<channel>` reminders.
|
||||
Real two-session smoke confirmed the silent drop: messages landed
|
||||
in `inbox.db`, the daemon SSE stream emitted the `message` events,
|
||||
yet neither Claude Code session got a real-time push.
|
||||
|
||||
### Root cause
|
||||
|
||||
Claude Code v2.1.x added a capability gate on the channel handler.
|
||||
Reading `claude` binary at the `notifications/claude/channel`
|
||||
offsets:
|
||||
|
||||
```js
|
||||
function xJ_(serverName, capabilities, pluginSource) {
|
||||
if (!capabilities?.experimental?.["claude/channel"])
|
||||
return { action: "skip", kind: "capability",
|
||||
reason: "server did not declare claude/channel capability" };
|
||||
...
|
||||
}
|
||||
```
|
||||
|
||||
`xJ_` is called when the MCP server connects. When it returns
|
||||
`{action: "skip"}`, Claude Code never calls
|
||||
`client.setNotificationHandler(IJ_(), ...)` for that server — so
|
||||
every `notifications/claude/channel` emit falls into the void. The
|
||||
`--dangerously-load-development-channels server:claudemesh` flag
|
||||
gets you past the allowlist check that runs LATER in `xJ_`, but the
|
||||
capability gate runs FIRST and is independent.
|
||||
|
||||
Pre-2.1.x clients didn't gate on this key, which is why the same
|
||||
MCP wire shape "worked" before. There's no error / log / warning
|
||||
on either side; the notifications just disappear.
|
||||
|
||||
### Fix
|
||||
|
||||
`apps/cli/src/mcp/server.ts` declares the capability:
|
||||
|
||||
```ts
|
||||
new Server({ name: "claudemesh", version: VERSION }, {
|
||||
capabilities: {
|
||||
tools: {}, prompts: {}, resources: {},
|
||||
experimental: { "claude/channel": {} },
|
||||
},
|
||||
});
|
||||
```
|
||||
|
||||
The empty object is enough — Claude Code only checks for presence,
|
||||
not contents.
|
||||
|
||||
### Diagnostic logging
|
||||
|
||||
The MCP server now writes a per-pid log to
|
||||
`~/.claudemesh/daemon/mcp-<pid>.log` whenever:
|
||||
|
||||
- the SSE event arrives (`sse_event_received`),
|
||||
- a channel notification is emitted (`channel_emitted`), or
|
||||
- the emit throws (`channel_emit_failed`).
|
||||
|
||||
`tail -f ~/.claudemesh/daemon/mcp-*.log` lets users verify the
|
||||
push pipeline end-to-end without strings-dumping the Claude Code
|
||||
binary. (MCP stderr is captured by Claude Code and not visible to
|
||||
the user, so an on-disk log was the only way to surface this
|
||||
state in the future.)
|
||||
|
||||
### Upgrade
|
||||
|
||||
```sh
|
||||
npm i -g claudemesh-cli@latest
|
||||
# Restart Claude Code so the MCP picks up the capability change.
|
||||
```
|
||||
|
||||
After this version: peer messages surface as `<channel>` reminders
|
||||
mid-turn the way they did pre-2.1.x.
|
||||
|
||||
## 1.34.0 (2026-05-04) — Sender attribution via session-WS + inbox CLI fix
|
||||
|
||||
Two regressions surfaced in real two-session smokes that landed
|
||||
together; both root in the same architectural seam (sender identity
|
||||
across the daemon ↔ broker ↔ recipient hop).
|
||||
|
||||
### Sender attribution: outbox routes via session-WS
|
||||
|
||||
Pre-1.34.0, every outbox row drained through the daemon's
|
||||
member-keyed `DaemonBrokerClient`, regardless of which session typed
|
||||
`claudemesh send`. The broker's fan-out builds the push envelope from
|
||||
`conn.sessionPubkey ?? conn.memberPubkey` — for a member-WS that's
|
||||
always the member pubkey. Result: a real two-session smoke
|
||||
(`a → b: "123"`, `b → a: "456"`) landed messages in `inbox.db` with
|
||||
`sender_pubkey = <daemon's member pubkey>` instead of the actual
|
||||
session sender's ephemeral pubkey. Wrong "from" for every DM.
|
||||
|
||||
The fix routes session-originated sends through the matching
|
||||
`SessionBrokerClient` so the broker sees `conn.sessionPubkey =
|
||||
<sender session pubkey>` naturally — no broker-side change needed.
|
||||
Mechanics:
|
||||
|
||||
- New `outbox.sender_session_pubkey` column. The IPC `/v1/send`
|
||||
handler fills it whenever the request authenticates as a launched
|
||||
session (`Authorization: ClaudeMesh-Session …`).
|
||||
- IPC `/v1/send` now encrypts with the **session secret** (was: mesh
|
||||
member secret) when a session token is present. Recipient's
|
||||
`inbound.ts` decrypts with `senderSessionPub × recipientSessionSec`
|
||||
→ matches what the sender wrote.
|
||||
- `SessionBrokerClient` gains a `send()` method mirroring
|
||||
`DaemonBrokerClient.send` (pendingAcks tracking, 15s ack-timeout,
|
||||
queue-while-reconnecting via the `opens` array). Composition kept
|
||||
intact — both clients share `connectWsWithBackoff`; the
|
||||
request/ack bookkeeping is duplicated rather than subclassed.
|
||||
- Drain worker reads `sender_session_pubkey` and looks up an open
|
||||
session-WS via a new `getSessionBrokerByPubkey` accessor on
|
||||
`DrainOptions`. Session-attributed rows REQUIRE an open session-WS;
|
||||
no fallback to daemon-WS, because the row is encrypted with the
|
||||
session secret and silent fallback would break decryption on the
|
||||
recipient side. Closed/reconnecting → backoff + retry.
|
||||
- `apps/cli/src/daemon/run.ts` maintains a parallel
|
||||
`sessionBrokersByPubkey` index alongside the existing token-keyed
|
||||
map, kept in sync on register/deregister.
|
||||
|
||||
Cold-path sends (no session token in IPC headers) keep the legacy
|
||||
member-key flow unchanged. Pre-1.34.0 outbox rows (NULL session
|
||||
pubkey) drain via the daemon-WS as before — no migration of in-flight
|
||||
rows is required.
|
||||
|
||||
### `claudemesh inbox` reads `inbox.db` (was: stale broker buffer)
|
||||
|
||||
The pre-1.34.0 implementation opened a fresh `BrokerClient`, waited
|
||||
1s, then drained an in-memory push buffer that would only contain
|
||||
new pushes received during that 1s window — completely disjoint from
|
||||
the daemon's persisted `~/.claudemesh/daemon/inbox.db`. So with the
|
||||
attribution bug above, a real smoke that DID land messages in the
|
||||
daemon's inbox.db reported "No messages on mesh prueba1" because the
|
||||
CLI was looking at the wrong source.
|
||||
|
||||
Fixed:
|
||||
|
||||
- New `tryListInboxViaDaemon(mesh, limit)` daemon-route helper hits
|
||||
`/v1/inbox`.
|
||||
- `listInbox` (DB layer) and the `/v1/inbox` IPC endpoint accept a
|
||||
`mesh` filter so the server scopes server-side instead of returning
|
||||
all rows and filtering in-process.
|
||||
- `runInbox` rewritten to call the daemon-route helper. JSON mode
|
||||
returns the raw daemon shape; the human renderer formats sender
|
||||
name + pubkey prefix + body + receipt time per row.
|
||||
|
||||
The cold-path "drain a fresh-broker buffer" was always vestigial —
|
||||
removed entirely.
|
||||
|
||||
### Verifying
|
||||
|
||||
`/tmp/cm-bus-trace.mjs` (workshop scratch, not shipped) opens an SSE
|
||||
listener against `/v1/events`, registers two test sessions, sends
|
||||
both directions, and asserts the broker `message` events surface
|
||||
correctly. Used to confirm the daemon's bus.publish path was already
|
||||
fine — the regression sat upstream in the daemon's outbound
|
||||
attribution.
|
||||
|
||||
After this version: real two-session smokes show
|
||||
`sender_pubkey = <session pubkey>` (not member pubkey),
|
||||
`claudemesh inbox --mesh <slug>` lists what the daemon actually
|
||||
received, and existing MCP `notifications/claude/channel` events
|
||||
carry the correct sender attribution to Claude Code.
|
||||
|
||||
## 1.33.0 (2026-05-04) — Milestone 1: lifecycle cleanups + at-least-once with ack
|
||||
|
||||
First milestone of the agentic-comms architecture work
|
||||
|
||||
@@ -1,6 +1,6 @@
|
||||
{
|
||||
"name": "claudemesh-cli",
|
||||
"version": "1.33.0",
|
||||
"version": "1.34.16",
|
||||
"description": "Peer mesh for Claude Code sessions — CLI + MCP server.",
|
||||
"keywords": [
|
||||
"claude-code",
|
||||
|
||||
@@ -9,9 +9,13 @@ description: Use when the user asks to send a message to a peer Claude session,
|
||||
|
||||
**You invoke claudemesh exclusively through the `claudemesh` CLI via Bash.** There are no MCP tools to call — `tools/list` returns empty for the claudemesh MCP server. The MCP server exists only to deliver inbound peer messages as `<channel source="claudemesh">` interrupts mid-turn. Everything else is CLI.
|
||||
|
||||
## Launch welcome (`kind: "welcome"`) — 1.34.2+
|
||||
|
||||
5 seconds after Claude Code attaches to claudemesh via `claudemesh launch`, the MCP server emits ONE `<channel source="claudemesh">` push with `meta.kind: "welcome"`. It carries identity (`self_display_name`, `self_session_pubkey`, `self_role`), the active `mesh_slug`, live `peer_count` + `peer_names`, recent `unread_count` + `latest_message_ids`, and a CLI hint line. Treat it as the "mesh is connected" handshake — read it once, internalize identity + peers + inbox state, and use it to decide whether to act on unread items right away. Do NOT reply to a welcome push the way you reply to a DM; it has no sender.
|
||||
|
||||
## When you receive a `<channel source="claudemesh">` message
|
||||
|
||||
Respond IMMEDIATELY. Pause your current task, reply via `claudemesh send`, then resume. Read `from_name`, `mesh_slug`, and `priority` from the channel attributes. Reply by setting `<to>` to the sender's `from_name`. Do not ignore low-priority messages — acknowledge them briefly even if you defer action. If the channel meta contains `subtype: reminder`, this is a scheduled reminder you set yourself — act on it.
|
||||
Respond IMMEDIATELY (unless `meta.kind` is `"welcome"` or `"system"` — those are informational, no reply needed). Pause your current task, reply via `claudemesh send`, then resume. Read `from_name`, `mesh_slug`, and `priority` from the channel attributes. Reply by setting `<to>` to the sender's `from_name`. Do not ignore low-priority messages — acknowledge them briefly even if you defer action. If the channel meta contains `subtype: reminder`, this is a scheduled reminder you set yourself — act on it.
|
||||
|
||||
### Channel attributes (everything you need to reply is in the push)
|
||||
|
||||
@@ -19,14 +23,17 @@ The `<channel>` interrupt carries these attributes — no lookup needed:
|
||||
|
||||
| Attribute | What it is |
|
||||
|---|---|
|
||||
| `from_name` | Sender's display name. **Use as `to` in your reply** for DMs. |
|
||||
| `from_pubkey` | Sender's session pubkey (hex). Stable per-session. |
|
||||
| `from_member_id` | Sender's stable mesh.member id. Survives display-name changes — the canonical id. |
|
||||
| `from_name` | Sender's display name. **Use as `to` in your reply** for DMs. Empty/absent on `kind: "welcome"` and `kind: "system"`. |
|
||||
| `from_pubkey` | Sender's **session pubkey** (hex, ephemeral per-launch). Since 1.34.0 this is the session pubkey of the launched session that originated the send, NOT the daemon's stable member pubkey — sibling sessions of the same human are correctly disambiguated. |
|
||||
| `from_session_pubkey` | Same as `from_pubkey` for session-originated DMs. Kept as a separate key so the model never confuses session vs member identity when a control-plane source is involved. |
|
||||
| `from_member_id` / `from_member_pubkey` | Sender's stable mesh.member id / pubkey. Survives display-name and session rotation. Use to recognize "the same human across multiple Claude Code windows". |
|
||||
| `mesh_slug` | Mesh the message arrived on. Pass via `--mesh <slug>` if the parent isn't on the same mesh. |
|
||||
| `priority` | `now` / `next` / `low`. |
|
||||
| `message_id` | Server-side id of THIS message. **Pass to `--reply-to <id>` to thread your reply** in topic posts. |
|
||||
| `client_message_id` | Sender-stable idempotency id (UUID). Survives broker restarts; safe to log. |
|
||||
| `topic` | Set when the source is a topic post. Reply via `topic post <topic> --reply-to <message_id>`. |
|
||||
| `reply_to_id` | Set when the message itself is a reply to a previous one — render thread context. |
|
||||
| `kind` (welcome/system meta only) | `"welcome"` for the launch handshake, `"system"` for peer_join/peer_leave/etc. — neither needs a reply. |
|
||||
|
||||
**Reply patterns:**
|
||||
|
||||
@@ -370,15 +377,33 @@ claudemesh message send <p> "..." --priority now # bypass busy gates
|
||||
claudemesh message send <p> "..." --priority next # default
|
||||
claudemesh message send <p> "..." --priority low # pull-only
|
||||
|
||||
# inbox (alias: claudemesh inbox)
|
||||
claudemesh message inbox
|
||||
claudemesh message inbox --json
|
||||
# inbox (alias: claudemesh inbox) — 1.34.0+ reads from inbox.db via daemon IPC
|
||||
claudemesh inbox # all attached meshes, last 100
|
||||
claudemesh inbox --mesh <slug> # scoped to one mesh
|
||||
claudemesh inbox --mesh <slug> --limit 20 # custom cap
|
||||
claudemesh inbox --json # full row (sender_pubkey, mesh, body, received_at, seen_at, …)
|
||||
claudemesh inbox --unread # 1.34.8+ only rows whose seen_at IS NULL
|
||||
|
||||
# inbox flush + delete — 1.34.7+
|
||||
claudemesh inbox flush --mesh <slug> # delete all rows on one mesh
|
||||
claudemesh inbox flush --before <iso-timestamp> # delete rows older than timestamp
|
||||
claudemesh inbox flush --all # delete every row on every mesh (required guard)
|
||||
claudemesh inbox delete <id> # delete one inbox row by id (alias: rm)
|
||||
claudemesh inbox flush --mesh <slug> --json # JSON: { ok: true, removed: N }
|
||||
|
||||
# delivery status (alias: claudemesh msg-status <id>)
|
||||
claudemesh message status <message-id>
|
||||
claudemesh message status <message-id> --json
|
||||
```
|
||||
|
||||
**Inbox source (1.34.0+):** `claudemesh inbox` queries the daemon's persistent `~/.claudemesh/daemon/inbox.db` over IPC — it is NOT a fresh broker-WS buffer drain. Rows survive daemon restarts. Sender attribution is the actual session pubkey of the launched session that originated the send (NOT the stable member pubkey of the sender's daemon), so two sibling sessions of the same human appear as distinct rows.
|
||||
|
||||
**Read-state (1.34.8+):** every inbox row carries a `seen_at` timestamp. `null` = never surfaced; an ISO string = first surfaced at that moment. The flag flips automatically when (a) the row is returned by an interactive `claudemesh inbox` listing, or (b) the MCP server emits a live `<channel>` reminder for it. The launch welcome push uses `unread_only=true` to surface only rows the user hasn't seen — so a session relaunched a day later sees what it actually missed, not the same 24h batch every time. Use `claudemesh inbox --unread` to get the same filter from the CLI.
|
||||
|
||||
**Self-echo guard (1.34.8+):** broker fan-out paths sometimes mirror an outbound DM back to the originating session-WS. The daemon now drops those at the WS boundary (matching on `senderPubkey === own.session_pubkey`), so the sender no longer sees their own `claudemesh send` arrive as a `← claudemesh: <self>: ...` channel push immediately after dispatching it.
|
||||
|
||||
**Inbox TTL (1.34.8+):** the daemon runs an hourly prune that deletes rows older than 30 days. Without this the inbox grew unbounded; now it self-trims while preserving "I went on holiday and want to see what I missed" recovery for a generous window. No CLI knob — it's a built-in retention policy. To override, manually `claudemesh inbox flush --before <iso>`.
|
||||
|
||||
`send` JSON output: `{"ok": true, "messageId": "...", "target": "..."}`. Errors: `{"ok": false, "error": "..."}`.
|
||||
|
||||
### `state` — shared per-mesh key-value store
|
||||
|
||||
@@ -29,6 +29,14 @@ const BOOLEAN_FLAGS = new Set([
|
||||
"dry-run",
|
||||
"verbose",
|
||||
"skip-service",
|
||||
// 1.34.8: `--unread` filters `claudemesh inbox` to rows whose
|
||||
// seen_at is NULL. No value — pure switch.
|
||||
"unread",
|
||||
// 1.34.12: `--foreground` keeps `claudemesh daemon up` attached
|
||||
// to the terminal (pre-1.34.12 behavior). Default is detached now.
|
||||
"foreground",
|
||||
"no-tcp",
|
||||
"public-health",
|
||||
]);
|
||||
|
||||
export function parseArgv(argv: string[]): ParsedArgs {
|
||||
|
||||
@@ -1,3 +1,7 @@
|
||||
import { spawn } from "node:child_process";
|
||||
import { existsSync, openSync, mkdirSync } from "node:fs";
|
||||
import { join } from "node:path";
|
||||
|
||||
import { runDaemon } from "~/daemon/run.js";
|
||||
import { ipc, IpcError } from "~/daemon/ipc/client.js";
|
||||
import { readRunningPid } from "~/daemon/lock.js";
|
||||
@@ -9,6 +13,15 @@ export interface DaemonOptions {
|
||||
publicHealth?: boolean;
|
||||
mesh?: string;
|
||||
displayName?: string;
|
||||
/** 1.34.12: keep the daemon attached to the current shell instead
|
||||
* of double-forking. Default behavior changed in 1.34.12 — `up`
|
||||
* now detaches by default and writes JSON logs to
|
||||
* ~/.claudemesh/daemon/daemon.log. Pass `--foreground` to get the
|
||||
* pre-1.34.12 behavior (logs streaming to stdout, blocks the
|
||||
* terminal until Ctrl-C). install-service and `claudemesh launch`'s
|
||||
* auto-spawn path always pass --foreground because their parents
|
||||
* (launchd / the launch helper) own the lifecycle. */
|
||||
foreground?: boolean;
|
||||
/** outbox-list status filter, set from boolean flags --failed/--pending/etc. */
|
||||
outboxStatus?: "pending" | "inflight" | "done" | "dead" | "aborted";
|
||||
/** outbox requeue: optional id to mint a fresh client_message_id with. */
|
||||
@@ -26,11 +39,40 @@ export async function runDaemonCommand(
|
||||
|
||||
case "up":
|
||||
case "start":
|
||||
// 1.34.10: `--mesh` and `--name` deprecated.
|
||||
// --mesh: daemon attaches to every joined mesh automatically;
|
||||
// pinning at start time blocks new meshes from being picked up.
|
||||
// --name: overrides the daemon-WS display name GLOBALLY across
|
||||
// every mesh, but each mesh has its own per-mesh display name
|
||||
// in config.json (set at `claudemesh join` time). Passing one
|
||||
// name flattens that out. Sessions advertise their own
|
||||
// CLAUDEMESH_DISPLAY_NAME at `claudemesh launch` time anyway,
|
||||
// and the daemon-WS presence is hidden from peer lists since
|
||||
// 1.32, so the daemon's display name isn't user-visible.
|
||||
if (opts.mesh) {
|
||||
process.stderr.write(
|
||||
`[claudemesh] --mesh on \`daemon up\` is deprecated; the daemon attaches to every joined mesh automatically. ` +
|
||||
`Ignoring --mesh ${opts.mesh}.\n`,
|
||||
);
|
||||
}
|
||||
if (opts.displayName) {
|
||||
process.stderr.write(
|
||||
`[claudemesh] --name on \`daemon up\` is deprecated; per-mesh display names live in config.json (set at join time), ` +
|
||||
`and session display names come from \`claudemesh launch --name\`. Ignoring --name ${opts.displayName}.\n`,
|
||||
);
|
||||
}
|
||||
// 1.34.12: detach by default. The pre-1.34.12 behavior streamed
|
||||
// JSON logs to the controlling terminal and blocked the shell —
|
||||
// fine for debugging, surprising for users who just want the
|
||||
// daemon "up." `--foreground` opts back into the old behavior;
|
||||
// launchd / systemd-user units always pass it because the unit
|
||||
// manager owns lifecycle and stdio redirection.
|
||||
if (!opts.foreground) {
|
||||
return spawnDetachedDaemon(opts);
|
||||
}
|
||||
return runDaemon({
|
||||
tcpEnabled: !opts.noTcp,
|
||||
publicHealthCheck: opts.publicHealth,
|
||||
mesh: opts.mesh,
|
||||
displayName: opts.displayName,
|
||||
});
|
||||
|
||||
case "help":
|
||||
@@ -74,19 +116,18 @@ USAGE
|
||||
claudemesh daemon <command> [options]
|
||||
|
||||
COMMANDS
|
||||
up | start start the daemon in the foreground
|
||||
up | start start the daemon (detached by default)
|
||||
status show running pid + IPC health
|
||||
version ipc + schema version of the running daemon
|
||||
down | stop stop the running daemon (SIGTERM, then wait)
|
||||
accept-host pin the current host fingerprint
|
||||
outbox list list local outbox rows (newest first)
|
||||
outbox requeue <id> re-enqueue an aborted / dead outbox row
|
||||
install-service --mesh <s> write launchd (macOS) / systemd-user (Linux) unit
|
||||
install-service write launchd (macOS) / systemd-user (Linux) unit
|
||||
uninstall-service remove the platform service unit
|
||||
|
||||
OPTIONS
|
||||
--mesh <slug> attach to / target this mesh
|
||||
--name <displayName> override CLAUDEMESH_DISPLAY_NAME
|
||||
--foreground keep daemon attached to terminal, JSON logs to stdout (1.34.12+)
|
||||
--no-tcp disable the loopback TCP listener (UDS only)
|
||||
--public-health expose /v1/health unauthenticated on TCP
|
||||
--json machine-readable output where supported
|
||||
@@ -192,9 +233,12 @@ async function runInstallService(opts: DaemonOptions): Promise<number> {
|
||||
}
|
||||
// Resolve the binary path. Prefer the running argv[0] when it's an
|
||||
// installed claudemesh binary; fall back to whichever `claudemesh` is
|
||||
// first on PATH. --mesh is now optional: omit it to attach to every
|
||||
// joined mesh (the 1.26.0 multi-mesh default); pass it to lock the
|
||||
// unit to a single mesh for testing or single-mesh hosts.
|
||||
// first on PATH.
|
||||
// 1.34.10: install-service no longer bakes --mesh into the unit. The
|
||||
// daemon attaches to every joined mesh by default, and pinning the
|
||||
// unit to one slug at install time was the source of the "joined a
|
||||
// new mesh but my service ignores it" footgun. If the user passes
|
||||
// --mesh anyway, we warn + ignore.
|
||||
let binary = process.argv[1] ?? "";
|
||||
if (!binary || /\.ts$/.test(binary) || /node_modules|src\/entrypoints/.test(binary)) {
|
||||
try {
|
||||
@@ -205,11 +249,19 @@ async function runInstallService(opts: DaemonOptions): Promise<number> {
|
||||
return 1;
|
||||
}
|
||||
}
|
||||
if (opts.mesh) {
|
||||
process.stderr.write(
|
||||
`[claudemesh] --mesh on \`daemon install-service\` is deprecated and ignored; the daemon attaches to every joined mesh.\n`,
|
||||
);
|
||||
}
|
||||
if (opts.displayName) {
|
||||
process.stderr.write(
|
||||
`[claudemesh] --name on \`daemon install-service\` is deprecated and ignored; per-mesh names live in config.json, session names come from \`claudemesh launch --name\`.\n`,
|
||||
);
|
||||
}
|
||||
try {
|
||||
const r = installService({
|
||||
binaryPath: binary,
|
||||
...(opts.mesh ? { meshSlug: opts.mesh } : {}),
|
||||
...(opts.displayName ? { displayName: opts.displayName } : {}),
|
||||
});
|
||||
if (opts.json) {
|
||||
process.stdout.write(JSON.stringify({ ok: true, ...r }) + "\n");
|
||||
@@ -309,3 +361,71 @@ async function runStop(opts: DaemonOptions): Promise<number> {
|
||||
else process.stdout.write(`daemon: signaled but did not exit within 5s (pid ${pid})\n`);
|
||||
return 1;
|
||||
}
|
||||
|
||||
/**
|
||||
* 1.34.12: spawn the daemon as a detached background process. Re-execs
|
||||
* the same `claudemesh` binary with `daemon up --foreground` (so the
|
||||
* child runs the long-lived loop), redirects stdout/stderr to
|
||||
* ~/.claudemesh/daemon/daemon.log, and `unref()`s so the parent shell
|
||||
* can exit cleanly.
|
||||
*
|
||||
* The parent waits up to ~3s for the UDS socket to appear before
|
||||
* declaring success — that's the same liveness check `claudemesh launch`
|
||||
* uses, and it catches the "child crashed during boot" case (config
|
||||
* read failed, port bind failed, etc.) with an actionable error
|
||||
* pointing at the log file rather than silent loss.
|
||||
*/
|
||||
async function spawnDetachedDaemon(opts: DaemonOptions): Promise<number> {
|
||||
// Ensure the log directory exists before opening the FDs.
|
||||
mkdirSync(DAEMON_PATHS.DAEMON_DIR, { recursive: true, mode: 0o700 });
|
||||
const logPath = join(DAEMON_PATHS.DAEMON_DIR, "daemon.log");
|
||||
|
||||
// The CLI binary path. process.argv[1] is the entrypoint script the
|
||||
// node runtime is currently executing — for an installed CLI that's
|
||||
// .../bin/claudemesh, for `bun run` dev that's the local dist file.
|
||||
// Either way it's the right thing to re-exec.
|
||||
const binary = process.argv[1] ?? "claudemesh";
|
||||
const args = ["daemon", "up", "--foreground"];
|
||||
if (opts.noTcp) args.push("--no-tcp");
|
||||
if (opts.publicHealth) args.push("--public-health");
|
||||
|
||||
const out = openSync(logPath, "a");
|
||||
const err = openSync(logPath, "a");
|
||||
const child = spawn(process.execPath, [binary, ...args], {
|
||||
detached: true,
|
||||
stdio: ["ignore", out, err],
|
||||
env: process.env,
|
||||
});
|
||||
// Decouple the child from the parent's process group so closing the
|
||||
// shell doesn't SIGHUP the daemon.
|
||||
child.unref();
|
||||
|
||||
// Wait for the socket to appear — the daemon's IPC listener binds
|
||||
// ~immediately after the broker WS handshake starts, so socket
|
||||
// existence is a reliable "the daemon is alive enough to accept
|
||||
// requests" signal.
|
||||
const sockPath = DAEMON_PATHS.SOCK_FILE;
|
||||
const startedAt = Date.now();
|
||||
while (Date.now() - startedAt < 3_000) {
|
||||
if (existsSync(sockPath)) {
|
||||
if (opts.json) {
|
||||
process.stdout.write(JSON.stringify({ ok: true, detached: true, pid: child.pid, log: logPath }) + "\n");
|
||||
} else {
|
||||
process.stdout.write(` ✔ daemon started (pid ${child.pid})\n`);
|
||||
process.stdout.write(` → log: ${logPath}\n`);
|
||||
process.stdout.write(` → stop: claudemesh daemon down\n`);
|
||||
}
|
||||
return 0;
|
||||
}
|
||||
await new Promise<void>((r) => setTimeout(r, 100));
|
||||
}
|
||||
|
||||
if (opts.json) {
|
||||
process.stdout.write(JSON.stringify({ ok: false, detached: true, pid: child.pid, reason: "socket_not_appeared", log: logPath }) + "\n");
|
||||
} else {
|
||||
process.stderr.write(` ✘ daemon spawn timeout: socket did not appear within 3s\n`);
|
||||
process.stderr.write(` → check log: ${logPath}\n`);
|
||||
process.stderr.write(` → run foreground for live output: claudemesh daemon up --foreground\n`);
|
||||
}
|
||||
return 1;
|
||||
}
|
||||
|
||||
91
apps/cli/src/commands/inbox-actions.ts
Normal file
91
apps/cli/src/commands/inbox-actions.ts
Normal file
@@ -0,0 +1,91 @@
|
||||
/**
|
||||
* `claudemesh inbox flush` and `claudemesh inbox delete <id>` —
|
||||
* mutate the daemon's persistent inbox store
|
||||
* (`~/.claudemesh/daemon/inbox.db`) over IPC.
|
||||
*
|
||||
* 1.34.7: until this version, the only way to clean the inbox was a
|
||||
* raw `sqlite3 inbox.db "DELETE FROM inbox"` against the daemon's
|
||||
* private DB. That works but bypasses the IPC layer (and any future
|
||||
* lifecycle hooks on row removal), and is invisible to a user who
|
||||
* doesn't know the schema. These two verbs make the operation visible
|
||||
* + safe + scriptable.
|
||||
*/
|
||||
|
||||
import {
|
||||
tryFlushInboxViaDaemon,
|
||||
tryDeleteInboxRowViaDaemon,
|
||||
} from "~/services/bridge/daemon-route.js";
|
||||
import { render } from "~/ui/render.js";
|
||||
import { dim } from "~/ui/styles.js";
|
||||
|
||||
export interface InboxFlushFlags {
|
||||
mesh?: string;
|
||||
/** ISO-8601 timestamp; deletes rows received_at < before. */
|
||||
before?: string;
|
||||
/** Required when neither --mesh nor --before is set, to prevent an
|
||||
* accidental "delete every row on every mesh". */
|
||||
all?: boolean;
|
||||
json?: boolean;
|
||||
}
|
||||
|
||||
export async function runInboxFlush(flags: InboxFlushFlags): Promise<void> {
|
||||
const hasFilter = !!(flags.mesh || flags.before);
|
||||
if (!hasFilter && !flags.all) {
|
||||
if (flags.json) { process.stdout.write(JSON.stringify({ ok: false, error: "missing_filter" }) + "\n"); return; }
|
||||
render.info(dim(
|
||||
"Refusing to flush every row on every mesh.\n" +
|
||||
" Re-run with --mesh <slug>, --before <iso-timestamp>, or --all to confirm.",
|
||||
));
|
||||
process.exit(1);
|
||||
}
|
||||
|
||||
const removed = await tryFlushInboxViaDaemon({
|
||||
...(flags.mesh ? { mesh: flags.mesh } : {}),
|
||||
...(flags.before ? { beforeIso: flags.before } : {}),
|
||||
});
|
||||
|
||||
if (removed === null) {
|
||||
if (flags.json) { process.stdout.write(JSON.stringify({ ok: false, error: "daemon_unreachable" }) + "\n"); return; }
|
||||
render.info(dim("Daemon not reachable. Run `claudemesh daemon up` and retry."));
|
||||
process.exit(1);
|
||||
}
|
||||
|
||||
if (flags.json) {
|
||||
process.stdout.write(JSON.stringify({ ok: true, removed }) + "\n");
|
||||
return;
|
||||
}
|
||||
const scope = flags.mesh
|
||||
? `mesh "${flags.mesh}"`
|
||||
: flags.before
|
||||
? `older than ${flags.before}`
|
||||
: "all meshes";
|
||||
render.info(`✔ Flushed ${removed} message${removed === 1 ? "" : "s"} from ${scope}.`);
|
||||
}
|
||||
|
||||
export interface InboxDeleteFlags {
|
||||
json?: boolean;
|
||||
}
|
||||
|
||||
export async function runInboxDelete(id: string, flags: InboxDeleteFlags): Promise<void> {
|
||||
if (!id) {
|
||||
if (flags.json) { process.stdout.write(JSON.stringify({ ok: false, error: "missing_id" }) + "\n"); return; }
|
||||
render.info(dim("Usage: claudemesh inbox delete <message-id>"));
|
||||
process.exit(1);
|
||||
}
|
||||
const ok = await tryDeleteInboxRowViaDaemon(id);
|
||||
if (ok === null) {
|
||||
if (flags.json) { process.stdout.write(JSON.stringify({ ok: false, error: "daemon_unreachable" }) + "\n"); return; }
|
||||
render.info(dim("Daemon not reachable. Run `claudemesh daemon up` and retry."));
|
||||
process.exit(1);
|
||||
}
|
||||
if (!ok) {
|
||||
if (flags.json) { process.stdout.write(JSON.stringify({ ok: false, error: "not_found", id }) + "\n"); return; }
|
||||
render.info(dim(`No inbox row with id "${id}".`));
|
||||
process.exit(1);
|
||||
}
|
||||
if (flags.json) {
|
||||
process.stdout.write(JSON.stringify({ ok: true, id }) + "\n");
|
||||
return;
|
||||
}
|
||||
render.info(`✔ Deleted inbox row ${id}.`);
|
||||
}
|
||||
@@ -1,49 +1,101 @@
|
||||
/**
|
||||
* `claudemesh inbox` — read pending peer messages.
|
||||
* `claudemesh inbox` — read pending peer messages from the daemon's
|
||||
* persisted inbox (`~/.claudemesh/daemon/inbox.db`).
|
||||
*
|
||||
* Connects, waits briefly for push delivery, drains the buffer, prints.
|
||||
* Works best when message-mode is "inbox" or "off" (messages held at broker).
|
||||
* 1.34.0: switched from the legacy cold-path "open fresh broker WS,
|
||||
* drain in-memory buffer" flow to a daemon IPC read against `/v1/inbox`.
|
||||
* The cold path was structurally broken — the persistent inbox lives in
|
||||
* the daemon, and pushes land on its session-WS, not on a freshly-opened
|
||||
* standalone WS. The daemon-route `tryListInboxViaDaemon` returns rows
|
||||
* persisted across daemon restarts and surfaces them with the correct
|
||||
* mesh scoping (server-side mesh filter added in 1.34.0).
|
||||
*
|
||||
* Cold-path fallback removed: when the daemon isn't reachable, the
|
||||
* prior implementation returned an empty list anyway (no broker state
|
||||
* = no buffered pushes), so removing that path doesn't lose any
|
||||
* functionality. Strict mode emits a clear error via daemon-route.
|
||||
*/
|
||||
|
||||
import { withMesh } from "./connect.js";
|
||||
import type { InboundPush } from "~/services/broker/facade.js";
|
||||
import { tryListInboxViaDaemon } from "~/services/bridge/daemon-route.js";
|
||||
import { render } from "~/ui/render.js";
|
||||
import { bold, dim } from "~/ui/styles.js";
|
||||
|
||||
export interface InboxFlags {
|
||||
mesh?: string;
|
||||
json?: boolean;
|
||||
wait?: number;
|
||||
/** Cap the number of rows returned by the daemon. Default 100. */
|
||||
limit?: number;
|
||||
/** 1.34.8: only show rows whose seen_at is NULL (i.e., never
|
||||
* surfaced via an interactive listing or live channel reminder).
|
||||
* When omitted, every row is returned and an interactive listing
|
||||
* stamps them seen as a side effect. */
|
||||
unread?: boolean;
|
||||
}
|
||||
|
||||
function formatMessage(msg: InboundPush): string {
|
||||
const text = msg.plaintext ?? `[encrypted: ${msg.ciphertext.slice(0, 32)}…]`;
|
||||
const from = msg.senderPubkey.slice(0, 8);
|
||||
const time = new Date(msg.createdAt).toLocaleTimeString();
|
||||
const kindTag = msg.kind === "direct" ? "→ direct" : msg.kind;
|
||||
return ` ${bold(from)} ${dim(`[${kindTag}] ${time}`)}\n ${text}`;
|
||||
interface FormattedItem {
|
||||
sender_pubkey: string;
|
||||
sender_name: string;
|
||||
body: string | null;
|
||||
topic: string | null;
|
||||
received_at: string;
|
||||
mesh: string;
|
||||
}
|
||||
|
||||
function formatMessage(msg: FormattedItem, includeMesh: boolean): string {
|
||||
const text = msg.body ?? "[encrypted]";
|
||||
const from = msg.sender_name && msg.sender_name !== msg.sender_pubkey.slice(0, 8)
|
||||
? `${msg.sender_name} (${msg.sender_pubkey.slice(0, 8)})`
|
||||
: msg.sender_pubkey.slice(0, 8);
|
||||
const time = new Date(msg.received_at).toLocaleTimeString();
|
||||
const topicTag = msg.topic ? ` (#${msg.topic})` : "";
|
||||
const meshTag = includeMesh ? ` [${msg.mesh}]` : "";
|
||||
return ` ${bold(from)} ${dim(`${meshTag}${topicTag} ${time}`)}\n ${text}`;
|
||||
}
|
||||
|
||||
export async function runInbox(flags: InboxFlags): Promise<void> {
|
||||
const waitMs = (flags.wait ?? 1) * 1000;
|
||||
// Mesh resolution is owned by the daemon (it knows which meshes are
|
||||
// attached) — the CLI just forwards the user's --mesh flag through.
|
||||
// When omitted, the daemon's `/v1/inbox` honors the session-default
|
||||
// mesh on auth-token requests; out-of-session callers see rows from
|
||||
// every attached mesh. We don't pre-validate the mesh slug here so
|
||||
// the command works even from a launch tmpdir whose local
|
||||
// `config.json` only knows about the launch's mesh.
|
||||
const meshSlug = flags.mesh;
|
||||
|
||||
await withMesh({ meshSlug: flags.mesh ?? null }, async (client, mesh) => {
|
||||
await new Promise<void>((resolve) => setTimeout(resolve, waitMs));
|
||||
const messages = client.drainPushBuffer();
|
||||
|
||||
if (flags.json) {
|
||||
process.stdout.write(JSON.stringify(messages, null, 2) + "\n");
|
||||
return;
|
||||
}
|
||||
|
||||
if (messages.length === 0) {
|
||||
render.info(dim(`No messages on mesh "${mesh.slug}".`));
|
||||
return;
|
||||
}
|
||||
|
||||
render.section(`inbox — ${mesh.slug} (${messages.length} message${messages.length === 1 ? "" : "s"})`);
|
||||
for (const msg of messages) {
|
||||
process.stdout.write(formatMessage(msg) + "\n\n");
|
||||
}
|
||||
const items = await tryListInboxViaDaemon(meshSlug, flags.limit ?? 100, {
|
||||
unreadOnly: flags.unread === true,
|
||||
// CLI is the canonical "I'm reading my inbox" path — let the daemon
|
||||
// auto-stamp seen_at on the rows we just rendered. The MCP welcome
|
||||
// path passes mark_seen=false instead and stamps explicitly after
|
||||
// the channel notification succeeds.
|
||||
markSeen: true,
|
||||
});
|
||||
if (items === null) {
|
||||
if (flags.json) { process.stdout.write("[]\n"); return; }
|
||||
render.info(dim("Daemon not reachable. Run `claudemesh daemon up` and retry."));
|
||||
return;
|
||||
}
|
||||
|
||||
if (flags.json) {
|
||||
process.stdout.write(JSON.stringify(items, null, 2) + "\n");
|
||||
return;
|
||||
}
|
||||
|
||||
if (items.length === 0) {
|
||||
const scope = meshSlug ? `mesh "${meshSlug}"` : "any mesh";
|
||||
const filter = flags.unread ? "unread " : "";
|
||||
render.info(dim(`No ${filter}messages on ${scope}.`));
|
||||
return;
|
||||
}
|
||||
|
||||
const filterTag = flags.unread ? " unread" : "";
|
||||
const heading = meshSlug
|
||||
? `inbox — ${meshSlug} (${items.length}${filterTag} message${items.length === 1 ? "" : "s"})`
|
||||
: `inbox (${items.length}${filterTag} message${items.length === 1 ? "" : "s"})`;
|
||||
render.section(heading);
|
||||
// When the user didn't filter by mesh, surface the mesh slug per row
|
||||
// so they can tell apart rows from different meshes at a glance.
|
||||
for (const msg of items) {
|
||||
process.stdout.write(formatMessage(msg, !meshSlug) + "\n\n");
|
||||
}
|
||||
}
|
||||
|
||||
@@ -76,12 +76,32 @@ export async function runKick(
|
||||
if ("error" in built) { render.err(String(built.error)); return EXIT.INVALID_ARGS; }
|
||||
|
||||
return await withMesh({ meshSlug }, async (client) => {
|
||||
const result = await client.sendAndWait(built as Record<string, unknown>) as { affected?: string[]; kicked?: string[] };
|
||||
const result = await client.sendAndWait(built as Record<string, unknown>) as {
|
||||
affected?: string[];
|
||||
kicked?: string[];
|
||||
// 1.34.15: broker refuses to kick control-plane WSes (they'd
|
||||
// just auto-reconnect). Older brokers don't emit this field.
|
||||
skipped_control_plane?: string[];
|
||||
};
|
||||
const peers = result?.affected ?? result?.kicked ?? [];
|
||||
if (peers.length === 0) render.info("No peers matched.");
|
||||
else {
|
||||
const skipped = result?.skipped_control_plane ?? [];
|
||||
|
||||
if (peers.length === 0 && skipped.length === 0) {
|
||||
render.info("No peers matched.");
|
||||
} else if (peers.length === 0 && skipped.length > 0) {
|
||||
render.warn(
|
||||
`${skipped.length} match(es) refused: ${skipped.join(", ")} — control-plane connections (daemon / dashboard) auto-reconnect, so kick is a no-op.`,
|
||||
"To take a daemon offline locally, run `claudemesh daemon down` on that machine. To remove a member from the mesh, use `claudemesh ban <peer>`.",
|
||||
);
|
||||
} else {
|
||||
render.ok(`Kicked ${peers.length} peer(s): ${peers.join(", ")}`);
|
||||
render.hint("Their Claude Code session ended. They can rejoin anytime by running `claudemesh`.");
|
||||
if (skipped.length > 0) {
|
||||
render.warn(
|
||||
`(also refused ${skipped.length} control-plane connection(s): ${skipped.join(", ")})`,
|
||||
"Daemon / dashboard connections auto-reconnect; kick is a no-op against them. Use `claudemesh ban <peer>` to remove a member entirely.",
|
||||
);
|
||||
}
|
||||
}
|
||||
return EXIT.SUCCESS;
|
||||
});
|
||||
|
||||
@@ -63,6 +63,7 @@ async function ensureDaemonRunning(meshSlug: string, quiet: boolean): Promise<vo
|
||||
const res = await ensureDaemonReady({ budgetMs: 10_000, mesh: meshSlug });
|
||||
if (res.state === "up") {
|
||||
if (!quiet) render.ok("daemon already running");
|
||||
await warnIfDaemonStale(quiet);
|
||||
return;
|
||||
}
|
||||
if (res.state === "started") {
|
||||
@@ -71,10 +72,34 @@ async function ensureDaemonRunning(meshSlug: string, quiet: boolean): Promise<vo
|
||||
}
|
||||
render.warn(
|
||||
`daemon ${res.state}${res.reason ? `: ${res.reason}` : ""}`,
|
||||
"Run `claudemesh daemon up --mesh " + meshSlug + "` manually, then re-launch.",
|
||||
"Run `claudemesh daemon up` manually, then re-launch.",
|
||||
);
|
||||
}
|
||||
|
||||
/** 1.34.9: warn when the running daemon's version doesn't match the CLI
|
||||
* that's about to launch a session. `npm i -g claudemesh-cli` upgrades
|
||||
* the binaries on disk but doesn't restart a launchd / systemd-user
|
||||
* service or a foreground `claudemesh daemon up`, so users routinely
|
||||
* ship a fix to the CLI side and never see it because the WS lifecycle,
|
||||
* echo guards, and self-join filters all live in the long-running
|
||||
* daemon process. We probe `/v1/version` and emit a one-shot stderr
|
||||
* warning when CLI ≠ daemon. Best-effort; failures are silent. */
|
||||
async function warnIfDaemonStale(quiet: boolean): Promise<void> {
|
||||
if (quiet) return;
|
||||
try {
|
||||
const { ipc } = await import("~/daemon/ipc/client.js");
|
||||
const { VERSION } = await import("~/constants/urls.js");
|
||||
const res = await ipc<{ daemon_version?: string }>({ path: "/v1/version", timeoutMs: 1_500 });
|
||||
if (res.status !== 200) return;
|
||||
const daemonVersion = res.body.daemon_version ?? "";
|
||||
if (!daemonVersion || daemonVersion === VERSION) return;
|
||||
render.warn(
|
||||
`daemon is ${daemonVersion}, CLI is ${VERSION} — restart to pick up new fixes.`,
|
||||
"Run: `claudemesh daemon down && claudemesh daemon up` (no --mesh — daemon attaches to every joined mesh; restart the launchd / systemd-user unit if you installed one).",
|
||||
);
|
||||
} catch { /* swallow — version probe is best-effort */ }
|
||||
}
|
||||
|
||||
async function pickMesh(meshes: JoinedMesh[]): Promise<JoinedMesh> {
|
||||
if (meshes.length === 1) return meshes[0]!;
|
||||
|
||||
@@ -375,11 +400,13 @@ async function printBrokerWelcome(meshSlug: string): Promise<void> {
|
||||
}
|
||||
} catch { /* daemon unreachable — not fatal */ }
|
||||
|
||||
// Peer count (best-effort).
|
||||
// Peer count (best-effort). 1.34.15: scope to the launched mesh so
|
||||
// multi-mesh daemons don't inflate the welcome banner with peers
|
||||
// from other meshes the user didn't just attach to.
|
||||
let peerCount = -1;
|
||||
try {
|
||||
const { tryListPeersViaDaemon } = await import("~/services/bridge/daemon-route.js");
|
||||
const peers = (await tryListPeersViaDaemon()) ?? [];
|
||||
const peers = (await tryListPeersViaDaemon(meshSlug)) ?? [];
|
||||
peerCount = peers.filter((p) =>
|
||||
(p as { channel?: string }).channel !== "claudemesh-daemon",
|
||||
).length;
|
||||
|
||||
@@ -135,9 +135,17 @@ async function listPeersForMesh(slug: string): Promise<PeerRecord[]> {
|
||||
// lifecycle helper inside tryListPeersViaDaemon auto-spawns the
|
||||
// daemon if it's down and probes it for liveness — no separate bridge
|
||||
// tier is needed any more (1.28.0).
|
||||
//
|
||||
// 1.34.15: forward `slug` to the daemon as `?mesh=<slug>` so the
|
||||
// server-side aggregator narrows to the requested mesh. Pre-1.34.15
|
||||
// we called this with no argument, so a multi-mesh daemon returned
|
||||
// peers from every attached mesh and the renderer printed "peers on
|
||||
// flexicar" with cross-mesh rows mixed in. The daemon's
|
||||
// `meshFromCtx` already does the right scoping when the slug is
|
||||
// passed; the CLI just wasn't passing it.
|
||||
try {
|
||||
const { tryListPeersViaDaemon } = await import("~/services/bridge/daemon-route.js");
|
||||
const dr = await tryListPeersViaDaemon();
|
||||
const dr = await tryListPeersViaDaemon(slug);
|
||||
if (dr !== null) {
|
||||
return dr.map((p) => annotateSelf(p as PeerRecord, selfMemberPubkey, selfSessionPubkey));
|
||||
}
|
||||
|
||||
@@ -1,10 +1,82 @@
|
||||
import { existsSync } from "node:fs";
|
||||
import { homedir } from "node:os";
|
||||
import { join } from "node:path";
|
||||
|
||||
const home = homedir();
|
||||
const DEFAULT_CONFIG_DIR = join(home, ".claudemesh");
|
||||
|
||||
/**
|
||||
* Resolve `CONFIG_DIR` once, with stale-env detection.
|
||||
*
|
||||
* `claudemesh launch` exposes `CLAUDEMESH_CONFIG_DIR=<tmpdir>` to its
|
||||
* spawned `claude` so the per-session mesh selection is isolated from
|
||||
* `~/.claudemesh/config.json`. The tmpdir is rmSync'd on launch exit.
|
||||
*
|
||||
* Footgun: if a `claudemesh` invocation INHERITS that env from an
|
||||
* already-launched (or previously-launched) session — e.g. a Bash tool
|
||||
* call inside Claude Code, or a tmux pane that captured the env via
|
||||
* `update-environment` — the inherited path may point at a tmpdir that
|
||||
* no longer exists. Pre-1.34.14 we silently used the dead path,
|
||||
* `readConfig()` came back empty, and the user saw "No meshes joined"
|
||||
* from an otherwise-working install.
|
||||
*
|
||||
* Resolution rules:
|
||||
* 1. No env var → `~/.claudemesh` (default).
|
||||
* 2. Env points at a dir containing `config.json` → trust it
|
||||
* (the legitimate per-session-launch case).
|
||||
* 3. Env set but stale (dir missing or no `config.json`) → warn
|
||||
* once on stderr (TTY-only) and fall back to `~/.claudemesh`.
|
||||
*
|
||||
* Memoized: resolves once on first access. Mid-process env mutations
|
||||
* are intentionally ignored — paths must stay stable across one CLI
|
||||
* invocation.
|
||||
*/
|
||||
let _resolvedConfigDir: string | null = null;
|
||||
let _warnedStaleEnv = false;
|
||||
|
||||
function resolveConfigDir(): string {
|
||||
if (_resolvedConfigDir !== null) return _resolvedConfigDir;
|
||||
const envDir = process.env.CLAUDEMESH_CONFIG_DIR;
|
||||
if (!envDir) {
|
||||
_resolvedConfigDir = DEFAULT_CONFIG_DIR;
|
||||
return DEFAULT_CONFIG_DIR;
|
||||
}
|
||||
// Trust the env when it resolves to a real directory. We check
|
||||
// the DIR (not `config.json`) because the legitimate "fresh launch
|
||||
// before any write" case has the dir but no config.json yet.
|
||||
// The stale signature we want to catch is `rmSync(tmpDir,
|
||||
// {recursive: true})` from the outer launch's cleanup — that
|
||||
// removes the directory entirely, so a missing dir is the
|
||||
// unambiguous "stale" signal.
|
||||
if (existsSync(envDir)) {
|
||||
_resolvedConfigDir = envDir;
|
||||
return envDir;
|
||||
}
|
||||
// Stale: env set but the dir is gone. Most likely the outer
|
||||
// launch's cleanup ran and we inherited its (now-dead) tmpdir
|
||||
// path. Fall back to default and warn the user once on stderr —
|
||||
// only when attached to a TTY, so non-interactive callers (CI,
|
||||
// MCP boot, scripts piping stdout) stay quiet.
|
||||
if (!_warnedStaleEnv && process.stderr.isTTY) {
|
||||
_warnedStaleEnv = true;
|
||||
const unsetHint =
|
||||
process.env.SHELL?.endsWith("fish")
|
||||
? "set -e CLAUDEMESH_CONFIG_DIR CLAUDEMESH_IPC_TOKEN_FILE"
|
||||
: "unset CLAUDEMESH_CONFIG_DIR CLAUDEMESH_IPC_TOKEN_FILE";
|
||||
process.stderr.write(
|
||||
`claudemesh: ignoring stale CLAUDEMESH_CONFIG_DIR=${envDir} (no config.json there); using ${DEFAULT_CONFIG_DIR}.\n`
|
||||
+ ` Hint: this is usually a leftover env from a previous \`claudemesh launch\`. Clean it with:\n`
|
||||
+ ` ${unsetHint}\n`,
|
||||
);
|
||||
}
|
||||
_resolvedConfigDir = DEFAULT_CONFIG_DIR;
|
||||
return DEFAULT_CONFIG_DIR;
|
||||
}
|
||||
|
||||
export const PATHS = {
|
||||
CONFIG_DIR: process.env.CLAUDEMESH_CONFIG_DIR || join(home, ".claudemesh"),
|
||||
get CONFIG_DIR() {
|
||||
return resolveConfigDir();
|
||||
},
|
||||
get CONFIG_FILE() {
|
||||
return join(this.CONFIG_DIR, "config.json");
|
||||
},
|
||||
@@ -20,3 +92,12 @@ export const PATHS = {
|
||||
CLAUDE_JSON: join(home, ".claude.json"),
|
||||
CLAUDE_SETTINGS: join(home, ".claude", "settings.json"),
|
||||
} as const;
|
||||
|
||||
/**
|
||||
* Test-only: reset the memoized resolution. Not exported from the
|
||||
* package barrel; reach in via the relative path from a test file.
|
||||
*/
|
||||
export function _resetPathsForTest(): void {
|
||||
_resolvedConfigDir = null;
|
||||
_warnedStaleEnv = false;
|
||||
}
|
||||
|
||||
@@ -15,6 +15,25 @@ export interface InboxRow {
|
||||
meta: string | null;
|
||||
received_at: number;
|
||||
reply_to_id: string | null;
|
||||
/** 1.34.8: Unix ms of when this row was first surfaced to the user
|
||||
* (returned by an interactive `inbox` listing or pushed via channel
|
||||
* reminder). NULL = never seen. Welcome filters on `seen_at IS NULL`
|
||||
* so freshly-launched sessions only see what they actually missed. */
|
||||
seen_at: number | null;
|
||||
/** 1.34.11: pubkey of the WS that received this push. Either the
|
||||
* daemon's member pubkey for member-keyed broadcasts, or one of
|
||||
* our session pubkeys for session-targeted DMs. Without this, two
|
||||
* sessions on the same daemon shared one inbox table and each saw
|
||||
* every other session's messages — same bug shape the 1.34.10 SSE
|
||||
* demux fixed for the live event path, just at the storage layer.
|
||||
* Pre-1.34.11 rows have NULL here and are visible to every session
|
||||
* on the same mesh (best-effort back-compat for already-stored
|
||||
* history). */
|
||||
recipient_pubkey: string | null;
|
||||
/** 1.34.11: matches `recipient_kind` on the bus event. "session" =
|
||||
* scoped to one session pubkey; "member" = visible to every
|
||||
* session of that member on the mesh. NULL on legacy rows. */
|
||||
recipient_kind: string | null;
|
||||
}
|
||||
|
||||
export function migrateInbox(db: SqliteDb): void {
|
||||
@@ -36,6 +55,24 @@ export function migrateInbox(db: SqliteDb): void {
|
||||
CREATE INDEX IF NOT EXISTS inbox_topic ON inbox(topic);
|
||||
CREATE INDEX IF NOT EXISTS inbox_sender ON inbox(sender_pubkey);
|
||||
`);
|
||||
// 1.34.8: read-state tracking. Pre-1.34.8 rows land with seen_at=NULL
|
||||
// (treated as unread); welcome surfaces them once and the listing
|
||||
// marks them seen. Indexed because welcome queries WHERE seen_at IS
|
||||
// NULL on every launch.
|
||||
const cols = db.prepare(`PRAGMA table_info(inbox)`).all<{ name: string }>();
|
||||
if (!cols.some((c) => c.name === "seen_at")) {
|
||||
db.exec(`ALTER TABLE inbox ADD COLUMN seen_at INTEGER`);
|
||||
db.exec(`CREATE INDEX IF NOT EXISTS inbox_seen_at ON inbox(seen_at)`);
|
||||
}
|
||||
// 1.34.11: per-recipient scoping. Two sessions on the same daemon
|
||||
// share one inbox table; without this column, listInbox returns
|
||||
// every row regardless of which session is asking. Indexed
|
||||
// because every interactive listing + welcome path filters by it.
|
||||
if (!cols.some((c) => c.name === "recipient_pubkey")) {
|
||||
db.exec(`ALTER TABLE inbox ADD COLUMN recipient_pubkey TEXT`);
|
||||
db.exec(`ALTER TABLE inbox ADD COLUMN recipient_kind TEXT`);
|
||||
db.exec(`CREATE INDEX IF NOT EXISTS inbox_recipient ON inbox(recipient_pubkey)`);
|
||||
}
|
||||
}
|
||||
|
||||
/**
|
||||
@@ -45,7 +82,14 @@ export function migrateInbox(db: SqliteDb): void {
|
||||
* Returns the new row id when this was a fresh insert, or null when the
|
||||
* message id was already known (idempotent receive).
|
||||
*/
|
||||
export function insertIfNew(db: SqliteDb, row: Omit<InboxRow, "id"> & { id: string }): string | null {
|
||||
export function insertIfNew(
|
||||
db: SqliteDb,
|
||||
// 1.34.8: callers don't pass `seen_at` — it's always NULL on insert
|
||||
// (a freshly-received row is by definition unread). Stripping the
|
||||
// field from the input type keeps inbound.ts callers from having to
|
||||
// construct it.
|
||||
row: Omit<InboxRow, "id" | "seen_at"> & { id: string },
|
||||
): string | null {
|
||||
// node:sqlite does support RETURNING. bun:sqlite does too. We branch on
|
||||
// the row count instead so it works on both.
|
||||
const before = db.prepare(`SELECT id FROM inbox WHERE client_message_id = ?`).get<{ id: string }>(row.client_message_id);
|
||||
@@ -53,12 +97,14 @@ export function insertIfNew(db: SqliteDb, row: Omit<InboxRow, "id"> & { id: stri
|
||||
db.prepare(`
|
||||
INSERT INTO inbox (
|
||||
id, client_message_id, broker_message_id, mesh, topic,
|
||||
sender_pubkey, sender_name, body, meta, received_at, reply_to_id
|
||||
) VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?)
|
||||
sender_pubkey, sender_name, body, meta, received_at, reply_to_id,
|
||||
recipient_pubkey, recipient_kind
|
||||
) VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?)
|
||||
ON CONFLICT(client_message_id) DO NOTHING
|
||||
`).run(
|
||||
row.id, row.client_message_id, row.broker_message_id, row.mesh, row.topic,
|
||||
row.sender_pubkey, row.sender_name, row.body, row.meta, row.received_at, row.reply_to_id,
|
||||
row.recipient_pubkey, row.recipient_kind,
|
||||
);
|
||||
// Confirm the insert landed (handles the conflict-noop race).
|
||||
const after = db.prepare(`SELECT id FROM inbox WHERE client_message_id = ?`).get<{ id: string }>(row.client_message_id);
|
||||
@@ -69,6 +115,21 @@ export interface ListInboxParams {
|
||||
since?: number; // received_at >= since
|
||||
topic?: string;
|
||||
fromPubkey?: string;
|
||||
/** 1.34.0: filter by mesh slug. Omit to return rows across all meshes. */
|
||||
mesh?: string;
|
||||
/** 1.34.8: only rows with `seen_at IS NULL`. Used by the welcome
|
||||
* push so a freshly-launched session surfaces what it actually
|
||||
* missed instead of every row from the last 24h. */
|
||||
unreadOnly?: boolean;
|
||||
/** 1.34.11: scope to rows whose recipient is this session pubkey,
|
||||
* PLUS member-keyed rows for the same member, PLUS legacy rows
|
||||
* with a NULL recipient (best-effort back-compat with pre-1.34.11
|
||||
* history). Set by the IPC `/v1/inbox` route from the bearer
|
||||
* session token; without it the listing returns everything.
|
||||
* `recipientMemberPubkey` widens the match to include broadcasts
|
||||
* / member DMs that should reach every session of this member. */
|
||||
recipientPubkey?: string;
|
||||
recipientMemberPubkey?: string;
|
||||
limit?: number;
|
||||
}
|
||||
|
||||
@@ -78,9 +139,28 @@ export function listInbox(db: SqliteDb, p: ListInboxParams): InboxRow[] {
|
||||
if (p.since !== undefined) { where.push("received_at >= ?"); args.push(p.since); }
|
||||
if (p.topic !== undefined) { where.push("topic = ?"); args.push(p.topic); }
|
||||
if (p.fromPubkey !== undefined){ where.push("sender_pubkey = ?"); args.push(p.fromPubkey); }
|
||||
if (p.mesh !== undefined) { where.push("mesh = ?"); args.push(p.mesh); }
|
||||
if (p.unreadOnly === true) { where.push("seen_at IS NULL"); }
|
||||
// 1.34.11: recipient scoping. A session sees:
|
||||
// - rows whose recipient_pubkey === its session pubkey (its DMs),
|
||||
// - rows whose recipient_pubkey === the daemon's member pubkey
|
||||
// (broadcasts / member-keyed DMs to anyone in this member's
|
||||
// identity — every sibling session sees them),
|
||||
// - legacy rows where recipient_pubkey IS NULL (pre-1.34.11
|
||||
// history; we can't tell who they were for, so surface to all).
|
||||
if (p.recipientPubkey) {
|
||||
const ors: string[] = ["recipient_pubkey IS NULL", "recipient_pubkey = ?"];
|
||||
args.push(p.recipientPubkey);
|
||||
if (p.recipientMemberPubkey) {
|
||||
ors.push("recipient_pubkey = ?");
|
||||
args.push(p.recipientMemberPubkey);
|
||||
}
|
||||
where.push(`(${ors.join(" OR ")})`);
|
||||
}
|
||||
const sql = `
|
||||
SELECT id, client_message_id, broker_message_id, mesh, topic,
|
||||
sender_pubkey, sender_name, body, meta, received_at, reply_to_id
|
||||
sender_pubkey, sender_name, body, meta, received_at, reply_to_id, seen_at,
|
||||
recipient_pubkey, recipient_kind
|
||||
FROM inbox
|
||||
${where.length ? "WHERE " + where.join(" AND ") : ""}
|
||||
ORDER BY received_at DESC
|
||||
@@ -89,3 +169,57 @@ export function listInbox(db: SqliteDb, p: ListInboxParams): InboxRow[] {
|
||||
args.push(Math.min(Math.max(p.limit ?? 100, 1), 1000));
|
||||
return db.prepare(sql).all<InboxRow>(...args);
|
||||
}
|
||||
|
||||
/** 1.34.8: stamp `seen_at = now` on every row whose id is in `ids`,
|
||||
* but only when `seen_at IS NULL` so re-marking doesn't bump the
|
||||
* timestamp on a row the user already knew about. Returns the number
|
||||
* of rows that flipped from unread → seen. Used by:
|
||||
* - the IPC `/v1/inbox` route when called by an interactive
|
||||
* listing (the daemon stamps after returning rows so the human
|
||||
* who just looked at their inbox doesn't see the same rows
|
||||
* flagged "unread" on next launch);
|
||||
* - the MCP server when the SSE message event surfaces a live
|
||||
* `<channel>` reminder (Claude Code already saw the row inline,
|
||||
* no need to surface it again on welcome). */
|
||||
export function markInboxSeen(db: SqliteDb, ids: readonly string[], now = Date.now()): number {
|
||||
if (ids.length === 0) return 0;
|
||||
const placeholders = ids.map(() => "?").join(",");
|
||||
const r = db.prepare(
|
||||
`UPDATE inbox SET seen_at = ? WHERE seen_at IS NULL AND id IN (${placeholders})`,
|
||||
).run(now, ...ids);
|
||||
return Number(r.changes);
|
||||
}
|
||||
|
||||
/** 1.34.8: TTL prune. Removes inbox rows older than `cutoffMs`
|
||||
* (received_at < cutoffMs). Daemon schedules this hourly with a 30-day
|
||||
* default retention (see startInboxPruner). Returns the number of
|
||||
* rows removed so the caller can log the volume. */
|
||||
export function pruneInboxBefore(db: SqliteDb, cutoffMs: number): number {
|
||||
const r = db.prepare(`DELETE FROM inbox WHERE received_at < ?`).run(cutoffMs);
|
||||
return Number(r.changes);
|
||||
}
|
||||
|
||||
/** 1.34.7: delete a single inbox row by id. Returns true iff a row was
|
||||
* removed. The CLI exposes this as `claudemesh inbox delete <id>`. */
|
||||
export function deleteInboxRow(db: SqliteDb, id: string): boolean {
|
||||
const r = db.prepare(`DELETE FROM inbox WHERE id = ?`).run(id);
|
||||
return Number(r.changes) > 0;
|
||||
}
|
||||
|
||||
/** 1.34.7: bulk delete with mesh / age filters. Returns the number of
|
||||
* rows removed. With no filter, deletes ALL rows on ALL meshes —
|
||||
* caller is expected to gate this behind a `--all` confirmation. */
|
||||
export interface FlushInboxParams {
|
||||
mesh?: string;
|
||||
/** Unix ms — delete rows received_at < before. */
|
||||
before?: number;
|
||||
}
|
||||
export function flushInbox(db: SqliteDb, p: FlushInboxParams): number {
|
||||
const where: string[] = [];
|
||||
const args: unknown[] = [];
|
||||
if (p.mesh !== undefined) { where.push("mesh = ?"); args.push(p.mesh); }
|
||||
if (p.before !== undefined) { where.push("received_at < ?"); args.push(p.before); }
|
||||
const sql = `DELETE FROM inbox ${where.length ? "WHERE " + where.join(" AND ") : ""}`;
|
||||
const r = db.prepare(sql).run(...args);
|
||||
return Number(r.changes);
|
||||
}
|
||||
|
||||
@@ -26,6 +26,15 @@ export interface OutboxRow {
|
||||
nonce: string | null;
|
||||
ciphertext: string | null;
|
||||
priority: string | null;
|
||||
/**
|
||||
* 1.34.0: hex pubkey of the launched session that originated this row.
|
||||
* NULL when the send came from outside a registered session
|
||||
* (cold-path CLI, system-issued sends, etc.) — drain falls through to
|
||||
* the daemon-WS in that case. When set, drain prefers the matching
|
||||
* SessionBrokerClient so the broker fan-out attributes the push to
|
||||
* the session pubkey instead of the daemon's stable member pubkey.
|
||||
*/
|
||||
sender_session_pubkey: string | null;
|
||||
}
|
||||
|
||||
export function migrateOutbox(db: SqliteDb): void {
|
||||
@@ -68,6 +77,14 @@ export function migrateOutbox(db: SqliteDb): void {
|
||||
if (!hasNonce) db.exec(`ALTER TABLE outbox ADD COLUMN nonce TEXT`);
|
||||
if (!hasCiphertext) db.exec(`ALTER TABLE outbox ADD COLUMN ciphertext TEXT`);
|
||||
if (!hasPriority) db.exec(`ALTER TABLE outbox ADD COLUMN priority TEXT`);
|
||||
|
||||
// 1.34.0: per-row sender session pubkey, used by the drain worker to
|
||||
// route via the originating session's WS so broker fan-out attributes
|
||||
// the push to the session pubkey, not the daemon's member pubkey.
|
||||
// Pre-1.34.0 rows land with NULL — drain falls back to the daemon-WS
|
||||
// path (legacy attribution).
|
||||
const hasSenderSessionPk = columnExists(db, "outbox", "sender_session_pubkey");
|
||||
if (!hasSenderSessionPk) db.exec(`ALTER TABLE outbox ADD COLUMN sender_session_pubkey TEXT`);
|
||||
}
|
||||
|
||||
function columnExists(db: SqliteDb, table: string, column: string): boolean {
|
||||
@@ -80,7 +97,8 @@ export function findByClientId(db: SqliteDb, clientMessageId: string): OutboxRow
|
||||
SELECT id, client_message_id, request_fingerprint, payload, enqueued_at,
|
||||
attempts, next_attempt_at, status, last_error, delivered_at,
|
||||
broker_message_id, aborted_at, aborted_by, superseded_by,
|
||||
mesh, target_spec, nonce, ciphertext, priority
|
||||
mesh, target_spec, nonce, ciphertext, priority,
|
||||
sender_session_pubkey
|
||||
FROM outbox WHERE client_message_id = ?
|
||||
`).get<OutboxRow>(clientMessageId);
|
||||
return row ?? null;
|
||||
@@ -98,6 +116,9 @@ export interface InsertPendingInput {
|
||||
nonce?: string;
|
||||
ciphertext?: string;
|
||||
priority?: string;
|
||||
/** 1.34.0: hex pubkey of the originating session (omit for cold-path
|
||||
* CLI sends — drain will use the daemon-WS). */
|
||||
sender_session_pubkey?: string;
|
||||
}
|
||||
|
||||
export function insertPending(db: SqliteDb, input: InsertPendingInput): void {
|
||||
@@ -105,8 +126,9 @@ export function insertPending(db: SqliteDb, input: InsertPendingInput): void {
|
||||
INSERT INTO outbox (
|
||||
id, client_message_id, request_fingerprint, payload,
|
||||
enqueued_at, attempts, next_attempt_at, status,
|
||||
mesh, target_spec, nonce, ciphertext, priority
|
||||
) VALUES (?, ?, ?, ?, ?, 0, ?, 'pending', ?, ?, ?, ?, ?)
|
||||
mesh, target_spec, nonce, ciphertext, priority,
|
||||
sender_session_pubkey
|
||||
) VALUES (?, ?, ?, ?, ?, 0, ?, 'pending', ?, ?, ?, ?, ?, ?)
|
||||
`).run(
|
||||
input.id,
|
||||
input.client_message_id,
|
||||
@@ -114,11 +136,12 @@ export function insertPending(db: SqliteDb, input: InsertPendingInput): void {
|
||||
input.payload,
|
||||
input.now,
|
||||
input.now,
|
||||
input.mesh ?? null,
|
||||
input.target_spec ?? null,
|
||||
input.nonce ?? null,
|
||||
input.ciphertext ?? null,
|
||||
input.priority ?? null,
|
||||
input.mesh ?? null,
|
||||
input.target_spec ?? null,
|
||||
input.nonce ?? null,
|
||||
input.ciphertext ?? null,
|
||||
input.priority ?? null,
|
||||
input.sender_session_pubkey ?? null,
|
||||
);
|
||||
}
|
||||
|
||||
@@ -149,7 +172,8 @@ export function listOutbox(db: SqliteDb, p: ListOutboxParams = {}): OutboxRow[]
|
||||
SELECT id, client_message_id, request_fingerprint, payload, enqueued_at,
|
||||
attempts, next_attempt_at, status, last_error, delivered_at,
|
||||
broker_message_id, aborted_at, aborted_by, superseded_by,
|
||||
mesh, target_spec, nonce, ciphertext, priority
|
||||
mesh, target_spec, nonce, ciphertext, priority,
|
||||
sender_session_pubkey
|
||||
FROM outbox
|
||||
${where.length ? "WHERE " + where.join(" AND ") : ""}
|
||||
ORDER BY enqueued_at DESC
|
||||
@@ -164,7 +188,8 @@ export function findById(db: SqliteDb, id: string): OutboxRow | null {
|
||||
SELECT id, client_message_id, request_fingerprint, payload, enqueued_at,
|
||||
attempts, next_attempt_at, status, last_error, delivered_at,
|
||||
broker_message_id, aborted_at, aborted_by, superseded_by,
|
||||
mesh, target_spec, nonce, ciphertext, priority
|
||||
mesh, target_spec, nonce, ciphertext, priority,
|
||||
sender_session_pubkey
|
||||
FROM outbox WHERE id = ?
|
||||
`).get<OutboxRow>(id) ?? null;
|
||||
}
|
||||
|
||||
@@ -13,6 +13,7 @@
|
||||
|
||||
import type { SqliteDb } from "./db/sqlite.js";
|
||||
import type { DaemonBrokerClient } from "./broker.js";
|
||||
import type { SessionBrokerClient } from "./session-broker.js";
|
||||
import type { OutboxStatus } from "./db/outbox.js";
|
||||
|
||||
const POLL_INTERVAL_MS = 500;
|
||||
@@ -32,6 +33,10 @@ interface PendingRow {
|
||||
ciphertext: string | null;
|
||||
priority: string | null;
|
||||
mesh: string | null;
|
||||
/** 1.34.0: hex pubkey of the originating session — drain prefers
|
||||
* routing via that session's WS so broker fan-out attributes the
|
||||
* push to the session pubkey. NULL on cold-path / pre-1.34.0 rows. */
|
||||
sender_session_pubkey: string | null;
|
||||
}
|
||||
|
||||
export interface DrainOptions {
|
||||
@@ -40,6 +45,20 @@ export interface DrainOptions {
|
||||
* broker keyed by its `mesh` column. Single-mesh daemons pass a
|
||||
* Map of size 1; multi-mesh daemons pass one entry per joined mesh. */
|
||||
brokers: Map<string, DaemonBrokerClient>;
|
||||
/**
|
||||
* 1.34.0: lookup for the per-session WS keyed by hex session pubkey.
|
||||
* When an outbox row has `sender_session_pubkey` set and this lookup
|
||||
* returns an open client, the drain routes via the session-WS so the
|
||||
* broker fan-out attributes the push to the session pubkey instead
|
||||
* of the daemon's stable member pubkey.
|
||||
*
|
||||
* Returning `undefined` (or an unopened client) signals "no session
|
||||
* WS available" — the drain backs off and retries; it does NOT fall
|
||||
* back to the daemon-WS, because the row was encrypted with the
|
||||
* session secret and would fail to decrypt on the recipient side
|
||||
* if attribution silently changed mid-flight.
|
||||
*/
|
||||
getSessionBrokerByPubkey?: (sessionPubkey: string) => SessionBrokerClient | undefined;
|
||||
log?: (level: "info" | "warn" | "error", msg: string, meta?: Record<string, unknown>) => void;
|
||||
}
|
||||
|
||||
@@ -88,7 +107,8 @@ async function drainOnce(opts: DrainOptions, log: NonNullable<DrainOptions["log"
|
||||
const now = Date.now();
|
||||
const rows = opts.db.prepare(`
|
||||
SELECT id, client_message_id, request_fingerprint, payload, attempts,
|
||||
target_spec, nonce, ciphertext, priority, mesh
|
||||
target_spec, nonce, ciphertext, priority, mesh,
|
||||
sender_session_pubkey
|
||||
FROM outbox
|
||||
WHERE status = 'pending' AND next_attempt_at <= ?
|
||||
ORDER BY enqueued_at
|
||||
@@ -101,21 +121,34 @@ async function drainOnce(opts: DrainOptions, log: NonNullable<DrainOptions["log"
|
||||
if (markInflight(opts.db, row.id, now) === 0) continue; // raced with another drainer
|
||||
const fpHex = bufferToHex(row.request_fingerprint);
|
||||
|
||||
// v1.26.0: pick the broker keyed by the row's mesh. Legacy rows
|
||||
// (mesh=NULL) fall back to the only broker if there's exactly one;
|
||||
// otherwise mark dead because we don't know where to send them.
|
||||
let broker: DaemonBrokerClient | undefined;
|
||||
// v1.26.0: pick the daemon-WS broker keyed by the row's mesh.
|
||||
// Legacy rows (mesh=NULL) fall back to the only broker if there's
|
||||
// exactly one; otherwise mark dead because we don't know where to
|
||||
// send them.
|
||||
let daemonBroker: DaemonBrokerClient | undefined;
|
||||
if (row.mesh) {
|
||||
broker = opts.brokers.get(row.mesh);
|
||||
daemonBroker = opts.brokers.get(row.mesh);
|
||||
} else if (opts.brokers.size === 1) {
|
||||
broker = opts.brokers.values().next().value;
|
||||
daemonBroker = opts.brokers.values().next().value;
|
||||
}
|
||||
if (!broker) {
|
||||
if (!daemonBroker) {
|
||||
log("warn", "drain_no_broker_for_mesh", { id: row.id, mesh: row.mesh ?? "(null)" });
|
||||
markDead(opts.db, row.id, `no_broker_for_mesh:${row.mesh ?? "null"}`);
|
||||
continue;
|
||||
}
|
||||
|
||||
// 1.34.0: when the row was written by an authenticated session,
|
||||
// dispatch via the matching SessionBrokerClient so broker fan-out
|
||||
// attributes the push to the session pubkey. Encryption is
|
||||
// session-secret based on those rows, so we MUST NOT silently fall
|
||||
// back to the daemon-WS — the recipient's decrypt would fail. If
|
||||
// the session-WS is closed (reconnecting / session terminated), we
|
||||
// back off and retry.
|
||||
let sessionBroker: SessionBrokerClient | undefined;
|
||||
if (row.sender_session_pubkey && opts.getSessionBrokerByPubkey) {
|
||||
sessionBroker = opts.getSessionBrokerByPubkey(row.sender_session_pubkey);
|
||||
}
|
||||
|
||||
// Sprint 4: use the row's resolved target/ciphertext if present.
|
||||
// Legacy v0.9.0 rows (NULL on these columns) fall back to the
|
||||
// broadcast smoke-test shape so existing in-flight rows still drain.
|
||||
@@ -135,16 +168,31 @@ async function drainOnce(opts: DrainOptions, log: NonNullable<DrainOptions["log"
|
||||
priority = "next";
|
||||
}
|
||||
|
||||
const sendArgs = {
|
||||
targetSpec,
|
||||
priority,
|
||||
nonce,
|
||||
ciphertext,
|
||||
client_message_id: row.client_message_id,
|
||||
request_fingerprint_hex: fpHex,
|
||||
};
|
||||
|
||||
let res;
|
||||
try {
|
||||
res = await broker.send({
|
||||
targetSpec,
|
||||
priority,
|
||||
nonce,
|
||||
ciphertext,
|
||||
client_message_id: row.client_message_id,
|
||||
request_fingerprint_hex: fpHex,
|
||||
});
|
||||
if (row.sender_session_pubkey) {
|
||||
// Session-attributed row. Require an open session-WS — see comment
|
||||
// above on why we don't fall back to the daemon-WS.
|
||||
if (!sessionBroker || !sessionBroker.isOpen()) {
|
||||
log("info", "drain_session_ws_not_ready", {
|
||||
id: row.id, session_pubkey: row.sender_session_pubkey.slice(0, 12),
|
||||
});
|
||||
backoffPending(opts.db, row.id, row.attempts + 1, "session_ws_not_open", "session_ws_not_open");
|
||||
continue;
|
||||
}
|
||||
res = await sessionBroker.send(sendArgs);
|
||||
} else {
|
||||
res = await daemonBroker.send(sendArgs);
|
||||
}
|
||||
} catch (e) {
|
||||
log("warn", "drain_send_threw", { id: row.id, err: String(e) });
|
||||
backoffPending(opts.db, row.id, row.attempts + 1, "exception", String(e));
|
||||
|
||||
@@ -41,8 +41,68 @@ export function writeSse(res: ServerResponse, e: DaemonEvent, idCounter: number)
|
||||
res.write(`data: ${JSON.stringify({ ts: e.ts, ...e.data })}\n\n`);
|
||||
}
|
||||
|
||||
/** Open an SSE stream on the response and route bus events to it. */
|
||||
export function bindSseStream(res: ServerResponse, bus: EventBus): () => void {
|
||||
/** 1.34.10: per-subscriber demux options. The MCP server passes its
|
||||
* own session pubkey + member pubkey when binding so the bus only
|
||||
* sends events meant for that session. Without this, every MCP on a
|
||||
* multi-session daemon receives every inbox row and emits a
|
||||
* duplicate channel notification — manifests as session A seeing its
|
||||
* own outbound DM to B because B's session-WS published the row to
|
||||
* the shared bus. */
|
||||
export interface SseFilterOptions {
|
||||
/** Session pubkey the subscribing MCP serves. Events tagged
|
||||
* `recipient_kind: "session"` only flow when their
|
||||
* `recipient_pubkey` matches this. */
|
||||
sessionPubkey?: string;
|
||||
/** Daemon's member pubkey for this mesh. Events tagged
|
||||
* `recipient_kind: "member"` flow when their `recipient_pubkey`
|
||||
* matches — those are member-keyed broadcasts / DMs that should
|
||||
* reach every session of this member, but not OTHER members. */
|
||||
memberPubkey?: string;
|
||||
/** Mesh slug the subscriber is bound to (from session registry).
|
||||
* When set, system events (peer_join etc.) are filtered to this
|
||||
* mesh; without it every system event surfaces. */
|
||||
meshSlug?: string;
|
||||
}
|
||||
|
||||
function shouldDeliver(e: DaemonEvent, f: SseFilterOptions): boolean {
|
||||
// No filter set → legacy behavior: deliver everything (used by
|
||||
// diagnostic tooling like `claudemesh daemon events`).
|
||||
if (!f.sessionPubkey && !f.memberPubkey && !f.meshSlug) return true;
|
||||
|
||||
// Mesh scoping for events that carry a mesh slug. peer_join /
|
||||
// peer_leave / broker_status all carry `data.mesh`; if the
|
||||
// subscriber is bound to a specific mesh, drop events from other
|
||||
// meshes.
|
||||
if (f.meshSlug) {
|
||||
const eventMesh = typeof e.data.mesh === "string" ? e.data.mesh : null;
|
||||
if (eventMesh && eventMesh !== f.meshSlug) return false;
|
||||
}
|
||||
|
||||
// System events (peer_join etc.) flow to every session on the same
|
||||
// mesh — they're informational, not addressed.
|
||||
if (e.kind !== "message") return true;
|
||||
|
||||
const recipientKind = typeof e.data.recipient_kind === "string" ? e.data.recipient_kind : null;
|
||||
const recipientPubkey = typeof e.data.recipient_pubkey === "string" ? e.data.recipient_pubkey.toLowerCase() : null;
|
||||
|
||||
// Legacy publish without recipient context → everyone gets it. Keeps
|
||||
// backward compatibility with older daemon code paths until they're
|
||||
// migrated. Also covers test paths that don't thread context.
|
||||
if (!recipientKind || !recipientPubkey) return true;
|
||||
|
||||
if (recipientKind === "session") {
|
||||
return !!f.sessionPubkey && f.sessionPubkey.toLowerCase() === recipientPubkey;
|
||||
}
|
||||
if (recipientKind === "member") {
|
||||
return !!f.memberPubkey && f.memberPubkey.toLowerCase() === recipientPubkey;
|
||||
}
|
||||
return true;
|
||||
}
|
||||
|
||||
/** Open an SSE stream on the response and route bus events to it.
|
||||
* 1.34.10: optional `filter` scopes the stream to one session/member;
|
||||
* see SseFilterOptions. */
|
||||
export function bindSseStream(res: ServerResponse, bus: EventBus, filter: SseFilterOptions = {}): () => void {
|
||||
res.statusCode = 200;
|
||||
res.setHeader("Content-Type", "text/event-stream");
|
||||
res.setHeader("Cache-Control", "no-cache, no-transform");
|
||||
@@ -51,7 +111,10 @@ export function bindSseStream(res: ServerResponse, bus: EventBus): () => void {
|
||||
res.write(": connected\n\n");
|
||||
|
||||
let counter = 0;
|
||||
const unsubscribe = bus.subscribe((e) => writeSse(res, e, ++counter));
|
||||
const unsubscribe = bus.subscribe((e) => {
|
||||
if (!shouldDeliver(e, filter)) return;
|
||||
writeSse(res, e, ++counter);
|
||||
});
|
||||
|
||||
const heartbeat = setInterval(() => {
|
||||
try { res.write(": keepalive\n\n"); }
|
||||
|
||||
@@ -18,6 +18,20 @@ export interface InboundContext {
|
||||
/** Daemon's session secret key hex (rotates per connect). When the
|
||||
* sender encrypted to our session pubkey, decrypt with this instead. */
|
||||
sessionSecretKeyHex?: string;
|
||||
/** 1.34.10: recipient pubkey of the WS that received this push.
|
||||
* Either the daemon's member pubkey (member-WS) or one of our
|
||||
* session pubkeys (session-WS). Threaded through to the bus event
|
||||
* so each MCP subscriber can filter to events meant for its own
|
||||
* session — without it, every MCP on the same daemon renders every
|
||||
* inbox row, which manifests as session A seeing its own outbound
|
||||
* to B (because A's MCP also picks up the bus event B's WS just
|
||||
* published). */
|
||||
recipientPubkey?: string;
|
||||
/** 1.34.10: kind of WS this push arrived on. "session" pushes only
|
||||
* surface to the matching session's MCP; "member" pushes surface to
|
||||
* every session on the same mesh (member-keyed broadcasts, member
|
||||
* DMs that don't have a session). */
|
||||
recipientKind?: "session" | "member";
|
||||
/** v2 agentic-comms (M1): emit `client_ack` back to the broker after
|
||||
* the message lands in inbox.db. Broker uses the ack to set
|
||||
* `delivered_at` (atomic at-least-once). Without it, the broker's
|
||||
@@ -25,6 +39,16 @@ export interface InboundContext {
|
||||
* client owns this callback because it's the one that owns the
|
||||
* socket; inbound.ts just signals "I accepted this id." */
|
||||
ackClientMessage?: (clientMessageId: string, brokerMessageId: string | null) => void;
|
||||
/** 1.34.9: drops system events (peer_joined / peer_left /
|
||||
* peer_returned) whose eventData.pubkey is one of our own. The broker
|
||||
* fans peer_joined to every OTHER connection in the mesh — but our
|
||||
* daemon's member-WS counts as "other" relative to our session-WS,
|
||||
* so without this filter the user sees `[system] Peer "<self>"
|
||||
* joined the mesh` every time their own session reconnects.
|
||||
* Implementation passes a closure that walks the live broker map
|
||||
* rather than a static set, so newly-spawned sessions are visible
|
||||
* immediately. */
|
||||
isOwnPubkey?: (pubkey: string) => boolean;
|
||||
log?: (level: "info" | "warn" | "error", msg: string, meta?: Record<string, unknown>) => void;
|
||||
}
|
||||
|
||||
@@ -38,10 +62,21 @@ export interface InboundContext {
|
||||
export async function handleBrokerPush(msg: Record<string, unknown>, ctx: InboundContext): Promise<void> {
|
||||
// System/topology pushes (peer_join, tick, …) — emit verbatim.
|
||||
if (msg.subtype === "system" && typeof msg.event === "string") {
|
||||
const eventData = (msg.eventData as Record<string, unknown> | undefined) ?? {};
|
||||
// 1.34.9: drop self-joins. The broker excludes the JOINING
|
||||
// connection from the fan-out, but our daemon owns multiple
|
||||
// connections per mesh (member-WS + N session-WSs), and each is a
|
||||
// distinct "other" from the broker's view — so a session's own
|
||||
// peer_joined arrives at the same daemon's member-WS and used to
|
||||
// surface as `[system] Peer "<self>" joined`. The session-WS path
|
||||
// already skips system events entirely (see session-broker.ts
|
||||
// 1.34.9), and this filter handles the member-WS path.
|
||||
const eventPubkey = typeof eventData.pubkey === "string" ? eventData.pubkey : "";
|
||||
if (eventPubkey && ctx.isOwnPubkey?.(eventPubkey)) return;
|
||||
ctx.bus.publish(mapSystemEventKind(msg.event), {
|
||||
mesh: ctx.meshSlug,
|
||||
event: msg.event,
|
||||
...(msg.eventData as Record<string, unknown> | undefined ?? {}),
|
||||
...eventData,
|
||||
});
|
||||
return;
|
||||
}
|
||||
@@ -78,6 +113,12 @@ export async function handleBrokerPush(msg: Record<string, unknown>, ctx: Inboun
|
||||
meta: createdAt ? JSON.stringify({ created_at: createdAt }) : null,
|
||||
received_at: Date.now(),
|
||||
reply_to_id: replyToId,
|
||||
// 1.34.11: persist the recipient context so /v1/inbox can scope
|
||||
// queries to the asking session. Mirrors the same fields on the
|
||||
// bus event added in 1.34.10. Falls back to NULL when the caller
|
||||
// didn't pass them (legacy paths, tests).
|
||||
recipient_pubkey: ctx.recipientPubkey ?? null,
|
||||
recipient_kind: ctx.recipientKind ?? null,
|
||||
});
|
||||
|
||||
// Whether the row was newly inserted or already existed (dedupe), the
|
||||
@@ -102,6 +143,14 @@ export async function handleBrokerPush(msg: Record<string, unknown>, ctx: Inboun
|
||||
...(subtype ? { subtype } : {}),
|
||||
body,
|
||||
created_at: createdAt,
|
||||
// 1.34.10: per-recipient routing context. SSE subscribers (the
|
||||
// MCP servers that translate bus events into channel notifications)
|
||||
// use this to filter to events meant for their own session. Without
|
||||
// it, every MCP on the same daemon emits a channel push for every
|
||||
// inbox row, which means session A sees its own outbound to B
|
||||
// because B's session-WS published the inbox row to the shared bus.
|
||||
...(ctx.recipientPubkey ? { recipient_pubkey: ctx.recipientPubkey } : {}),
|
||||
...(ctx.recipientKind ? { recipient_kind: ctx.recipientKind } : {}),
|
||||
});
|
||||
}
|
||||
|
||||
|
||||
73
apps/cli/src/daemon/inbox-pruner.ts
Normal file
73
apps/cli/src/daemon/inbox-pruner.ts
Normal file
@@ -0,0 +1,73 @@
|
||||
// 1.34.8: TTL prune for inbox.db.
|
||||
//
|
||||
// The inbox grows monotonically — every received DM lands as a row and
|
||||
// nothing removes it except an explicit `claudemesh inbox flush`. For
|
||||
// chatty meshes that's tens of thousands of rows over a few weeks.
|
||||
// SQLite handles that volume fine, but the rows are sitting there
|
||||
// forever and `claudemesh inbox` queries get slower as the table grows.
|
||||
//
|
||||
// The pruner runs hourly inside the daemon process and deletes rows
|
||||
// whose received_at is older than `retentionMs`. Default is 30 days,
|
||||
// which is generous for the "I went on holiday and want to see what I
|
||||
// missed" case but won't carry old rows into next year.
|
||||
//
|
||||
// Best-effort: a failure logs a warning and the pruner keeps trying on
|
||||
// the next interval. There's no shared state to corrupt — pruneInboxBefore
|
||||
// is a single DELETE statement.
|
||||
|
||||
import { pruneInboxBefore } from "./db/inbox.js";
|
||||
import type { SqliteDb } from "./db/sqlite.js";
|
||||
|
||||
export interface InboxPrunerOptions {
|
||||
db: SqliteDb;
|
||||
/** Retention window in ms. Rows with received_at < (now - retentionMs)
|
||||
* are deleted. Default: 30 days. */
|
||||
retentionMs?: number;
|
||||
/** How often to run the prune. Default: 1 hour. */
|
||||
intervalMs?: number;
|
||||
log?: (level: "info" | "warn" | "error", msg: string, meta?: Record<string, unknown>) => void;
|
||||
}
|
||||
|
||||
export interface InboxPrunerHandle {
|
||||
stop: () => void;
|
||||
}
|
||||
|
||||
const DEFAULT_RETENTION_MS = 30 * 24 * 60 * 60 * 1000;
|
||||
const DEFAULT_INTERVAL_MS = 60 * 60 * 1000;
|
||||
|
||||
export function startInboxPruner(opts: InboxPrunerOptions): InboxPrunerHandle {
|
||||
const retentionMs = opts.retentionMs ?? DEFAULT_RETENTION_MS;
|
||||
const intervalMs = opts.intervalMs ?? DEFAULT_INTERVAL_MS;
|
||||
const log = opts.log ?? defaultLog;
|
||||
|
||||
const tick = (): void => {
|
||||
try {
|
||||
const cutoff = Date.now() - retentionMs;
|
||||
const removed = pruneInboxBefore(opts.db, cutoff);
|
||||
if (removed > 0) {
|
||||
log("info", "inbox_prune_completed", {
|
||||
removed,
|
||||
retention_days: Math.round(retentionMs / (24 * 60 * 60 * 1000)),
|
||||
});
|
||||
}
|
||||
} catch (e) {
|
||||
log("warn", "inbox_prune_failed", { err: String(e) });
|
||||
}
|
||||
};
|
||||
|
||||
// Run once at startup so a daemon that's been down for weeks reaps
|
||||
// immediately rather than waiting an hour.
|
||||
tick();
|
||||
|
||||
const handle = setInterval(tick, intervalMs);
|
||||
// Don't let the pruner block daemon shutdown.
|
||||
if (typeof handle.unref === "function") handle.unref();
|
||||
|
||||
return { stop: () => clearInterval(handle) };
|
||||
}
|
||||
|
||||
function defaultLog(level: "info" | "warn" | "error", msg: string, meta?: Record<string, unknown>) {
|
||||
const line = JSON.stringify({ level, msg, ...meta, ts: new Date().toISOString() });
|
||||
if (level === "info") process.stdout.write(line + "\n");
|
||||
else process.stderr.write(line + "\n");
|
||||
}
|
||||
@@ -39,6 +39,12 @@ export interface SendRequest {
|
||||
nonce?: string;
|
||||
/** Sprint 4: which mesh this send is for (single-mesh daemon today; multi-mesh later). */
|
||||
mesh?: string;
|
||||
/** 1.34.0: when the IPC request authenticated as a launched session,
|
||||
* the IPC layer fills this with the session's hex pubkey. The drain
|
||||
* worker uses it to route via the matching SessionBrokerClient so
|
||||
* broker fan-out attributes the push to the session pubkey instead
|
||||
* of the daemon's member pubkey. */
|
||||
sender_session_pubkey?: string;
|
||||
}
|
||||
|
||||
export type AcceptOutcome =
|
||||
@@ -93,6 +99,7 @@ export function acceptSend(req: SendRequest, deps: AcceptDeps): AcceptOutcome {
|
||||
nonce: req.nonce,
|
||||
ciphertext: req.ciphertext,
|
||||
priority: req.priority,
|
||||
sender_session_pubkey: req.sender_session_pubkey,
|
||||
});
|
||||
return { kind: "accepted_pending", status: 202, client_message_id: clientId };
|
||||
}
|
||||
|
||||
@@ -5,7 +5,7 @@ import { timingSafeEqual } from "node:crypto";
|
||||
import { DAEMON_PATHS, DAEMON_TCP_HOST, DAEMON_TCP_DEFAULT_PORT } from "../paths.js";
|
||||
import type { SqliteDb } from "../db/sqlite.js";
|
||||
import { acceptSend, type SendRequest } from "./handlers/send.js";
|
||||
import { listInbox } from "../db/inbox.js";
|
||||
import { listInbox, deleteInboxRow, flushInbox, markInboxSeen } from "../db/inbox.js";
|
||||
import { listOutbox, requeueDeadOrPending, type OutboxStatus } from "../db/outbox.js";
|
||||
import { randomUUID } from "node:crypto";
|
||||
import { bindSseStream, type EventBus } from "../events.js";
|
||||
@@ -319,7 +319,21 @@ function makeHandler(opts: {
|
||||
respond(res, 503, { error: "event bus not initialised" });
|
||||
return;
|
||||
}
|
||||
bindSseStream(res, opts.bus);
|
||||
// 1.34.10: per-session SSE demux. When the subscriber presented
|
||||
// a ClaudeMesh-Session token (the MCP server always does post-
|
||||
// 1.34.10), scope the stream to that session's pubkey + the
|
||||
// matching mesh's member pubkey. Diagnostic callers without a
|
||||
// session token (`claudemesh daemon events`) get the unfiltered
|
||||
// legacy stream. The bus itself stays single-shot; demux lives
|
||||
// entirely at the SSE bind layer (events.ts shouldDeliver).
|
||||
const filter: Record<string, string> = {};
|
||||
if (session?.presence?.sessionPubkey) filter.sessionPubkey = session.presence.sessionPubkey;
|
||||
if (session?.mesh) {
|
||||
filter.meshSlug = session.mesh;
|
||||
const meshCfg = opts.meshConfigs?.get(session.mesh);
|
||||
if (meshCfg?.pubkey) filter.memberPubkey = meshCfg.pubkey;
|
||||
}
|
||||
bindSseStream(res, opts.bus, filter);
|
||||
return;
|
||||
}
|
||||
|
||||
@@ -579,12 +593,46 @@ function makeHandler(opts: {
|
||||
const fromPubkey = url.searchParams.get("from") ?? undefined;
|
||||
const limitRaw = url.searchParams.get("limit");
|
||||
const limit = limitRaw ? Number.parseInt(limitRaw, 10) : undefined;
|
||||
// 1.34.0: mesh filter. Falls back to session-default if header set.
|
||||
const meshFilter = meshFromCtx(url.searchParams.get("mesh")) ?? undefined;
|
||||
// 1.34.8: read-state filter. ?unread_only=true narrows to rows
|
||||
// whose seen_at is NULL — used by the welcome push so a freshly
|
||||
// launched session surfaces only what it actually missed.
|
||||
const unreadOnly = url.searchParams.get("unread_only") === "true";
|
||||
// 1.34.8: ?mark_seen=false opts out of the auto-stamp behavior. By
|
||||
// default an interactive listing flips seen_at on the rows it just
|
||||
// returned (the user "saw" them), which is what we want for the
|
||||
// CLI but not for diagnostic tooling that wants to peek without
|
||||
// affecting state. The MCP server uses mark_seen=false on the
|
||||
// welcome path; it stamps explicitly via /v1/inbox/seen instead.
|
||||
const markSeen = url.searchParams.get("mark_seen") !== "false";
|
||||
// 1.34.11: scope by recipient when the caller is an authenticated
|
||||
// session. The daemon receives every inbox row for every session
|
||||
// it hosts, so a query without scoping returns the global table —
|
||||
// session A would see B's DMs (the bug 1.34.10 fixed for the
|
||||
// live event path; this is the storage half). Scope = session
|
||||
// pubkey (DMs) + member pubkey (broadcasts/member DMs the whole
|
||||
// member should see) + NULL (legacy rows we can't attribute).
|
||||
const recipientPubkey = session?.presence?.sessionPubkey;
|
||||
const meshCfgForRecipient = session?.mesh ? opts.meshConfigs?.get(session.mesh) : undefined;
|
||||
const recipientMemberPubkey = meshCfgForRecipient?.pubkey;
|
||||
const rows = listInbox(opts.inboxDb, {
|
||||
since: Number.isFinite(since) ? since : undefined,
|
||||
topic,
|
||||
fromPubkey,
|
||||
...(meshFilter ? { mesh: meshFilter } : {}),
|
||||
unreadOnly,
|
||||
...(recipientPubkey ? { recipientPubkey } : {}),
|
||||
...(recipientMemberPubkey ? { recipientMemberPubkey } : {}),
|
||||
limit: Number.isFinite(limit ?? NaN) ? limit : undefined,
|
||||
});
|
||||
let flippedCount = 0;
|
||||
if (markSeen) {
|
||||
const unreadIds = rows.filter((r) => r.seen_at == null).map((r) => r.id);
|
||||
if (unreadIds.length > 0) {
|
||||
flippedCount = markInboxSeen(opts.inboxDb, unreadIds);
|
||||
}
|
||||
}
|
||||
respond(res, 200, {
|
||||
items: rows.map((r) => ({
|
||||
id: r.id,
|
||||
@@ -597,11 +645,72 @@ function makeHandler(opts: {
|
||||
body: r.body,
|
||||
received_at: new Date(r.received_at).toISOString(),
|
||||
reply_to_id: r.reply_to_id,
|
||||
// 1.34.8: surface read-state. `null` = never seen (welcome
|
||||
// candidate). Note that if mark_seen=true (default), we just
|
||||
// stamped these rows — but the snapshot reflects the value
|
||||
// BEFORE the stamp so callers can still tell which rows were
|
||||
// unread when they asked.
|
||||
seen_at: r.seen_at ? new Date(r.seen_at).toISOString() : null,
|
||||
// 1.34.11: recipient context. Lets `--json` consumers tell
|
||||
// a session DM apart from a member-keyed broadcast, and
|
||||
// distinguishes pre-1.34.11 legacy rows (NULL) from
|
||||
// properly-scoped ones.
|
||||
recipient_pubkey: r.recipient_pubkey,
|
||||
recipient_kind: r.recipient_kind,
|
||||
})),
|
||||
// 1.34.8: how many rows just flipped from unread → seen. Useful
|
||||
// for telemetry and lets the CLI render "marked N as read".
|
||||
marked_seen: flippedCount,
|
||||
});
|
||||
return;
|
||||
}
|
||||
|
||||
// 1.34.8: explicit mark-seen endpoint. Used by the MCP server after
|
||||
// it surfaces a live `<channel>` reminder for an inbox row — Claude
|
||||
// Code already saw the row inline, so welcome shouldn't re-surface
|
||||
// it on the next launch. Body: { ids: string[] }. Returns the
|
||||
// number of rows that flipped from unread → seen.
|
||||
if (req.method === "POST" && url.pathname === "/v1/inbox/seen") {
|
||||
if (!opts.inboxDb) { respond(res, 503, { error: "inbox not initialised" }); return; }
|
||||
try {
|
||||
const body = await readJsonBody(req, 64 * 1024) as Record<string, unknown> | null;
|
||||
const ids = Array.isArray(body?.ids)
|
||||
? (body!.ids as unknown[]).filter((x): x is string => typeof x === "string")
|
||||
: [];
|
||||
if (ids.length === 0) { respond(res, 400, { error: "missing 'ids' (string[])" }); return; }
|
||||
const flipped = markInboxSeen(opts.inboxDb, ids);
|
||||
respond(res, 200, { marked_seen: flipped });
|
||||
} catch (e) {
|
||||
respond(res, 400, { error: String(e) });
|
||||
}
|
||||
return;
|
||||
}
|
||||
|
||||
// 1.34.7: inbox flush + per-row delete. The inbox is the daemon's
|
||||
// local persisted SQLite store — there's no broker-side state to
|
||||
// coordinate, so these are simple local writes.
|
||||
if (req.method === "DELETE" && url.pathname === "/v1/inbox") {
|
||||
if (!opts.inboxDb) { respond(res, 503, { error: "inbox not initialised" }); return; }
|
||||
const meshFilter = meshFromCtx(url.searchParams.get("mesh")) ?? undefined;
|
||||
const beforeRaw = url.searchParams.get("before");
|
||||
const before = beforeRaw ? Date.parse(beforeRaw) : undefined;
|
||||
const removed = flushInbox(opts.inboxDb, {
|
||||
...(meshFilter ? { mesh: meshFilter } : {}),
|
||||
...(Number.isFinite(before) ? { before } : {}),
|
||||
});
|
||||
respond(res, 200, { removed });
|
||||
return;
|
||||
}
|
||||
if (req.method === "DELETE" && url.pathname.startsWith("/v1/inbox/")) {
|
||||
if (!opts.inboxDb) { respond(res, 503, { error: "inbox not initialised" }); return; }
|
||||
const id = url.pathname.slice("/v1/inbox/".length);
|
||||
if (!id) { respond(res, 400, { error: "missing id" }); return; }
|
||||
const ok = deleteInboxRow(opts.inboxDb, id);
|
||||
if (!ok) { respond(res, 404, { error: "not found", id }); return; }
|
||||
respond(res, 200, { removed: 1, id });
|
||||
return;
|
||||
}
|
||||
|
||||
if (req.method === "GET" && url.pathname === "/v1/outbox") {
|
||||
if (!opts.outboxDb) { respond(res, 503, { error: "outbox not initialised" }); return; }
|
||||
const statusParam = url.searchParams.get("status") ?? undefined;
|
||||
@@ -701,12 +810,23 @@ function makeHandler(opts: {
|
||||
respond(res, 404, { error: "mesh_not_attached", mesh: chosenSlug });
|
||||
return;
|
||||
}
|
||||
// 1.34.0: authenticated session sends encrypt with the session
|
||||
// secret key + carry the session pubkey through to the outbox
|
||||
// row, so the drain worker can route via SessionBrokerClient
|
||||
// and the broker fan-out attributes the push to the session
|
||||
// pubkey instead of the daemon's member pubkey. Cold-path
|
||||
// sends (no session token) keep the legacy member-key flow.
|
||||
const senderSessionPubkey = session?.presence?.sessionPubkey;
|
||||
const senderSecretKey = session?.presence?.sessionSecretKey ?? meshCfg.secretKey;
|
||||
try {
|
||||
const routed = await resolveAndEncrypt(parsed.req, broker, meshCfg.secretKey, chosenSlug);
|
||||
const routed = await resolveAndEncrypt(parsed.req, broker, senderSecretKey, chosenSlug);
|
||||
parsed.req.target_spec = routed.target_spec;
|
||||
parsed.req.ciphertext = routed.ciphertext;
|
||||
parsed.req.nonce = routed.nonce;
|
||||
parsed.req.mesh = routed.mesh;
|
||||
if (senderSessionPubkey) {
|
||||
parsed.req.sender_session_pubkey = senderSessionPubkey;
|
||||
}
|
||||
} catch (e) {
|
||||
respond(res, 502, { error: "route_failed", detail: String(e) });
|
||||
return;
|
||||
|
||||
@@ -11,19 +11,17 @@ import { migrateInbox } from "./db/inbox.js";
|
||||
import { DaemonBrokerClient } from "./broker.js";
|
||||
import { SessionBrokerClient } from "./session-broker.js";
|
||||
import { startDrainWorker, type DrainHandle } from "./drain.js";
|
||||
import { startInboxPruner, type InboxPrunerHandle } from "./inbox-pruner.js";
|
||||
import { handleBrokerPush } from "./inbound.js";
|
||||
import { EventBus } from "./events.js";
|
||||
import { checkFingerprint, type ClonePolicy } from "./identity.js";
|
||||
import { readConfig } from "~/services/config/facade.js";
|
||||
import { VERSION } from "~/constants/urls.js";
|
||||
|
||||
export interface RunDaemonOptions {
|
||||
/** Disable TCP loopback (UDS-only). Defaults true in container envs. */
|
||||
tcpEnabled?: boolean;
|
||||
publicHealthCheck?: boolean;
|
||||
/** Mesh slug to attach to. Required when the user has joined multiple meshes. */
|
||||
mesh?: string;
|
||||
/** Daemon's display name on the mesh. */
|
||||
displayName?: string;
|
||||
/** Behavior on host_fingerprint mismatch. Defaults 'refuse'. */
|
||||
clonePolicy?: ClonePolicy;
|
||||
}
|
||||
@@ -95,30 +93,27 @@ export async function runDaemon(opts: RunDaemonOptions = {}): Promise<number> {
|
||||
|
||||
const bus = new EventBus();
|
||||
|
||||
// 1.26.0 — multi-mesh by default. With --mesh <slug>, the daemon
|
||||
// scopes to one mesh (legacy mode). Without it, attaches to every
|
||||
// joined mesh simultaneously so ambient mode (raw `claude`) works
|
||||
// for all meshes with one daemon process.
|
||||
// 1.34.10: the daemon is universal — attaches to every mesh listed
|
||||
// in config.json. Single-mesh isolation is handled by simply joining
|
||||
// only one mesh in that environment (containers, etc.). No --mesh
|
||||
// flag, no per-mesh service unit; one daemon, every mesh.
|
||||
const cfg = readConfig();
|
||||
let meshes: Array<typeof cfg.meshes[number]>;
|
||||
if (opts.mesh) {
|
||||
const found = cfg.meshes.find((m) => m.slug === opts.mesh);
|
||||
if (!found) {
|
||||
process.stderr.write(`mesh not found: ${opts.mesh}\n`);
|
||||
process.stderr.write(`joined meshes: ${cfg.meshes.map((m) => m.slug).join(", ") || "(none)"}\n`);
|
||||
releaseSingletonLock();
|
||||
try { outboxDb.close(); } catch { /* ignore */ }
|
||||
return 2;
|
||||
}
|
||||
meshes = [found];
|
||||
} else if (cfg.meshes.length === 0) {
|
||||
if (cfg.meshes.length === 0) {
|
||||
process.stderr.write(`no mesh joined; run \`claudemesh join <invite-url>\` first\n`);
|
||||
releaseSingletonLock();
|
||||
try { outboxDb.close(); } catch { /* ignore */ }
|
||||
return 2;
|
||||
} else {
|
||||
meshes = cfg.meshes;
|
||||
}
|
||||
const meshes = cfg.meshes;
|
||||
|
||||
// 1.34.9 — declared upfront so the daemon-WS onPush closure can
|
||||
// reach into the per-session map for the isOwnPubkey filter (drops
|
||||
// peer_joined / peer_left events for our own session pubkeys before
|
||||
// they surface as `[system] Peer "<self>" joined`). Populated below
|
||||
// by setRegistryHooks; empty until the first session registers, but
|
||||
// that's fine — the closure walks it lazily.
|
||||
const sessionBrokers = new Map<string, SessionBrokerClient>();
|
||||
const sessionBrokersByPubkey = new Map<string, SessionBrokerClient>();
|
||||
|
||||
// Spin up one broker per mesh. Connection failures are non-fatal:
|
||||
// the outbox keeps queuing per-mesh and reconnect logic in
|
||||
@@ -127,8 +122,11 @@ export async function runDaemon(opts: RunDaemonOptions = {}): Promise<number> {
|
||||
const meshConfigs = new Map<string, typeof cfg.meshes[number]>();
|
||||
for (const mesh of meshes) {
|
||||
meshConfigs.set(mesh.slug, mesh);
|
||||
// 1.34.10: no global displayName override anymore. Each mesh's
|
||||
// hello uses its own per-mesh display name from config.json (set
|
||||
// at `claudemesh join` time). Sessions advertise their own name
|
||||
// via `claudemesh launch --name`.
|
||||
const broker: DaemonBrokerClient = new DaemonBrokerClient(mesh, {
|
||||
displayName: opts.displayName,
|
||||
onStatusChange: (s) => {
|
||||
process.stdout.write(JSON.stringify({
|
||||
msg: "broker_status", status: s, mesh: mesh.slug, ts: new Date().toISOString(),
|
||||
@@ -141,6 +139,22 @@ export async function runDaemon(opts: RunDaemonOptions = {}): Promise<number> {
|
||||
// 1.32.1 and decrypt with the session secret there. Anything that
|
||||
// arrives here can only be member-keyed (broadcasts, member DMs,
|
||||
// system events) — pass member secret only.
|
||||
// 1.34.9: drop self-echoes — broker fan-out paths mirror an
|
||||
// outbound back to the SAME daemon's member-WS even when the
|
||||
// send originated on a session-WS (because both connections
|
||||
// belong to the same member from the broker's view). Filter on
|
||||
// senderMemberPubkey alone: anything attributed to OUR member is
|
||||
// either our own send echoing back or, theoretically, a peer
|
||||
// send from a different connection that happens to share our
|
||||
// pubkey — but two-different-clients-same-pubkey is impossible
|
||||
// by construction (member pubkeys are stable + unique per
|
||||
// identity). Sibling-session DMs don't fan to our member-WS;
|
||||
// they fan session-to-session. So this is safe.
|
||||
const senderMemberPk = String((m as Record<string, unknown>).senderMemberPubkey ?? "").toLowerCase();
|
||||
const ownMember = mesh.pubkey.toLowerCase();
|
||||
if (senderMemberPk && senderMemberPk === ownMember) {
|
||||
return;
|
||||
}
|
||||
void handleBrokerPush(m, {
|
||||
db: inboxDb,
|
||||
bus,
|
||||
@@ -149,6 +163,18 @@ export async function runDaemon(opts: RunDaemonOptions = {}): Promise<number> {
|
||||
// v2 agentic-comms (M1): client_ack closes the at-least-once
|
||||
// loop. Broker holds the row claimed (not delivered) until ack.
|
||||
ackClientMessage: (cmid, bmid) => broker.sendClientAck(cmid, bmid),
|
||||
// 1.34.9: drop self-join system events. Member pubkey + every
|
||||
// live session pubkey on this daemon all count as "us".
|
||||
isOwnPubkey: (pubkey) => {
|
||||
const lower = pubkey.toLowerCase();
|
||||
if (lower === ownMember) return true;
|
||||
return sessionBrokersByPubkey.has(lower);
|
||||
},
|
||||
// 1.34.10: tag the bus event with our member pubkey so the
|
||||
// SSE demux only fans this row to MCPs whose subscriber
|
||||
// matches (member-keyed broadcasts / DMs).
|
||||
recipientPubkey: mesh.pubkey,
|
||||
recipientKind: "member",
|
||||
});
|
||||
},
|
||||
});
|
||||
@@ -156,16 +182,33 @@ export async function runDaemon(opts: RunDaemonOptions = {}): Promise<number> {
|
||||
brokers.set(mesh.slug, broker);
|
||||
}
|
||||
|
||||
// Start the drain worker. With multi-mesh, drain dispatches each
|
||||
// outbox row to its mesh's broker via the `mesh` column.
|
||||
let drain: DrainHandle | null = null;
|
||||
drain = startDrainWorker({ db: outboxDb, brokers });
|
||||
|
||||
// 1.30.0 — per-session broker presence. Always on. Older CLIs that
|
||||
// don't include `presence` material in the register body just won't
|
||||
// get a session WS; the daemon's own member-keyed broker still
|
||||
// covers them.
|
||||
const sessionBrokers = new Map<string, SessionBrokerClient>();
|
||||
//
|
||||
// The two index maps (sessionBrokers by token, sessionBrokersByPubkey
|
||||
// by session pubkey) are declared earlier in this function so the
|
||||
// daemon-WS onPush closure can reference them for the isOwnPubkey
|
||||
// self-join filter.
|
||||
|
||||
// Start the drain worker. With multi-mesh, drain dispatches each
|
||||
// outbox row to its mesh's broker via the `mesh` column.
|
||||
// 1.34.0: drain also accepts a session-pubkey lookup so rows
|
||||
// written by authenticated sessions route via the matching session-WS
|
||||
// (broker fan-out then attributes the push to the session pubkey).
|
||||
let drain: DrainHandle | null = null;
|
||||
drain = startDrainWorker({
|
||||
db: outboxDb,
|
||||
brokers,
|
||||
getSessionBrokerByPubkey: (pubkey) => sessionBrokersByPubkey.get(pubkey),
|
||||
});
|
||||
|
||||
// 1.34.8 — TTL prune for inbox.db. Runs hourly with a 30-day default
|
||||
// retention. Without this the inbox grows unbounded; even on a moderate
|
||||
// mesh that's tens of thousands of rows over a few weeks. Prune is a
|
||||
// single DELETE; failures are non-fatal and the next interval retries.
|
||||
const inboxPruner: InboxPrunerHandle = startInboxPruner({ db: inboxDb });
|
||||
setRegistryHooks({
|
||||
onRegister: (info) => {
|
||||
if (!info.presence) return;
|
||||
@@ -181,6 +224,10 @@ export async function runDaemon(opts: RunDaemonOptions = {}): Promise<number> {
|
||||
const prior = sessionBrokers.get(info.token);
|
||||
if (prior) {
|
||||
sessionBrokers.delete(info.token);
|
||||
// 1.34.0: keep both indices in sync.
|
||||
if (sessionBrokersByPubkey.get(prior.sessionPubkey) === prior) {
|
||||
sessionBrokersByPubkey.delete(prior.sessionPubkey);
|
||||
}
|
||||
prior.close().catch(() => { /* ignore */ });
|
||||
}
|
||||
// 1.32.1 — wire push delivery. Messages targeted at the launched
|
||||
@@ -190,6 +237,10 @@ export async function runDaemon(opts: RunDaemonOptions = {}): Promise<number> {
|
||||
// session secret key; member key remains the fallback for legacy
|
||||
// member-targeted traffic that happens to fan out here.
|
||||
const sessionSecretKeyHex = info.presence.sessionSecretKey;
|
||||
// Capture the pubkey for the onPush closure below — TS can't
|
||||
// narrow `info.presence` inside the async arrow even though we
|
||||
// guard `if (!info.presence) return` earlier.
|
||||
const sessionPubkeyHex = info.presence.sessionPubkey;
|
||||
const client: SessionBrokerClient = new SessionBrokerClient({
|
||||
mesh: meshConfig,
|
||||
sessionPubkey: info.presence.sessionPubkey,
|
||||
@@ -209,10 +260,18 @@ export async function runDaemon(opts: RunDaemonOptions = {}): Promise<number> {
|
||||
sessionSecretKeyHex,
|
||||
// v2 agentic-comms (M1): close the at-least-once loop.
|
||||
ackClientMessage: (cmid, bmid) => client.sendClientAck(cmid, bmid),
|
||||
// 1.34.10: tag the bus event with this session's pubkey so
|
||||
// the SSE demux only delivers to the MCP serving THIS
|
||||
// session — not its siblings on the same daemon. Without
|
||||
// this, A's MCP also rendered DMs intended for B because
|
||||
// the bus was a single shared stream.
|
||||
recipientPubkey: sessionPubkeyHex,
|
||||
recipientKind: "session",
|
||||
});
|
||||
},
|
||||
});
|
||||
sessionBrokers.set(info.token, client);
|
||||
sessionBrokersByPubkey.set(info.presence.sessionPubkey, client);
|
||||
client.connect().catch((err) =>
|
||||
process.stderr.write(JSON.stringify({
|
||||
level: "warn", msg: "session_broker_connect_failed",
|
||||
@@ -224,6 +283,11 @@ export async function runDaemon(opts: RunDaemonOptions = {}): Promise<number> {
|
||||
const client = sessionBrokers.get(info.token);
|
||||
if (!client) return;
|
||||
sessionBrokers.delete(info.token);
|
||||
// 1.34.0: drop the pubkey index iff this client still owns it
|
||||
// (a re-register may have already swapped the entry).
|
||||
if (sessionBrokersByPubkey.get(client.sessionPubkey) === client) {
|
||||
sessionBrokersByPubkey.delete(client.sessionPubkey);
|
||||
}
|
||||
client.close().catch(() => { /* ignore */ });
|
||||
},
|
||||
});
|
||||
@@ -252,6 +316,10 @@ export async function runDaemon(opts: RunDaemonOptions = {}): Promise<number> {
|
||||
|
||||
process.stdout.write(JSON.stringify({
|
||||
msg: "daemon_started",
|
||||
// 1.34.10: stamp the version so users can tell whether the
|
||||
// running daemon picked up a recent CLI ship. Read off the same
|
||||
// VERSION constant the IPC `/v1/version` endpoint serves.
|
||||
version: VERSION,
|
||||
pid: process.pid,
|
||||
sock: DAEMON_PATHS.SOCK_FILE,
|
||||
tcp: tcpEnabled ? `127.0.0.1:47823` : null,
|
||||
@@ -264,6 +332,7 @@ export async function runDaemon(opts: RunDaemonOptions = {}): Promise<number> {
|
||||
if (shuttingDown) return;
|
||||
shuttingDown = true;
|
||||
process.stdout.write(JSON.stringify({ msg: "daemon_shutdown", signal: sig, ts: new Date().toISOString() }) + "\n");
|
||||
inboxPruner.stop();
|
||||
if (drain) await drain.close();
|
||||
for (const b of brokers.values()) {
|
||||
try { await b.close(); } catch { /* ignore */ }
|
||||
|
||||
@@ -98,10 +98,16 @@ function installDarwin(args: InstallArgs): InstallResult {
|
||||
// one that installed claudemesh-cli. Pinning process.execPath here means
|
||||
// the daemon always runs under the same Node that ran `claudemesh install`.
|
||||
const nodeBin = process.execPath;
|
||||
// 1.34.12: --foreground because launchd manages lifecycle + stdio.
|
||||
// Without it, the daemon would re-spawn itself detached (the new
|
||||
// default) and launchd would lose track of the actual long-lived
|
||||
// process — KeepAlive wouldn't work and stdout redirect would
|
||||
// capture only the parent's brief boot.
|
||||
const meshArgs = [
|
||||
`<string>${escapeXml(args.binaryPath)}</string>`,
|
||||
"<string>daemon</string>",
|
||||
"<string>up</string>",
|
||||
"<string>--foreground</string>",
|
||||
...(args.meshSlug
|
||||
? ["<string>--mesh</string>", `<string>${escapeXml(args.meshSlug)}</string>`]
|
||||
: []),
|
||||
@@ -180,8 +186,11 @@ function installLinux(args: InstallArgs): InstallResult {
|
||||
// Same node-pinning rationale as macOS — systemd's User= environment is
|
||||
// similarly minimal; resolve node by absolute path.
|
||||
const nodeBin = process.execPath;
|
||||
// 1.34.12: --foreground because systemd-user owns process lifecycle
|
||||
// and stdio capture; we don't want the child to double-fork into a
|
||||
// detached grandchild systemd can't track.
|
||||
const execArgs = [
|
||||
"daemon", "up",
|
||||
"daemon", "up", "--foreground",
|
||||
...(args.meshSlug ? ["--mesh", args.meshSlug] : []),
|
||||
...(args.displayName ? ["--name", args.displayName] : []),
|
||||
].map(shellQuote).join(" ");
|
||||
|
||||
@@ -11,14 +11,22 @@
|
||||
* Differences from `DaemonBrokerClient`:
|
||||
* - Uses session_hello (1.30.0+ broker), with a parent-vouched
|
||||
* attestation provided at construction time.
|
||||
* - Does NOT drain the outbox — that stays the parent member-keyed
|
||||
* DaemonBrokerClient's job. Keeps the responsibility split clean
|
||||
* and avoids two clients fighting over the same outbox row.
|
||||
* - Does NOT carry list_peers / state / memory RPCs. This client is
|
||||
* presence-only PLUS inbound DM delivery for messages targeted at
|
||||
* the session pubkey — pushes are forwarded via the `onPush`
|
||||
* callback to the daemon's shared handleBrokerPush, decrypted with
|
||||
* this session's secret key.
|
||||
* presence + inbound DM delivery + (1.34.0) outbound send for
|
||||
* messages that originate from this session. Routing those through
|
||||
* here is what makes the broker fan-out attribute the push to the
|
||||
* session pubkey instead of the daemon's stable member pubkey.
|
||||
*
|
||||
* Outbox routing (1.34.0): the drain worker now consults
|
||||
* `outbox.sender_session_pubkey`. If a row was written by an
|
||||
* authenticated session and the matching session-WS is `open`, the
|
||||
* drain dispatches via `SessionBrokerClient.send()` — this
|
||||
* connection's `conn.sessionPubkey` server-side is the session pubkey,
|
||||
* so the broker's existing fan-out attribution
|
||||
* (`senderPubkey: conn.sessionPubkey ?? conn.memberPubkey`) just works.
|
||||
* Pre-1.34.0 every drain went through DaemonBrokerClient (member-WS),
|
||||
* so every push showed up as "from <daemon-member-pubkey>" regardless
|
||||
* of which session typed `claudemesh send`.
|
||||
*
|
||||
* Old brokers reply with `unknown_message_type` on session_hello — we
|
||||
* surface that as a one-shot `error` event and the daemon decides
|
||||
@@ -37,9 +45,27 @@ import { hostname as osHostname } from "node:os";
|
||||
import type { JoinedMesh } from "~/services/config/facade.js";
|
||||
import { signSessionHello } from "~/services/broker/session-hello-sig.js";
|
||||
import { connectWsWithBackoff, type WsLifecycle, type WsStatus } from "./ws-lifecycle.js";
|
||||
import type { BrokerSendArgs, BrokerSendResult } from "./broker.js";
|
||||
|
||||
export type SessionBrokerStatus = WsStatus;
|
||||
|
||||
/** Ack-tracking shape, mirrors DaemonBrokerClient.PendingAck. Kept
|
||||
* internal — callers see only the resolved BrokerSendResult. */
|
||||
interface PendingAck {
|
||||
resolve: (r: BrokerSendResult) => void;
|
||||
timer: NodeJS.Timeout;
|
||||
}
|
||||
|
||||
const SEND_ACK_TIMEOUT_MS = 15_000;
|
||||
|
||||
/** Heuristic: which broker-reported send errors are permanent enough
|
||||
* that the drain worker should give up rather than retry. Mirrors the
|
||||
* daemon-WS classifier so behavior is identical regardless of which
|
||||
* socket the row went out on. */
|
||||
function classifyPermanent(error: string): boolean {
|
||||
return /unknown|invalid|forbidden|not_authorized|target_not_found/i.test(error);
|
||||
}
|
||||
|
||||
export interface ParentAttestation {
|
||||
sessionPubkey: string;
|
||||
parentMemberPubkey: string;
|
||||
@@ -86,6 +112,14 @@ export class SessionBrokerClient {
|
||||
/** Set when the broker rejects session_hello with `unknown_message_type` —
|
||||
* older brokers without the 1.30.0 surface. We stop retrying. */
|
||||
private brokerUnsupported = false;
|
||||
/** 1.34.0: outbound send tracking. Keyed by client_message_id. The
|
||||
* drain worker registers an entry on dispatch; the WS message
|
||||
* handler resolves it on broker `ack`. Times out after 15s. */
|
||||
private pendingAcks = new Map<string, PendingAck>();
|
||||
/** 1.34.0: dispatchers queued while the WS is reconnecting — flushed
|
||||
* in onStatusChange when status flips to `open`. Mirrors the
|
||||
* daemon-WS `opens` array. */
|
||||
private opens: Array<() => void> = [];
|
||||
|
||||
constructor(private opts: SessionBrokerOptions) {}
|
||||
|
||||
@@ -151,10 +185,50 @@ export class SessionBrokerClient {
|
||||
return;
|
||||
}
|
||||
|
||||
// 1.34.0: outbox `send` ack arriving on the session-WS. Resolves
|
||||
// the Promise the drain worker is awaiting. Mirrors the
|
||||
// daemon-WS handler exactly.
|
||||
if (msg.type === "ack") {
|
||||
const id = String(msg.id ?? "");
|
||||
const ack = this.pendingAcks.get(id);
|
||||
if (ack) {
|
||||
this.pendingAcks.delete(id);
|
||||
clearTimeout(ack.timer);
|
||||
if (typeof msg.error === "string" && msg.error.length > 0) {
|
||||
ack.resolve({ ok: false, error: msg.error, permanent: classifyPermanent(msg.error) });
|
||||
} else {
|
||||
ack.resolve({ ok: true, messageId: String(msg.messageId ?? id) });
|
||||
}
|
||||
}
|
||||
return;
|
||||
}
|
||||
|
||||
// 1.32.1 — DMs targeted at the launched session's pubkey arrive
|
||||
// here, NOT on the daemon's member-keyed WS. Forward to the
|
||||
// daemon-level push handler so they land in inbox.db.
|
||||
if (msg.type === "push" || msg.type === "inbound") {
|
||||
// 1.34.9: skip system events on the session-WS — the daemon-WS
|
||||
// already receives the same broker broadcast and publishes it
|
||||
// to the bus, so forwarding here just produces duplicate
|
||||
// `[system] Peer "X" joined the mesh` channel pushes (one per
|
||||
// connection: 1 member-WS + 1 session-WS = 2 messages, +
|
||||
// another set per sibling session). Caught in the 2026-05-04
|
||||
// peer-rejoin smoke.
|
||||
if ((msg as Record<string, unknown>).subtype === "system") return;
|
||||
// 1.34.8: drop self-echoes. Some broker fan-out paths mirror an
|
||||
// outbound DM back to the originating session-WS; without this
|
||||
// guard the sender's own message lands in inbox.db, publishes a
|
||||
// `message` bus event, and Claude Code surfaces it as
|
||||
// `← claudemesh: <self>: <text>` immediately after the user
|
||||
// typed `claudemesh send`. Caught in the 2026-05-04 two-session
|
||||
// smoke. Match on session pubkey only — sibling sessions of the
|
||||
// same member share `senderMemberPubkey`, so a member-level
|
||||
// filter would wrongly drop legit sibling DMs.
|
||||
const senderPubkey = String((msg as Record<string, unknown>).senderPubkey ?? "").toLowerCase();
|
||||
if (senderPubkey && senderPubkey === this.opts.sessionPubkey.toLowerCase()) {
|
||||
this.log("info", "self_echo_dropped", { sender: senderPubkey.slice(0, 12) });
|
||||
return;
|
||||
}
|
||||
this.opts.onPush?.(msg);
|
||||
return;
|
||||
}
|
||||
@@ -162,6 +236,21 @@ export class SessionBrokerClient {
|
||||
onStatusChange: (s) => {
|
||||
this._status = s;
|
||||
this.opts.onStatusChange?.(s);
|
||||
if (s === "open") {
|
||||
// 1.34.0: flush queued send dispatchers so any outbox row that
|
||||
// tried to dispatch while we were reconnecting goes out now.
|
||||
const queued = this.opens.slice();
|
||||
this.opens.length = 0;
|
||||
for (const fn of queued) {
|
||||
try { fn(); } catch (e) { this.log("warn", "session_open_handler_failed", { err: String(e) }); }
|
||||
}
|
||||
} else if (s === "closed" || s === "reconnecting") {
|
||||
// Fail any in-flight acks so the drain worker can retry/backoff
|
||||
// instead of hanging on a dead promise. The daemon-WS does the
|
||||
// same thing via onBeforeReconnect; we centralize it here
|
||||
// because session-broker uses status transitions directly.
|
||||
this.failPendingAcks(`session_ws_${s}`);
|
||||
}
|
||||
},
|
||||
log: (level, msg, meta) => this.log(level, `session_broker_${msg}`, meta),
|
||||
});
|
||||
@@ -181,6 +270,72 @@ export class SessionBrokerClient {
|
||||
} catch { /* drop; lease re-delivers */ }
|
||||
}
|
||||
|
||||
/** True when underlying socket is OPEN-ready for direct sends. */
|
||||
isOpen(): boolean {
|
||||
const sock = this.lifecycle?.ws;
|
||||
return !!sock && sock.readyState === sock.OPEN;
|
||||
}
|
||||
|
||||
/**
|
||||
* 1.34.0 — Send one outbox row over the session-WS. Same wire format
|
||||
* as DaemonBrokerClient.send, but routed via this connection so the
|
||||
* broker's fan-out attributes the push to the session pubkey.
|
||||
*
|
||||
* Used by the drain worker for rows whose `sender_session_pubkey`
|
||||
* matches this client's session pubkey. When the WS is reconnecting
|
||||
* the dispatcher is queued via `opens` and flushed on the next
|
||||
* status flip.
|
||||
*/
|
||||
send(req: BrokerSendArgs): Promise<BrokerSendResult> {
|
||||
return new Promise<BrokerSendResult>((resolve) => {
|
||||
const dispatch = () => {
|
||||
if (!this.isOpen() || !this.lifecycle) {
|
||||
resolve({ ok: false, error: "session_ws_not_open", permanent: false });
|
||||
return;
|
||||
}
|
||||
const id = req.client_message_id;
|
||||
const timer = setTimeout(() => {
|
||||
if (this.pendingAcks.delete(id)) {
|
||||
resolve({ ok: false, error: "ack_timeout", permanent: false });
|
||||
}
|
||||
}, SEND_ACK_TIMEOUT_MS);
|
||||
this.pendingAcks.set(id, { resolve, timer });
|
||||
try {
|
||||
this.lifecycle.send({
|
||||
type: "send",
|
||||
id,
|
||||
client_message_id: id,
|
||||
request_fingerprint: req.request_fingerprint_hex,
|
||||
targetSpec: req.targetSpec,
|
||||
priority: req.priority,
|
||||
nonce: req.nonce,
|
||||
ciphertext: req.ciphertext,
|
||||
});
|
||||
} catch (e) {
|
||||
this.pendingAcks.delete(id);
|
||||
clearTimeout(timer);
|
||||
resolve({ ok: false, error: `ws_write_failed: ${String(e)}`, permanent: false });
|
||||
}
|
||||
};
|
||||
|
||||
if (this._status === "open") dispatch();
|
||||
else this.opens.push(dispatch);
|
||||
});
|
||||
}
|
||||
|
||||
/** Resolve every in-flight ack with a synthetic failure. Called on
|
||||
* WS close so the drain worker stops waiting and either retries or
|
||||
* reroutes via the daemon-WS. */
|
||||
private failPendingAcks(reason: string): void {
|
||||
if (this.pendingAcks.size === 0) return;
|
||||
const entries = [...this.pendingAcks.entries()];
|
||||
this.pendingAcks.clear();
|
||||
for (const [, ack] of entries) {
|
||||
clearTimeout(ack.timer);
|
||||
ack.resolve({ ok: false, error: reason, permanent: false });
|
||||
}
|
||||
}
|
||||
|
||||
async close(): Promise<void> {
|
||||
this.closed = true;
|
||||
if (this.lifecycle) {
|
||||
|
||||
@@ -139,33 +139,58 @@ export function connectWsWithBackoff(opts: WsLifecycleOptions): Promise<WsLifecy
|
||||
* but ignores the rejection — by then the close handler has already
|
||||
* scheduled its own reconnect).
|
||||
*/
|
||||
// Liveness watchdog: same cadence (30s) as the broker's outbound
|
||||
// ping. Two jobs per tick:
|
||||
// 1. If we haven't heard from the broker in >75s (2.5x the ping
|
||||
// cadence — covers one missed ping plus some slack), terminate
|
||||
// the socket. Fires the close handler → backoff reconnect runs
|
||||
// its normal path. This is what catches NAT-dropped half-dead
|
||||
// connections that the kernel won't RST for ~2 hours.
|
||||
// 2. Otherwise, send our own ping. The broker's `ws` library
|
||||
// auto-replies with a pong, which bumps lastActivity. This
|
||||
// keeps the broker's stale-pong watchdog seeing us as alive.
|
||||
//
|
||||
// Bare `ping` and `pong` events both bump lastActivity, as does
|
||||
// any inbound application message — any sign of life resets the
|
||||
// dead-man's-switch.
|
||||
const PING_INTERVAL_MS = 30_000;
|
||||
const STALE_THRESHOLD_MS = 75_000;
|
||||
let lastActivity = Date.now();
|
||||
let watchdogTimer: NodeJS.Timeout | null = null;
|
||||
|
||||
const openOnce = (): Promise<void> => {
|
||||
if (closed) return Promise.reject(new Error("client_closed"));
|
||||
setStatus("connecting");
|
||||
|
||||
log("info", "ws_open_attempt", { url: opts.url });
|
||||
const sock = new WebSocket(opts.url);
|
||||
ws = sock;
|
||||
lastActivity = Date.now();
|
||||
|
||||
return new Promise<void>((resolve, reject) => {
|
||||
sock.on("open", () => {
|
||||
log("info", "ws_open_ok", { url: opts.url });
|
||||
// Build and send the hello inside a microtask so any sync
|
||||
// throws from buildHello() reject this connect attempt cleanly.
|
||||
(async () => {
|
||||
try {
|
||||
const hello = await opts.buildHello();
|
||||
sock.send(JSON.stringify(hello));
|
||||
log("info", "ws_hello_sent", { url: opts.url });
|
||||
helloTimer = setTimeout(() => {
|
||||
log("warn", "hello_ack_timeout", { url: opts.url });
|
||||
try { sock.close(); } catch { /* ignore */ }
|
||||
reject(new Error("hello_ack_timeout"));
|
||||
}, helloAckTimeoutMs);
|
||||
} catch (e) {
|
||||
log("warn", "ws_build_hello_threw", { err: String(e) });
|
||||
reject(e instanceof Error ? e : new Error(String(e)));
|
||||
}
|
||||
})();
|
||||
});
|
||||
|
||||
sock.on("message", (raw) => {
|
||||
lastActivity = Date.now();
|
||||
let msg: Record<string, unknown>;
|
||||
try { msg = JSON.parse(raw.toString()) as Record<string, unknown>; }
|
||||
catch { return; }
|
||||
@@ -174,18 +199,34 @@ export function connectWsWithBackoff(opts: WsLifecycleOptions): Promise<WsLifecy
|
||||
if (helloTimer) { clearTimeout(helloTimer); helloTimer = null; }
|
||||
setStatus("open");
|
||||
reconnectAttempt = 0;
|
||||
log("info", "ws_hello_acked", { url: opts.url });
|
||||
// Start liveness watchdog only after a successful handshake.
|
||||
if (watchdogTimer) clearInterval(watchdogTimer);
|
||||
watchdogTimer = setInterval(() => {
|
||||
if (sock.readyState !== sock.OPEN) return;
|
||||
const idle = Date.now() - lastActivity;
|
||||
if (idle > STALE_THRESHOLD_MS) {
|
||||
log("warn", "ws_stale_terminate", { url: opts.url, idle_ms: idle });
|
||||
try { sock.terminate(); } catch { /* socket already gone */ }
|
||||
return;
|
||||
}
|
||||
try { sock.ping(); } catch { /* ignore */ }
|
||||
}, PING_INTERVAL_MS);
|
||||
resolve();
|
||||
// Don't forward hello_ack to onMessage — both pre-refactor
|
||||
// clients consumed it inline and never delegated.
|
||||
return;
|
||||
}
|
||||
|
||||
opts.onMessage(msg);
|
||||
});
|
||||
|
||||
sock.on("ping", () => { lastActivity = Date.now(); });
|
||||
sock.on("pong", () => { lastActivity = Date.now(); });
|
||||
|
||||
sock.on("close", (code, reason) => {
|
||||
if (helloTimer) { clearTimeout(helloTimer); helloTimer = null; }
|
||||
if (watchdogTimer) { clearInterval(watchdogTimer); watchdogTimer = null; }
|
||||
const reasonStr = reason.toString("utf8");
|
||||
log("warn", "ws_closed", { url: opts.url, code, reason: reasonStr, status });
|
||||
opts.onBeforeReconnect?.(code, reasonStr);
|
||||
|
||||
if (closed) {
|
||||
@@ -200,8 +241,6 @@ export function connectWsWithBackoff(opts: WsLifecycleOptions): Promise<WsLifecy
|
||||
() => openOnce().catch((err) => log("warn", "ws_reconnect_failed", { url: opts.url, err: String(err) })),
|
||||
wait,
|
||||
);
|
||||
// First attempt failure (still in connecting) also rejects the
|
||||
// initial connect promise so callers can surface it.
|
||||
if (status === "connecting" || status === "reconnecting") {
|
||||
reject(new Error(`closed_before_hello_${code}`));
|
||||
}
|
||||
@@ -225,6 +264,7 @@ export function connectWsWithBackoff(opts: WsLifecycleOptions): Promise<WsLifecy
|
||||
closed = true;
|
||||
if (reconnectTimer) { clearTimeout(reconnectTimer); reconnectTimer = null; }
|
||||
if (helloTimer) { clearTimeout(helloTimer); helloTimer = null; }
|
||||
if (watchdogTimer) { clearInterval(watchdogTimer); watchdogTimer = null; }
|
||||
try { ws?.close(); } catch { /* ignore */ }
|
||||
setStatus("closed");
|
||||
},
|
||||
|
||||
@@ -97,7 +97,15 @@ Message (resource form)
|
||||
[--self] (allow targeting your own member/session pubkey;
|
||||
fans out to every sibling session of your member)
|
||||
[--json] (machine-readable result)
|
||||
claudemesh message inbox drain pending (alias: inbox)
|
||||
claudemesh message inbox read persisted inbox (alias: inbox)
|
||||
flags: [--mesh <slug>] [--limit N] [--unread] [--json]
|
||||
reads ~/.claudemesh/daemon/inbox.db via daemon
|
||||
--unread → only rows never surfaced before (seen_at IS NULL);
|
||||
listing stamps returned rows seen as a side effect
|
||||
claudemesh inbox flush bulk-delete inbox rows
|
||||
flags: [--mesh <slug>] [--before <iso-timestamp>] [--all]
|
||||
--all required when neither --mesh nor --before is set
|
||||
claudemesh inbox delete <id> delete one inbox row by id (alias: rm)
|
||||
claudemesh message status <id> delivery status (alias: msg-status)
|
||||
|
||||
Memory (resource form)
|
||||
@@ -190,16 +198,18 @@ Security
|
||||
claudemesh backup [file] encrypt config → portable recovery file
|
||||
claudemesh restore <file> restore config from a backup file
|
||||
|
||||
Daemon (long-lived peer mesh runtime, v0.9.0)
|
||||
claudemesh daemon up start daemon (alias: start) [--mesh <slug>] [--no-tcp]
|
||||
Daemon (long-lived peer mesh runtime — universal across every joined mesh)
|
||||
claudemesh daemon up start daemon (alias: start) [--no-tcp]
|
||||
claudemesh daemon status show running pid + IPC health [--json]
|
||||
claudemesh daemon down stop daemon (alias: stop)
|
||||
claudemesh daemon version ipc + schema version of running daemon
|
||||
claudemesh daemon outbox list list local outbox rows [--failed|--pending|--inflight|--done]
|
||||
claudemesh daemon outbox requeue <id> re-enqueue an aborted/dead row [--new-client-id <id>]
|
||||
claudemesh daemon accept-host pin current host fingerprint
|
||||
claudemesh daemon install-service --mesh <slug> write launchd / systemd-user unit
|
||||
claudemesh daemon uninstall-service remove the unit
|
||||
claudemesh daemon install-service write launchd / systemd-user unit
|
||||
claudemesh daemon uninstall-service remove the unit
|
||||
Note: the daemon attaches to every mesh in ~/.claudemesh/config.json
|
||||
automatically; --mesh on up / install-service is deprecated and ignored.
|
||||
|
||||
Setup
|
||||
claudemesh install register MCP server + hooks
|
||||
@@ -394,7 +404,30 @@ async function main(): Promise<void> {
|
||||
// Messaging
|
||||
case "peers": { const { runPeers } = await import("~/commands/peers.js"); await runPeers({ mesh: flags.mesh as string, json: flags.json as boolean | string | undefined, all: !!flags.all }); break; }
|
||||
case "send": { const { runSend } = await import("~/commands/send.js"); await runSend({ mesh: flags.mesh as string, priority: flags.priority as string, json: !!flags.json, self: !!flags.self }, positionals[0] ?? "", positionals.slice(1).join(" ")); break; }
|
||||
case "inbox": { const { runInbox } = await import("~/commands/inbox.js"); await runInbox({ json: !!flags.json }); break; }
|
||||
case "inbox": {
|
||||
const sub = positionals[0];
|
||||
if (sub === "flush") {
|
||||
const { runInboxFlush } = await import("~/commands/inbox-actions.js");
|
||||
await runInboxFlush({
|
||||
mesh: flags.mesh as string | undefined,
|
||||
before: flags.before as string | undefined,
|
||||
all: !!flags.all,
|
||||
json: !!flags.json,
|
||||
});
|
||||
} else if (sub === "delete" || sub === "rm") {
|
||||
const { runInboxDelete } = await import("~/commands/inbox-actions.js");
|
||||
await runInboxDelete(positionals[1] ?? "", { json: !!flags.json });
|
||||
} else {
|
||||
const { runInbox } = await import("~/commands/inbox.js");
|
||||
await runInbox({
|
||||
mesh: flags.mesh as string | undefined,
|
||||
json: !!flags.json,
|
||||
limit: typeof flags.limit === "number" ? flags.limit : (typeof flags.limit === "string" ? Number.parseInt(flags.limit, 10) : undefined),
|
||||
unread: !!flags.unread,
|
||||
});
|
||||
}
|
||||
break;
|
||||
}
|
||||
case "state": {
|
||||
const sub = positionals[0];
|
||||
if (sub === "set") { const { runStateSet } = await import("~/commands/state.js"); await runStateSet({}, positionals[1] ?? "", positionals[2] ?? ""); }
|
||||
@@ -466,6 +499,11 @@ async function main(): Promise<void> {
|
||||
publicHealth: !!flags["public-health"],
|
||||
mesh: flags.mesh as string | undefined,
|
||||
displayName: flags.name as string | undefined,
|
||||
// 1.34.12: --foreground opts out of the new "detach by default"
|
||||
// behavior. install-service and `claudemesh launch`'s auto-spawn
|
||||
// path always run with --foreground so their parents (launchd /
|
||||
// the launch helper) own lifecycle and stdio redirection.
|
||||
foreground: !!flags.foreground,
|
||||
outboxStatus,
|
||||
newClientId: flags["new-client-id"] as string | undefined,
|
||||
}, rest);
|
||||
@@ -530,7 +568,29 @@ async function main(): Promise<void> {
|
||||
case "message": {
|
||||
const sub = positionals[0];
|
||||
if (sub === "send") { const { runSend } = await import("~/commands/send.js"); await runSend({ mesh: flags.mesh as string, priority: flags.priority as string, json: !!flags.json, self: !!flags.self }, positionals[1] ?? "", positionals.slice(2).join(" ")); }
|
||||
else if (sub === "inbox") { const { runInbox } = await import("~/commands/inbox.js"); await runInbox({ json: !!flags.json }); }
|
||||
else if (sub === "inbox") {
|
||||
const sub2 = positionals[1];
|
||||
if (sub2 === "flush") {
|
||||
const { runInboxFlush } = await import("~/commands/inbox-actions.js");
|
||||
await runInboxFlush({
|
||||
mesh: flags.mesh as string | undefined,
|
||||
before: flags.before as string | undefined,
|
||||
all: !!flags.all,
|
||||
json: !!flags.json,
|
||||
});
|
||||
} else if (sub2 === "delete" || sub2 === "rm") {
|
||||
const { runInboxDelete } = await import("~/commands/inbox-actions.js");
|
||||
await runInboxDelete(positionals[2] ?? "", { json: !!flags.json });
|
||||
} else {
|
||||
const { runInbox } = await import("~/commands/inbox.js");
|
||||
await runInbox({
|
||||
mesh: flags.mesh as string | undefined,
|
||||
json: !!flags.json,
|
||||
limit: typeof flags.limit === "number" ? flags.limit : (typeof flags.limit === "string" ? Number.parseInt(flags.limit, 10) : undefined),
|
||||
unread: !!flags.unread,
|
||||
});
|
||||
}
|
||||
}
|
||||
else if (sub === "status") { const { runMsgStatus } = await import("~/commands/broker-actions.js"); process.exit(await runMsgStatus(positionals[1], { mesh: flags.mesh as string, json: !!flags.json })); }
|
||||
else { console.error("Usage: claudemesh message <send|inbox|status>"); process.exit(EXIT.INVALID_ARGS); }
|
||||
break;
|
||||
|
||||
@@ -30,8 +30,9 @@ import {
|
||||
ListResourcesRequestSchema,
|
||||
ReadResourceRequestSchema,
|
||||
} from "@modelcontextprotocol/sdk/types.js";
|
||||
import { existsSync } from "node:fs";
|
||||
import { existsSync, appendFileSync } from "node:fs";
|
||||
import { request as httpRequest, type IncomingMessage } from "node:http";
|
||||
import { join } from "node:path";
|
||||
|
||||
import { DAEMON_PATHS } from "~/daemon/paths.js";
|
||||
import { VERSION } from "~/constants/urls.js";
|
||||
@@ -69,10 +70,15 @@ function bailNoDaemon(): never {
|
||||
|
||||
interface DaemonGetResult { status: number; body: any }
|
||||
|
||||
function daemonGet(path: string): Promise<DaemonGetResult> {
|
||||
function daemonGet(path: string, opts: { sessionToken?: string | null } = {}): Promise<DaemonGetResult> {
|
||||
return new Promise((resolve, reject) => {
|
||||
const headers: Record<string, string> = {};
|
||||
// 1.34.2+: when the launched process gave us a session token, forward
|
||||
// it on every IPC. Routes like `/v1/sessions/me` 401 without it, and
|
||||
// routes like `/v1/peers` use it for default-mesh scoping.
|
||||
if (opts.sessionToken) headers.Authorization = `ClaudeMesh-Session ${opts.sessionToken}`;
|
||||
const req = httpRequest(
|
||||
{ socketPath: DAEMON_PATHS.SOCK_FILE, path, method: "GET", timeout: 5_000 },
|
||||
{ socketPath: DAEMON_PATHS.SOCK_FILE, path, method: "GET", timeout: 5_000, headers },
|
||||
(res: IncomingMessage) => {
|
||||
const chunks: Buffer[] = [];
|
||||
res.on("data", (c) => chunks.push(c as Buffer));
|
||||
@@ -90,21 +96,54 @@ function daemonGet(path: string): Promise<DaemonGetResult> {
|
||||
});
|
||||
}
|
||||
|
||||
/** 1.34.8: best-effort POST /v1/inbox/seen so the MCP can stamp rows it
|
||||
* just surfaced via a `<channel>` reminder. Failures are swallowed —
|
||||
* read-state is a UX optimization, not a correctness gate. */
|
||||
function daemonMarkSeen(ids: string[], sessionToken?: string | null): Promise<void> {
|
||||
return new Promise((resolve) => {
|
||||
if (ids.length === 0) { resolve(); return; }
|
||||
const body = JSON.stringify({ ids });
|
||||
const headers: Record<string, string> = {
|
||||
"Content-Type": "application/json",
|
||||
"Content-Length": String(Buffer.byteLength(body)),
|
||||
};
|
||||
if (sessionToken) headers.Authorization = `ClaudeMesh-Session ${sessionToken}`;
|
||||
const req = httpRequest(
|
||||
{ socketPath: DAEMON_PATHS.SOCK_FILE, path: "/v1/inbox/seen", method: "POST", timeout: 3_000, headers },
|
||||
(res: IncomingMessage) => { res.on("data", () => { /* drain */ }); res.on("end", () => resolve()); },
|
||||
);
|
||||
req.on("error", () => resolve());
|
||||
req.on("timeout", () => { req.destroy(); resolve(); });
|
||||
req.write(body);
|
||||
req.end();
|
||||
});
|
||||
}
|
||||
|
||||
// ── daemon SSE subscription ────────────────────────────────────────────
|
||||
|
||||
interface DaemonEvent { kind: string; ts: string; data: Record<string, any> }
|
||||
|
||||
function subscribeEvents(onEvent: (e: DaemonEvent) => void): { close: () => void } {
|
||||
function subscribeEvents(onEvent: (e: DaemonEvent) => void, opts: { sessionToken?: string | null } = {}): { close: () => void } {
|
||||
let active = true;
|
||||
let req: ReturnType<typeof httpRequest> | null = null;
|
||||
|
||||
const connect = (): void => {
|
||||
if (!active) return;
|
||||
// 1.34.13: forward the session token on the SSE subscription so the
|
||||
// daemon's `/v1/events` route can scope the stream to this session
|
||||
// via the SseFilterOptions demux added in 1.34.10. Without this
|
||||
// header, `session` resolves to null in the IPC handler, the filter
|
||||
// is empty, and every MCP receives every event — manifests as
|
||||
// session A rendering DMs that arrived on B's session-WS. The
|
||||
// launch helper sets CLAUDEMESH_IPC_TOKEN_FILE in the child env;
|
||||
// readSessionTokenFromEnv() picks it up at MCP boot time.
|
||||
const headers: Record<string, string> = { Accept: "text/event-stream" };
|
||||
if (opts.sessionToken) headers.Authorization = `ClaudeMesh-Session ${opts.sessionToken}`;
|
||||
req = httpRequest({
|
||||
socketPath: DAEMON_PATHS.SOCK_FILE,
|
||||
path: "/v1/events",
|
||||
method: "GET",
|
||||
headers: { Accept: "text/event-stream" },
|
||||
headers,
|
||||
});
|
||||
let buffer = "";
|
||||
req.on("response", (res: IncomingMessage) => {
|
||||
@@ -166,7 +205,26 @@ export async function startMcpServer(): Promise<void> {
|
||||
|
||||
const server = new Server(
|
||||
{ name: "claudemesh", version: VERSION },
|
||||
{ capabilities: { tools: {}, prompts: {}, resources: {} } },
|
||||
{
|
||||
capabilities: {
|
||||
tools: {},
|
||||
prompts: {},
|
||||
resources: {},
|
||||
// 1.34.1 — declare the experimental `claude/channel` capability.
|
||||
// Claude Code v2.1.x gates `notifications/claude/channel` on this
|
||||
// exact key: its `xJ_(serverName, capabilities, pluginSource)` check
|
||||
// returns {action:"skip", kind:"capability"} when
|
||||
// `capabilities.experimental?.["claude/channel"]` is missing, and
|
||||
// the notification handler is never registered → every channel
|
||||
// emit lands on the floor, regardless of the
|
||||
// `--dangerously-load-development-channels server:claudemesh` flag.
|
||||
// This was the silent regression: pre-2.1.x clients didn't gate on
|
||||
// this key, so the same MCP wire shape "worked" until Claude Code
|
||||
// tightened the check. Verified by reading the binary at the
|
||||
// offsets near `notifications/claude/channel` in the strings dump.
|
||||
experimental: { "claude/channel": {} },
|
||||
},
|
||||
},
|
||||
);
|
||||
|
||||
// Tools: empty. The CLI is the API; the model invokes it via Bash.
|
||||
@@ -264,8 +322,33 @@ export async function startMcpServer(): Promise<void> {
|
||||
return { contents: [{ uri, mimeType: "text/markdown", text: fm.join("\n") + skill.instructions }] };
|
||||
});
|
||||
|
||||
// 1.34.1: every channel emit (and SSE event arrival) writes to a
|
||||
// per-pid log file under ~/.claudemesh/daemon/. Stderr from a Claude
|
||||
// Code-spawned MCP server isn't surfaced anywhere visible to the
|
||||
// user; without an on-disk trace we can't tell whether the SSE
|
||||
// delivered the event, whether the bus reached the MCP, or whether
|
||||
// server.notification rejected. The file path is stable across MCP
|
||||
// restarts so users can `tail -f` to watch live.
|
||||
const mcpLogPath = join(DAEMON_PATHS.DAEMON_DIR, `mcp-${process.pid}.log`);
|
||||
const mcpLog = (msg: string, meta?: Record<string, unknown>): void => {
|
||||
const line = JSON.stringify({ ts: new Date().toISOString(), pid: process.pid, msg, ...meta }) + "\n";
|
||||
try { appendFileSync(mcpLogPath, line); } catch { /* logging must never crash */ }
|
||||
};
|
||||
mcpLog("mcp_started", { version: VERSION });
|
||||
|
||||
// 1.34.8: forward session token on /v1/inbox/seen so the daemon can
|
||||
// resolve mesh scoping if it ever needs to. We read it once here and
|
||||
// capture it in the closure since the MCP runs for the lifetime of
|
||||
// the session; the env var doesn't rotate mid-process.
|
||||
const { readSessionTokenFromEnv } = await import("~/services/session/token.js");
|
||||
const sessionTokenForSeen = readSessionTokenFromEnv();
|
||||
|
||||
// Subscribe to daemon events; translate to channel notifications.
|
||||
// 1.34.13: pass the session token so the daemon scopes the SSE
|
||||
// stream via SseFilterOptions. Re-uses the same token already read
|
||||
// for /v1/inbox/seen above.
|
||||
const sub = subscribeEvents(async (ev) => {
|
||||
mcpLog("sse_event_received", { kind: ev.kind });
|
||||
if (ev.kind === "message") {
|
||||
const d = ev.data;
|
||||
const fromName = String(d.sender_name ?? "unknown");
|
||||
@@ -295,17 +378,51 @@ export async function startMcpServer(): Promise<void> {
|
||||
},
|
||||
},
|
||||
});
|
||||
mcpLog("channel_emitted", { content_preview: content.slice(0, 80), mesh: String(d.mesh ?? "") });
|
||||
// 1.34.8: this row was just surfaced inline as a channel
|
||||
// reminder; mark it seen so the next launch's welcome doesn't
|
||||
// re-surface it as "unread." Best-effort: a failure here just
|
||||
// means the welcome will list one extra row, not data loss.
|
||||
const inboxRowId = String(d.id ?? "");
|
||||
if (inboxRowId) {
|
||||
void daemonMarkSeen([inboxRowId], sessionTokenForSeen).catch(() => { /* swallow */ });
|
||||
}
|
||||
} catch (err) {
|
||||
mcpLog("channel_emit_failed", { err: String(err) });
|
||||
process.stderr.write(`[claudemesh-mcp] channel emit failed: ${err}\n`);
|
||||
}
|
||||
} else if (ev.kind === "peer_join" || ev.kind === "peer_leave" || ev.kind === "system") {
|
||||
const d = ev.data;
|
||||
const eventName = String(d.event ?? ev.kind);
|
||||
// 1.34.9: enrich peer_join/leave with the context the broker
|
||||
// already ships (name, pubkey prefix, groups, returning summary).
|
||||
// Pre-1.34.9 we surfaced just the displayName, which is ambiguous
|
||||
// when two sessions share a name (e.g. two `agutierrez` peers in
|
||||
// different cwds). Pubkey prefix disambiguates; groups hint at
|
||||
// role (e.g. "[ops, devs]"). cwd / role aren't in the broker
|
||||
// event yet, so they're skipped — adding them broker-side is a
|
||||
// separate ship.
|
||||
const renderPeerLine = (verb: string): string => {
|
||||
const name = String(d.name ?? "unknown");
|
||||
const pubkey = String(d.pubkey ?? "");
|
||||
const pubkeyTag = pubkey ? ` (${pubkey.slice(0, 8)})` : "";
|
||||
const groups = Array.isArray(d.groups) ? d.groups : [];
|
||||
const groupNames = groups
|
||||
.map((g) => (typeof g === "object" && g !== null && "name" in g ? String((g as { name: unknown }).name) : typeof g === "string" ? g : ""))
|
||||
.filter(Boolean);
|
||||
const groupsTag = groupNames.length > 0 ? ` [${groupNames.join(", ")}]` : "";
|
||||
const lastSeen = typeof d.lastSeenAt === "string" ? d.lastSeenAt : null;
|
||||
const summary = typeof d.summary === "string" && d.summary.trim() ? d.summary.trim() : null;
|
||||
const returningTail = lastSeen
|
||||
? ` — last seen ${new Date(lastSeen).toLocaleTimeString()}${summary ? ` · "${summary.slice(0, 80)}"` : ""}`
|
||||
: "";
|
||||
return `[system] Peer "${name}"${pubkeyTag}${groupsTag} ${verb} the mesh${returningTail}`;
|
||||
};
|
||||
let content: string;
|
||||
if (ev.kind === "peer_join") {
|
||||
content = `[system] Peer "${String(d.name ?? "unknown")}" joined the mesh`;
|
||||
content = renderPeerLine(eventName === "peer_returned" ? "returned to" : "joined");
|
||||
} else if (ev.kind === "peer_leave") {
|
||||
content = `[system] Peer "${String(d.name ?? "unknown")}" left the mesh`;
|
||||
content = renderPeerLine("left");
|
||||
} else {
|
||||
content = `[system] ${eventName}: ${JSON.stringify(d).slice(0, 240)}`;
|
||||
}
|
||||
@@ -318,12 +435,55 @@ export async function startMcpServer(): Promise<void> {
|
||||
kind: "system",
|
||||
event: eventName,
|
||||
mesh_slug: String(d.mesh ?? ""),
|
||||
...(typeof d.name === "string" ? { peer_name: d.name } : {}),
|
||||
...(typeof d.pubkey === "string" ? { peer_pubkey: d.pubkey } : {}),
|
||||
...(Array.isArray(d.groups) ? { peer_groups: JSON.stringify(d.groups) } : {}),
|
||||
...(typeof d.lastSeenAt === "string" ? { peer_last_seen_at: d.lastSeenAt } : {}),
|
||||
...(typeof d.summary === "string" ? { peer_summary: d.summary } : {}),
|
||||
},
|
||||
},
|
||||
});
|
||||
} catch { /* best effort */ }
|
||||
}
|
||||
});
|
||||
}, { sessionToken: sessionTokenForSeen });
|
||||
|
||||
// 1.34.6 — Welcome: single emit on oninitialized + 3s grace.
|
||||
//
|
||||
// The earlier "timing race" theory was wrong. Reading Claude Code's
|
||||
// binary at the `notifications/claude/channel` Zod schema:
|
||||
//
|
||||
// IJ_ = y.object({
|
||||
// method: y.literal("notifications/claude/channel"),
|
||||
// params: y.object({
|
||||
// content: y.string(),
|
||||
// meta: y.record(y.string(), y.string()).optional()
|
||||
// })
|
||||
// })
|
||||
//
|
||||
// `meta` MUST be a record of string-to-string. Pre-1.34.6 the
|
||||
// welcome shipped numbers (`peer_count`, `unread_count`) and arrays
|
||||
// (`peer_names`, `latest_message_ids`) — Zod rejected the entire
|
||||
// notification before it ever reached the channel handler.
|
||||
//
|
||||
// Live peer DMs always survived because their meta values all went
|
||||
// through `String(...)`. The welcome was the only notification
|
||||
// shape with non-string meta — uniquely affected, schema-rejected,
|
||||
// silently dropped.
|
||||
//
|
||||
// 1.34.6 fixes the meta values (see `emitMeshWelcome`) so the
|
||||
// notification passes validation; the dual-lane retry from 1.34.5
|
||||
// is no longer necessary and would now surface a duplicate. Back to
|
||||
// a single emit, with a 3s grace after `oninitialized` — enough for
|
||||
// the React effect that registers the channel handler to run, but
|
||||
// tight enough to feel like a launch handshake.
|
||||
const WELCOME_GRACE_MS = 3_000;
|
||||
let welcomeSent = false;
|
||||
server.oninitialized = () => {
|
||||
mcpLog("server_initialized");
|
||||
if (welcomeSent) return;
|
||||
welcomeSent = true;
|
||||
setTimeout(() => { void emitMeshWelcome(server, mcpLog); }, WELCOME_GRACE_MS);
|
||||
};
|
||||
|
||||
const transport = new StdioServerTransport();
|
||||
await server.connect(transport);
|
||||
@@ -341,6 +501,193 @@ export async function startMcpServer(): Promise<void> {
|
||||
process.on("SIGINT", shutdown);
|
||||
}
|
||||
|
||||
/**
|
||||
* Mesh-connected welcome. Runs once 5s after the MCP transport is up,
|
||||
* regardless of inbox state. The point isn't just to summarize unread —
|
||||
* an empty welcome still confirms to the user that the mesh pipe is
|
||||
* live, names the session, says how many peers are visible, and lists
|
||||
* the canonical CLI commands so the model can use them mid-turn.
|
||||
*
|
||||
* Composes from up to three best-effort daemon queries:
|
||||
* - `/v1/sessions/me` → display name + session pubkey + mesh
|
||||
* (requires session token; absent on bare `claudemesh mcp`)
|
||||
* - `/v1/peers?mesh=…` → live peer count, filtered to non-control-plane
|
||||
* - `/v1/inbox?…` → recent message count + up to 3 previews
|
||||
*
|
||||
* Each query degrades silently — a missing field becomes "unknown" or
|
||||
* is omitted. The welcome ALWAYS emits unless the IPC socket is
|
||||
* unreachable; that's the design contract: "you launched into the
|
||||
* mesh, here's what you've got."
|
||||
*/
|
||||
async function emitMeshWelcome(
|
||||
server: import("@modelcontextprotocol/sdk/server/index.js").Server,
|
||||
mcpLog: (msg: string, meta?: Record<string, unknown>) => void,
|
||||
): Promise<void> {
|
||||
const { readSessionTokenFromEnv } = await import("~/services/session/token.js");
|
||||
const sessionToken = readSessionTokenFromEnv();
|
||||
|
||||
// 1) Self identity. Token-less path (bare `claudemesh mcp` outside a
|
||||
// launch) just leaves these undefined; the welcome still goes out.
|
||||
let selfDisplayName: string | undefined;
|
||||
let selfSessionPubkey: string | undefined;
|
||||
let selfMeshSlug: string | undefined;
|
||||
let selfRole: string | undefined;
|
||||
if (sessionToken) {
|
||||
try {
|
||||
const { status, body } = await daemonGet("/v1/sessions/me", { sessionToken });
|
||||
if (status === 200 && body?.session) {
|
||||
selfDisplayName = body.session.displayName;
|
||||
selfMeshSlug = body.session.mesh;
|
||||
selfRole = body.session.role;
|
||||
selfSessionPubkey = body.session.presence?.sessionPubkey;
|
||||
}
|
||||
} catch (e) { mcpLog("welcome_self_lookup_failed", { err: String(e) }); }
|
||||
}
|
||||
|
||||
// 2) Live peer count. Match the same filter the launch banner uses
|
||||
// (`channel !== "claudemesh-daemon"`) so the welcome's number agrees
|
||||
// with the "N peers online" line that just printed in the terminal.
|
||||
// We also fall back to `peerRole !== "control-plane"` for newer
|
||||
// brokers that emit the role taxonomy. Excluding self uses both
|
||||
// session pubkey AND session id (older brokers may not surface
|
||||
// peerRole, so name-only matching would fail).
|
||||
let peerCount = -1;
|
||||
let peerNames: string[] = [];
|
||||
try {
|
||||
const path = selfMeshSlug ? `/v1/peers?mesh=${encodeURIComponent(selfMeshSlug)}` : "/v1/peers";
|
||||
const { status, body } = await daemonGet(path, { sessionToken });
|
||||
if (status === 200 && Array.isArray(body?.peers)) {
|
||||
const peers = body.peers as Array<Record<string, unknown>>;
|
||||
const real = peers.filter((p) => {
|
||||
const channel = String(p.channel ?? "");
|
||||
const peerRole = String(p.peerRole ?? "");
|
||||
const isInfra = channel === "claudemesh-daemon" || peerRole === "control-plane";
|
||||
if (isInfra) return false;
|
||||
if (selfSessionPubkey && p.pubkey === selfSessionPubkey) return false;
|
||||
return true;
|
||||
});
|
||||
peerCount = real.length;
|
||||
peerNames = real
|
||||
.map((p) => String(p.displayName ?? "unknown"))
|
||||
.filter((n, i, arr) => arr.indexOf(n) === i)
|
||||
.slice(0, 5);
|
||||
mcpLog("welcome_peers_resolved", { total: peers.length, real: real.length });
|
||||
} else {
|
||||
mcpLog("welcome_peers_status", { status });
|
||||
}
|
||||
} catch (e) { mcpLog("welcome_peers_lookup_failed", { err: String(e) }); }
|
||||
|
||||
// 3) Unread inbox. 1.34.8 replaced the "last 24h" window with the
|
||||
// proper read-state filter — `?unread_only=true` returns rows whose
|
||||
// `seen_at` is NULL. The list call uses `mark_seen=false` so the
|
||||
// welcome doesn't auto-stamp; we stamp explicitly via /v1/inbox/seen
|
||||
// *after* we know the channel notification went out (otherwise a
|
||||
// schema rejection would silently mark rows seen that the user
|
||||
// never actually saw — the original 1.34.6 bug shape).
|
||||
const inboxPath = selfMeshSlug
|
||||
? `/v1/inbox?mesh=${encodeURIComponent(selfMeshSlug)}&unread_only=true&mark_seen=false&limit=50`
|
||||
: `/v1/inbox?unread_only=true&mark_seen=false&limit=50`;
|
||||
let inboxItems: Array<Record<string, unknown>> = [];
|
||||
try {
|
||||
const { status, body } = await daemonGet(inboxPath, { sessionToken });
|
||||
if (status === 200 && Array.isArray(body?.items)) {
|
||||
inboxItems = body.items as Array<Record<string, unknown>>;
|
||||
}
|
||||
} catch (e) { mcpLog("welcome_inbox_lookup_failed", { err: String(e) }); }
|
||||
|
||||
// Compose the body. Markdown-friendly so it renders cleanly in the
|
||||
// Claude Code channel reminder block.
|
||||
const lines: string[] = [];
|
||||
const idTag = selfDisplayName
|
||||
? `${selfDisplayName}${selfSessionPubkey ? ` (${selfSessionPubkey.slice(0, 8)})` : ""}${selfRole ? ` [${selfRole}]` : ""}`
|
||||
: "session";
|
||||
const meshTag = selfMeshSlug ? ` on mesh \`${selfMeshSlug}\`` : "";
|
||||
lines.push(`🌐 [welcome] claudemesh connected — you are **${idTag}**${meshTag}.`);
|
||||
|
||||
if (peerCount === 0) {
|
||||
lines.push(`👥 No other peers online right now.`);
|
||||
} else if (peerCount > 0) {
|
||||
const namesPreview = peerNames.join(", ");
|
||||
const more = peerCount > peerNames.length ? ` …and ${peerCount - peerNames.length} more` : "";
|
||||
lines.push(`👥 ${peerCount} peer${peerCount === 1 ? "" : "s"} online: ${namesPreview}${more}`);
|
||||
} else {
|
||||
lines.push(`👥 Peer list unavailable (daemon query failed).`);
|
||||
}
|
||||
|
||||
if (inboxItems.length === 0) {
|
||||
lines.push(`📥 No unread messages.`);
|
||||
} else {
|
||||
lines.push(`📥 ${inboxItems.length} unread message${inboxItems.length === 1 ? "" : "s"}:`);
|
||||
for (const it of inboxItems.slice(0, 3)) {
|
||||
const sender = String(it.sender_name ?? "unknown");
|
||||
const senderPub = String(it.sender_pubkey ?? "").slice(0, 8);
|
||||
const tag = sender !== senderPub ? `${sender} (${senderPub})` : senderPub;
|
||||
const bodyText = (typeof it.body === "string" ? it.body : "(encrypted)").slice(0, 60);
|
||||
const time = it.received_at ? new Date(String(it.received_at)).toLocaleTimeString() : "";
|
||||
lines.push(` ${tag} ${time}: ${bodyText}`);
|
||||
}
|
||||
if (inboxItems.length > 3) lines.push(` …and ${inboxItems.length - 3} more`);
|
||||
}
|
||||
|
||||
// CLI hints — what the model should call when the user asks. Listed
|
||||
// here as a one-liner so the welcome stays compact.
|
||||
lines.push(`💡 Use: \`claudemesh peer list\` · \`claudemesh send <peer> <msg>\` · \`claudemesh inbox\``);
|
||||
// Skill pointer — the `claudemesh` skill in the user's Claude install
|
||||
// documents every CLI verb, JSON shapes, channel attributes, and
|
||||
// common patterns. If the model isn't already loaded with it, this is
|
||||
// the cue to read it once before acting on the mesh.
|
||||
lines.push(`📚 Read the \`claudemesh\` skill (SKILL.md) for full CLI / channel / inbox reference if not yet in context.`);
|
||||
|
||||
const content = lines.join("\n");
|
||||
try {
|
||||
// Claude Code's `notifications/claude/channel` schema is
|
||||
// `meta: y.record(y.string(), y.string())` — string values only.
|
||||
// Pre-1.34.6 we sent numbers / arrays in `peer_count`, `unread_count`,
|
||||
// `peer_names`, `latest_message_ids`; Zod silently rejected the
|
||||
// whole notification before it reached the channel handler. Live
|
||||
// peer DMs survived because their meta values all went through
|
||||
// `String(...)`. Coerce everything here too — arrays stringify as
|
||||
// JSON so downstream consumers can re-parse if they want, and the
|
||||
// counts become digit strings (parseable on the receiving side).
|
||||
await server.notification({
|
||||
method: "notifications/claude/channel",
|
||||
params: {
|
||||
content,
|
||||
meta: {
|
||||
kind: "welcome",
|
||||
self_display_name: selfDisplayName ?? "",
|
||||
self_session_pubkey: selfSessionPubkey ?? "",
|
||||
self_role: selfRole ?? "",
|
||||
mesh_slug: selfMeshSlug ?? "",
|
||||
peer_count: peerCount >= 0 ? String(peerCount) : "",
|
||||
peer_names: JSON.stringify(peerNames),
|
||||
unread_count: String(inboxItems.length),
|
||||
latest_message_ids: JSON.stringify(
|
||||
inboxItems.slice(0, 10).map((it) => String(it.id ?? "")),
|
||||
),
|
||||
},
|
||||
},
|
||||
});
|
||||
mcpLog("welcome_emitted", {
|
||||
mesh: selfMeshSlug ?? "",
|
||||
peer_count: peerCount,
|
||||
unread_count: inboxItems.length,
|
||||
});
|
||||
// 1.34.8: stamp the rows we just surfaced. Done AFTER the
|
||||
// notification succeeds so a Zod-rejected welcome (the 1.34.6 bug
|
||||
// shape) doesn't silently mark rows seen that the user never
|
||||
// actually saw. Best-effort.
|
||||
if (inboxItems.length > 0) {
|
||||
const ids = inboxItems.map((it) => String(it.id ?? "")).filter(Boolean);
|
||||
if (ids.length > 0) {
|
||||
void daemonMarkSeen(ids, sessionToken).catch(() => { /* swallow */ });
|
||||
}
|
||||
}
|
||||
} catch (err) {
|
||||
mcpLog("welcome_emit_failed", { err: String(err) });
|
||||
}
|
||||
}
|
||||
|
||||
// ── mesh-service proxy mode (unchanged from prior versions) ────────────
|
||||
|
||||
/**
|
||||
|
||||
@@ -52,6 +52,105 @@ export async function tryListPeersViaDaemon(mesh?: string): Promise<unknown[] |
|
||||
}
|
||||
}
|
||||
|
||||
/**
|
||||
* 1.34.0 — Try fetching the persisted inbox from the daemon.
|
||||
*
|
||||
* Reads from `~/.claudemesh/daemon/inbox.db` via `/v1/inbox`. This is
|
||||
* the authoritative source of received messages — pushes from the
|
||||
* broker land here through the daemon's session-WS / member-WS push
|
||||
* handler. The pre-1.34.0 cold-path inbox command opened a fresh
|
||||
* BrokerClient and drained an empty in-memory buffer, which never
|
||||
* matched what the daemon was actually receiving.
|
||||
*/
|
||||
export interface InboxItem {
|
||||
id: string;
|
||||
client_message_id: string;
|
||||
broker_message_id: string | null;
|
||||
mesh: string;
|
||||
topic: string | null;
|
||||
sender_pubkey: string;
|
||||
sender_name: string;
|
||||
body: string | null;
|
||||
received_at: string;
|
||||
reply_to_id: string | null;
|
||||
/** 1.34.8: ISO timestamp of when the row was first surfaced to the
|
||||
* user (interactive listing or live channel reminder). `null` =
|
||||
* never seen. */
|
||||
seen_at?: string | null;
|
||||
}
|
||||
|
||||
export async function tryListInboxViaDaemon(
|
||||
mesh?: string,
|
||||
limit = 100,
|
||||
opts: { unreadOnly?: boolean; markSeen?: boolean } = {},
|
||||
): Promise<InboxItem[] | null> {
|
||||
if (!(await daemonReachable())) return null;
|
||||
try {
|
||||
const params: string[] = [`limit=${limit}`];
|
||||
if (mesh) params.push(`mesh=${encodeURIComponent(mesh)}`);
|
||||
// 1.34.8: read-state filters. `unread_only=true` narrows to seen_at
|
||||
// IS NULL; `mark_seen=false` lets the caller peek without flipping
|
||||
// the seen flag (used by the welcome push on the MCP side, not the
|
||||
// CLI). Default behavior matches pre-1.34.8 — return everything
|
||||
// and stamp it seen — so existing callers keep working.
|
||||
if (opts.unreadOnly) params.push("unread_only=true");
|
||||
if (opts.markSeen === false) params.push("mark_seen=false");
|
||||
const path = `/v1/inbox?${params.join("&")}`;
|
||||
const res = await ipc<{ items?: InboxItem[] }>({ path, timeoutMs: 3_000 });
|
||||
if (res.status !== 200) return null;
|
||||
return Array.isArray(res.body.items) ? res.body.items : [];
|
||||
} catch (err) {
|
||||
const msg = String(err);
|
||||
if (/ENOENT|ECONNREFUSED|ipc_timeout/.test(msg)) return null;
|
||||
return null;
|
||||
}
|
||||
}
|
||||
|
||||
/**
|
||||
* 1.34.7: bulk-delete inbox rows. `mesh` scopes to one mesh (omit =
|
||||
* across every attached mesh); `beforeIso` filters by `received_at <
|
||||
* Date.parse(beforeIso)`. Returns the number of rows removed, or null
|
||||
* when the daemon couldn't be reached.
|
||||
*/
|
||||
export async function tryFlushInboxViaDaemon(
|
||||
args: { mesh?: string; beforeIso?: string } = {},
|
||||
): Promise<number | null> {
|
||||
if (!(await daemonReachable())) return null;
|
||||
try {
|
||||
const params: string[] = [];
|
||||
if (args.mesh) params.push(`mesh=${encodeURIComponent(args.mesh)}`);
|
||||
if (args.beforeIso) params.push(`before=${encodeURIComponent(args.beforeIso)}`);
|
||||
const path = `/v1/inbox${params.length ? `?${params.join("&")}` : ""}`;
|
||||
const res = await ipc<{ removed?: number }>({ path, method: "DELETE", timeoutMs: 3_000 });
|
||||
if (res.status !== 200) return null;
|
||||
return typeof res.body.removed === "number" ? res.body.removed : null;
|
||||
} catch (err) {
|
||||
const msg = String(err);
|
||||
if (/ENOENT|ECONNREFUSED|ipc_timeout/.test(msg)) return null;
|
||||
return null;
|
||||
}
|
||||
}
|
||||
|
||||
/** 1.34.7: delete one inbox row by id. Returns true iff the row was
|
||||
* removed; false on 404; null on transport failure. */
|
||||
export async function tryDeleteInboxRowViaDaemon(id: string): Promise<boolean | null> {
|
||||
if (!(await daemonReachable())) return null;
|
||||
try {
|
||||
const res = await ipc<{ removed?: number }>({
|
||||
path: `/v1/inbox/${encodeURIComponent(id)}`,
|
||||
method: "DELETE",
|
||||
timeoutMs: 3_000,
|
||||
});
|
||||
if (res.status === 404) return false;
|
||||
if (res.status !== 200) return null;
|
||||
return (res.body.removed ?? 0) > 0;
|
||||
} catch (err) {
|
||||
const msg = String(err);
|
||||
if (/ENOENT|ECONNREFUSED|ipc_timeout/.test(msg)) return null;
|
||||
return null;
|
||||
}
|
||||
}
|
||||
|
||||
/** Try fetching mesh-published skills through the daemon. */
|
||||
export async function tryListSkillsViaDaemon(mesh?: string): Promise<unknown[] | null> {
|
||||
if (!(await daemonReachable())) return null;
|
||||
|
||||
@@ -220,8 +220,12 @@ async function spawnDaemon(opts: EnsureDaemonOpts): Promise<SpawnResult> {
|
||||
try {
|
||||
const { spawn } = await import("node:child_process");
|
||||
const binary = await resolveCliBinary();
|
||||
const args = ["daemon", "up"];
|
||||
if (opts.mesh) args.push("--mesh", opts.mesh);
|
||||
// 1.34.12: pass --foreground because the lifecycle helper IS the
|
||||
// detacher in this path — it spawns with detached:true + stdio:
|
||||
// ignore. If we let the child re-detach (the new default), we'd
|
||||
// double-fork and orphan the grandchild. --mesh is dropped (1.34.10
|
||||
// deprecation; daemon attaches to every joined mesh).
|
||||
const args = ["daemon", "up", "--foreground"];
|
||||
|
||||
const child = spawn(binary, args, {
|
||||
detached: true,
|
||||
|
||||
57
apps/cli/tests/unit/paths-stale-env.test.ts
Normal file
57
apps/cli/tests/unit/paths-stale-env.test.ts
Normal file
@@ -0,0 +1,57 @@
|
||||
import { describe, it, expect, beforeEach, afterEach, vi } from "vitest";
|
||||
import { mkdirSync, rmSync, existsSync } from "node:fs";
|
||||
import { join } from "node:path";
|
||||
import { tmpdir, homedir } from "node:os";
|
||||
|
||||
/** Each test imports a fresh copy of paths.ts via dynamic import +
|
||||
* `_resetPathsForTest()` so memoization doesn't leak across cases. */
|
||||
|
||||
const TEST_DIR = join(tmpdir(), "claudemesh-paths-test-" + Date.now());
|
||||
|
||||
describe("paths CONFIG_DIR resolution", () => {
|
||||
beforeEach(() => {
|
||||
delete process.env.CLAUDEMESH_CONFIG_DIR;
|
||||
if (existsSync(TEST_DIR)) rmSync(TEST_DIR, { recursive: true, force: true });
|
||||
});
|
||||
afterEach(() => {
|
||||
delete process.env.CLAUDEMESH_CONFIG_DIR;
|
||||
if (existsSync(TEST_DIR)) rmSync(TEST_DIR, { recursive: true, force: true });
|
||||
});
|
||||
|
||||
it("falls back to ~/.claudemesh when env var is unset", async () => {
|
||||
const mod = await import("~/constants/paths.js");
|
||||
mod._resetPathsForTest();
|
||||
expect(mod.PATHS.CONFIG_DIR).toBe(join(homedir(), ".claudemesh"));
|
||||
});
|
||||
|
||||
it("honors CLAUDEMESH_CONFIG_DIR when the dir exists, even without config.json", async () => {
|
||||
mkdirSync(TEST_DIR, { recursive: true });
|
||||
process.env.CLAUDEMESH_CONFIG_DIR = TEST_DIR;
|
||||
const mod = await import("~/constants/paths.js");
|
||||
mod._resetPathsForTest();
|
||||
expect(mod.PATHS.CONFIG_DIR).toBe(TEST_DIR);
|
||||
});
|
||||
|
||||
it("falls back to default when env points at a missing dir (stale-tmpdir case)", async () => {
|
||||
process.env.CLAUDEMESH_CONFIG_DIR = "/var/folders/_nonexistent_claudemesh_dir_xyz123";
|
||||
const mod = await import("~/constants/paths.js");
|
||||
mod._resetPathsForTest();
|
||||
// Suppress the stderr warning to keep test output clean
|
||||
const stderr = vi.spyOn(process.stderr, "write").mockImplementation(() => true);
|
||||
try {
|
||||
expect(mod.PATHS.CONFIG_DIR).toBe(join(homedir(), ".claudemesh"));
|
||||
} finally {
|
||||
stderr.mockRestore();
|
||||
}
|
||||
});
|
||||
|
||||
it("memoizes — second access returns the same path even if env changes mid-process", async () => {
|
||||
mkdirSync(TEST_DIR, { recursive: true });
|
||||
process.env.CLAUDEMESH_CONFIG_DIR = TEST_DIR;
|
||||
const mod = await import("~/constants/paths.js");
|
||||
mod._resetPathsForTest();
|
||||
const first = mod.PATHS.CONFIG_DIR;
|
||||
process.env.CLAUDEMESH_CONFIG_DIR = "/something/else";
|
||||
expect(mod.PATHS.CONFIG_DIR).toBe(first);
|
||||
});
|
||||
});
|
||||
@@ -1,55 +1,161 @@
|
||||
import Link from "next/link";
|
||||
|
||||
import {
|
||||
CHANGELOG_ENTRIES,
|
||||
CHANGELOG_TYPE_COLOR,
|
||||
CHANGELOG_TYPE_LABELS,
|
||||
} from "~/modules/marketing/home/changelog-data";
|
||||
|
||||
export const metadata = {
|
||||
title: "Changelog — claudemesh",
|
||||
description: "Release history for claudemesh-cli.",
|
||||
description:
|
||||
"Release history for claudemesh-cli — every shipped version, with the why behind it.",
|
||||
};
|
||||
|
||||
const ENTRIES = [
|
||||
{ version: "0.1.4", date: "2026-04-06", type: "feat", summary: "Stateful welcome screen, PROTOCOL.md, THREAT_MODEL.md, Windows CI matrix" },
|
||||
{ version: "0.1.3", date: "2026-04-05", type: "feat", summary: "claudemesh --version, status, doctor commands" },
|
||||
{ version: "0.1.2", date: "2026-04-05", type: "feat", summary: "claudemesh launch command, transparency banner, decrypt fix, Windows support" },
|
||||
];
|
||||
|
||||
const TYPE_LABELS: Record<string, string> = { feat: "Feature", fix: "Fix", docs: "Docs" };
|
||||
const TYPE_COLORS: Record<string, string> = { feat: "bg-[var(--cm-clay)]", fix: "bg-[var(--cm-cactus)]", docs: "bg-[var(--cm-oat)]" };
|
||||
|
||||
export default function ChangelogPage() {
|
||||
return (
|
||||
<section className="mx-auto max-w-3xl px-6 py-24 md:py-32">
|
||||
<h1
|
||||
className="text-[clamp(2rem,4.5vw,3rem)] font-medium leading-[1.1] text-[var(--cm-fg)]"
|
||||
style={{ fontFamily: "var(--cm-font-serif)" }}
|
||||
>
|
||||
Changelog
|
||||
</h1>
|
||||
<p
|
||||
className="mt-4 text-[15px] text-[var(--cm-fg-secondary)]"
|
||||
style={{ fontFamily: "var(--cm-font-sans)" }}
|
||||
>
|
||||
Every shipped version of claudemesh-cli.
|
||||
</p>
|
||||
<div className="mt-12 space-y-8">
|
||||
{ENTRIES.map((entry) => (
|
||||
<article key={entry.version} className="border-b border-[var(--cm-border)] pb-6">
|
||||
<div className="flex items-center gap-3">
|
||||
<span
|
||||
className={`rounded-[4px] px-2 py-0.5 text-[10px] font-medium uppercase tracking-wider text-[var(--cm-bg)] ${TYPE_COLORS[entry.type] || "bg-[var(--cm-fg-tertiary)]"}`}
|
||||
style={{ fontFamily: "var(--cm-font-mono)" }}
|
||||
>
|
||||
{TYPE_LABELS[entry.type] || entry.type}
|
||||
</span>
|
||||
<span className="text-[18px] font-medium text-[var(--cm-fg)]" style={{ fontFamily: "var(--cm-font-serif)" }}>
|
||||
v{entry.version}
|
||||
</span>
|
||||
<time dateTime={entry.date} className="text-[11px] text-[var(--cm-fg-tertiary)]" style={{ fontFamily: "var(--cm-font-mono)" }}>
|
||||
{new Date(entry.date).toLocaleDateString("en-US", { year: "numeric", month: "short", day: "numeric" })}
|
||||
</time>
|
||||
</div>
|
||||
<p className="mt-2 text-[14px] leading-[1.6] text-[var(--cm-fg-secondary)]" style={{ fontFamily: "var(--cm-font-sans)" }}>
|
||||
{entry.summary}
|
||||
</p>
|
||||
</article>
|
||||
))}
|
||||
<div className="mb-12">
|
||||
<p
|
||||
className="text-[11px] uppercase tracking-[0.2em] text-[var(--cm-fg-tertiary)]"
|
||||
style={{ fontFamily: "var(--cm-font-mono)" }}
|
||||
>
|
||||
claudemesh-cli · release log
|
||||
</p>
|
||||
<h1
|
||||
className="mt-3 text-[clamp(2rem,4.5vw,3rem)] font-medium leading-[1.1] text-[var(--cm-fg)]"
|
||||
style={{ fontFamily: "var(--cm-font-serif)" }}
|
||||
>
|
||||
Changelog
|
||||
</h1>
|
||||
<p
|
||||
className="mt-4 max-w-xl text-[15px] leading-[1.65] text-[var(--cm-fg-secondary)]"
|
||||
style={{ fontFamily: "var(--cm-font-sans)" }}
|
||||
>
|
||||
Hand-picked, load-bearing ships from{" "}
|
||||
<span className="text-[var(--cm-fg)]">v0.1.0</span> through{" "}
|
||||
<span className="text-[var(--cm-clay)]">v1.34.15</span>. For the
|
||||
byte-level diff, the canonical{" "}
|
||||
<Link
|
||||
href="https://github.com/alezmad/claudemesh/blob/main/apps/cli/CHANGELOG.md"
|
||||
className="underline decoration-[var(--cm-fg-tertiary)] underline-offset-4 transition-colors hover:text-[var(--cm-fg)] hover:decoration-[var(--cm-clay)]"
|
||||
>
|
||||
CHANGELOG.md
|
||||
</Link>{" "}
|
||||
lives in the repo.
|
||||
</p>
|
||||
</div>
|
||||
|
||||
{/* Vertical timeline rail */}
|
||||
<div className="relative">
|
||||
<div
|
||||
className="absolute left-[7px] top-2 hidden h-full w-px md:block"
|
||||
style={{
|
||||
background:
|
||||
"linear-gradient(to bottom, var(--cm-clay) 0%, var(--cm-fig) 30%, var(--cm-cactus) 60%, transparent 100%)",
|
||||
}}
|
||||
/>
|
||||
|
||||
<div className="space-y-10">
|
||||
{CHANGELOG_ENTRIES.map((entry, idx) => (
|
||||
<article
|
||||
key={entry.version + entry.date}
|
||||
className="relative md:pl-10"
|
||||
>
|
||||
{/* Dot on rail */}
|
||||
<div
|
||||
className="absolute left-0 top-[10px] hidden h-[15px] w-[15px] rounded-full border-2 md:block"
|
||||
style={{
|
||||
borderColor: CHANGELOG_TYPE_COLOR[entry.type],
|
||||
backgroundColor: "var(--cm-bg)",
|
||||
}}
|
||||
>
|
||||
<div
|
||||
className="absolute inset-[3px] rounded-full"
|
||||
style={{
|
||||
backgroundColor: CHANGELOG_TYPE_COLOR[entry.type],
|
||||
opacity: idx === 0 ? 1 : 0.5,
|
||||
}}
|
||||
/>
|
||||
</div>
|
||||
|
||||
<header className="mb-3 flex flex-wrap items-baseline gap-x-3 gap-y-1">
|
||||
<span
|
||||
className="rounded-[3px] px-1.5 py-0.5 text-[10px] font-medium uppercase tracking-wider"
|
||||
style={{
|
||||
fontFamily: "var(--cm-font-mono)",
|
||||
backgroundColor: CHANGELOG_TYPE_COLOR[entry.type],
|
||||
color: "var(--cm-gray-900)",
|
||||
}}
|
||||
>
|
||||
{CHANGELOG_TYPE_LABELS[entry.type]}
|
||||
</span>
|
||||
<span
|
||||
className="text-[18px] font-medium text-[var(--cm-fg)]"
|
||||
style={{ fontFamily: "var(--cm-font-serif)" }}
|
||||
>
|
||||
v{entry.version}
|
||||
</span>
|
||||
<time
|
||||
dateTime={entry.date}
|
||||
className="text-[11px] text-[var(--cm-fg-tertiary)]"
|
||||
style={{ fontFamily: "var(--cm-font-mono)" }}
|
||||
>
|
||||
{new Date(entry.date).toLocaleDateString("en-US", {
|
||||
year: "numeric",
|
||||
month: "short",
|
||||
day: "numeric",
|
||||
})}
|
||||
</time>
|
||||
</header>
|
||||
|
||||
<h2
|
||||
className="text-[15px] font-medium text-[var(--cm-fg)]"
|
||||
style={{ fontFamily: "var(--cm-font-sans)" }}
|
||||
>
|
||||
{entry.title}
|
||||
</h2>
|
||||
|
||||
<p
|
||||
className="mt-2 text-[14px] leading-[1.7] text-[var(--cm-fg-secondary)]"
|
||||
style={{ fontFamily: "var(--cm-font-sans)" }}
|
||||
>
|
||||
{entry.summary}
|
||||
</p>
|
||||
</article>
|
||||
))}
|
||||
</div>
|
||||
</div>
|
||||
|
||||
<footer className="mt-20 border-t border-[var(--cm-border)] pt-8">
|
||||
<p
|
||||
className="text-[13px] text-[var(--cm-fg-tertiary)]"
|
||||
style={{ fontFamily: "var(--cm-font-sans)" }}
|
||||
>
|
||||
Tracked at{" "}
|
||||
<Link
|
||||
href="https://github.com/alezmad/claudemesh/blob/main/docs/roadmap.md"
|
||||
className="underline decoration-[var(--cm-fg-tertiary)] underline-offset-4 transition-colors hover:text-[var(--cm-fg)] hover:decoration-[var(--cm-clay)]"
|
||||
>
|
||||
docs/roadmap.md
|
||||
</Link>
|
||||
. Specs at{" "}
|
||||
<Link
|
||||
href="https://github.com/alezmad/claudemesh/tree/main/.artifacts/specs"
|
||||
className="underline decoration-[var(--cm-fg-tertiary)] underline-offset-4 transition-colors hover:text-[var(--cm-fg)] hover:decoration-[var(--cm-clay)]"
|
||||
>
|
||||
.artifacts/specs/
|
||||
</Link>
|
||||
. Tagged binaries on{" "}
|
||||
<Link
|
||||
href="https://github.com/alezmad/claudemesh/releases"
|
||||
className="underline decoration-[var(--cm-fg-tertiary)] underline-offset-4 transition-colors hover:text-[var(--cm-fg)] hover:decoration-[var(--cm-clay)]"
|
||||
>
|
||||
GitHub Releases
|
||||
</Link>
|
||||
.
|
||||
</p>
|
||||
</footer>
|
||||
</section>
|
||||
);
|
||||
}
|
||||
|
||||
@@ -3,6 +3,7 @@ import { Features } from "~/modules/marketing/home/features";
|
||||
import { WhereMeshFits } from "~/modules/marketing/home/where-mesh-fits";
|
||||
import { WhatIsClaudemesh } from "~/modules/marketing/home/what-is-claudemesh";
|
||||
import { Timeline } from "~/modules/marketing/home/timeline";
|
||||
import { LatestReleases } from "~/modules/marketing/home/latest-releases";
|
||||
import { Pricing } from "~/modules/marketing/home/pricing";
|
||||
import { FAQ } from "~/modules/marketing/home/faq";
|
||||
import { CallToAction } from "~/modules/marketing/home/cta";
|
||||
@@ -22,6 +23,7 @@ const HomePage = () => {
|
||||
<WhereMeshFits />
|
||||
<WhatIsClaudemesh />
|
||||
<Timeline />
|
||||
<LatestReleases count={5} />
|
||||
<Pricing />
|
||||
<FAQ />
|
||||
<CallToAction />
|
||||
|
||||
168
apps/web/src/modules/marketing/home/changelog-data.ts
Normal file
168
apps/web/src/modules/marketing/home/changelog-data.ts
Normal file
@@ -0,0 +1,168 @@
|
||||
/**
|
||||
* Single source of truth for the curated release log surfaced on:
|
||||
* - /changelog (full timeline)
|
||||
* - / (Latest Releases compact strip)
|
||||
*
|
||||
* Lives outside `app/.../page.tsx` because Next.js's app-router type generator
|
||||
* rejects non-conforming exports from route files (only `default`, `metadata`,
|
||||
* `dynamic`, etc. are allowed). Importing data from a plain module sidesteps
|
||||
* the constraint without changing route semantics.
|
||||
*
|
||||
* Hand-picked load-bearing ships, newest first. For the byte-level history
|
||||
* see `apps/cli/CHANGELOG.md` in the repo.
|
||||
*/
|
||||
|
||||
export type ChangelogEntry = {
|
||||
version: string;
|
||||
date: string;
|
||||
type: "feat" | "fix" | "docs" | "perf" | "infra";
|
||||
title: string;
|
||||
summary: string;
|
||||
};
|
||||
|
||||
export const CHANGELOG_ENTRIES: ChangelogEntry[] = [
|
||||
{
|
||||
version: "1.34.15",
|
||||
date: "2026-05-04",
|
||||
type: "fix",
|
||||
title: "peer list --mesh scopes; kick refuses control-plane",
|
||||
summary:
|
||||
"Two follow-ups from the multi-session correctness train. peer list --mesh now forwards the slug to the daemon (was aggregating across all attached meshes). The broker refuses no-op kicks against control-plane connections (daemon, dashboard) — they auto-reconnected within seconds — and surfaces them in a new additive ack field. Soft `disconnect` keeps old behavior.",
|
||||
},
|
||||
{
|
||||
version: "1.34.14",
|
||||
date: "2026-05-04",
|
||||
type: "fix",
|
||||
title: "stale CLAUDEMESH_CONFIG_DIR falls back",
|
||||
summary:
|
||||
"When the launched-session env leaked into a later CLI invocation and pointed at a tmpdir that no longer existed, the resolver silently used the dead path and showed “No meshes joined”. Now memoized: env unset → default; env points at a real dir → trust; env set but dir gone → TTY-only stderr warning + fallback to ~/.claudemesh.",
|
||||
},
|
||||
{
|
||||
version: "1.34.7 → 1.34.13",
|
||||
date: "2026-05-04",
|
||||
type: "fix",
|
||||
title: "multi-session correctness train",
|
||||
summary:
|
||||
"Seven releases over a few hours that took claudemesh from “works for one session” to “internally consistent for N sessions on one daemon.” Per-session SSE demux at the bind layer, inbox per-recipient column, daemon detached by default, MCP forwards session token on /v1/events. Architecture invariant: every shared store / channel scopes by recipient.",
|
||||
},
|
||||
{
|
||||
version: "1.32.0",
|
||||
date: "2026-05-04",
|
||||
type: "feat",
|
||||
title: "multi-session UX bundle",
|
||||
summary:
|
||||
"Self-identity via session pubkey, `--self` fan-out for member-pubkey targeting, broker welcome on launch (broker state + peer count + unread inbox). Resolves hex prefixes to full pubkeys before send.",
|
||||
},
|
||||
{
|
||||
version: "1.30.0",
|
||||
date: "2026-05-04",
|
||||
type: "feat",
|
||||
title: "per-session broker presence",
|
||||
summary:
|
||||
"Two `claudemesh launch` sessions in the same cwd finally see each other in `peer list`. Each session has a long-lived broker presence row owned by the daemon, identified by a per-launch ephemeral keypair vouched by the member's stable key. Broker `session_hello` handler with parent-attestation TTL and session-signature checks.",
|
||||
},
|
||||
{
|
||||
version: "1.26.0 → 1.29.0",
|
||||
date: "2026-05-04",
|
||||
type: "feat",
|
||||
title: "multi-mesh daemon · per-session IPC tokens",
|
||||
summary:
|
||||
"One daemon process attaches to every joined mesh simultaneously. Aggregate read routes (/v1/peers, /v1/skills) tag each record with its mesh; explicit ?mesh=<slug> narrows server-side. Per-session IPC tokens scoped to tmpdir mode-0600 so CLI invocations from inside a launched session auto-attribute to its workspace. Self-healing daemon lifecycle (auto-spawn under file-lock, version probe).",
|
||||
},
|
||||
{
|
||||
version: "1.24.0",
|
||||
date: "2026-05-03",
|
||||
type: "feat",
|
||||
title: "daemon required + thin MCP",
|
||||
summary:
|
||||
"MCP server shrinks from 979 LoC to ~200 LoC of push-pipe. The daemon owns the broker WS and feeds the MCP push channel over IPC SSE. `claudemesh install` auto-installs and starts the daemon service. `claudemesh launch` ensures daemon is running before spawning Claude.",
|
||||
},
|
||||
{
|
||||
version: "0.9.0 (1.22.0)",
|
||||
date: "2026-05-03",
|
||||
type: "feat",
|
||||
title: "daemon foundation",
|
||||
summary:
|
||||
"Long-lived process holding one broker WS per attached mesh, durable outbox/inbox in SQLite, IPC over UDS (+ optional loopback TCP w/ bearer), SSE event stream. Caller-stable idempotency on every send. Service install (launchd / systemd-user). Outbox CLI with atomic abort+insert on requeue. Host-fingerprint pin on first run.",
|
||||
},
|
||||
{
|
||||
version: "0.7.0 (1.21.0)",
|
||||
date: "2026-05-03",
|
||||
type: "infra",
|
||||
title: "slug = identifier",
|
||||
summary:
|
||||
"Pre-launch correction of generic SaaS scaffolding. mesh.name and mesh.slug collapse — slug IS the identifier. `claudemesh rename <old-slug> <new-slug>` is the entire rename surface. CLI picker drops the (parens). Server PATCH /api/cli/meshes/:slug body becomes `{ slug }`.",
|
||||
},
|
||||
{
|
||||
version: "0.4.0 → 0.5.2 (1.10.0–1.18.0)",
|
||||
date: "2026-05-03",
|
||||
type: "feat",
|
||||
title: "me/* cross-mesh aggregation",
|
||||
summary:
|
||||
"First cross-mesh read-aggregating verbs. /v1/me/workspace, /v1/me/topics, /v1/me/notifications, /v1/me/activity, /v1/me/search — every aggregating read verb has CLI + web parity. Default-aggregation for `topic list`, `notification list`, `task list`, `state list`, `memory recall` when no --mesh is passed. file share / get with same-host fast path.",
|
||||
},
|
||||
{
|
||||
version: "0.3.0 (1.8.0)",
|
||||
date: "2026-05-02",
|
||||
type: "feat",
|
||||
title: "per-topic encryption (CLI + web)",
|
||||
summary:
|
||||
"Topics generate a 32-byte symmetric key on creation; broker seals via crypto_box for the creator. Pending-seals endpoint, seal POST, claudemesh topic post for encrypted REST sends, decrypt-on-render in topic tail, 30s background re-seal loop. Web side: browser-side persistent ed25519 identity in IndexedDB + encrypt-on-send / decrypt-on-render.",
|
||||
},
|
||||
{
|
||||
version: "1.7.0",
|
||||
date: "2026-05-02",
|
||||
type: "feat",
|
||||
title: "demo cut: topic tail, member list, notifications",
|
||||
summary:
|
||||
"Member sidebar in chat panel with names, online dots, presence summaries. Topic search + member-mention autocomplete. Notification feed at /dashboard listing every @<your-name> reference across all meshes (last 7 days). CLI parity: `claudemesh topic tail` (live SSE consumer), `claudemesh member list`, `claudemesh notification list`.",
|
||||
},
|
||||
{
|
||||
version: "0.2.0 (1.6.0)",
|
||||
date: "2026-05-02",
|
||||
type: "feat",
|
||||
title: "topics + REST gateway + bridge peers",
|
||||
summary:
|
||||
"Topics (channel pub/sub) with mesh = trust boundary, group = identity tag, topic = conversation scope — three orthogonal axes. API keys for non-WebSocket clients. REST /api/v1/* with bearer-token auth (messages, topics, peers, history). Bridge peers belonging to two meshes forwarding a topic between them. Humans-as-peers — peer_type: human plumbed end-to-end.",
|
||||
},
|
||||
{
|
||||
version: "1.5.0",
|
||||
date: "2026-05-02",
|
||||
type: "feat",
|
||||
title: "CLI-first architecture lock-in",
|
||||
summary:
|
||||
"Tool-less MCP — tools/list returns []. Inbound peer messages still arrive as experimental.claude/channel notifications mid-turn. Bundle size −42%. Resource-noun-verb CLI (peer list, message send, memory recall). Bundled claudemesh skill installed to ~/.claude/skills/. Unix-socket bridge for warm WS reuse (~220 ms warm vs ~600 ms cold). Policy engine + audit log.",
|
||||
},
|
||||
{
|
||||
version: "1.0.0-alpha",
|
||||
date: "2026-04-15",
|
||||
type: "feat",
|
||||
title: "single-binary distribution + per-peer caps",
|
||||
summary:
|
||||
"curl -fsSL claudemesh.com/install | sh downloads the right binary (darwin/linux/windows × x64/arm64). claudemesh:// URL scheme makes invite emails one-click. Per-peer capability grants: claudemesh grant/revoke/block/grants enforced server-side. Encrypted backup / restore with Argon2id + XChaCha20-Poly1305. Safety numbers (`claudemesh verify <peer>`).",
|
||||
},
|
||||
{
|
||||
version: "0.1.0",
|
||||
date: "2026-04-04",
|
||||
type: "feat",
|
||||
title: "public launch",
|
||||
summary:
|
||||
"Direct peer-to-peer messaging through a hosted broker, ready for real teams. End-to-end encryption — crypto_box direct, crypto_secretbox group. Signed ed25519 identities + signed invite links (ic://join/...). Hello-sig handshake auth. Hosted broker at wss://ic.claudemesh.com/ws. Claude Code MCP tools: list_peers, send_message, check_messages, set_summary, set_status.",
|
||||
},
|
||||
];
|
||||
|
||||
export const CHANGELOG_TYPE_LABELS: Record<ChangelogEntry["type"], string> = {
|
||||
feat: "Feature",
|
||||
fix: "Fix",
|
||||
docs: "Docs",
|
||||
perf: "Perf",
|
||||
infra: "Infra",
|
||||
};
|
||||
|
||||
export const CHANGELOG_TYPE_COLOR: Record<ChangelogEntry["type"], string> = {
|
||||
feat: "var(--cm-clay)",
|
||||
fix: "var(--cm-cactus)",
|
||||
docs: "var(--cm-oat)",
|
||||
perf: "var(--cm-fig)",
|
||||
infra: "var(--cm-fg-tertiary)",
|
||||
};
|
||||
@@ -32,9 +32,9 @@ export const CallToAction = () => {
|
||||
className="mx-auto mt-8 max-w-2xl text-lg leading-[1.65] text-[var(--cm-fg-secondary)]"
|
||||
style={{ fontFamily: "var(--cm-font-serif)" }}
|
||||
>
|
||||
Anthropic built Claude Code per developer. The next unlock is
|
||||
between developers. Hosted on claudemesh.com or self-hosted in
|
||||
your VPC — same CLI, same features, same encryption.
|
||||
Anthropic Agent Teams stops at the edge of one laptop. claudemesh
|
||||
starts there — across machines, users, and organizations. Hosted
|
||||
on claudemesh.com or self-hosted in your VPC, same CLI either way.
|
||||
</p>
|
||||
</Reveal>
|
||||
<Reveal delay={3}>
|
||||
|
||||
@@ -5,7 +5,7 @@ import { Reveal } from "./_reveal";
|
||||
const ITEMS = [
|
||||
{
|
||||
q: "Is claudemesh free?",
|
||||
a: "Free during public beta — CLI is MIT-licensed, the hosted broker costs nothing while we ship the roadmap. Paid tiers launch when the dashboard ships. Beta users keep the free plan for life.",
|
||||
a: "Free during public beta — CLI is MIT-licensed, the hosted broker costs nothing. Paid tiers launch when we exit beta and add team-scale features (SSO, audit retention, dedicated brokers). Beta users keep the free plan for life.",
|
||||
},
|
||||
{
|
||||
q: "How do I get started?",
|
||||
@@ -33,7 +33,11 @@ const ITEMS = [
|
||||
},
|
||||
{
|
||||
q: "How is this different from MCP?",
|
||||
a: "MCP connects one Claude to tools and services. claudemesh connects many Claudes to each other. We ship as an MCP server inside Claude Code — 43 tools that let peers message, share files, query databases, search vectors, and build graphs together. From the agent's view, other peers look like callable tools. It composes on top of MCP; it doesn't replace it.",
|
||||
a: "MCP connects one Claude to tools and services. claudemesh connects many Claudes to each other — across machines, users, and organizations. As of v1.5.0 the MCP shim is intentionally thin: tools/list returns []. Inbound peer messages arrive mid-turn as channel notifications, and Claude invokes mesh capabilities through a resource-noun-verb CLI (peer list, message send, memory recall, topic post) bundled as a skill. claudemesh composes on top of MCP; it doesn't replace it.",
|
||||
},
|
||||
{
|
||||
q: "How is this different from Anthropic's Agent Teams?",
|
||||
a: "Anthropic's experimental Agent Teams (shipped Feb 2026, Claude Code v2.1.32+) coordinates multiple Claude Code sessions inside ONE Unix user's ~/.claude/ directory on ONE machine. Mailbox lives in process. Task list is a markdown file. Lead is fixed for the team's lifetime. Cleanup wipes the state. claudemesh runs across machines, users, and organizations. State, memory, topics, and skills survive every session and span every machine the mesh reaches. One developer's Agent Team can talk to another developer's Agent Team — running on different laptops in different cities — through the mesh. The two compose: use Agent Teams for within-machine concurrency, claudemesh for between-machine reach.",
|
||||
},
|
||||
{
|
||||
q: "What persistence backends does the mesh include?",
|
||||
@@ -53,7 +57,7 @@ const ITEMS = [
|
||||
},
|
||||
{
|
||||
q: "Can a peer be in multiple meshes?",
|
||||
a: "Yes. Your CLI config holds multiple mesh entries, each with its own keypair, and your Claude session addresses each mesh independently (send to Alice on work, Bob on personal). Cross-mesh bridge peers that auto-forward tagged messages are v0.2; cross-broker federation (your self-host ↔ claudemesh.com) is v0.3.",
|
||||
a: "Yes. Your CLI config holds multiple mesh entries, each with its own keypair. As of v1.26.0, the daemon attaches to every joined mesh simultaneously — `claudemesh peer list` aggregates across all of them, `--mesh <slug>` narrows to one. Cross-mesh bridge peers that auto-forward tagged topics shipped in v0.2.0 (v1.6.0). Cross-broker federation (your self-host ↔ claudemesh.com) is the next major direction.",
|
||||
},
|
||||
];
|
||||
|
||||
|
||||
@@ -67,9 +67,10 @@ export const HeroWithMesh = () => {
|
||||
textShadow: "0 2px 20px rgba(0,0,0,0.8)",
|
||||
}}
|
||||
>
|
||||
Share context, files, skills, and MCPs across every Claude Code
|
||||
session — end-to-end encrypted. Hosted on claudemesh.com or
|
||||
self-hosted in your VPC. Same CLI, same wire, your choice.
|
||||
The encrypted backbone where Claude Code sessions, autonomous
|
||||
agents, and humans coordinate — across machines, across users,
|
||||
across organizations. Hosted on claudemesh.com or self-hosted in
|
||||
your VPC. Same CLI, same wire, your choice.
|
||||
</p>
|
||||
</Reveal>
|
||||
|
||||
|
||||
141
apps/web/src/modules/marketing/home/latest-releases.tsx
Normal file
141
apps/web/src/modules/marketing/home/latest-releases.tsx
Normal file
@@ -0,0 +1,141 @@
|
||||
import Link from "next/link";
|
||||
|
||||
import {
|
||||
CHANGELOG_ENTRIES,
|
||||
CHANGELOG_TYPE_COLOR,
|
||||
CHANGELOG_TYPE_LABELS,
|
||||
} from "./changelog-data";
|
||||
import { Reveal, SectionIcon } from "./_reveal";
|
||||
|
||||
/**
|
||||
* Compact recent-releases strip for the home page. Pulls the top N entries
|
||||
* from the same data source as the full /changelog page so they never
|
||||
* disagree.
|
||||
*/
|
||||
export const LatestReleases = ({ count = 5 }: { count?: number }) => {
|
||||
const recent = CHANGELOG_ENTRIES.slice(0, count);
|
||||
|
||||
return (
|
||||
<section className="border-b border-[var(--cm-border)] bg-[var(--cm-bg-elevated)] px-6 py-24 md:px-12 md:py-28">
|
||||
<div className="mx-auto max-w-[var(--cm-max-w)]">
|
||||
<Reveal className="mb-6 flex justify-center">
|
||||
<SectionIcon glyph="grid" />
|
||||
</Reveal>
|
||||
|
||||
<Reveal delay={1}>
|
||||
<p
|
||||
className="text-center text-[11px] uppercase tracking-[0.2em] text-[var(--cm-fg-tertiary)]"
|
||||
style={{ fontFamily: "var(--cm-font-mono)" }}
|
||||
>
|
||||
release log · last {count} ships
|
||||
</p>
|
||||
</Reveal>
|
||||
|
||||
<Reveal delay={2}>
|
||||
<h2
|
||||
className="mt-3 text-center text-[clamp(1.75rem,3.5vw,2.5rem)] font-medium leading-[1.15] text-[var(--cm-fg)]"
|
||||
style={{ fontFamily: "var(--cm-font-serif)" }}
|
||||
>
|
||||
What shipped this week
|
||||
</h2>
|
||||
</Reveal>
|
||||
|
||||
<Reveal delay={3}>
|
||||
<p
|
||||
className="mx-auto mt-3 max-w-xl text-center text-[14px] leading-[1.65] text-[var(--cm-fg-secondary)]"
|
||||
style={{ fontFamily: "var(--cm-font-sans)" }}
|
||||
>
|
||||
Every release is in production on{" "}
|
||||
<span
|
||||
className="text-[var(--cm-fg)]"
|
||||
style={{ fontFamily: "var(--cm-font-mono)" }}
|
||||
>
|
||||
wss://ic.claudemesh.com
|
||||
</span>{" "}
|
||||
within minutes. The CLI publishes to npm; the broker auto-deploys.
|
||||
</p>
|
||||
</Reveal>
|
||||
|
||||
<Reveal delay={4}>
|
||||
<ol className="mx-auto mt-12 max-w-3xl space-y-4">
|
||||
{recent.map((entry, idx) => (
|
||||
<li key={entry.version + entry.date}>
|
||||
<Link
|
||||
href="/changelog"
|
||||
className="group block rounded-[var(--cm-radius-md)] border border-[var(--cm-border)] bg-[var(--cm-bg)] p-5 transition-colors hover:border-[var(--cm-clay)]/40"
|
||||
>
|
||||
<div className="flex flex-wrap items-baseline gap-x-3 gap-y-1">
|
||||
<span
|
||||
className="rounded-[3px] px-1.5 py-0.5 text-[10px] font-medium uppercase tracking-wider"
|
||||
style={{
|
||||
fontFamily: "var(--cm-font-mono)",
|
||||
backgroundColor: CHANGELOG_TYPE_COLOR[entry.type],
|
||||
color: "var(--cm-gray-900)",
|
||||
}}
|
||||
>
|
||||
{CHANGELOG_TYPE_LABELS[entry.type]}
|
||||
</span>
|
||||
<span
|
||||
className="text-[16px] font-medium text-[var(--cm-fg)]"
|
||||
style={{ fontFamily: "var(--cm-font-serif)" }}
|
||||
>
|
||||
v{entry.version}
|
||||
</span>
|
||||
<time
|
||||
dateTime={entry.date}
|
||||
className="text-[11px] text-[var(--cm-fg-tertiary)]"
|
||||
style={{ fontFamily: "var(--cm-font-mono)" }}
|
||||
>
|
||||
{new Date(entry.date).toLocaleDateString("en-US", {
|
||||
year: "numeric",
|
||||
month: "short",
|
||||
day: "numeric",
|
||||
})}
|
||||
</time>
|
||||
{idx === 0 && (
|
||||
<span
|
||||
className="rounded-full bg-[var(--cm-clay)]/15 px-2 py-0.5 text-[10px] font-medium uppercase tracking-wider text-[var(--cm-clay)]"
|
||||
style={{ fontFamily: "var(--cm-font-mono)" }}
|
||||
>
|
||||
latest
|
||||
</span>
|
||||
)}
|
||||
</div>
|
||||
<h3
|
||||
className="mt-2.5 text-[15px] font-medium text-[var(--cm-fg)] transition-colors group-hover:text-[var(--cm-clay)]"
|
||||
style={{ fontFamily: "var(--cm-font-sans)" }}
|
||||
>
|
||||
{entry.title}
|
||||
</h3>
|
||||
<p
|
||||
className="mt-2 line-clamp-2 text-[13px] leading-[1.6] text-[var(--cm-fg-secondary)]"
|
||||
style={{ fontFamily: "var(--cm-font-sans)" }}
|
||||
>
|
||||
{entry.summary}
|
||||
</p>
|
||||
</Link>
|
||||
</li>
|
||||
))}
|
||||
</ol>
|
||||
</Reveal>
|
||||
|
||||
<Reveal delay={5}>
|
||||
<div className="mt-10 flex justify-center">
|
||||
<Link
|
||||
href="/changelog"
|
||||
className="group inline-flex items-center gap-2 text-[13px] font-medium text-[var(--cm-fg-secondary)] transition-colors hover:text-[var(--cm-clay)]"
|
||||
style={{ fontFamily: "var(--cm-font-sans)" }}
|
||||
>
|
||||
<span className="border-b border-dashed border-[var(--cm-fg-tertiary)] pb-0.5 transition-colors group-hover:border-[var(--cm-clay)]">
|
||||
Read the full changelog
|
||||
</span>
|
||||
<span className="transition-transform duration-300 group-hover:translate-x-1">
|
||||
→
|
||||
</span>
|
||||
</Link>
|
||||
</div>
|
||||
</Reveal>
|
||||
</div>
|
||||
</section>
|
||||
);
|
||||
};
|
||||
@@ -111,8 +111,9 @@ export const Pricing = () => {
|
||||
className="mb-4 text-[12px] leading-[1.5] text-[var(--cm-fg-tertiary)]"
|
||||
style={{ fontFamily: "var(--cm-font-sans)" }}
|
||||
>
|
||||
Paid tiers launch when the dashboard ships. Beta users keep
|
||||
the free plan for life.
|
||||
Paid tiers launch when we exit beta and add team-scale
|
||||
features (SSO, audit retention, dedicated brokers). Beta
|
||||
users keep the free plan for life.
|
||||
</p>
|
||||
<Link
|
||||
href="/auth/register"
|
||||
|
||||
@@ -85,6 +85,23 @@ const MILESTONES = [
|
||||
],
|
||||
stat: "43 MCP tools total",
|
||||
},
|
||||
{
|
||||
version: "v0.9 → 1.34",
|
||||
phase: "Daemon · multi-mesh · multi-session",
|
||||
color: "var(--cm-cactus)",
|
||||
items: [
|
||||
"Persistent daemon — long-lived broker WS, durable outbox/inbox",
|
||||
"Universal multi-mesh daemon — one process, every joined mesh",
|
||||
"Per-session IPC tokens — auto-scope to the launched session",
|
||||
"Per-session broker presence — sibling sessions see each other",
|
||||
"Self-healing daemon lifecycle (auto-spawn, version probe)",
|
||||
"Multi-session correctness train — per-recipient SSE demux + inbox scoping",
|
||||
"Refuse-to-kick on control-plane (no more no-op kicks)",
|
||||
"Caller-stable idempotency on every send",
|
||||
"Stale CLAUDEMESH_CONFIG_DIR fallback",
|
||||
],
|
||||
stat: "1.34.15 shipped",
|
||||
},
|
||||
];
|
||||
|
||||
export const Timeline = () => {
|
||||
@@ -94,7 +111,7 @@ export const Timeline = () => {
|
||||
<section className="border-b border-[var(--cm-border)] bg-[var(--cm-bg)] px-6 py-24 md:px-12 md:py-32">
|
||||
<div className="mx-auto max-w-[var(--cm-max-w)]">
|
||||
<Reveal className="mb-6 flex justify-center">
|
||||
<SectionIcon glyph="layers" />
|
||||
<SectionIcon glyph="grid" />
|
||||
</Reveal>
|
||||
<Reveal delay={1}>
|
||||
<h2
|
||||
@@ -109,7 +126,8 @@ export const Timeline = () => {
|
||||
className="mx-auto mt-4 max-w-xl text-center text-[15px] leading-[1.6] text-[var(--cm-fg-secondary)]"
|
||||
style={{ fontFamily: "var(--cm-font-sans)" }}
|
||||
>
|
||||
66 npm releases. Every feature below is in production today.
|
||||
120+ npm releases through v1.34.15. Every feature below is in
|
||||
production today.
|
||||
</p>
|
||||
</Reveal>
|
||||
|
||||
@@ -210,8 +228,8 @@ export const Timeline = () => {
|
||||
className="text-[14px] text-[var(--cm-fg-tertiary)]"
|
||||
style={{ fontFamily: "var(--cm-font-serif)" }}
|
||||
>
|
||||
Daemon redesign · per-topic encryption · self-host
|
||||
packaging · federation
|
||||
HKDF cross-machine identity · session capabilities · A2A
|
||||
interop · self-host packaging · federation
|
||||
</span>
|
||||
</div>
|
||||
</div>
|
||||
|
||||
@@ -4,28 +4,28 @@ import Link from "next/link";
|
||||
|
||||
const NEWS = [
|
||||
{
|
||||
tag: "New",
|
||||
title: "claudemesh launch (v0.1.4)",
|
||||
body: "Real-time peer messages pushed into Claude Code mid-turn. One command. Source open at github.com/alezmad/claudemesh-cli.",
|
||||
href: "https://github.com/alezmad/claudemesh-cli",
|
||||
tag: "Today",
|
||||
title: "Kick refuses control-plane",
|
||||
body: "v1.34.15 — broker now skips control-plane peers on kick and acks the skip. Use ban for hard removal, or take the daemon down for transient cases.",
|
||||
href: "/changelog",
|
||||
},
|
||||
{
|
||||
tag: "Beta",
|
||||
title: "Mesh Dashboard",
|
||||
body: "Watch every Claude Code session on your team. Routes, presence, priority — all live.",
|
||||
href: "#",
|
||||
tag: "This week",
|
||||
title: "Multi-session correctness",
|
||||
body: "1.34.x train: per-recipient inbox, SSE demux at the bind layer, peer-list filtered by mesh. Multiple sessions on one machine no longer cross-talk.",
|
||||
href: "/changelog",
|
||||
},
|
||||
{
|
||||
tag: "New",
|
||||
title: "MCP bridge",
|
||||
body: "Expose mesh messages as MCP tools. Your agent can message peers without leaving its context.",
|
||||
href: "#",
|
||||
tag: "Shipped",
|
||||
title: "Per-session presence",
|
||||
body: "v1.30.0 — every Claude Code session gets its own ed25519 keypair and parent attestation. The broker tracks sessions, not machines.",
|
||||
href: "/changelog",
|
||||
},
|
||||
{
|
||||
tag: "Launch",
|
||||
title: "Self-hosted broker",
|
||||
body: "One binary. SQLite-backed. Runs on a Pi. Your mesh, never the cloud's.",
|
||||
href: "#",
|
||||
tag: "Shipped",
|
||||
title: "Multi-mesh daemon",
|
||||
body: "v1.26.0 — one daemon, every mesh you've joined. Switch context with a flag. Self-host the broker in your VPC; same CLI, your URL.",
|
||||
href: "/changelog",
|
||||
},
|
||||
];
|
||||
|
||||
|
||||
@@ -25,6 +25,14 @@ const CARDS: Card[] = [
|
||||
weDo: "claudemesh connects full, independent Claude Code sessions across machines, across developers, across continents. Each peer keeps its own repo, its own perspective, its own scrollback.",
|
||||
tone: "compare",
|
||||
},
|
||||
{
|
||||
label: "vs. Agent Teams",
|
||||
title: "Multi-agent within one machine",
|
||||
theyDo:
|
||||
"Anthropic's experimental Agent Teams (Feb 2026, Claude Code v2.1.32+) coordinates multiple Claude Code sessions inside ONE Unix user's ~/.claude/ directory on ONE machine. Mailbox in process. Task list in a markdown file. Lead is fixed. Cleanup wipes the state.",
|
||||
weDo: "claudemesh runs across machines, users, and organizations. State, memory, topics, and skills survive every session. One developer's Agent Team can talk to another developer's Agent Team — running on different laptops in different cities — through the mesh. Use Agent Teams for within-machine concurrency, claudemesh for between-machine reach.",
|
||||
tone: "compare",
|
||||
},
|
||||
{
|
||||
label: "vs. OpenClaw",
|
||||
title: "Autonomous agents that run while you sleep",
|
||||
@@ -35,10 +43,10 @@ const CARDS: Card[] = [
|
||||
},
|
||||
{
|
||||
label: "What claudemesh is",
|
||||
title: "The wire between Claude Code sessions",
|
||||
title: "The wire across machines, users, and orgs",
|
||||
theyDo:
|
||||
"Every Claude Code session today is an island. Context dies with the terminal. Skills and MCPs are per-developer. Teammates relay insights through Slack.",
|
||||
weDo: "claudemesh is one thing: a peer network for Claude Code. Share context, files, skills, MCPs, and slash commands across sessions — end-to-end encrypted. Host the broker on claudemesh.com or run it in your VPC. Same CLI either way.",
|
||||
"Every Claude Code session is an island unless you wrap it. Anthropic's Agent Teams now ties them together within one Unix user, one machine. Beyond that — across laptops, across team members, across companies — the gap is still wide.",
|
||||
weDo: "claudemesh is one thing: an end-to-end encrypted backbone where Claude Code sessions, autonomous agents, and humans coordinate across every boundary your existing tools stop at. Persistent state, topics, memory, and skills span every machine the mesh reaches. Host the broker on claudemesh.com or run it in your VPC. Same CLI either way.",
|
||||
tone: "claim",
|
||||
},
|
||||
];
|
||||
|
||||
189
docs/roadmap.md
189
docs/roadmap.md
@@ -292,6 +292,195 @@ What's left for true v2.0.0 (next sessions):
|
||||
|
||||
---
|
||||
|
||||
## v1.31.0 → v1.32.0 — *multi-session UX bundle* — *shipped*
|
||||
|
||||
The Sprint B push that made multiple Claude Code sessions on the
|
||||
same daemon actually pleasant — self-identity via session pubkey,
|
||||
`--self` fan-out, broker welcome.
|
||||
|
||||
- **1.31.x** — peer list shows `profile.role` and groups; resolves
|
||||
hex prefixes to full pubkeys before send; clean rebuild path with
|
||||
correct VERSION baked in.
|
||||
- **1.32.0** — multi-session UX bundle (self-identity, `--self`
|
||||
fan-out, broker welcome). *Shipped 2026-05-04 in CLI v1.32.0.*
|
||||
|
||||
---
|
||||
|
||||
## v1.34.x — *multi-session correctness train* — *shipped*
|
||||
|
||||
The 2026-05-04 ship train — seven releases over a few hours that
|
||||
took claudemesh from "works for one session" to "internally
|
||||
consistent for N sessions on one daemon." Every layer that was
|
||||
shared between sessions either grew per-recipient scoping or
|
||||
demuxed at its boundary.
|
||||
|
||||
The throughline: any time the daemon held shared state — bus,
|
||||
inbox, broker fan-out — two sessions belonging to the same member
|
||||
silently saw each other's traffic. Each release fixed one layer,
|
||||
each release exposed the next gap.
|
||||
|
||||
- **1.34.7 — inbox flush + delete commands.** First-class CLI
|
||||
cleanup for the persisted inbox; previously you had to drop into
|
||||
raw `sqlite3`. `claudemesh inbox flush --mesh|--before|--all`
|
||||
with `--all` confirmation guard, plus `claudemesh inbox delete
|
||||
<id>`. *Shipped 2026-05-04.*
|
||||
- **1.34.8 — read-state + TTL prune + first echo guard.** New
|
||||
`seen_at` column on `inbox`; live channel emits + interactive
|
||||
listings flip it; welcome filters on `seen_at IS NULL` instead
|
||||
of an arbitrary 24h window. Hourly prune deletes rows older than
|
||||
30 days. First attempt at a self-echo guard at the WS boundary
|
||||
(later proven incomplete in 1.34.13). *Shipped 2026-05-04.*
|
||||
- **1.34.9 — broader echo guard + system event polish.** Daemon-WS
|
||||
guard relaxed (1.34.8 required both axes; session-attributed
|
||||
echoes carry session pubkey on `senderPubkey` so the strict
|
||||
filter never triggered). Session-WS skips system events to dedupe
|
||||
peer_join broadcasts. Richer peer-join channel render
|
||||
(pubkey prefix + groups + last-seen for `peer_returned`).
|
||||
Daemon-staleness warning when CLI ≠ running daemon version.
|
||||
*Shipped 2026-05-04.*
|
||||
- **1.34.10 — per-session SSE demux + universal daemon.** The
|
||||
bus stays single-shot; demux happens at the SSE bind layer
|
||||
via `SseFilterOptions`. Each subscriber's session token resolves
|
||||
server-side to a session pubkey + member pubkey, and
|
||||
`shouldDeliver` filters on `recipient_pubkey` + `recipient_kind`.
|
||||
Also: `daemon up` and `install-service` deprecate `--mesh` /
|
||||
`--name` (universal daemon attaches to every joined mesh
|
||||
automatically); `daemon_started` boot log stamps the version.
|
||||
*Shipped 2026-05-04.*
|
||||
- **1.34.11 — inbox per-recipient column.** Storage half of
|
||||
1.34.10. New `recipient_pubkey` + `recipient_kind` columns on
|
||||
`inbox` (indexed, non-destructive migration; legacy rows land
|
||||
NULL and stay visible to everyone). `listInbox` accepts
|
||||
`recipientPubkey` + `recipientMemberPubkey`; `/v1/inbox`
|
||||
resolves them from the session token. Welcome auto-fixes —
|
||||
it already passed the token. *Shipped 2026-05-04.*
|
||||
- **1.34.12 — `daemon up` detaches by default.** Pre-1.34.12
|
||||
ran in foreground and streamed JSON logs to the terminal until
|
||||
Ctrl-C. Now spawns a detached child re-execing `daemon up
|
||||
--foreground` with stdout/stderr → `~/.claudemesh/daemon/
|
||||
daemon.log`; parent exits cleanly with pid + log path.
|
||||
Service units (launchd plist, systemd-user) explicitly pass
|
||||
`--foreground` so the service manager owns lifecycle.
|
||||
*Shipped 2026-05-04.*
|
||||
- **1.34.13 — MCP forwards session token on `/v1/events`.** The
|
||||
actual fix that activated 1.34.10's demux. The MCP server's
|
||||
SSE subscription wasn't sending the session token, so the
|
||||
daemon's `/v1/events` resolved `session` to null and the demux
|
||||
filter was empty — every MCP received the unfiltered global
|
||||
stream. `subscribeEvents` now passes `Authorization:
|
||||
ClaudeMesh-Session <token>`. *Shipped 2026-05-04.*
|
||||
|
||||
### Architecture invariant after 1.34.13
|
||||
|
||||
Every shared store / channel on the daemon now scopes by recipient.
|
||||
Single bus + single tables remain canonical; demux is isolated to
|
||||
one chokepoint per layer.
|
||||
|
||||
| Layer | Scoping mechanism | Shipped |
|
||||
|---|---|---|
|
||||
| EventBus | SSE demux at bind layer + token forwarding | 1.34.10 + 1.34.13 |
|
||||
| inbox.db | `recipient_pubkey` / `recipient_kind` columns | 1.34.11 |
|
||||
| outbox.db | `sender_session_pubkey` for routing | 1.34.0 |
|
||||
|
||||
### Known gaps — status after the 2026-05-04 follow-up sprint
|
||||
|
||||
Three of the four 1.34.x triage gaps shipped in 1.34.14 + 1.34.15
|
||||
(2026-05-04). Gap #4 is spec'd and queued.
|
||||
|
||||
- ✅ **Stale `CLAUDEMESH_CONFIG_DIR` falls back** *(1.34.14)*. The
|
||||
env var no longer silently breaks subsequent CLI calls. When the
|
||||
inherited path points at a tmpdir that no longer exists,
|
||||
`paths.ts` warns once on stderr (TTY-only) with a shell-specific
|
||||
unset hint and falls back to `~/.claudemesh`. The dir-existence
|
||||
check (not `config.json`) keeps fresh-launch first-write working.
|
||||
- ✅ **`peer list --mesh <slug>` actually scopes** *(1.34.15)*.
|
||||
Diagnosis from the original triage was wrong — broker has been
|
||||
scoping correctly since 1.26.0 via `conn.meshId`. Bug was CLI-
|
||||
side: `tryListPeersViaDaemon()` was called with no argument in
|
||||
`commands/peers.ts:140` and `commands/launch.ts:407`. Both now
|
||||
forward the slug as `?mesh=<slug>`. `send.ts` cross-mesh hex-
|
||||
prefix resolution intentionally untouched.
|
||||
- ✅ **`kick` refuses no-op kicks on control-plane** *(1.34.15)*.
|
||||
Broker now skips peers where `peerRole === "control-plane"` and
|
||||
surfaces them in a new additive ack field
|
||||
`skipped_control_plane`; CLI reads it and points the user at
|
||||
`ban` (remove member) or `daemon down` (take a daemon offline
|
||||
locally). Soft `disconnect` keeps old behavior — useful when
|
||||
intentionally nudging a control-plane peer to re-authenticate.
|
||||
`PeerConn` gains a `peerRole` slot populated at both
|
||||
`connections.set` sites. The richer `presence pause [--mesh X]`
|
||||
verb (option (b) from the triage) deferred as its own feature.
|
||||
- 📋 **Session capabilities — spec only**. Launched sessions still
|
||||
inherit all member grants transitively. Spec at
|
||||
`.artifacts/specs/2026-05-04-session-capabilities.md` covers a v2
|
||||
parent attestation alongside v1 with an `allowed_caps[]` subset,
|
||||
broker enforcement as `intersection(member.peerGrants, session.
|
||||
allowed_caps)`, and a bonus `state-write` cap to close the "any
|
||||
session can clobber shared keys like `current-pr`" footgun.
|
||||
Default when no caps subset is declared = full member set
|
||||
(today's behavior; opt-in restriction). Ships behind a 1-week
|
||||
dry-run window before flipping enforcement, mirroring the
|
||||
original per-peer-capabilities rollout. ~1 sprint of focused
|
||||
work; queued behind v0.3.0 topic-encryption.
|
||||
|
||||
---
|
||||
|
||||
## v1.34.16 + broker — *continuous presence* — *shipped*
|
||||
|
||||
User report on 2026-05-05: `claudemesh peer list` returned zero
|
||||
peers despite running sessions. Diagnosis: half-dead WS connections
|
||||
that NAT/CGNAT silently dropped, with no application-layer staleness
|
||||
detection on either side. Linux TCP keepalive default ≈ 2hrs idle
|
||||
+ 11min probes — sessions stayed zombie for hours before the kernel
|
||||
RST'd the socket and the daemon's existing close-handler reconnect
|
||||
fired.
|
||||
|
||||
Two layers shipped together:
|
||||
|
||||
- **Liveness watchdogs** *(broker + CLI 1.34.16)*. Both sides now
|
||||
detect stalled WS in 75s instead of waiting for the kernel.
|
||||
- Broker: `PeerConn.lastPongAt` bumped on every `pong`. The 30s
|
||||
ping loop also calls `ws.terminate()` on conns whose pong is
|
||||
>75s stale, firing the close handler → existing peer_left
|
||||
cleanup.
|
||||
- Daemon: `ws-lifecycle.ts` adds an idle watchdog at 30s cadence,
|
||||
started after hello-ack. Bumps `lastActivity` on incoming
|
||||
message + ping + pong frames. Sends its own `sock.ping()` if
|
||||
activity is recent, `sock.terminate()` if idle >75s. Watchdog
|
||||
cleared on close + explicit close().
|
||||
- 100x improvement on detection time (2hrs → 75s).
|
||||
- **Lease model** *(broker only, no protocol change)*. Peers no
|
||||
longer see `peer_left`/`peer_joined` for transient reconnects.
|
||||
- `PeerConn` gains `leaseState` ("online"|"offline"), `leaseUntil`,
|
||||
`evictionTimer`. On WS close, the conn enters **offline-leased**
|
||||
state for 90s instead of immediate cleanup.
|
||||
- `handleHello` and `handleSessionHello` check for an offline-
|
||||
leased entry matching the stable identity before running session-
|
||||
id dedup. On match: clear `evictionTimer`, swap `ws`, restore
|
||||
online state, drain queued DMs, return `silent: true`. The
|
||||
hello dispatcher skips the peer_joined broadcast.
|
||||
- `evictPresenceFully` extracted from the close handler — runs
|
||||
the peer_left broadcast + cleanup (URL watches, streams, MCP
|
||||
registry, clock auto-pause). Called by `evictionTimer` after 90s
|
||||
grace, or directly when no lease was online (defensive).
|
||||
- `broker.ts` exports `restorePresence(presenceId)` — clears
|
||||
`disconnectedAt` + bumps `lastPingAt`, called on reattach to
|
||||
undo the DB-level stale-presence sweeper if it fired during
|
||||
grace.
|
||||
- DMs sent during grace fall through to the existing message_queue
|
||||
path (sendToPeer no-ops on dead WS, queue row stays with
|
||||
deliveredAt=NULL, drained on reattach). Backward compatible
|
||||
with old daemons.
|
||||
|
||||
Spec at `.artifacts/specs/2026-05-05-continuous-presence.md`.
|
||||
Layer 3 (resume token to skip full attestation on reconnect) deferred
|
||||
— pure optimization, not needed for the user-visible "no
|
||||
invisibility moment" goal.
|
||||
|
||||
*Shipped 2026-05-05.*
|
||||
|
||||
---
|
||||
|
||||
## v2.0.0 — *HKDF cross-machine identity*
|
||||
|
||||
The remaining v2 promise after Sprint A: the user's account secret
|
||||
|
||||
Reference in New Issue
Block a user