Compare commits
17 Commits
cb90f1ca60
...
cef246a34a
| Author | SHA1 | Date | |
|---|---|---|---|
|
|
cef246a34a | ||
|
|
f013436541 | ||
|
|
6d981976c0 | ||
|
|
f7d7d391c9 | ||
|
|
ff2aa8bf7c | ||
|
|
4d42185b0f | ||
|
|
d62b3f45d2 | ||
|
|
e688f66791 | ||
|
|
033a2d37e1 | ||
|
|
364178d95b | ||
|
|
f91871c71d | ||
|
|
92cac16c91 | ||
|
|
81f0e4f7ac | ||
|
|
2b6cf2c14b | ||
|
|
8a5469a5df | ||
|
|
e128a6ae5f | ||
|
|
3753a6e137 |
282
.artifacts/specs/2026-05-04-per-session-presence.md
Normal file
282
.artifacts/specs/2026-05-04-per-session-presence.md
Normal file
@@ -0,0 +1,282 @@
|
||||
# Per-session broker presence — daemon-multiplexed
|
||||
|
||||
**Status:** spec, queued for 1.30.0 (alongside launch-wizard refactor).
|
||||
**Owner:** alezmad
|
||||
**Author:** Claude (Sprint A planning, 2026-05-04)
|
||||
**Related:** `2026-05-04-v2-roadmap-completion.md` (Sprint A overview),
|
||||
1.29.0 session-registry CHANGELOG entry.
|
||||
|
||||
## Problem
|
||||
|
||||
After 1.28.0 dropped the bridge tier, **launched `claude` sessions have
|
||||
no persistent broker presence**. Only the daemon does.
|
||||
|
||||
Concretely: two `claudemesh launch` sessions in the same cwd, querying
|
||||
`peer list` 2 s apart, **never see each other**. Each `claudemesh peer
|
||||
list` opens a short-lived cold-path WS that creates a `presence` row
|
||||
for the duration of the query and tears it down. The "this session"
|
||||
row everyone sees in their own snapshot is created by the snapshot
|
||||
itself; sibling sessions' queries miss it because their WS-lifetimes
|
||||
don't overlap.
|
||||
|
||||
Confirmed empirically (2026-05-04, same-cwd ECIJA-Intranet test):
|
||||
|
||||
| Snapshot | timestamp | self pubkey | self `connectedAt` |
|
||||
|---|---|---|---|
|
||||
| Session A | 11:42:37Z | `61d96106cb499208` | 11:42:38Z (= query time) |
|
||||
| Session B | 11:42:39Z | `ce77188aba02827d` | 11:42:38Z (= query time) |
|
||||
|
||||
Each saw 5 long-lived peers (the daemon and unrelated other sessions)
|
||||
plus its own ephemeral row. Neither saw the other.
|
||||
|
||||
## Goal
|
||||
|
||||
Every launched `claude` session has a long-lived broker presence row
|
||||
**owned by the daemon**, identified by the session's per-launch
|
||||
keypair. Siblings see each other in `peer list` immediately and
|
||||
continuously, not as snapshot artifacts.
|
||||
|
||||
## Non-goals
|
||||
|
||||
- Cross-machine session sync (waiting on 2.0.0 HKDF identity).
|
||||
- Replacing the daemon's own presence row — the daemon stays as a
|
||||
separate row for "the user on this machine, no specific session."
|
||||
- Persistence of the session-presence link across daemon restarts —
|
||||
daemon restart can be allowed to require launched sessions to
|
||||
re-register (same compromise as the in-memory session registry from
|
||||
1.29.0).
|
||||
|
||||
## Design
|
||||
|
||||
### State machine
|
||||
|
||||
The 1.29.0 session registry already tracks `Map<token, SessionInfo>`
|
||||
inside the daemon. Extend it to own a per-session broker connection.
|
||||
|
||||
```
|
||||
session lifecycle:
|
||||
POST /v1/sessions/register
|
||||
→ registry.set(token, info)
|
||||
→ daemon.openSessionWs(info) ← NEW
|
||||
→ broker creates presence row owned by session.pubkey
|
||||
|
||||
DELETE /v1/sessions/:token
|
||||
→ registry.delete(token)
|
||||
→ daemon.closeSessionWs(token) ← NEW
|
||||
→ broker marks presence.disconnectedAt = now()
|
||||
|
||||
reaper (30 s tick): pid dead?
|
||||
→ registry.delete(token)
|
||||
→ daemon.closeSessionWs(token)
|
||||
```
|
||||
|
||||
### Daemon-side: per-session `BrokerClient`
|
||||
|
||||
Today the daemon holds `Map<meshSlug, DaemonBrokerClient>` (one WS per
|
||||
attached mesh). Add a parallel `Map<token, SessionBrokerClient>` for
|
||||
the per-launch ephemeral connections.
|
||||
|
||||
`SessionBrokerClient` is the existing `BrokerClient` reused, configured
|
||||
with the session's per-launch keypair instead of the member's stable
|
||||
keypair. It registers presence (`presence_join`) and stays connected
|
||||
until `closeSessionWs(token)` fires. It does **not** drain the outbox
|
||||
— that's the member-keypair `DaemonBrokerClient`'s job. It only carries
|
||||
presence + receives DMs targeted at the session pubkey.
|
||||
|
||||
### Broker-side: parent-vouched presence auth
|
||||
|
||||
Today's broker accepts hello-sig auth where:
|
||||
- Caller signs the broker's nonce with their `mesh_member` keypair.
|
||||
- Broker looks up `mesh_member.peer_pubkey == sig.pubkey`.
|
||||
|
||||
For per-session keypairs, the session pubkey is **not** in `mesh_member`
|
||||
— it's freshly generated by `claudemesh launch`. We need a new
|
||||
attestation flow:
|
||||
|
||||
```
|
||||
hello {
|
||||
type: "session_hello",
|
||||
session_pubkey: <fresh keypair>,
|
||||
parent_member_pubkey: <member keypair from config>,
|
||||
display_name, cwd, role, groups,
|
||||
parent_signature: ed25519_sign(member_priv,
|
||||
"claudemesh-session/" || session_pubkey || "/" || nonce),
|
||||
nonce_challenge: <broker nonce>,
|
||||
}
|
||||
```
|
||||
|
||||
Broker validates:
|
||||
1. `parent_member_pubkey` exists in `mesh.member` for the target mesh.
|
||||
2. `parent_signature` validates against `parent_member_pubkey` over the
|
||||
canonical message above.
|
||||
3. Broker inserts a presence row keyed on `session_pubkey` but
|
||||
`member_id` pointing at the parent member's `mesh.member.id`.
|
||||
|
||||
This is the OAuth-style refresh-vs-access pattern: the parent member
|
||||
key vouches "this ephemeral session pubkey belongs to me." The broker
|
||||
binds the row to the parent member but uses the session pubkey for
|
||||
routing (so DMs targeted at the session pubkey land at this WS).
|
||||
|
||||
### CLI-side: launch.ts produces the parent signature
|
||||
|
||||
`claudemesh launch` already mints the session keypair and writes the
|
||||
session-token file. Extend it to also produce a `parent_signature`
|
||||
that the daemon can present when opening the session WS:
|
||||
|
||||
```ts
|
||||
const sessionPubkey = sessionKeypair.publicKey;
|
||||
const parentSig = ed25519_sign(
|
||||
mesh.secretKey,
|
||||
Buffer.concat([
|
||||
Buffer.from("claudemesh-session/"),
|
||||
sessionPubkey,
|
||||
Buffer.from("/"),
|
||||
/* nonce comes from broker — handled at WS-connect time */
|
||||
]),
|
||||
);
|
||||
```
|
||||
|
||||
Actually, the nonce is broker-issued at hello time, so the signature
|
||||
needs to be produced fresh per WS-connect. Simpler approach: the
|
||||
`POST /v1/sessions/register` body carries the *member secret key* (or
|
||||
a derived signing capability) so the daemon can sign nonces on behalf
|
||||
of the session.
|
||||
|
||||
That's a key-leak risk. Better: register carries a **pre-signed
|
||||
attestation** good for a TTL window:
|
||||
|
||||
```
|
||||
register body adds:
|
||||
parent_attestation: {
|
||||
session_pubkey: hex,
|
||||
parent_member_pubkey: hex,
|
||||
expires_at: ISO,
|
||||
signature: ed25519_sign(member_priv,
|
||||
"claudemesh-session-attest/" ||
|
||||
session_pubkey || "/" ||
|
||||
expires_at),
|
||||
}
|
||||
```
|
||||
|
||||
Daemon presents this attestation in `session_hello`; broker validates
|
||||
expiry and signature, then issues a nonce challenge that the daemon
|
||||
can satisfy with the session keypair (which IS held by the daemon
|
||||
for the lifetime of the registration). Two-stage: parent vouches the
|
||||
session; session signs the nonce.
|
||||
|
||||
### Registry persistence
|
||||
|
||||
For now, in-memory only (matching 1.29.0). Daemon restart drops all
|
||||
session WSes; launched `claude` processes are responsible for
|
||||
re-registering on next CLI invocation. Acceptable v1 behaviour;
|
||||
revisit when sqlite persistence lands for the registry.
|
||||
|
||||
## Wire changes
|
||||
|
||||
### Broker
|
||||
|
||||
- New `session_hello` message type (additive; existing `hello` for
|
||||
member auth unchanged).
|
||||
- `presence` row schema unchanged — `member_id` still required, but
|
||||
`session_pubkey` differs from member's stable pubkey.
|
||||
- Validate `parent_attestation.expires_at <= now() + 24h` to bound
|
||||
attestation reuse.
|
||||
|
||||
### Daemon
|
||||
|
||||
- New `SessionBrokerClient` factory — wraps `BrokerClient` with
|
||||
session-mode hello.
|
||||
- `Map<token, SessionBrokerClient>` alongside the existing
|
||||
`Map<slug, DaemonBrokerClient>`.
|
||||
- IPC routes:
|
||||
- `POST /v1/sessions/register` — extend body schema with
|
||||
`parent_attestation`.
|
||||
- `DELETE /v1/sessions/:token` — close the session WS first, then
|
||||
drop registry entry.
|
||||
|
||||
### CLI (`claudemesh launch`)
|
||||
|
||||
- Mint session keypair (today only writes the session token; need to
|
||||
add ed25519 keypair generation per launch and write the privkey
|
||||
alongside the token).
|
||||
- Sign `parent_attestation` with the member key from the joined-mesh
|
||||
config.
|
||||
- POST register with both the new keypair and the attestation.
|
||||
|
||||
## LoC estimate
|
||||
|
||||
- Daemon `SessionBrokerClient` + registry hook: ~120 LoC.
|
||||
- IPC route schema extension + validation: ~40 LoC.
|
||||
- Broker `session_hello` handler + tests: ~140 LoC.
|
||||
- CLI `claudemesh launch` keypair + attestation: ~60 LoC.
|
||||
- Tests + smoke: ~80 LoC.
|
||||
|
||||
Total: **~440 LoC** across CLI + daemon + broker.
|
||||
|
||||
## Risks
|
||||
|
||||
| Risk | Mitigation |
|
||||
|---|---|
|
||||
| Member private key never leaves the user's machine, but the **attestation** (signed token) can be replayed within its TTL. | TTL bound 24h; refresh on launch; revocation path = drop the parent member's mesh enrollment (nuclear, but works). |
|
||||
| Cascading WS connections — N launches = N+1 broker WSes per user. | Acceptable up to 10-20 concurrent sessions; if it ever becomes a problem, multiplex per-session at the protocol level (one WS, multiple presence rows). Out of scope for v1. |
|
||||
| Daemon restart kills all session WSes — `peer list` from inside a launched session sees the remaining 5 peers but not its own siblings until they re-register. | Same as 1.29.0 registry. The registry could persist to sqlite later; for v1, accepted. |
|
||||
| Broker schema cost: every new presence row has a different `session_pubkey`, growing the table faster. | Already accepted — broker prunes disconnected rows on a 30-day window. Per-session keys triple the row count at peak but stay within the prune budget. |
|
||||
|
||||
## Compatibility
|
||||
|
||||
- **Older brokers** can't validate `session_hello`. Sessions will
|
||||
attempt the new hello, get back `unknown_message_type`, and fall
|
||||
back to the existing member-keyed hello (no per-session presence,
|
||||
but everything still works as 1.28.0). Add the broker change first,
|
||||
let it deploy, then ship the CLI side.
|
||||
- **Older CLIs** continue to work unchanged — they don't open
|
||||
per-session WSes. They appear as ephemeral cold-path rows just like
|
||||
today, and lose the symmetric-visibility property between siblings.
|
||||
- **Backward visible:** users on 1.30.0+ on the same mesh as users on
|
||||
≤1.29.x will see the older users as one row (their daemon) instead
|
||||
of one row per session. Acceptable — opt-in to the new visibility
|
||||
by upgrading.
|
||||
|
||||
## Sequencing
|
||||
|
||||
1. **Broker change ships first.** Add `session_hello` handler, deploy,
|
||||
bake for ~24h. No CLI behaviour change yet.
|
||||
2. **Daemon `SessionBrokerClient` ships next** behind a feature flag
|
||||
(`CLAUDEMESH_SESSION_PRESENCE=1`). Manually test with two launched
|
||||
sessions in the same cwd; verify both see each other.
|
||||
3. **CLI keypair-mint + attestation in `launch.ts` ships last**, behind
|
||||
the same flag.
|
||||
4. Flip the flag default in 1.30.0 release; document rollback via env.
|
||||
|
||||
## Verification
|
||||
|
||||
End-to-end smoke (paste into 1.30.0's CHANGELOG):
|
||||
|
||||
```
|
||||
$ # In two different shells, both cd ~/Desktop/foo:
|
||||
$ claudemesh launch --name SessionA -y # shell 1
|
||||
$ claudemesh launch --name SessionB -y # shell 2
|
||||
$
|
||||
$ # In a third shell:
|
||||
$ claudemesh peer list --json --mesh foo | jq '.[] | {n: .displayName, c: .cwd}'
|
||||
{ "n": "SessionA", "c": "/.../foo" } ← persistent, not query-induced
|
||||
{ "n": "SessionB", "c": "/.../foo" }
|
||||
$
|
||||
$ # In SessionA's shell:
|
||||
$ claudemesh peer list --mesh foo
|
||||
should include SessionB.
|
||||
$
|
||||
$ # Kill SessionB (Ctrl-C in shell 2). Wait <30s.
|
||||
$ claudemesh peer list --mesh foo
|
||||
should NOT include SessionB (reaper closed its WS).
|
||||
```
|
||||
|
||||
## Open questions
|
||||
|
||||
- Should the per-session WS also drain *its own* outbox subset, or stay
|
||||
presence-only? Recommend presence-only for v1 — keeps state machines
|
||||
simple, daemon's member-keyed WS handles all sends. Can be revisited
|
||||
when per-session policy DSL ships.
|
||||
- Should the parent attestation be revocable mid-session? Could add an
|
||||
IPC route on the daemon. Out of scope for v1; revoke = drop the
|
||||
whole member enrollment.
|
||||
@@ -1013,7 +1013,7 @@ export async function topicHistory(args: {
|
||||
ORDER BY tm.created_at DESC, tm.id DESC
|
||||
LIMIT ${limit}
|
||||
`);
|
||||
const rows = (result.rows ?? result) as Array<{
|
||||
const rows = ((result as unknown as { rows?: unknown[] }).rows ?? (result as unknown as unknown[])) as Array<{
|
||||
id: string;
|
||||
sender_member_id: string;
|
||||
sender_pubkey: string;
|
||||
@@ -1442,7 +1442,7 @@ export async function recallMemory(
|
||||
ORDER BY ts_rank(search_vector, plainto_tsquery('english', ${query})) DESC
|
||||
LIMIT 20
|
||||
`);
|
||||
const rows = (result.rows ?? result) as Array<{
|
||||
const rows = ((result as unknown as { rows?: unknown[] }).rows ?? (result as unknown as unknown[])) as Array<{
|
||||
id: string;
|
||||
content: string;
|
||||
tags: string[];
|
||||
@@ -2010,7 +2010,7 @@ export async function getContext(
|
||||
ORDER BY updated_at DESC
|
||||
LIMIT 20
|
||||
`);
|
||||
const rows = (result.rows ?? result) as Array<{
|
||||
const rows = ((result as unknown as { rows?: unknown[] }).rows ?? (result as unknown as unknown[])) as Array<{
|
||||
peer_name: string | null;
|
||||
summary: string;
|
||||
files_read: string[] | null;
|
||||
@@ -2419,7 +2419,7 @@ export async function drainForMember(
|
||||
SELECT * FROM claimed ORDER BY created_at ASC, id ASC
|
||||
`);
|
||||
|
||||
const rows = (result.rows ?? result) as Array<{
|
||||
const rows = ((result as unknown as { rows?: unknown[] }).rows ?? (result as unknown as unknown[])) as Array<{
|
||||
id: string;
|
||||
priority: string;
|
||||
nonce: string;
|
||||
@@ -2665,7 +2665,11 @@ export async function findMemberByPubkey(
|
||||
),
|
||||
)
|
||||
.limit(1);
|
||||
return row ?? null;
|
||||
if (!row) return null;
|
||||
return {
|
||||
...row,
|
||||
defaultGroups: row.defaultGroups ?? [],
|
||||
};
|
||||
}
|
||||
|
||||
// --- Mesh databases (per-mesh PostgreSQL schemas) ---
|
||||
@@ -2719,7 +2723,7 @@ export async function meshQuery(
|
||||
sql.raw(`SET LOCAL search_path TO "${schema}"`)
|
||||
);
|
||||
const result = await tx.execute(sql.raw(query));
|
||||
const rows = (result.rows ?? []) as Array<Record<string, unknown>>;
|
||||
const rows = ((result as unknown as { rows?: unknown[] }).rows ?? (result as unknown as unknown[])) as Array<Record<string, unknown>>;
|
||||
const columns = rows.length > 0 ? Object.keys(rows[0]!) : [];
|
||||
return { columns, rows, rowCount: rows.length };
|
||||
});
|
||||
@@ -2762,7 +2766,7 @@ export async function meshSchema(
|
||||
WHERE table_schema = ${schema}
|
||||
ORDER BY table_name, ordinal_position
|
||||
`);
|
||||
const rows = (result.rows ?? result) as Array<{
|
||||
const rows = ((result as unknown as { rows?: unknown[] }).rows ?? (result as unknown as unknown[])) as Array<{
|
||||
table_name: string;
|
||||
column_name: string;
|
||||
data_type: string;
|
||||
|
||||
@@ -138,6 +138,128 @@ export async function sealRootKeyToRecipient(params: {
|
||||
|
||||
export const HELLO_SKEW_MS = 60_000;
|
||||
|
||||
/** Maximum lifetime of a parent attestation (24h). */
|
||||
export const SESSION_ATTESTATION_MAX_TTL_MS = 24 * 60 * 60 * 1000;
|
||||
|
||||
/**
|
||||
* Canonical bytes for a parent-vouches-session attestation.
|
||||
*
|
||||
* The parent member signs this with their stable ed25519 secret key when
|
||||
* minting an attestation in `claudemesh launch`. The broker recomputes
|
||||
* the same string at session_hello time and verifies the signature
|
||||
* against `parent_member_pubkey`.
|
||||
*
|
||||
* Format: `claudemesh-session-attest|<parent_pubkey>|<session_pubkey>|<expires_at_ms>`
|
||||
*/
|
||||
export function canonicalSessionAttestation(
|
||||
parentMemberPubkey: string,
|
||||
sessionPubkey: string,
|
||||
expiresAt: number,
|
||||
): string {
|
||||
return `claudemesh-session-attest|${parentMemberPubkey}|${sessionPubkey}|${expiresAt}`;
|
||||
}
|
||||
|
||||
/**
|
||||
* Canonical bytes for the session_hello signature.
|
||||
*
|
||||
* The session keypair (held by the daemon for the lifetime of the
|
||||
* registration) signs this fresh on every WS connect, proving liveness +
|
||||
* possession of the session secret key. Without this stage, an attacker
|
||||
* who captured an attestation could replay it from any machine.
|
||||
*
|
||||
* Format: `claudemesh-session-hello|<mesh_id>|<parent_pubkey>|<session_pubkey>|<timestamp_ms>`
|
||||
*/
|
||||
export function canonicalSessionHello(
|
||||
meshId: string,
|
||||
parentMemberPubkey: string,
|
||||
sessionPubkey: string,
|
||||
timestamp: number,
|
||||
): string {
|
||||
return `claudemesh-session-hello|${meshId}|${parentMemberPubkey}|${sessionPubkey}|${timestamp}`;
|
||||
}
|
||||
|
||||
/**
|
||||
* Validate a parent-vouches-session attestation: lifetime bound + signature.
|
||||
* Returns `{ ok: true }` on success or `{ ok: false, reason }` on failure.
|
||||
*
|
||||
* The TTL ceiling (24h) bounds replay damage if an attestation leaks; the
|
||||
* lower bound (already in the past) blocks reuse of expired ones.
|
||||
*/
|
||||
export async function verifySessionAttestation(args: {
|
||||
parentMemberPubkey: string;
|
||||
sessionPubkey: string;
|
||||
expiresAt: number;
|
||||
signature: string;
|
||||
now?: number;
|
||||
}): Promise<
|
||||
| { ok: true }
|
||||
| { ok: false; reason: "expired" | "ttl_too_long" | "bad_signature" | "malformed" }
|
||||
> {
|
||||
const now = args.now ?? Date.now();
|
||||
if (!Number.isFinite(args.expiresAt)) {
|
||||
return { ok: false, reason: "malformed" };
|
||||
}
|
||||
if (args.expiresAt <= now) {
|
||||
return { ok: false, reason: "expired" };
|
||||
}
|
||||
if (args.expiresAt > now + SESSION_ATTESTATION_MAX_TTL_MS) {
|
||||
return { ok: false, reason: "ttl_too_long" };
|
||||
}
|
||||
if (
|
||||
!/^[0-9a-f]{64}$/i.test(args.parentMemberPubkey) ||
|
||||
!/^[0-9a-f]{64}$/i.test(args.sessionPubkey) ||
|
||||
!/^[0-9a-f]{128}$/i.test(args.signature)
|
||||
) {
|
||||
return { ok: false, reason: "malformed" };
|
||||
}
|
||||
const canonical = canonicalSessionAttestation(
|
||||
args.parentMemberPubkey,
|
||||
args.sessionPubkey,
|
||||
args.expiresAt,
|
||||
);
|
||||
const ok = await verifyEd25519(canonical, args.signature, args.parentMemberPubkey);
|
||||
return ok ? { ok: true } : { ok: false, reason: "bad_signature" };
|
||||
}
|
||||
|
||||
/**
|
||||
* Validate the session-side hello signature: timestamp skew + signature
|
||||
* by the session keypair over canonical session-hello bytes.
|
||||
*/
|
||||
export async function verifySessionHelloSignature(args: {
|
||||
meshId: string;
|
||||
parentMemberPubkey: string;
|
||||
sessionPubkey: string;
|
||||
timestamp: number;
|
||||
signature: string;
|
||||
now?: number;
|
||||
}): Promise<
|
||||
| { ok: true }
|
||||
| { ok: false; reason: "timestamp_skew" | "bad_signature" | "malformed" }
|
||||
> {
|
||||
const now = args.now ?? Date.now();
|
||||
if (
|
||||
!Number.isFinite(args.timestamp) ||
|
||||
Math.abs(now - args.timestamp) > HELLO_SKEW_MS
|
||||
) {
|
||||
return { ok: false, reason: "timestamp_skew" };
|
||||
}
|
||||
if (
|
||||
!/^[0-9a-f]{64}$/i.test(args.parentMemberPubkey) ||
|
||||
!/^[0-9a-f]{64}$/i.test(args.sessionPubkey) ||
|
||||
!/^[0-9a-f]{128}$/i.test(args.signature)
|
||||
) {
|
||||
return { ok: false, reason: "malformed" };
|
||||
}
|
||||
const canonical = canonicalSessionHello(
|
||||
args.meshId,
|
||||
args.parentMemberPubkey,
|
||||
args.sessionPubkey,
|
||||
args.timestamp,
|
||||
);
|
||||
const ok = await verifyEd25519(canonical, args.signature, args.sessionPubkey);
|
||||
return ok ? { ok: true } : { ok: false, reason: "bad_signature" };
|
||||
}
|
||||
|
||||
/**
|
||||
* Verify a hello's ed25519 signature + timestamp skew.
|
||||
* Returns { ok: true } on success, or { ok: false, reason } describing
|
||||
|
||||
@@ -23,7 +23,7 @@ const envSchema = z.object({
|
||||
MINIO_ENDPOINT: z.string().default("minio:9000"),
|
||||
MINIO_ACCESS_KEY: z.string().default("claudemesh"),
|
||||
MINIO_SECRET_KEY: z.string().default("changeme"),
|
||||
MINIO_USE_SSL: z.enum(["true", "false", ""]).transform(v => v === "true").default("false"),
|
||||
MINIO_USE_SSL: z.enum(["true", "false", ""]).default("false").transform(v => v === "true"),
|
||||
QDRANT_URL: z.string().default("http://qdrant:6333"),
|
||||
NEO4J_URL: z.string().default("bolt://neo4j:7687"),
|
||||
NEO4J_USER: z.string().default("neo4j"),
|
||||
|
||||
@@ -22,7 +22,7 @@ import { invite as inviteTable, mesh, meshMember, messageQueue, presence, schedu
|
||||
import { user } from "@turbostarter/db/schema/auth";
|
||||
import { handleCliSync, type CliSyncRequest } from "./cli-sync";
|
||||
import { generateId } from "@turbostarter/shared/utils";
|
||||
import { updateMemberProfile, listMeshMembers, updateMeshSettings } from "./member-api";
|
||||
import { updateMemberProfile, listMeshMembers, updateMeshSettings, type MemberUpdateRequest, type SelfEditablePolicy } from "./member-api";
|
||||
import {
|
||||
claimTask,
|
||||
completeTask,
|
||||
@@ -115,7 +115,7 @@ import { metrics, metricsToText } from "./metrics";
|
||||
import { TokenBucket } from "./rate-limit";
|
||||
import { isDbHealthy, startDbHealth, stopDbHealth } from "./db-health";
|
||||
import { buildInfo } from "./build-info";
|
||||
import { canonicalInvite, canonicalInviteV2, claimInviteV2Core as _claimInviteV2Core, sealRootKeyToRecipient, verifyHelloSignature, verifyInviteV2 } from "./crypto";
|
||||
import { canonicalInvite, canonicalInviteV2, claimInviteV2Core as _claimInviteV2Core, sealRootKeyToRecipient, verifyHelloSignature, verifyInviteV2, verifySessionAttestation, verifySessionHelloSignature } from "./crypto";
|
||||
// Alias for in-module callers; the public re-export below surfaces the
|
||||
// same symbol without colliding with tests that import from index.ts.
|
||||
const claimInviteV2Core = _claimInviteV2Core;
|
||||
@@ -831,7 +831,12 @@ function handleHttpRequest(req: IncomingMessage, res: ServerResponse): void {
|
||||
req.on("data", (c: Buffer) => chunks.push(c));
|
||||
req.on("end", () => {
|
||||
try {
|
||||
const body = JSON.parse(Buffer.concat(chunks).toString());
|
||||
const body = JSON.parse(Buffer.concat(chunks).toString()) as {
|
||||
meshId?: string;
|
||||
memberId?: string;
|
||||
pubkey?: string;
|
||||
secretKey?: string;
|
||||
};
|
||||
const { meshId: tgMeshId, memberId: tgMemberId, pubkey: tgPubkey, secretKey: tgSecretKey } = body;
|
||||
if (!tgMeshId || !tgMemberId || !tgPubkey || !tgSecretKey) {
|
||||
writeJson(res, 400, { error: "meshId, memberId, pubkey, secretKey required" });
|
||||
@@ -1099,7 +1104,7 @@ function handleInviteClaimV2Post(
|
||||
const raw = Buffer.concat(chunks).toString();
|
||||
let payload: { recipient_x25519_pubkey?: string; display_name?: string };
|
||||
try {
|
||||
payload = JSON.parse(raw);
|
||||
payload = JSON.parse(raw) as { recipient_x25519_pubkey?: string; display_name?: string };
|
||||
} catch {
|
||||
writeJson(res, 400, { error: "malformed" });
|
||||
return;
|
||||
@@ -1197,7 +1202,7 @@ async function handleUploadPost(
|
||||
let tags: string[] = [];
|
||||
if (tagsRaw) {
|
||||
try {
|
||||
tags = JSON.parse(tagsRaw);
|
||||
tags = JSON.parse(tagsRaw) as string[];
|
||||
} catch {
|
||||
tags = [];
|
||||
}
|
||||
@@ -1259,7 +1264,7 @@ async function handleUploadPost(
|
||||
let fileKeys: Array<{ peerPubkey: string; sealedKey: string }> = [];
|
||||
if (encrypted && fileKeysRaw) {
|
||||
try {
|
||||
fileKeys = JSON.parse(fileKeysRaw);
|
||||
fileKeys = JSON.parse(fileKeysRaw) as Array<{ peerPubkey: string; sealedKey: string }>;
|
||||
} catch { /* ignore */ }
|
||||
}
|
||||
|
||||
@@ -1364,7 +1369,7 @@ function handleMemberPatchPost(req: IncomingMessage, res: ServerResponse, meshId
|
||||
req.on("end", async () => {
|
||||
if (aborted) return;
|
||||
try {
|
||||
const body = JSON.parse(Buffer.concat(chunks).toString());
|
||||
const body = JSON.parse(Buffer.concat(chunks).toString()) as MemberUpdateRequest;
|
||||
// Auth: callerMemberId from X-Member-Id header (dashboard or CLI provides this)
|
||||
const callerMemberId = req.headers["x-member-id"] as string | undefined;
|
||||
if (!callerMemberId) { writeJson(res, 401, { ok: false, error: "X-Member-Id header required" }); return; }
|
||||
@@ -1407,7 +1412,7 @@ function handleMeshSettingsPatch(req: IncomingMessage, res: ServerResponse, mesh
|
||||
req.on("end", async () => {
|
||||
if (aborted) return;
|
||||
try {
|
||||
const body = JSON.parse(Buffer.concat(chunks).toString());
|
||||
const body = JSON.parse(Buffer.concat(chunks).toString()) as { selfEditable?: SelfEditablePolicy };
|
||||
const callerMemberId = req.headers["x-member-id"] as string | undefined;
|
||||
if (!callerMemberId) { writeJson(res, 401, { ok: false, error: "X-Member-Id header required" }); return; }
|
||||
const result = await updateMeshSettings(meshId, callerMemberId, body);
|
||||
@@ -1821,6 +1826,220 @@ async function handleHello(
|
||||
};
|
||||
}
|
||||
|
||||
/**
|
||||
* Authenticate + presence-register a per-launch session WebSocket.
|
||||
*
|
||||
* Two-stage proof: parent member's pre-signed attestation vouches the
|
||||
* session pubkey, and the session keypair signs the hello timestamp to
|
||||
* prove possession. The presence row is keyed on `sessionPubkey` but
|
||||
* `member_id` points at the parent member, so member-targeted operations
|
||||
* (revocation, send-by-member-pubkey) keep working unchanged.
|
||||
*
|
||||
* Spec: .artifacts/specs/2026-05-04-per-session-presence.md.
|
||||
*/
|
||||
async function handleSessionHello(
|
||||
ws: WebSocket,
|
||||
hello: Extract<WSClientMessage, { type: "session_hello" }>,
|
||||
): Promise<{
|
||||
presenceId: string;
|
||||
memberDisplayName: string;
|
||||
memberProfile?: unknown;
|
||||
meshPolicy?: Record<string, unknown>;
|
||||
} | null> {
|
||||
// Shape checks. The crypto helpers also enforce these but bailing
|
||||
// early gives a clearer error code on the wire.
|
||||
if (!/^[0-9a-f]{64}$/.test(hello.sessionPubkey ?? "")) {
|
||||
metrics.connectionsRejected.inc({ reason: "bad_session_pubkey" });
|
||||
sendError(ws, "bad_session_pubkey", "sessionPubkey must be 64 lowercase hex chars");
|
||||
ws.close(1008, "bad_session_pubkey");
|
||||
return null;
|
||||
}
|
||||
if (!/^[0-9a-f]{64}$/.test(hello.parentMemberPubkey ?? "")) {
|
||||
metrics.connectionsRejected.inc({ reason: "bad_parent_pubkey" });
|
||||
sendError(ws, "bad_parent_pubkey", "parentMemberPubkey must be 64 lowercase hex chars");
|
||||
ws.close(1008, "bad_parent_pubkey");
|
||||
return null;
|
||||
}
|
||||
const att = hello.parentAttestation;
|
||||
if (
|
||||
!att ||
|
||||
typeof att !== "object" ||
|
||||
att.sessionPubkey !== hello.sessionPubkey ||
|
||||
att.parentMemberPubkey !== hello.parentMemberPubkey
|
||||
) {
|
||||
metrics.connectionsRejected.inc({ reason: "attestation_mismatch" });
|
||||
sendError(ws, "attestation_mismatch", "parentAttestation does not bind the claimed session+parent pubkeys");
|
||||
ws.close(1008, "attestation_mismatch");
|
||||
return null;
|
||||
}
|
||||
|
||||
// Capacity check BEFORE touching DB.
|
||||
const existing = connectionsPerMesh.get(hello.meshId) ?? 0;
|
||||
if (existing >= env.MAX_CONNECTIONS_PER_MESH) {
|
||||
metrics.connectionsRejected.inc({ reason: "capacity" });
|
||||
log.warn("mesh at capacity (session_hello)", {
|
||||
mesh_id: hello.meshId,
|
||||
existing,
|
||||
cap: env.MAX_CONNECTIONS_PER_MESH,
|
||||
});
|
||||
sendError(ws, "capacity", "mesh at connection capacity");
|
||||
ws.close(1008, "capacity");
|
||||
return null;
|
||||
}
|
||||
|
||||
// 1. Parent attestation: TTL bounds + signature against parent pubkey.
|
||||
const attCheck = await verifySessionAttestation({
|
||||
parentMemberPubkey: hello.parentMemberPubkey,
|
||||
sessionPubkey: hello.sessionPubkey,
|
||||
expiresAt: att.expiresAt,
|
||||
signature: att.signature,
|
||||
});
|
||||
if (!attCheck.ok) {
|
||||
metrics.connectionsRejected.inc({ reason: `attestation_${attCheck.reason}` });
|
||||
log.warn("session_hello attestation rejected", {
|
||||
reason: attCheck.reason,
|
||||
mesh_id: hello.meshId,
|
||||
parent_pubkey: hello.parentMemberPubkey.slice(0, 12),
|
||||
session_pubkey: hello.sessionPubkey.slice(0, 12),
|
||||
});
|
||||
sendError(ws, attCheck.reason, `attestation rejected: ${attCheck.reason}`);
|
||||
ws.close(1008, attCheck.reason);
|
||||
return null;
|
||||
}
|
||||
|
||||
// 2. Session signature: timestamp skew + ed25519 against sessionPubkey.
|
||||
const sigCheck = await verifySessionHelloSignature({
|
||||
meshId: hello.meshId,
|
||||
parentMemberPubkey: hello.parentMemberPubkey,
|
||||
sessionPubkey: hello.sessionPubkey,
|
||||
timestamp: hello.timestamp,
|
||||
signature: hello.signature,
|
||||
});
|
||||
if (!sigCheck.ok) {
|
||||
metrics.connectionsRejected.inc({ reason: `session_${sigCheck.reason}` });
|
||||
log.warn("session_hello sig rejected", {
|
||||
reason: sigCheck.reason,
|
||||
mesh_id: hello.meshId,
|
||||
session_pubkey: hello.sessionPubkey.slice(0, 12),
|
||||
});
|
||||
sendError(ws, sigCheck.reason, `session_hello rejected: ${sigCheck.reason}`);
|
||||
ws.close(1008, sigCheck.reason);
|
||||
return null;
|
||||
}
|
||||
|
||||
// 3. Parent member must exist + be active in the claimed mesh.
|
||||
const member = await findMemberByPubkey(hello.meshId, hello.parentMemberPubkey);
|
||||
if (!member) {
|
||||
const [revokedRow] = await db
|
||||
.select({ displayName: meshMember.displayName, revokedAt: meshMember.revokedAt })
|
||||
.from(meshMember)
|
||||
.where(and(eq(meshMember.meshId, hello.meshId), eq(meshMember.peerPubkey, hello.parentMemberPubkey)))
|
||||
.limit(1);
|
||||
if (revokedRow?.revokedAt) {
|
||||
metrics.connectionsRejected.inc({ reason: "revoked" });
|
||||
const [m] = await db.select({ slug: mesh.slug, name: mesh.name }).from(mesh).where(eq(mesh.id, hello.meshId)).limit(1);
|
||||
const meshLabel = m?.name || m?.slug || hello.meshId;
|
||||
sendError(
|
||||
ws,
|
||||
"revoked",
|
||||
`You've been removed from "${meshLabel}". Contact the mesh owner to rejoin.`,
|
||||
);
|
||||
ws.close(4002, "banned");
|
||||
log.info("session_hello rejected: revoked parent", { mesh_id: hello.meshId, display_name: revokedRow.displayName });
|
||||
return null;
|
||||
}
|
||||
metrics.connectionsRejected.inc({ reason: "unauthorized" });
|
||||
sendError(ws, "unauthorized", "parent pubkey not found in mesh");
|
||||
ws.close(1008, "unauthorized");
|
||||
return null;
|
||||
}
|
||||
// The parentMemberId in the hello must match the member we resolved by
|
||||
// pubkey — otherwise the daemon claims membership it doesn't have.
|
||||
if (hello.parentMemberId && hello.parentMemberId !== member.id) {
|
||||
metrics.connectionsRejected.inc({ reason: "parent_member_id_mismatch" });
|
||||
sendError(ws, "parent_member_id_mismatch", "parentMemberId does not match parentMemberPubkey");
|
||||
ws.close(1008, "parent_member_id_mismatch");
|
||||
return null;
|
||||
}
|
||||
|
||||
// Load mesh policy (best-effort; non-fatal).
|
||||
let meshPolicy: Record<string, unknown> | undefined;
|
||||
try {
|
||||
const [m] = await db
|
||||
.select({ selfEditable: mesh.selfEditable })
|
||||
.from(mesh)
|
||||
.where(eq(mesh.id, hello.meshId));
|
||||
if (m?.selfEditable) meshPolicy = { selfEditable: m.selfEditable };
|
||||
} catch { /* non-fatal */ }
|
||||
|
||||
const initialGroups = hello.groups ?? member.defaultGroups ?? [];
|
||||
|
||||
// Session-id dedup: if the same session_id is already connected, kick
|
||||
// the ghost. Reconnect after a network blip lands here cleanly.
|
||||
for (const [oldPid, oldConn] of connections) {
|
||||
if (oldConn.meshId === hello.meshId && oldConn.sessionId === hello.sessionId) {
|
||||
log.info("session_hello dedup", { old_presence: oldPid, session_id: hello.sessionId });
|
||||
try { oldConn.ws.close(1000, "session_replaced"); } catch { /* already dead */ }
|
||||
connections.delete(oldPid);
|
||||
void disconnectPresence(oldPid);
|
||||
}
|
||||
}
|
||||
|
||||
const presenceId = await connectPresence({
|
||||
memberId: member.id,
|
||||
sessionId: hello.sessionId,
|
||||
sessionPubkey: hello.sessionPubkey,
|
||||
displayName: hello.displayName,
|
||||
pid: hello.pid,
|
||||
cwd: hello.cwd,
|
||||
groups: initialGroups,
|
||||
});
|
||||
const effectiveDisplayName = hello.displayName || member.displayName;
|
||||
connections.set(presenceId, {
|
||||
ws,
|
||||
meshId: hello.meshId,
|
||||
memberId: member.id,
|
||||
memberPubkey: hello.parentMemberPubkey,
|
||||
sessionId: hello.sessionId,
|
||||
sessionPubkey: hello.sessionPubkey,
|
||||
displayName: effectiveDisplayName,
|
||||
cwd: hello.cwd,
|
||||
hostname: hello.hostname,
|
||||
peerType: hello.peerType,
|
||||
channel: hello.channel,
|
||||
model: hello.model,
|
||||
groups: initialGroups,
|
||||
visible: true,
|
||||
profile: {},
|
||||
});
|
||||
incMeshCount(hello.meshId);
|
||||
void audit(hello.meshId, "peer_joined", member.id, effectiveDisplayName, {
|
||||
pubkey: hello.parentMemberPubkey,
|
||||
session_pubkey: hello.sessionPubkey,
|
||||
groups: initialGroups,
|
||||
via: "session_hello",
|
||||
});
|
||||
log.info("ws session_hello", {
|
||||
mesh_id: hello.meshId,
|
||||
member: effectiveDisplayName,
|
||||
presence_id: presenceId,
|
||||
session_id: hello.sessionId,
|
||||
session_pubkey: hello.sessionPubkey.slice(0, 12),
|
||||
});
|
||||
// Drain any DMs queued for this session pubkey (or the parent member).
|
||||
void maybePushQueuedMessages(presenceId);
|
||||
return {
|
||||
presenceId,
|
||||
memberDisplayName: effectiveDisplayName,
|
||||
memberProfile: {
|
||||
roleTag: member.roleTag,
|
||||
groups: member.defaultGroups ?? [],
|
||||
messageMode: member.messageMode ?? "push",
|
||||
},
|
||||
meshPolicy,
|
||||
};
|
||||
}
|
||||
|
||||
async function handleSend(
|
||||
conn: PeerConn,
|
||||
msg: Extract<WSClientMessage, { type: "send" }>,
|
||||
@@ -2171,6 +2390,53 @@ function handleConnection(ws: WebSocket): void {
|
||||
try {
|
||||
const msg = JSON.parse(raw.toString()) as WSClientMessage;
|
||||
const _reqId = (msg as any)._reqId as string | undefined;
|
||||
if (msg.type === "session_hello") {
|
||||
const result = await handleSessionHello(ws, msg);
|
||||
if (!result) return;
|
||||
presenceId = result.presenceId;
|
||||
try {
|
||||
const ackPayload: Record<string, unknown> = {
|
||||
type: "hello_ack",
|
||||
presenceId: result.presenceId,
|
||||
memberDisplayName: result.memberDisplayName,
|
||||
memberProfile: result.memberProfile,
|
||||
...(result.meshPolicy ? { meshPolicy: result.meshPolicy } : {}),
|
||||
};
|
||||
ws.send(JSON.stringify(ackPayload));
|
||||
} catch {
|
||||
/* ws closed during hello */
|
||||
}
|
||||
// Broadcast peer_joined to siblings — same shape as the regular
|
||||
// hello path, so list_peers consumers don't need to special-case.
|
||||
const joinedConn = connections.get(presenceId);
|
||||
if (joinedConn) {
|
||||
const joinMsg: WSPushMessage = {
|
||||
type: "push",
|
||||
subtype: "system",
|
||||
event: "peer_joined",
|
||||
eventData: {
|
||||
name: result.memberDisplayName,
|
||||
pubkey: joinedConn.sessionPubkey ?? joinedConn.memberPubkey,
|
||||
groups: joinedConn.groups,
|
||||
},
|
||||
messageId: crypto.randomUUID(),
|
||||
meshId: joinedConn.meshId,
|
||||
senderPubkey: "system",
|
||||
priority: "low",
|
||||
nonce: "",
|
||||
ciphertext: "",
|
||||
createdAt: new Date().toISOString(),
|
||||
};
|
||||
for (const [pid, peer] of connections) {
|
||||
if (pid === presenceId) continue;
|
||||
if (peer.meshId !== joinedConn.meshId) continue;
|
||||
// Same-member sibling sessions get the join — a per-launch
|
||||
// session is meant to be visible to the user's other launches.
|
||||
sendToPeer(pid, joinMsg);
|
||||
}
|
||||
}
|
||||
return;
|
||||
}
|
||||
if (msg.type === "hello") {
|
||||
const result = await handleHello(ws, msg);
|
||||
if (!result) return;
|
||||
@@ -3492,7 +3758,7 @@ function handleConnection(ws: WebSocket): void {
|
||||
const gqRecords = gqResult.records.map((r) => {
|
||||
const obj: Record<string, unknown> = {};
|
||||
for (const key of r.keys) {
|
||||
obj[key] = r.get(key);
|
||||
obj[String(key)] = r.get(key);
|
||||
}
|
||||
return obj;
|
||||
});
|
||||
@@ -3527,7 +3793,7 @@ function handleConnection(ws: WebSocket): void {
|
||||
const geRecords = geResult.records.map((r) => {
|
||||
const obj: Record<string, unknown> = {};
|
||||
for (const key of r.keys) {
|
||||
obj[key] = r.get(key);
|
||||
obj[String(key)] = r.get(key);
|
||||
}
|
||||
return obj;
|
||||
});
|
||||
@@ -3616,10 +3882,10 @@ function handleConnection(ws: WebSocket): void {
|
||||
const [peers, stateEntries, memCount, fileCount, taskCounts, streams, tables] = await Promise.all([
|
||||
listPeersInMesh(conn.meshId),
|
||||
listState(conn.meshId),
|
||||
db.execute(sql`SELECT COUNT(*) as n FROM mesh.memory WHERE mesh_id = ${conn.meshId} AND forgotten_at IS NULL`).then(r => Number(((r.rows ?? r) as any[])[0]?.n ?? 0)),
|
||||
db.execute(sql`SELECT COUNT(*) as n FROM mesh.file WHERE mesh_id = ${conn.meshId} AND deleted_at IS NULL`).then(r => Number(((r.rows ?? r) as any[])[0]?.n ?? 0)),
|
||||
db.execute(sql`SELECT COUNT(*) as n FROM mesh.memory WHERE mesh_id = ${conn.meshId} AND forgotten_at IS NULL`).then(r => Number((((r as unknown as { rows?: unknown[] }).rows ?? (r as unknown as unknown[])) as any[])[0]?.n ?? 0)),
|
||||
db.execute(sql`SELECT COUNT(*) as n FROM mesh.file WHERE mesh_id = ${conn.meshId} AND deleted_at IS NULL`).then(r => Number((((r as unknown as { rows?: unknown[] }).rows ?? (r as unknown as unknown[])) as any[])[0]?.n ?? 0)),
|
||||
db.execute(sql`SELECT status, COUNT(*) as n FROM mesh.task WHERE mesh_id = ${conn.meshId} GROUP BY status`).then(r => {
|
||||
const rows = (r.rows ?? r) as Array<{ status: string; n: string }>;
|
||||
const rows = (((r as unknown as { rows?: unknown[] }).rows ?? (r as unknown as unknown[]))) as Array<{ status: string; n: string }>;
|
||||
const counts = { open: 0, claimed: 0, done: 0 };
|
||||
for (const row of rows) counts[row.status as keyof typeof counts] = Number(row.n);
|
||||
return counts;
|
||||
|
||||
@@ -86,7 +86,7 @@ export async function verifySyncToken(
|
||||
}
|
||||
|
||||
// Decode header — must be HS256
|
||||
const header = JSON.parse(new TextDecoder().decode(base64UrlDecode(headerB64)));
|
||||
const header = JSON.parse(new TextDecoder().decode(base64UrlDecode(headerB64))) as { alg?: string };
|
||||
if (header.alg !== "HS256") {
|
||||
return { ok: false, error: `unsupported algorithm: ${header.alg}` };
|
||||
}
|
||||
|
||||
@@ -31,7 +31,7 @@ export interface MemberPermissionUpdate {
|
||||
|
||||
export type MemberUpdateRequest = MemberProfileUpdate & MemberPermissionUpdate;
|
||||
|
||||
interface SelfEditablePolicy {
|
||||
export interface SelfEditablePolicy {
|
||||
displayName: boolean;
|
||||
roleTag: boolean;
|
||||
groups: boolean;
|
||||
|
||||
@@ -115,11 +115,11 @@ function lastAssistantHasToolUse(filePath: string): boolean {
|
||||
if (!line) continue;
|
||||
if (!line.includes('"assistant"')) continue;
|
||||
try {
|
||||
const d = JSON.parse(line);
|
||||
const d = JSON.parse(line) as { type?: string; message?: { content?: unknown } };
|
||||
if (d.type !== "assistant") continue;
|
||||
const content = d.message?.content;
|
||||
if (!Array.isArray(content)) continue;
|
||||
return content.some((c: { type?: string }) => c.type === "tool_use");
|
||||
return (content as Array<{ type?: string }>).some((c) => c.type === "tool_use");
|
||||
} catch {
|
||||
/* malformed line, skip */
|
||||
}
|
||||
|
||||
@@ -169,7 +169,7 @@ function detectEntry(
|
||||
try {
|
||||
const pkg = JSON.parse(
|
||||
readFileSync(join(sourcePath, "package.json"), "utf-8"),
|
||||
);
|
||||
) as { main?: string; bin?: string | Record<string, string> };
|
||||
if (pkg.main) return { command: cmd, args: [pkg.main] };
|
||||
if (pkg.bin) {
|
||||
const bin =
|
||||
@@ -372,7 +372,7 @@ function spawnService(svc: ManagedService): void {
|
||||
const rl = createInterface({ input: child.stdout! });
|
||||
rl.on("line", (line) => {
|
||||
try {
|
||||
const msg = JSON.parse(line);
|
||||
const msg = JSON.parse(line) as { id?: string | number; error?: { message?: string }; result?: unknown };
|
||||
if (msg.id && svc.pendingCalls.has(String(msg.id))) {
|
||||
const pending = svc.pendingCalls.get(String(msg.id))!;
|
||||
clearTimeout(pending.timer);
|
||||
|
||||
@@ -13,6 +13,7 @@ import { Bot, InputFile } from "grammy";
|
||||
import WebSocket from "ws";
|
||||
import sodium from "libsodium-wrappers";
|
||||
import { validateTelegramConnectToken } from "./telegram-token";
|
||||
import { log } from "./logger";
|
||||
|
||||
// ---------------------------------------------------------------------------
|
||||
// Types
|
||||
@@ -22,11 +23,12 @@ export interface BridgeRow {
|
||||
chatId: number;
|
||||
meshId: string;
|
||||
meshSlug?: string;
|
||||
memberId: string;
|
||||
/** memberId can be null until the bridge claims a mesh.member row. */
|
||||
memberId: string | null;
|
||||
pubkey: string;
|
||||
secretKey: string;
|
||||
displayName: string;
|
||||
chatType: string;
|
||||
displayName: string | null;
|
||||
chatType: string | null;
|
||||
chatTitle: string | null;
|
||||
}
|
||||
|
||||
@@ -228,7 +230,7 @@ class MeshConnection {
|
||||
|
||||
ws.on("message", async (raw) => {
|
||||
try {
|
||||
const msg = JSON.parse(raw.toString());
|
||||
const msg = JSON.parse(raw.toString()) as Record<string, any>;
|
||||
|
||||
if (msg.type === "hello_ack") {
|
||||
clearTimeout(helloTimeout);
|
||||
@@ -674,8 +676,8 @@ function createPushHandler(bot: Bot) {
|
||||
for (const chatId of chatIds) {
|
||||
bot.api
|
||||
.sendMessage(chatId, formatted)
|
||||
.catch((e) => {
|
||||
console.error(`[tg-bridge] send to chat ${chatId} failed:`, e.message);
|
||||
.catch((e: unknown) => {
|
||||
console.error(`[tg-bridge] send to chat ${chatId} failed:`, e instanceof Error ? e.message : String(e));
|
||||
});
|
||||
}
|
||||
};
|
||||
@@ -1729,11 +1731,12 @@ async function executeAiToolCall(
|
||||
for (const meshId of meshIds) {
|
||||
const services = await listDbMeshServices(meshId);
|
||||
for (const s of services) {
|
||||
const sx = s as Record<string, unknown>;
|
||||
allServices.push({
|
||||
name: s.name,
|
||||
type: s.type ?? "mcp",
|
||||
tools: s.tool_count ?? 0,
|
||||
status: s.status ?? "running",
|
||||
name: String(sx.name ?? ""),
|
||||
type: String(sx.type ?? "mcp"),
|
||||
tools: Number(sx.tool_count ?? 0),
|
||||
status: String(sx.status ?? "running"),
|
||||
});
|
||||
}
|
||||
}
|
||||
@@ -1841,6 +1844,9 @@ export async function bootTelegramBridge(
|
||||
for (const [meshId, meshRows] of byMesh) {
|
||||
const first = meshRows[0]!;
|
||||
try {
|
||||
// memberId/displayName come back from DB nullable; bridge only
|
||||
// works once both are populated, so skip rows missing either.
|
||||
if (!first.memberId || !first.displayName) continue;
|
||||
await ensureMeshConnection(
|
||||
{
|
||||
meshId,
|
||||
|
||||
@@ -102,11 +102,11 @@ export function validateTelegramConnectToken(
|
||||
if (!timingSafeEqual(a, b)) return null;
|
||||
|
||||
// Verify header algorithm
|
||||
const header = JSON.parse(base64urlDecode(headerB64));
|
||||
const header = JSON.parse(base64urlDecode(headerB64)) as { alg?: string };
|
||||
if (header.alg !== "HS256") return null;
|
||||
|
||||
// Decode and validate claims
|
||||
const claims: JwtClaims = JSON.parse(base64urlDecode(payloadB64));
|
||||
const claims = JSON.parse(base64urlDecode(payloadB64)) as JwtClaims;
|
||||
|
||||
// Check subject
|
||||
if (claims.sub !== "telegram-connect") return null;
|
||||
|
||||
@@ -90,6 +90,66 @@ export interface WSHelloMessage {
|
||||
signature: string;
|
||||
}
|
||||
|
||||
/**
|
||||
* Client → broker: per-launch session hello, vouched by the parent member.
|
||||
*
|
||||
* Used by the daemon's per-session WebSocket connections (1.30.0+) so that
|
||||
* each `claudemesh launch`-spawned session has its own long-lived presence
|
||||
* row owned by an ephemeral session keypair. The parent member key vouches
|
||||
* (out-of-band) that the session pubkey is theirs; the session keypair
|
||||
* proves liveness on every connect.
|
||||
*
|
||||
* Two-stage proof:
|
||||
* 1. `parentAttestation.signature` — ed25519 over
|
||||
* `claudemesh-session-attest|<parent_pubkey>|<session_pubkey>|<expires_at_ms>`
|
||||
* signed by the parent member's stable secret key. TTL ≤ 24h.
|
||||
* 2. `signature` — ed25519 over
|
||||
* `claudemesh-session-hello|<mesh_id>|<parent_pubkey>|<session_pubkey>|<timestamp>`
|
||||
* signed by the session secret key (held by the daemon for the
|
||||
* lifetime of the session registration).
|
||||
*
|
||||
* Older brokers don't recognize this message type and reply with
|
||||
* `unknown_message_type`; clients fall back to the legacy `hello` flow.
|
||||
*/
|
||||
export interface WSSessionHelloMessage {
|
||||
type: "session_hello";
|
||||
/** Highest WS protocol version the client understands. */
|
||||
protocolVersion?: number;
|
||||
/** Optional feature strings the client supports. */
|
||||
capabilities?: string[];
|
||||
meshId: string;
|
||||
/** Parent member's id (mesh.member.id) — used for revocation lookup. */
|
||||
parentMemberId: string;
|
||||
/** Parent member's stable ed25519 pubkey (hex), as found in mesh.member. */
|
||||
parentMemberPubkey: string;
|
||||
/** Per-launch ephemeral ed25519 pubkey (hex). Routes presence + DMs. */
|
||||
sessionPubkey: string;
|
||||
/** Pre-signed attestation by the parent member, presented per session. */
|
||||
parentAttestation: {
|
||||
sessionPubkey: string;
|
||||
parentMemberPubkey: string;
|
||||
/** Unix ms; broker rejects past or > now+24h. */
|
||||
expiresAt: number;
|
||||
signature: string;
|
||||
};
|
||||
/** Display name override for this session (optional, falls back to member). */
|
||||
displayName?: string;
|
||||
sessionId: string;
|
||||
pid: number;
|
||||
cwd: string;
|
||||
hostname?: string;
|
||||
peerType?: "ai" | "human" | "connector";
|
||||
channel?: string;
|
||||
model?: string;
|
||||
groups?: Array<{ name: string; role?: string }>;
|
||||
/** Initial role tag for the session. */
|
||||
role?: string;
|
||||
/** ms epoch; broker rejects if outside ±60s of its own clock. */
|
||||
timestamp: number;
|
||||
/** ed25519 signature (hex) by the SESSION secret key over canonical bytes. */
|
||||
signature: string;
|
||||
}
|
||||
|
||||
/** Client → broker: send an E2E-encrypted envelope to a target. */
|
||||
export interface WSSendMessage {
|
||||
type: "send";
|
||||
@@ -110,6 +170,10 @@ export interface WSSendMessage {
|
||||
* Server validates same-topic membership; FK is set null if parent
|
||||
* later disappears. Ignored for non-topic targets. */
|
||||
replyToId?: string;
|
||||
/** Optional ciphertext-format version. 1 = v1 plaintext base64;
|
||||
* 2 = v0.3.0 phase 3 per-topic encrypted body. Server passes this
|
||||
* through verbatim into topic_message.body_version. */
|
||||
bodyVersion?: number;
|
||||
}
|
||||
|
||||
/** Broker → client: an envelope addressed to this peer. */
|
||||
@@ -1330,6 +1394,16 @@ export interface WSVaultGetMessage { type: "vault_get"; keys: string[]; _reqId?:
|
||||
export interface WSWatchMessage { type: "watch"; url: string; mode?: "hash" | "json" | "status"; extract?: string; interval?: number; notify_on?: string; headers?: Record<string, string>; label?: string; _reqId?: string; }
|
||||
/** Client → broker: stop watching. */
|
||||
export interface WSUnwatchMessage { type: "unwatch"; watchId: string; _reqId?: string; }
|
||||
/** Client → broker: soft-disconnect a peer (1000; CLI auto-reconnects). */
|
||||
export interface WSDisconnectMessage { type: "disconnect"; target?: string; stale?: number; all?: boolean; _reqId?: string; }
|
||||
/** Client → broker: hard-kick a peer (4001; CLI exits). */
|
||||
export interface WSKickMessage { type: "kick"; target?: string; stale?: number; all?: boolean; _reqId?: string; }
|
||||
/** Client → broker: ban a member by pubkey or display name. */
|
||||
export interface WSBanMessage { type: "ban"; target: string; reason?: string; _reqId?: string; }
|
||||
/** Client → broker: lift a ban. */
|
||||
export interface WSUnbanMessage { type: "unban"; target: string; _reqId?: string; }
|
||||
/** Client → broker: list active bans on the caller's mesh. */
|
||||
export interface WSListBansMessage { type: "list_bans"; _reqId?: string; }
|
||||
/** Client → broker: list active watches. */
|
||||
export interface WSWatchListMessage { type: "watch_list"; _reqId?: string; }
|
||||
/** Broker → client: watch created acknowledgement. */
|
||||
@@ -1341,6 +1415,7 @@ export interface WSWatchTriggeredMessage { type: "watch_triggered"; watchId: str
|
||||
|
||||
export type WSClientMessage =
|
||||
| WSHelloMessage
|
||||
| WSSessionHelloMessage
|
||||
| WSSendMessage
|
||||
| WSSetStatusMessage
|
||||
| WSListPeersMessage
|
||||
@@ -1433,7 +1508,12 @@ export type WSClientMessage =
|
||||
| WSVaultGetMessage
|
||||
| WSWatchMessage
|
||||
| WSUnwatchMessage
|
||||
| WSWatchListMessage;
|
||||
| WSWatchListMessage
|
||||
| WSDisconnectMessage
|
||||
| WSKickMessage
|
||||
| WSBanMessage
|
||||
| WSUnbanMessage
|
||||
| WSListBansMessage;
|
||||
|
||||
// --- Skill messages ---
|
||||
|
||||
@@ -1485,6 +1565,8 @@ export interface WSSkillDataMessage {
|
||||
instructions: string;
|
||||
tags: string[];
|
||||
author: string;
|
||||
/** Optional opaque metadata stored alongside the skill body. */
|
||||
manifest?: unknown;
|
||||
createdAt: string;
|
||||
} | null;
|
||||
_reqId?: string;
|
||||
|
||||
218
apps/broker/tests/session-hello-signature.test.ts
Normal file
218
apps/broker/tests/session-hello-signature.test.ts
Normal file
@@ -0,0 +1,218 @@
|
||||
/**
|
||||
* Session-hello signature + parent-attestation verification.
|
||||
*
|
||||
* Two-stage proof:
|
||||
* 1. Parent member signs `canonicalSessionAttestation` (long-lived, ≤24h
|
||||
* TTL) — vouches that the session pubkey belongs to them.
|
||||
* 2. Session keypair signs `canonicalSessionHello` per WS-connect — proves
|
||||
* liveness + possession.
|
||||
*
|
||||
* The broker rejects on any: expired/over-TTL attestation, bad signature,
|
||||
* timestamp skew, malformed hex, or a session signature made with the
|
||||
* wrong key (covers the "attestation leaked, attacker tries to use it
|
||||
* without the session secret key" case).
|
||||
*/
|
||||
|
||||
import { beforeAll, describe, expect, test } from "vitest";
|
||||
import sodium from "libsodium-wrappers";
|
||||
import {
|
||||
canonicalSessionAttestation,
|
||||
canonicalSessionHello,
|
||||
verifySessionAttestation,
|
||||
verifySessionHelloSignature,
|
||||
SESSION_ATTESTATION_MAX_TTL_MS,
|
||||
HELLO_SKEW_MS,
|
||||
} from "../src/crypto";
|
||||
|
||||
interface Keypair {
|
||||
publicKey: string;
|
||||
secretKey: string;
|
||||
}
|
||||
|
||||
async function makeKeypair(): Promise<Keypair> {
|
||||
await sodium.ready;
|
||||
const kp = sodium.crypto_sign_keypair();
|
||||
return {
|
||||
publicKey: sodium.to_hex(kp.publicKey),
|
||||
secretKey: sodium.to_hex(kp.privateKey),
|
||||
};
|
||||
}
|
||||
|
||||
function sign(canonical: string, secretKeyHex: string): string {
|
||||
return sodium.to_hex(
|
||||
sodium.crypto_sign_detached(
|
||||
sodium.from_string(canonical),
|
||||
sodium.from_hex(secretKeyHex),
|
||||
),
|
||||
);
|
||||
}
|
||||
|
||||
describe("verifySessionAttestation", () => {
|
||||
let parent: Keypair;
|
||||
let session: Keypair;
|
||||
|
||||
beforeAll(async () => {
|
||||
parent = await makeKeypair();
|
||||
session = await makeKeypair();
|
||||
});
|
||||
|
||||
test("valid attestation accepted", async () => {
|
||||
const expiresAt = Date.now() + 60 * 60 * 1000;
|
||||
const canonical = canonicalSessionAttestation(parent.publicKey, session.publicKey, expiresAt);
|
||||
const signature = sign(canonical, parent.secretKey);
|
||||
const result = await verifySessionAttestation({
|
||||
parentMemberPubkey: parent.publicKey,
|
||||
sessionPubkey: session.publicKey,
|
||||
expiresAt,
|
||||
signature,
|
||||
});
|
||||
expect(result.ok).toBe(true);
|
||||
});
|
||||
|
||||
test("expired attestation rejected", async () => {
|
||||
const expiresAt = Date.now() - 1_000;
|
||||
const canonical = canonicalSessionAttestation(parent.publicKey, session.publicKey, expiresAt);
|
||||
const signature = sign(canonical, parent.secretKey);
|
||||
const result = await verifySessionAttestation({
|
||||
parentMemberPubkey: parent.publicKey,
|
||||
sessionPubkey: session.publicKey,
|
||||
expiresAt,
|
||||
signature,
|
||||
});
|
||||
expect(result.ok).toBe(false);
|
||||
if (!result.ok) expect(result.reason).toBe("expired");
|
||||
});
|
||||
|
||||
test("over-24h TTL rejected", async () => {
|
||||
const expiresAt = Date.now() + SESSION_ATTESTATION_MAX_TTL_MS + 60_000;
|
||||
const canonical = canonicalSessionAttestation(parent.publicKey, session.publicKey, expiresAt);
|
||||
const signature = sign(canonical, parent.secretKey);
|
||||
const result = await verifySessionAttestation({
|
||||
parentMemberPubkey: parent.publicKey,
|
||||
sessionPubkey: session.publicKey,
|
||||
expiresAt,
|
||||
signature,
|
||||
});
|
||||
expect(result.ok).toBe(false);
|
||||
if (!result.ok) expect(result.reason).toBe("ttl_too_long");
|
||||
});
|
||||
|
||||
test("attestation signed by wrong key rejected", async () => {
|
||||
const other = await makeKeypair();
|
||||
const expiresAt = Date.now() + 60 * 60 * 1000;
|
||||
const canonical = canonicalSessionAttestation(parent.publicKey, session.publicKey, expiresAt);
|
||||
// Sign with a different parent — verifier still checks against
|
||||
// claimed parentMemberPubkey, so it should fail.
|
||||
const signature = sign(canonical, other.secretKey);
|
||||
const result = await verifySessionAttestation({
|
||||
parentMemberPubkey: parent.publicKey,
|
||||
sessionPubkey: session.publicKey,
|
||||
expiresAt,
|
||||
signature,
|
||||
});
|
||||
expect(result.ok).toBe(false);
|
||||
if (!result.ok) expect(result.reason).toBe("bad_signature");
|
||||
});
|
||||
|
||||
test("tampered session_pubkey fails (canonical mismatch)", async () => {
|
||||
const expiresAt = Date.now() + 60 * 60 * 1000;
|
||||
const canonical = canonicalSessionAttestation(parent.publicKey, session.publicKey, expiresAt);
|
||||
const signature = sign(canonical, parent.secretKey);
|
||||
const evil = await makeKeypair();
|
||||
const result = await verifySessionAttestation({
|
||||
parentMemberPubkey: parent.publicKey,
|
||||
sessionPubkey: evil.publicKey, // claim a different session pubkey
|
||||
expiresAt,
|
||||
signature,
|
||||
});
|
||||
expect(result.ok).toBe(false);
|
||||
if (!result.ok) expect(result.reason).toBe("bad_signature");
|
||||
});
|
||||
|
||||
test("malformed hex rejected", async () => {
|
||||
const expiresAt = Date.now() + 60 * 60 * 1000;
|
||||
const result = await verifySessionAttestation({
|
||||
parentMemberPubkey: "not-hex",
|
||||
sessionPubkey: session.publicKey,
|
||||
expiresAt,
|
||||
signature: "a".repeat(128),
|
||||
});
|
||||
expect(result.ok).toBe(false);
|
||||
if (!result.ok) expect(result.reason).toBe("malformed");
|
||||
});
|
||||
});
|
||||
|
||||
describe("verifySessionHelloSignature", () => {
|
||||
let parent: Keypair;
|
||||
let session: Keypair;
|
||||
|
||||
beforeAll(async () => {
|
||||
parent = await makeKeypair();
|
||||
session = await makeKeypair();
|
||||
});
|
||||
|
||||
test("valid session-hello signature accepted", async () => {
|
||||
const meshId = "mesh-x";
|
||||
const timestamp = Date.now();
|
||||
const canonical = canonicalSessionHello(meshId, parent.publicKey, session.publicKey, timestamp);
|
||||
const signature = sign(canonical, session.secretKey);
|
||||
const result = await verifySessionHelloSignature({
|
||||
meshId,
|
||||
parentMemberPubkey: parent.publicKey,
|
||||
sessionPubkey: session.publicKey,
|
||||
timestamp,
|
||||
signature,
|
||||
});
|
||||
expect(result.ok).toBe(true);
|
||||
});
|
||||
|
||||
test("attacker without session secret key cannot forge session-hello", async () => {
|
||||
// The hostile case: attacker captured a valid attestation but doesn't
|
||||
// hold the session secret key. They try to sign session_hello with the
|
||||
// parent's key — broker checks the signature against sessionPubkey,
|
||||
// which fails because the parent didn't sign with the session key.
|
||||
const meshId = "mesh-x";
|
||||
const timestamp = Date.now();
|
||||
const canonical = canonicalSessionHello(meshId, parent.publicKey, session.publicKey, timestamp);
|
||||
const signature = sign(canonical, parent.secretKey); // wrong secret key
|
||||
const result = await verifySessionHelloSignature({
|
||||
meshId,
|
||||
parentMemberPubkey: parent.publicKey,
|
||||
sessionPubkey: session.publicKey,
|
||||
timestamp,
|
||||
signature,
|
||||
});
|
||||
expect(result.ok).toBe(false);
|
||||
if (!result.ok) expect(result.reason).toBe("bad_signature");
|
||||
});
|
||||
|
||||
test("timestamp skew rejected", async () => {
|
||||
const timestamp = Date.now() - HELLO_SKEW_MS - 1_000;
|
||||
const canonical = canonicalSessionHello("mesh-x", parent.publicKey, session.publicKey, timestamp);
|
||||
const signature = sign(canonical, session.secretKey);
|
||||
const result = await verifySessionHelloSignature({
|
||||
meshId: "mesh-x",
|
||||
parentMemberPubkey: parent.publicKey,
|
||||
sessionPubkey: session.publicKey,
|
||||
timestamp,
|
||||
signature,
|
||||
});
|
||||
expect(result.ok).toBe(false);
|
||||
if (!result.ok) expect(result.reason).toBe("timestamp_skew");
|
||||
});
|
||||
|
||||
test("tampered meshId fails verification", async () => {
|
||||
const timestamp = Date.now();
|
||||
const canonical = canonicalSessionHello("mesh-A", parent.publicKey, session.publicKey, timestamp);
|
||||
const signature = sign(canonical, session.secretKey);
|
||||
const result = await verifySessionHelloSignature({
|
||||
meshId: "mesh-B", // claim a different mesh
|
||||
parentMemberPubkey: parent.publicKey,
|
||||
sessionPubkey: session.publicKey,
|
||||
timestamp,
|
||||
signature,
|
||||
});
|
||||
expect(result.ok).toBe(false);
|
||||
if (!result.ok) expect(result.reason).toBe("bad_signature");
|
||||
});
|
||||
});
|
||||
@@ -1,5 +1,332 @@
|
||||
# Changelog
|
||||
|
||||
## 1.30.0 (2026-05-04) — per-session broker presence
|
||||
|
||||
Sprint A Phase 3. Two `claudemesh launch` sessions in the same cwd now
|
||||
see each other in `peer list`. Each launched session has a long-lived
|
||||
broker presence row owned by the daemon, identified by a per-launch
|
||||
ephemeral keypair vouched by the member's stable key (OAuth-refresh-vs-
|
||||
access shape).
|
||||
|
||||
### What landed
|
||||
|
||||
- **broker `session_hello`** — new WS message type. Validates a
|
||||
parent-vouched `parent_attestation` (≤24h TTL, ed25519 signature by
|
||||
the parent member) plus a session-keyed signature on the hello
|
||||
itself. Inserts a presence row keyed on `sessionPubkey` but
|
||||
`member_id` from the parent, so member-targeted operations stay
|
||||
unchanged. Older brokers reply `unknown_message_type` — newer clients
|
||||
drop back to the previous behavior.
|
||||
- **daemon `SessionBrokerClient`** — slim WS variant of
|
||||
`DaemonBrokerClient`. Presence-only, no outbox drain. Lifetime tied
|
||||
to a registry hook: register opens it, deregister/reaper closes it.
|
||||
Reconnect with exponential backoff up to 30 s.
|
||||
- **session-registry hooks** — `setRegistryHooks({ onRegister,
|
||||
onDeregister })` in `apps/cli/src/daemon/session-registry.ts`. Hook
|
||||
errors are caught so they never throttle the registry. SessionInfo
|
||||
gains an optional `presence` field carrying the per-launch keypair
|
||||
+ attestation.
|
||||
- **IPC `POST /v1/sessions/register`** — accepts an optional
|
||||
`presence` block on the body (`session_pubkey`, `session_secret_key`,
|
||||
`parent_attestation`). Older payloads continue to work.
|
||||
- **`claudemesh launch`** — generates an ed25519 session keypair and a
|
||||
12 h parent attestation per launch (mesh secret key signs it),
|
||||
forwards both to the daemon under `body.presence`. Per-session
|
||||
presence is always on; older brokers that don't recognize
|
||||
`session_hello` reply `unknown_message_type` and the daemon quietly
|
||||
drops the per-session WS for that mesh — the regular member-keyed
|
||||
WS still covers all functionality, the only loss is sibling-session
|
||||
visibility on that mesh.
|
||||
- **latent 1.29.0 bug fix** — `claudemesh launch` referenced
|
||||
`claudeSessionId` before its `const` declaration further down the
|
||||
file, hitting the temporal dead zone → `ReferenceError` silently
|
||||
swallowed by the surrounding catch. Net: the IPC session-token
|
||||
registration has been failing every launch since 1.29.0, falling
|
||||
every session back to user-level scope. Hoisted the declaration up
|
||||
so the registration actually runs.
|
||||
|
||||
### Sequencing
|
||||
|
||||
The broker side ships first and bakes for ~24 h. Older CLIs continue
|
||||
working unchanged (no per-session WS), and the protocol is purely
|
||||
additive on the wire.
|
||||
|
||||
### Verification (smoke)
|
||||
|
||||
In two shells, both `cd ~/Desktop/foo`:
|
||||
|
||||
```
|
||||
$ claudemesh launch --name SessionA -y # shell 1
|
||||
$ claudemesh launch --name SessionB -y # shell 2
|
||||
```
|
||||
|
||||
In a third shell:
|
||||
|
||||
```
|
||||
$ claudemesh peer list --json --mesh foo \
|
||||
| jq '.[] | {n: .displayName, c: .cwd}'
|
||||
{ "n": "SessionA", "c": "/.../foo" } ← persistent, not query-induced
|
||||
{ "n": "SessionB", "c": "/.../foo" }
|
||||
```
|
||||
|
||||
Inside SessionA, `peer list --mesh foo` now lists SessionB. Kill
|
||||
SessionB; within ≤30 s the reaper drops it from `peer list`.
|
||||
|
||||
### Out of scope (deferred)
|
||||
|
||||
- **Attestation auto-refresh** — current 12 h TTL is comfortably
|
||||
longer than typical sessions; if a session lives past the TTL and
|
||||
the WS reconnects after expiry, the broker rejects with `expired`
|
||||
and the SessionBrokerClient quiets. Workaround: `claudemesh launch`
|
||||
again. Auto-refresh queued for 1.31.0+ alongside HKDF identity.
|
||||
- **Per-session policy DSL** — the per-launch WS could carry
|
||||
per-session capabilities later. Out of scope here.
|
||||
- **Cross-machine session sync** — waits on 2.0.0 HKDF identity.
|
||||
- **Launch-wizard refactor** — bumped to 1.31.0 to keep this release
|
||||
scoped to presence.
|
||||
|
||||
## 1.29.0 (2026-05-04) — per-session IPC tokens + auto-scoping
|
||||
|
||||
Sprint A Phase 2. Every `claudemesh launch`-spawned session gets a
|
||||
unique 32-byte cryptographic token that the daemon resolves on every
|
||||
IPC call to identify which session is talking to it. CLI invocations
|
||||
from inside that session auto-scope to its workspace instead of
|
||||
aggregating across every joined mesh.
|
||||
|
||||
### What landed
|
||||
|
||||
- **`services/session/token.ts`** — mint random 32-byte token, write
|
||||
to `<tmpdir>/session-token` (mode 0o600). Reader pulls from
|
||||
`CLAUDEMESH_IPC_TOKEN_FILE` env (path, not value, to keep the secret
|
||||
off `ps eww`). Optional `CLAUDEMESH_IPC_TOKEN` direct-value escape
|
||||
hatch for tests.
|
||||
- **`daemon/session-registry.ts`** — in-memory `Map<token,
|
||||
SessionInfo>` keyed by token, secondary index by sessionId. 30 s
|
||||
reaper drops entries whose pid is dead; 24 h hard TTL ceiling guards
|
||||
forgotten sessions.
|
||||
- **IPC routes** — `POST /v1/sessions/register`, `DELETE
|
||||
/v1/sessions/:token`, `GET /v1/sessions/me`, `GET /v1/sessions`.
|
||||
- **IPC auth middleware** — parses `Authorization: ClaudeMesh-Session
|
||||
<hex>` and attaches the resolved `SessionInfo` to request context.
|
||||
Layered on top of the existing local-token auth (used for TCP
|
||||
loopback). Backward-compatible: tokenless callers behave exactly
|
||||
as before.
|
||||
- **`services/session/resolve.ts`** — CLI-side helper that asks the
|
||||
daemon `GET /v1/sessions/me` once per process and caches the result.
|
||||
Used by verbs that iterate meshes client-side.
|
||||
- **`launch.ts`** — mints a token, registers it with the daemon, sets
|
||||
`CLAUDEMESH_IPC_TOKEN_FILE` on the spawned `claude` env. Token file
|
||||
lives in the same tmpdir as the session config; gets shredded on
|
||||
cleanup. The daemon's reaper handles dead sessions.
|
||||
- **`peers.ts`** — selection precedence is now `--mesh` flag → session
|
||||
token's mesh → all joined meshes.
|
||||
|
||||
### Server-side scoping
|
||||
|
||||
Every read route that takes `?mesh=<slug>` (peers, state, memory,
|
||||
skills) now uses a `meshFromCtx()` helper: explicit query/body wins,
|
||||
session default fills in when missing. Write routes (set state,
|
||||
remember, deregister, profile-update) follow the same pattern. Pass
|
||||
`--mesh` to override.
|
||||
|
||||
### Verified end-to-end
|
||||
|
||||
| Setup | `peer list` returns |
|
||||
|---|---|
|
||||
| no token | 3 meshes' peers (aggregate, unchanged) |
|
||||
| token registered for prueba1 | 4 peers, all `mesh: prueba1` |
|
||||
|
||||
### Out of scope (deferred)
|
||||
|
||||
- SQLite persistence for the registry — restart loses it; the reaper
|
||||
(or callers re-registering) covers most cases.
|
||||
- `SO_PEERCRED`-strict pid binding — needs a tiny native binding.
|
||||
- Per-session policy DSL.
|
||||
- Cross-machine session sync (waiting on 2.0.0 HKDF identity).
|
||||
|
||||
## 1.28.0 (2026-05-04) — bridge tier deletion + daemon-policy flags
|
||||
|
||||
First Sprint A drop on the way to v2 thin-client. Two structural changes:
|
||||
|
||||
### Bridge tier deletion
|
||||
|
||||
- `services/bridge/{client,server,protocol}.ts` removed (~600 LoC).
|
||||
These were the per-mesh push-pipe sockets that the legacy MCP shim
|
||||
used to hold open; the 1.24.0 shim rewrite stopped opening them but
|
||||
the orphaned client kept being called as a "warm path" tier between
|
||||
daemon and cold. `tryBridge()` always returned `null` in production
|
||||
for the last seven releases — pure dead code.
|
||||
- Each verb now has two paths only: **daemon (with auto-spawn)** →
|
||||
**cold WS**. Same pattern shipped in 1.27.3, simpler to follow.
|
||||
- `commands/{peers,send,broker-actions}.ts` — bridge-tier blocks
|
||||
removed; orphaned `unambiguousMesh` helper removed from
|
||||
broker-actions.
|
||||
|
||||
### `--no-daemon` and `--strict` flags
|
||||
|
||||
New per-process daemon policy:
|
||||
|
||||
| Flag | Behavior |
|
||||
|---|---|
|
||||
| (default) | probe → auto-spawn → retry → cold fallback |
|
||||
| `--strict` | probe → auto-spawn → retry → **error** if all fail. No cold fallback. |
|
||||
| `--no-daemon` | skip daemon entirely → straight to cold path. For sandboxed CI / scripts that don't want a daemon. |
|
||||
|
||||
Env equivalents: `CLAUDEMESH_STRICT_DAEMON=1`, `CLAUDEMESH_NO_DAEMON=1`.
|
||||
Flag wins over env. `--no-daemon` and `--strict` are mutually
|
||||
exclusive (`--no-daemon` wins if both passed).
|
||||
|
||||
Strict-mode enforcement lives at `withMesh` (the cold-path entry
|
||||
point) so a single chokepoint covers every verb. Under `--strict`,
|
||||
the lifecycle's misleading "using cold path" warning is suppressed
|
||||
so the user sees one clean error instead of a confusing two-step.
|
||||
|
||||
### What's not in this release (planned for the rest of Sprint A)
|
||||
|
||||
- 1.29.0: per-session IPC tokens + auto-scoping
|
||||
- 1.30.0: launch wizard refactor
|
||||
- 1.31.0: setup wizard refactor
|
||||
- 1.32.0: full mesh→workspace public-surface rename
|
||||
- 2.0.0 (separate sprint): HKDF cross-machine identity (security-reviewed)
|
||||
|
||||
## 1.27.3 (2026-05-04) — self-healing daemon lifecycle
|
||||
|
||||
The CLI now auto-recovers from a dead daemon on every invocation
|
||||
instead of silently mis-routing through a stale socket.
|
||||
|
||||
### What changed
|
||||
|
||||
- New `services/daemon/lifecycle.ts` — single helper that probes the
|
||||
IPC socket via `/v1/version` (instead of trusting `existsSync`),
|
||||
cleans up stale `daemon.sock` / `daemon.pid` files, and auto-spawns
|
||||
a detached `claudemesh daemon up` under a file-lock when the daemon
|
||||
is missing.
|
||||
- Polls for socket liveness up to a budget (3 s for ad-hoc verbs,
|
||||
10 s for `claudemesh launch`) before falling through.
|
||||
- Recently-failed marker (`~/.claudemesh/daemon/.spawn-failure`,
|
||||
30 s TTL) prevents thundering-herd retries when the daemon
|
||||
crash-loops at startup.
|
||||
- Spawn-lock (`~/.claudemesh/daemon/.spawn.lock`) ensures concurrent
|
||||
CLI invocations share one spawn attempt instead of racing.
|
||||
- Per-process result cache — a script doing 50 sends pays the spawn
|
||||
cost at most once, not 50 times.
|
||||
- Recursion guard via `CLAUDEMESH_INTERNAL_NO_AUTOSPAWN=1` env (set
|
||||
on the spawned daemon's env) so nested CLI calls inside the daemon
|
||||
process don't re-trigger spawn.
|
||||
|
||||
### User-visible behavior
|
||||
|
||||
- `peer list`, `send`, `state get`, etc. now restart the daemon
|
||||
automatically when invoked while the daemon is down.
|
||||
- One-line stderr info on auto-restart:
|
||||
`[claudemesh] info daemon restarted automatically (took 615ms)`.
|
||||
- Cold-path fallback fires only when auto-spawn fails or is
|
||||
suppressed by the recently-failed marker; in those cases a `warn`
|
||||
line points at the daemon log.
|
||||
|
||||
### Bug fixed
|
||||
|
||||
`claudemesh launch`'s `ensureDaemonRunning` previously checked only
|
||||
`existsSync(SOCK_FILE)` and returned early on a stale socket left by
|
||||
a crashed daemon — silently breaking new sessions. Now delegates to
|
||||
the lifecycle helper which probes the socket and recovers.
|
||||
|
||||
### What's not in this patch
|
||||
|
||||
- `--strict` and `--no-daemon` flags (deferred to D in 1.28.0).
|
||||
- Lazy-loading of cold-path code (deferred to 1.28.0).
|
||||
- Per-session IPC tokens (deferred to 1.28.0 alongside D's
|
||||
thin-client conversion).
|
||||
|
||||
## 1.27.2 (2026-05-04) — skill: full-flag launch templates
|
||||
|
||||
Documentation-only ship. `skills/claudemesh/SKILL.md` gains a canonical
|
||||
"fully-populated spawn" recipe under "Wizard-free spawn templates" —
|
||||
every flag set explicitly, with a per-position annotation table — so
|
||||
agents and humans copy-paste a known-good kitchen-sink command instead
|
||||
of stitching one together from the flag table.
|
||||
|
||||
Also corrects two pre-existing inaccuracies:
|
||||
- `--system-prompt` was documented as forwarding to
|
||||
`claude --append-system-prompt`. It actually forwards to
|
||||
`claude --system-prompt` (overrides the default; pass a string, not a
|
||||
path).
|
||||
- `-q` was listed as a synonym for `--quiet`. The argv parser treats
|
||||
short flags (`-X`) and long flags (`--xyz`) as separate keys; only
|
||||
`--quiet` is wired. `-q` is currently a no-op.
|
||||
|
||||
Carries a note that all twelve launch flags are end-to-end wired only as
|
||||
of `claudemesh-cli@1.27.1`.
|
||||
|
||||
## 1.27.1 (2026-05-04) — wire missing launch flags
|
||||
|
||||
Fixes a wiring bug in `apps/cli/src/entrypoints/cli.ts` where six flags
|
||||
declared on `LaunchFlags` were silently dropped on the way to
|
||||
`runLaunch`. They were honored *inside* `runLaunch` if they ever arrived,
|
||||
but the four `runLaunch({...})` call sites in the CLI entrypoint each
|
||||
forwarded a hardcoded 5-key subset (`mesh, name, join, yes, resume`).
|
||||
|
||||
Now forwarded at every entry point (bare command, bare invite URL,
|
||||
`launch`/`connect`, `workspace launch`):
|
||||
|
||||
- `--role <r>` — sets session role; previously only settable via wizard.
|
||||
- `--groups "frontend:lead,reviewers"` — comma-separated groups string.
|
||||
- `--message-mode push|inbox|off` — message delivery mode.
|
||||
- `--system-prompt <text>` — passes through to `claude`.
|
||||
- `--continue` — passes through to `claude` to continue last session.
|
||||
- `--quiet` — actually suppresses the wizard and banner now. Previously
|
||||
it was a complete no-op flag at the CLI layer.
|
||||
|
||||
No internal logic changed; the launch internals already read these.
|
||||
This is a pure plumbing fix.
|
||||
|
||||
## 1.27.0 (2026-05-04) — state + memory through the daemon, workspace alias
|
||||
|
||||
Two more verb families now route through the local daemon's IPC for the
|
||||
warm path: `state get/set/list` and `remember/recall/forget`. Same
|
||||
pattern as 1.25.0 for peers/skills — try the socket first (~1 ms warm),
|
||||
fall back to the cold WS path when the daemon isn't running.
|
||||
|
||||
### What changed
|
||||
|
||||
- `claudemesh state get|set|list` route through `/v1/state` when the
|
||||
daemon socket is present. `--mesh <slug>` forwards as a query/body
|
||||
field. Single-mesh daemons auto-pick; multi-mesh daemons require
|
||||
`--mesh` for `state set`.
|
||||
- `claudemesh remember`, `claudemesh recall`, `claudemesh forget`
|
||||
(and `claudemesh memory <sub>`) route through `/v1/memory`.
|
||||
Aggregates across attached meshes for `recall`; requires `--mesh`
|
||||
for `remember`/`forget` when ambiguous.
|
||||
- New `claudemesh workspace <verb>` alias surface — early teaser for
|
||||
the 1.28.0 mesh→workspace public rename. Mirrors `list`, `info`,
|
||||
`create`, `join`, `delete`, `rename`, `share`, `launch`, `overview`.
|
||||
No-arg `claudemesh workspace` falls through to `launch` (same as
|
||||
bare `claudemesh`).
|
||||
|
||||
### IPC surface
|
||||
|
||||
- `GET /v1/state` — list (`?mesh=<slug>` filter) or single key lookup
|
||||
(`?key=<k>&mesh=<slug>`). Returns 404 with `{ error: "state_not_found" }`
|
||||
when missing.
|
||||
- `POST /v1/state` — `{ key, value, mesh? }`. 400 + attached list when
|
||||
multi-mesh and no `mesh` field.
|
||||
- `GET /v1/memory?q=<query>&mesh=<slug>` — recall. Aggregates across
|
||||
meshes, each match tagged with its `mesh` field.
|
||||
- `POST /v1/memory` — `{ content, tags?, mesh? }`. Returns
|
||||
`{ id, mesh }`.
|
||||
- `DELETE /v1/memory/:id?mesh=<slug>` — forget.
|
||||
- `ipc_features` gains `state` and `memory` keys.
|
||||
|
||||
### Why this matters
|
||||
|
||||
State and memory were the last verbs that opened a fresh broker WS on
|
||||
every invocation. Now they reuse the daemon's existing connection — the
|
||||
warm-path latency cliff (~150 ms cold WS handshake → ~1 ms IPC) extends
|
||||
to two more flows agents poll heavily.
|
||||
|
||||
The `workspace` alias is cosmetic but lays the groundwork for 1.28.0's
|
||||
documented rename without breaking anyone's muscle memory.
|
||||
|
||||
## 1.26.0 (2026-05-04) — multi-mesh daemon
|
||||
|
||||
The daemon now attaches to **all joined meshes simultaneously** by
|
||||
|
||||
@@ -1,6 +1,6 @@
|
||||
{
|
||||
"name": "claudemesh-cli",
|
||||
"version": "1.26.0",
|
||||
"version": "1.30.0",
|
||||
"description": "Peer mesh for Claude Code sessions — CLI + MCP server.",
|
||||
"keywords": [
|
||||
"claude-code",
|
||||
|
||||
@@ -80,15 +80,56 @@ Once `claudemesh install` has run (registers MCP entry + starts daemon service),
|
||||
| `--groups "name:role,name2:role2,all"` | the group selection prompt | comma-separated `<groupname>:<role>` entries; the literal `all` joins `@all` |
|
||||
| `--role <lead\|member\|observer>` | the role prompt | applied to all groups in `--groups` that didn't specify their own |
|
||||
| `--message-mode <push\|inbox>` | the message-mode prompt | `push` (default) emits `<channel>` notifications mid-turn; `inbox` only buffers — quieter for headless agents |
|
||||
| `--system-prompt <path>` | nothing — pure pass-through | forwarded to `claude --append-system-prompt` |
|
||||
| `--system-prompt <text>` | nothing — pure pass-through | forwarded to `claude --system-prompt` (overrides default; pass a string, not a path) |
|
||||
| `--resume <session-id>` | nothing — pure pass-through | forwarded to `claude --resume` to continue a prior Claude Code session |
|
||||
| `--continue` | nothing — pure pass-through | forwarded to `claude --continue` |
|
||||
| `--continue` | nothing — pure pass-through | forwarded to `claude --continue` (resumes the last session in this cwd) |
|
||||
| `-y` / `--yes` | every confirmation prompt | including the "you'll skip ALL permission prompts" gate. **Use for autonomous agents; omit for shared/multi-person meshes.** |
|
||||
| `-q` / `--quiet` | the welcome banner | useful when the spawning script wants clean stdout |
|
||||
| `--quiet` | the wizard + welcome banner | suppresses the launch wizard and banner. Combine with `-y` for true headless: `--quiet` alone won't bypass Claude's permission prompts, so a script using only `--quiet` will hang on the first tool call. |
|
||||
| `--` | (separator) | everything after `--` is forwarded verbatim to `claude`. Example: `claudemesh launch --name X -y -- --resume abc123 --model opus` |
|
||||
|
||||
> **All twelve flags are end-to-end wired as of `claudemesh-cli@1.27.1`.** Earlier builds silently dropped `--role`, `--groups`, `--message-mode`, `--system-prompt`, `--continue`, and `--quiet` at the CLI entrypoint — they were declared but never reached `runLaunch`. If a script targets older versions, those flags are no-ops.
|
||||
|
||||
### Wizard-free spawn templates
|
||||
|
||||
#### Canonical fully-populated spawn (every flag set explicitly)
|
||||
|
||||
The kitchen-sink form — copy, set every value, and the session boots without a single interactive prompt or banner. Use as a base when scripting from cron, hooks, CI, or another agent:
|
||||
|
||||
```bash
|
||||
claudemesh launch \
|
||||
--name "ci-bot" \
|
||||
--mesh openclaw \
|
||||
--role member \
|
||||
--groups "frontend:lead,reviewers:observer,all" \
|
||||
--message-mode inbox \
|
||||
--system-prompt "$(cat ~/agents/ci-bot.md)" \
|
||||
--quiet \
|
||||
-y \
|
||||
-- \
|
||||
--model opus \
|
||||
--resume "$LAST_SESSION_ID"
|
||||
```
|
||||
|
||||
Annotated:
|
||||
|
||||
| Position | Value | Effect |
|
||||
|---|---|---|
|
||||
| `--name "ci-bot"` | identity | what peers see in `peer list` and `<channel from_name>` — pin so peers always see the same name across machines |
|
||||
| `--mesh openclaw` | workspace | required when you have ≥2 joined meshes; safe to include even with 1 (becomes a no-op assertion) |
|
||||
| `--role member` | session label | free-form tag used by group conventions; common values: `lead`, `member`, `observer`, `bot`, `oncall` |
|
||||
| `--groups "frontend:lead,..."` | group memberships | comma-separated `<group>:<role>` pairs; bare `all` joins `@all` with no role |
|
||||
| `--message-mode inbox` | delivery | `push` interrupts mid-turn (default); `inbox` buffers silently; `off` disables messages but keeps tool calls |
|
||||
| `--system-prompt "..."` | claude system prompt | overrides Claude's default. Pass a string, not a path — wrap with `$(cat …)` if you keep prompts in files |
|
||||
| `--quiet` | output | suppress the wizard and banner — clean stdout for the spawning script |
|
||||
| `-y` | consent | skips every permission prompt (claudemesh's policy gate **and** Claude's `--dangerously-skip-permissions`). Required for true headless |
|
||||
| `--` | separator | everything after is passed verbatim to `claude` |
|
||||
| `--model opus` | claude flag | example claude-side override |
|
||||
| `--resume "$LAST_SESSION_ID"` | claude flag | resume a prior Claude session inside this mesh identity |
|
||||
|
||||
**Rule of thumb:** for any unattended spawn, the minimum is `--name + --mesh + -y + --quiet`. Add `--system-prompt` to seed task context, `--message-mode inbox` to keep the bot quiet, and `--role` + `--groups` so peers know how to address it. Drop `--quiet` when a human is watching the script's stdout.
|
||||
|
||||
#### Trimmed templates
|
||||
|
||||
```bash
|
||||
# Minimal — single joined mesh, fresh agent, autonomous:
|
||||
claudemesh launch --name "Lug Nut" -y
|
||||
@@ -109,9 +150,9 @@ claudemesh launch --name "Mou" --mesh openclaw -y -- --resume abc123-...
|
||||
|
||||
# Quiet, headless, system-prompt loaded — for cron / hooks:
|
||||
claudemesh launch --name "ci-bot" --mesh openclaw \
|
||||
--system-prompt /path/to/ci-bot.md \
|
||||
--system-prompt "$(cat ~/agents/ci-bot.md)" \
|
||||
--message-mode inbox \
|
||||
-q -y
|
||||
--quiet -y
|
||||
```
|
||||
|
||||
If any required flag is missing AND stdin is a TTY, `launch` falls back to its prompt for that single field. **In a non-TTY context (Bash tool, cron, AppleScript pipe), missing flags cause the verb to fail-closed — never silently use a default that affects identity.**
|
||||
|
||||
@@ -15,8 +15,7 @@
|
||||
*/
|
||||
|
||||
import { withMesh } from "./connect.js";
|
||||
import { readConfig } from "~/services/config/facade.js";
|
||||
import { tryBridge } from "~/services/bridge/client.js";
|
||||
import { tryForgetViaDaemon } from "~/services/bridge/daemon-route.js";
|
||||
import { render } from "~/ui/render.js";
|
||||
import { bold, clay, dim } from "~/ui/styles.js";
|
||||
import { EXIT } from "~/constants/exit-codes.js";
|
||||
@@ -25,14 +24,6 @@ import { validateMessageId, renderValidationError } from "~/cli/validators.js";
|
||||
type StateFlags = { mesh?: string; json?: boolean };
|
||||
type PeerStatus = "idle" | "working" | "dnd";
|
||||
|
||||
/** Resolve unambiguous mesh slug for warm-path bridging. Returns null if
|
||||
* the user has multiple joined meshes and didn't pick one. */
|
||||
function unambiguousMesh(opts: StateFlags): string | null {
|
||||
if (opts.mesh) return opts.mesh;
|
||||
const config = readConfig();
|
||||
return config.meshes.length === 1 ? config.meshes[0]!.slug : null;
|
||||
}
|
||||
|
||||
// --- status ---
|
||||
|
||||
export async function runStatusSet(state: string, opts: StateFlags): Promise<number> {
|
||||
@@ -42,21 +33,9 @@ export async function runStatusSet(state: string, opts: StateFlags): Promise<num
|
||||
return EXIT.INVALID_ARGS;
|
||||
}
|
||||
|
||||
// Warm path
|
||||
const meshSlug = unambiguousMesh(opts);
|
||||
if (meshSlug) {
|
||||
const bridged = await tryBridge(meshSlug, "status_set", { status: state });
|
||||
if (bridged !== null) {
|
||||
if (bridged.ok) {
|
||||
if (opts.json) console.log(JSON.stringify({ status: state }));
|
||||
else render.ok(`status set to ${bold(state)}`);
|
||||
return EXIT.SUCCESS;
|
||||
}
|
||||
render.err(bridged.error);
|
||||
return EXIT.INTERNAL_ERROR;
|
||||
}
|
||||
}
|
||||
|
||||
// Bridge tier deleted in 1.28.0 (dead code; the orphaned warm-path
|
||||
// socket was never opened by anyone). Daemon route would belong here;
|
||||
// adding it for status/summary/visible is queued for 1.29.0.
|
||||
await withMesh({ meshSlug: opts.mesh ?? null }, async (client) => {
|
||||
await client.setStatus(state as PeerStatus);
|
||||
});
|
||||
@@ -73,21 +52,6 @@ export async function runSummary(text: string, opts: StateFlags): Promise<number
|
||||
return EXIT.INVALID_ARGS;
|
||||
}
|
||||
|
||||
// Warm path
|
||||
const meshSlug = unambiguousMesh(opts);
|
||||
if (meshSlug) {
|
||||
const bridged = await tryBridge(meshSlug, "summary", { summary: text });
|
||||
if (bridged !== null) {
|
||||
if (bridged.ok) {
|
||||
if (opts.json) console.log(JSON.stringify({ summary: text }));
|
||||
else render.ok("summary set", dim(text));
|
||||
return EXIT.SUCCESS;
|
||||
}
|
||||
render.err(bridged.error);
|
||||
return EXIT.INTERNAL_ERROR;
|
||||
}
|
||||
}
|
||||
|
||||
await withMesh({ meshSlug: opts.mesh ?? null }, async (client) => {
|
||||
await client.setSummary(text);
|
||||
});
|
||||
@@ -107,21 +71,6 @@ export async function runVisible(value: string | undefined, opts: StateFlags): P
|
||||
return EXIT.INVALID_ARGS;
|
||||
}
|
||||
|
||||
// Warm path
|
||||
const meshSlug = unambiguousMesh(opts);
|
||||
if (meshSlug) {
|
||||
const bridged = await tryBridge(meshSlug, "visible", { visible });
|
||||
if (bridged !== null) {
|
||||
if (bridged.ok) {
|
||||
if (opts.json) console.log(JSON.stringify({ visible }));
|
||||
else render.ok(visible ? "you are now visible to peers" : "you are now hidden", visible ? undefined : "direct messages still reach you");
|
||||
return EXIT.SUCCESS;
|
||||
}
|
||||
render.err(bridged.error);
|
||||
return EXIT.INTERNAL_ERROR;
|
||||
}
|
||||
}
|
||||
|
||||
await withMesh({ meshSlug: opts.mesh ?? null }, async (client) => {
|
||||
await client.setVisible(visible);
|
||||
});
|
||||
@@ -173,6 +122,14 @@ export async function runForget(id: string | undefined, opts: StateFlags): Promi
|
||||
render.err("Usage: claudemesh forget <memory-id>");
|
||||
return EXIT.INVALID_ARGS;
|
||||
}
|
||||
|
||||
// Daemon path first.
|
||||
if (await tryForgetViaDaemon(id, opts.mesh)) {
|
||||
if (opts.json) { console.log(JSON.stringify({ id, forgotten: true })); return EXIT.SUCCESS; }
|
||||
render.ok(`forgot ${dim(id.slice(0, 8))}`);
|
||||
return EXIT.SUCCESS;
|
||||
}
|
||||
|
||||
await withMesh({ meshSlug: opts.mesh ?? null }, async (client) => {
|
||||
await client.forget(id);
|
||||
});
|
||||
@@ -237,7 +194,7 @@ export async function runMsgStatus(id: string | undefined, opts: StateFlags): Pr
|
||||
console.log(JSON.stringify(result, null, 2));
|
||||
return EXIT.SUCCESS;
|
||||
}
|
||||
render.section(`message ${id.slice(0, 12)}…`);
|
||||
render.section(`message ${lookupId.slice(0, 12)}…`);
|
||||
render.kv([
|
||||
["target", result.targetSpec],
|
||||
["delivered", result.delivered ? "yes" : "no"],
|
||||
|
||||
@@ -10,6 +10,7 @@ import { createInterface } from "node:readline";
|
||||
import { BrokerClient } from "~/services/broker/facade.js";
|
||||
import { readConfig } from "~/services/config/facade.js";
|
||||
import type { JoinedMesh } from "~/services/config/facade.js";
|
||||
import { getDaemonPolicy } from "~/services/daemon/policy.js";
|
||||
|
||||
export interface ConnectOpts {
|
||||
/** Mesh slug to connect to. Auto-selects if only one mesh joined. */
|
||||
@@ -46,6 +47,18 @@ export async function withMesh<T>(
|
||||
opts: ConnectOpts,
|
||||
fn: (client: BrokerClient, mesh: JoinedMesh) => Promise<T>,
|
||||
): Promise<T> {
|
||||
// --strict gate: every cold-path verb funnels through here, so a single
|
||||
// policy check covers the whole CLI surface. The daemon-routing helpers
|
||||
// already returned null (auto-spawn failed); under --strict we refuse
|
||||
// the cold-path fallback and exit loudly instead.
|
||||
if (getDaemonPolicy().mode === "strict") {
|
||||
console.error(
|
||||
"\n ✘ daemon not reachable — --strict refuses cold-path fallback.\n" +
|
||||
" run `claudemesh daemon up` (or `claudemesh doctor`) and retry.\n",
|
||||
);
|
||||
process.exit(1);
|
||||
}
|
||||
|
||||
const config = readConfig();
|
||||
if (config.meshes.length === 0) {
|
||||
console.error("No meshes joined. Run `claudemesh join <url>` first.");
|
||||
|
||||
@@ -49,46 +49,28 @@ export interface LaunchFlags {
|
||||
*
|
||||
* As of 1.24.0 the daemon owns the broker WS and feeds the MCP push-pipe
|
||||
* over IPC SSE. If the socket is absent when Claude boots its MCP shim,
|
||||
* the shim bails (no fallback). So we probe for the socket here and, if
|
||||
* missing, spawn `claudemesh daemon up --mesh <slug>` in the background,
|
||||
* waiting briefly for the socket to appear.
|
||||
*
|
||||
* Best-effort: if the daemon spawn fails, we surface the error and let
|
||||
* the launch proceed — Claude Code will print the same "daemon not
|
||||
* running" message and the user can fix it manually.
|
||||
* the shim bails (no fallback). Delegates to the shared lifecycle helper
|
||||
* (services/daemon/lifecycle.ts) which probes the socket properly
|
||||
* (avoiding the stale-socket bug where existsSync was a false positive
|
||||
* after a daemon crash), spawns under a file-lock, and polls for liveness.
|
||||
*/
|
||||
async function ensureDaemonRunning(meshSlug: string, quiet: boolean): Promise<void> {
|
||||
const { DAEMON_PATHS } = await import("~/daemon/paths.js");
|
||||
if (existsSync(DAEMON_PATHS.SOCK_FILE)) return;
|
||||
|
||||
if (!quiet) render.info("starting claudemesh daemon…");
|
||||
const { spawn } = await import("node:child_process");
|
||||
const argv0 = process.argv[1] ?? "claudemesh";
|
||||
let binary = argv0;
|
||||
if (/\.ts$/.test(binary) || /node_modules|src\/entrypoints/.test(binary)) {
|
||||
try {
|
||||
const { execSync } = await import("node:child_process");
|
||||
binary = execSync("which claudemesh", { encoding: "utf8" }).trim();
|
||||
} catch { binary = "claudemesh"; }
|
||||
const { ensureDaemonReady } = await import("~/services/daemon/lifecycle.js");
|
||||
if (!quiet) render.info("ensuring claudemesh daemon is running…");
|
||||
// Larger budget for `launch` — it's a one-shot flow where the user
|
||||
// is actively waiting; cold node start + broker hello can take
|
||||
// longer than the default 3s budget for ad-hoc verbs.
|
||||
const res = await ensureDaemonReady({ budgetMs: 10_000, mesh: meshSlug });
|
||||
if (res.state === "up") {
|
||||
if (!quiet) render.ok("daemon already running");
|
||||
return;
|
||||
}
|
||||
const child = spawn(binary, ["daemon", "up", "--mesh", meshSlug], {
|
||||
detached: true,
|
||||
stdio: "ignore",
|
||||
});
|
||||
child.unref();
|
||||
|
||||
// Wait for the socket to appear. 10 s budget — covers cold node start +
|
||||
// broker hello round-trip on slow links.
|
||||
const start = Date.now();
|
||||
while (Date.now() - start < 10_000) {
|
||||
if (existsSync(DAEMON_PATHS.SOCK_FILE)) {
|
||||
if (!quiet) render.ok("daemon ready");
|
||||
return;
|
||||
}
|
||||
await new Promise((r) => setTimeout(r, 200));
|
||||
if (res.state === "started") {
|
||||
if (!quiet) render.ok(`daemon ready (${res.durationMs}ms)`);
|
||||
return;
|
||||
}
|
||||
render.warn(
|
||||
"daemon failed to start within 10s",
|
||||
`daemon ${res.state}${res.reason ? `: ${res.reason}` : ""}`,
|
||||
"Run `claudemesh daemon up --mesh " + meshSlug + "` manually, then re-launch.",
|
||||
);
|
||||
}
|
||||
@@ -671,6 +653,102 @@ export async function runLaunch(flags: LaunchFlags, rawArgs: string[]): Promise<
|
||||
"utf-8",
|
||||
);
|
||||
|
||||
// 4b. Mint a per-session IPC token, persist it under tmpDir, and
|
||||
// register it with the daemon. The token's path is exposed to
|
||||
// the spawned claude (and all its descendants) via env so
|
||||
// CLI invocations from inside the session auto-attribute to it.
|
||||
//
|
||||
// 1.30.0: also mint an ephemeral ed25519 session keypair and a
|
||||
// parent-vouched attestation. The daemon uses these to open a
|
||||
// long-lived broker WebSocket per session (presence row keyed on
|
||||
// the session pubkey, member_id from the parent), so sibling
|
||||
// sessions in the same mesh see each other in `peer list`.
|
||||
//
|
||||
// Session-id resolution: 1.29.0 referenced `claudeSessionId`
|
||||
// before its `const` declaration further down the file, hitting
|
||||
// the TDZ → ReferenceError swallowed by the surrounding catch.
|
||||
// The IPC registration has been silently failing every launch
|
||||
// since 1.29.0. Hoist the declaration up so it actually runs.
|
||||
const isResume = args.resume !== null || args.continueSession;
|
||||
const claudeSessionId = isResume ? undefined : randomUUID();
|
||||
let sessionTokenFilePath: string | null = null;
|
||||
let sessionTokenForCleanup: string | null = null;
|
||||
try {
|
||||
const { mintSessionToken, TOKEN_FILE_ENV } = await import("~/services/session/token.js");
|
||||
const minted = mintSessionToken(tmpDir);
|
||||
sessionTokenFilePath = minted.filePath;
|
||||
sessionTokenForCleanup = minted.token;
|
||||
|
||||
// Per-session ephemeral keypair + parent attestation (1.30.0+).
|
||||
// Older daemons ignore unknown body fields, so sending presence
|
||||
// material always is forward-compatible.
|
||||
let presencePayload: {
|
||||
session_pubkey: string;
|
||||
session_secret_key: string;
|
||||
parent_attestation: {
|
||||
session_pubkey: string;
|
||||
parent_member_pubkey: string;
|
||||
expires_at: number;
|
||||
signature: string;
|
||||
};
|
||||
} | undefined;
|
||||
try {
|
||||
const { generateKeypair } = await import("~/services/crypto/facade.js");
|
||||
const { signParentAttestation } = await import("~/services/broker/session-hello-sig.js");
|
||||
const sessionKp = await generateKeypair();
|
||||
const att = await signParentAttestation({
|
||||
parentMemberPubkey: mesh.pubkey,
|
||||
parentSecretKey: mesh.secretKey,
|
||||
sessionPubkey: sessionKp.publicKey,
|
||||
});
|
||||
presencePayload = {
|
||||
session_pubkey: sessionKp.publicKey,
|
||||
session_secret_key: sessionKp.secretKey,
|
||||
parent_attestation: {
|
||||
session_pubkey: att.sessionPubkey,
|
||||
parent_member_pubkey: att.parentMemberPubkey,
|
||||
expires_at: att.expiresAt,
|
||||
signature: att.signature,
|
||||
},
|
||||
};
|
||||
} catch {
|
||||
// Keypair / attestation failure — proceed without per-session
|
||||
// presence. The session still registers; only the broker-side
|
||||
// presence row is skipped.
|
||||
}
|
||||
|
||||
// Register with the daemon. Best-effort: a daemon failure here
|
||||
// means the session falls back to user-level scope, which is fine.
|
||||
const { ipc } = await import("~/daemon/ipc/client.js");
|
||||
const sessionIdForRegister = claudeSessionId ?? randomUUID();
|
||||
await ipc({
|
||||
method: "POST",
|
||||
path: "/v1/sessions/register",
|
||||
timeoutMs: 3_000,
|
||||
body: {
|
||||
token: minted.token,
|
||||
session_id: sessionIdForRegister,
|
||||
mesh: mesh.slug,
|
||||
display_name: displayName,
|
||||
pid: process.pid,
|
||||
cwd: process.cwd(),
|
||||
...(role ? { role } : {}),
|
||||
...(parsedGroups.length > 0 ? { groups: parsedGroups.map((g) => `@${g.name}${g.role ? `:${g.role}` : ""}`) } : {}),
|
||||
...(presencePayload ? { presence: presencePayload } : {}),
|
||||
},
|
||||
}).catch(() => null);
|
||||
|
||||
// Pin the env name on a global so the spawn block below can pick it up.
|
||||
(process as unknown as { _claudemeshTokenEnv?: { name: string; value: string } })._claudemeshTokenEnv = {
|
||||
name: TOKEN_FILE_ENV,
|
||||
value: minted.filePath,
|
||||
};
|
||||
} catch {
|
||||
// Token mint or registration failed — proceed without per-session
|
||||
// attribution. CLI invocations from the session will still work,
|
||||
// they'll just default to user-level scope.
|
||||
}
|
||||
|
||||
// 5. Print summary banner (wizard already handled all interactive config).
|
||||
if (!args.quiet) {
|
||||
printBanner(displayName, mesh.slug, role, parsedGroups, messageMode);
|
||||
@@ -744,10 +822,8 @@ export async function runLaunch(flags: LaunchFlags, rawArgs: string[]): Promise<
|
||||
// passes -y / --yes. Without it, claudemesh tools still work because
|
||||
// `claudemesh install` pre-approves them via allowedTools in settings.json.
|
||||
// This keeps permissions tight for multi-person meshes.
|
||||
// Session identity: --resume reuses existing session, otherwise generate new.
|
||||
// When resuming, Claude Code reuses the session ID so the mesh peer identity persists.
|
||||
const isResume = args.resume !== null || args.continueSession;
|
||||
const claudeSessionId = isResume ? undefined : randomUUID();
|
||||
// Session identity: claudeSessionId was generated above (4b) so the
|
||||
// session-token registration could include it. Reuse here.
|
||||
|
||||
const claudeArgs = [
|
||||
"--dangerously-load-development-channels",
|
||||
@@ -792,7 +868,14 @@ export async function runLaunch(flags: LaunchFlags, rawArgs: string[]): Promise<
|
||||
writeFileSync(claudeConfigPath, JSON.stringify(claudeConfig, null, 2) + "\n", "utf-8");
|
||||
} catch { /* best effort */ }
|
||||
}
|
||||
// Ephemeral config dir
|
||||
// The token's session-token file lives inside tmpDir; rmSync below
|
||||
// shreds the secret. The daemon's session reaper notices the
|
||||
// launched session's pid is gone within 30s and drops the registry
|
||||
// entry. Explicit DELETE on /v1/sessions is feasible only from an
|
||||
// async exit hook, which adds complexity for ~30s of memory the
|
||||
// reaper will reclaim anyway. Leaving as-is; revisit if the
|
||||
// registry ever grows persistence.
|
||||
// Ephemeral config dir (also drops the session-token file)
|
||||
try {
|
||||
rmSync(tmpDir, { recursive: true, force: true });
|
||||
} catch { /* best effort */ }
|
||||
@@ -854,6 +937,7 @@ export async function runLaunch(flags: LaunchFlags, rawArgs: string[]): Promise<
|
||||
CLAUDEMESH_CONFIG_DIR: tmpDir,
|
||||
CLAUDEMESH_DISPLAY_NAME: displayName,
|
||||
...(claudeSessionId ? { CLAUDEMESH_SESSION_ID: claudeSessionId } : {}),
|
||||
...(sessionTokenFilePath ? { CLAUDEMESH_IPC_TOKEN_FILE: sessionTokenFilePath } : {}),
|
||||
MCP_TIMEOUT: process.env.MCP_TIMEOUT ?? "30000",
|
||||
MAX_MCP_OUTPUT_TOKENS: process.env.MAX_MCP_OUTPUT_TOKENS ?? "50000",
|
||||
...(role ? { CLAUDEMESH_ROLE: role } : {}),
|
||||
|
||||
@@ -14,7 +14,6 @@
|
||||
|
||||
import { withMesh } from "./connect.js";
|
||||
import { readConfig } from "~/services/config/facade.js";
|
||||
import { tryBridge } from "~/services/bridge/client.js";
|
||||
import { render } from "~/ui/render.js";
|
||||
import { bold, dim, green, yellow } from "~/ui/styles.js";
|
||||
|
||||
@@ -68,7 +67,10 @@ async function listPeersForMesh(slug: string): Promise<PeerRecord[]> {
|
||||
const selfMemberPubkey = joined?.pubkey ?? null;
|
||||
|
||||
// Daemon path — preferred when running. Same routing pattern as send.ts:
|
||||
// ~1 ms IPC round-trip; broker WS already warm in the daemon.
|
||||
// ~1 ms IPC round-trip; broker WS already warm in the daemon. The
|
||||
// lifecycle helper inside tryListPeersViaDaemon auto-spawns the
|
||||
// daemon if it's down and probes it for liveness — no separate bridge
|
||||
// tier is needed any more (1.28.0).
|
||||
try {
|
||||
const { tryListPeersViaDaemon } = await import("~/services/bridge/daemon-route.js");
|
||||
const dr = await tryListPeersViaDaemon();
|
||||
@@ -77,13 +79,8 @@ async function listPeersForMesh(slug: string): Promise<PeerRecord[]> {
|
||||
}
|
||||
} catch { /* daemon route helper not available; fall through */ }
|
||||
|
||||
// Try warm bridge path next.
|
||||
const bridged = await tryBridge(slug, "peers");
|
||||
if (bridged && bridged.ok) {
|
||||
const peers = bridged.result as PeerRecord[];
|
||||
return peers.map((p) => annotateSelf(p, selfMemberPubkey, null));
|
||||
}
|
||||
// Cold path — open our own WS.
|
||||
// Cold path — open our own WS. Reached only when the lifecycle helper
|
||||
// could not bring the daemon up.
|
||||
let result: PeerRecord[] = [];
|
||||
await withMesh({ meshSlug: slug }, async (client) => {
|
||||
const all = (await client.listPeers()) as unknown as PeerRecord[];
|
||||
@@ -122,7 +119,19 @@ function annotateSelf(
|
||||
|
||||
export async function runPeers(flags: PeersFlags): Promise<void> {
|
||||
const config = readConfig();
|
||||
const slugs = flags.mesh ? [flags.mesh] : config.meshes.map((m) => m.slug);
|
||||
|
||||
// Mesh selection precedence:
|
||||
// 1. explicit --mesh <slug> (always wins)
|
||||
// 2. session-token mesh (when invoked from inside a launched session)
|
||||
// 3. all joined meshes (default for bare shells)
|
||||
let slugs: string[];
|
||||
if (flags.mesh) {
|
||||
slugs = [flags.mesh];
|
||||
} else {
|
||||
const { getSessionInfo } = await import("~/services/session/resolve.js");
|
||||
const sess = await getSessionInfo();
|
||||
slugs = sess ? [sess.mesh] : config.meshes.map((m) => m.slug);
|
||||
}
|
||||
|
||||
if (slugs.length === 0) {
|
||||
render.err("No meshes joined.");
|
||||
|
||||
@@ -1,4 +1,5 @@
|
||||
import { withMesh } from "./connect.js";
|
||||
import { tryRecallViaDaemon } from "~/services/bridge/daemon-route.js";
|
||||
import { render } from "~/ui/render.js";
|
||||
import { bold, clay, dim } from "~/ui/styles.js";
|
||||
import { EXIT } from "~/constants/exit-codes.js";
|
||||
@@ -11,6 +12,22 @@ export async function recall(
|
||||
render.err("Usage: claudemesh recall <query>");
|
||||
return EXIT.INVALID_ARGS;
|
||||
}
|
||||
|
||||
// Daemon path first.
|
||||
const daemonMatches = await tryRecallViaDaemon(query, opts.mesh);
|
||||
if (daemonMatches !== null) {
|
||||
if (opts.json) { console.log(JSON.stringify(daemonMatches, null, 2)); return EXIT.SUCCESS; }
|
||||
if (daemonMatches.length === 0) { render.info(dim("no memories found.")); return EXIT.SUCCESS; }
|
||||
render.section(`memories (${daemonMatches.length})`);
|
||||
for (const m of daemonMatches) {
|
||||
const tags = m.tags.length ? dim(` [${m.tags.map((t) => clay(t)).join(dim(", "))}]`) : "";
|
||||
process.stdout.write(` ${bold(m.id.slice(0, 8))}${tags}\n`);
|
||||
process.stdout.write(` ${m.content}\n`);
|
||||
process.stdout.write(` ${dim(m.rememberedBy + " · " + new Date(m.rememberedAt).toLocaleString())}\n\n`);
|
||||
}
|
||||
return EXIT.SUCCESS;
|
||||
}
|
||||
|
||||
return await withMesh({ meshSlug: opts.mesh ?? null }, async (client) => {
|
||||
const memories = await client.recall(query);
|
||||
|
||||
|
||||
@@ -1,4 +1,5 @@
|
||||
import { withMesh } from "./connect.js";
|
||||
import { tryRememberViaDaemon } from "~/services/bridge/daemon-route.js";
|
||||
import { render } from "~/ui/render.js";
|
||||
import { dim } from "~/ui/styles.js";
|
||||
import { EXIT } from "~/constants/exit-codes.js";
|
||||
@@ -12,6 +13,18 @@ export async function remember(
|
||||
return EXIT.INVALID_ARGS;
|
||||
}
|
||||
const tags = opts.tags?.split(",").map((t) => t.trim()).filter(Boolean);
|
||||
|
||||
// Daemon path first.
|
||||
const daemonRes = await tryRememberViaDaemon(content, tags, opts.mesh);
|
||||
if (daemonRes) {
|
||||
if (opts.json) {
|
||||
console.log(JSON.stringify({ id: daemonRes.id, content, tags, mesh: daemonRes.mesh }));
|
||||
return EXIT.SUCCESS;
|
||||
}
|
||||
render.ok("remembered", dim(daemonRes.id.slice(0, 8)));
|
||||
return EXIT.SUCCESS;
|
||||
}
|
||||
|
||||
return await withMesh({ meshSlug: opts.mesh ?? null }, async (client) => {
|
||||
const id = await client.remember(content, tags);
|
||||
|
||||
|
||||
@@ -13,7 +13,6 @@
|
||||
|
||||
import { withMesh } from "./connect.js";
|
||||
import { readConfig } from "~/services/config/facade.js";
|
||||
import { tryBridge } from "~/services/bridge/client.js";
|
||||
import { trySendViaDaemon } from "~/services/bridge/daemon-route.js";
|
||||
import type { Priority } from "~/services/broker/facade.js";
|
||||
import { render } from "~/ui/render.js";
|
||||
@@ -82,34 +81,12 @@ export async function runSend(flags: SendFlags, to: string, message: string): Pr
|
||||
else render.err(`send failed (daemon): ${dr.error}`);
|
||||
process.exit(1);
|
||||
}
|
||||
// dr === null → daemon not running; fall through to bridge.
|
||||
// dr === null → daemon not running and lifecycle couldn't auto-
|
||||
// spawn it; fall through to cold path. The orphaned bridge tier
|
||||
// was removed in 1.28.0.
|
||||
}
|
||||
|
||||
// Warm path — only when mesh is unambiguous.
|
||||
if (meshSlug) {
|
||||
const bridged = await tryBridge(meshSlug, "send", { to, message, priority });
|
||||
if (bridged !== null) {
|
||||
if (bridged.ok) {
|
||||
const r = bridged.result as { messageId?: string };
|
||||
if (flags.json) {
|
||||
console.log(JSON.stringify({ ok: true, messageId: r.messageId, target: to }));
|
||||
} else {
|
||||
render.ok(`sent to ${to}`, r.messageId ? dim(r.messageId.slice(0, 8)) : undefined);
|
||||
}
|
||||
return;
|
||||
}
|
||||
// Bridge reachable but op failed — surface error, don't fall through.
|
||||
if (flags.json) {
|
||||
console.log(JSON.stringify({ ok: false, error: bridged.error }));
|
||||
} else {
|
||||
render.err(`send failed: ${bridged.error}`);
|
||||
}
|
||||
process.exit(1);
|
||||
}
|
||||
// bridged === null → bridge unreachable, fall through to cold path
|
||||
}
|
||||
|
||||
// Cold path
|
||||
// Cold path — open our own WS, encrypt locally, fire envelope.
|
||||
await withMesh({ meshSlug: flags.mesh ?? null }, async (client) => {
|
||||
let targetSpec = to;
|
||||
if (to.startsWith("#") && !/^#[0-9a-z_-]{20,}$/i.test(to)) {
|
||||
|
||||
@@ -5,6 +5,7 @@
|
||||
*/
|
||||
|
||||
import { withMesh } from "./connect.js";
|
||||
import { tryGetStateViaDaemon, tryListStateViaDaemon, trySetStateViaDaemon } from "~/services/bridge/daemon-route.js";
|
||||
import { render } from "~/ui/render.js";
|
||||
import { bold, dim } from "~/ui/styles.js";
|
||||
|
||||
@@ -14,6 +15,16 @@ export interface StateFlags {
|
||||
}
|
||||
|
||||
export async function runStateGet(flags: StateFlags, key: string): Promise<void> {
|
||||
// Daemon path first.
|
||||
const daemonEntry = await tryGetStateViaDaemon(key, flags.mesh);
|
||||
if (daemonEntry !== null) {
|
||||
if (!daemonEntry) { render.info(dim("(not set)")); return; }
|
||||
if (flags.json) { console.log(JSON.stringify(daemonEntry, null, 2)); return; }
|
||||
const val = typeof daemonEntry.value === "string" ? daemonEntry.value : JSON.stringify(daemonEntry.value);
|
||||
render.info(val);
|
||||
render.info(dim(` set by ${daemonEntry.updatedBy} at ${new Date(daemonEntry.updatedAt).toLocaleString()}`));
|
||||
return;
|
||||
}
|
||||
await withMesh({ meshSlug: flags.mesh ?? null }, async (client) => {
|
||||
const entry = await client.getState(key);
|
||||
if (!entry) {
|
||||
@@ -38,6 +49,12 @@ export async function runStateSet(flags: StateFlags, key: string, value: string)
|
||||
parsed = value;
|
||||
}
|
||||
|
||||
// Daemon path first.
|
||||
const daemonOk = await trySetStateViaDaemon(key, parsed, flags.mesh);
|
||||
if (daemonOk) {
|
||||
render.ok(`${bold(key)} = ${JSON.stringify(parsed)}`);
|
||||
return;
|
||||
}
|
||||
await withMesh({ meshSlug: flags.mesh ?? null }, async (client) => {
|
||||
await client.setState(key, parsed);
|
||||
render.ok(`${bold(key)} = ${JSON.stringify(parsed)}`);
|
||||
@@ -45,6 +62,19 @@ export async function runStateSet(flags: StateFlags, key: string, value: string)
|
||||
}
|
||||
|
||||
export async function runStateList(flags: StateFlags): Promise<void> {
|
||||
// Daemon path first.
|
||||
const daemonRows = await tryListStateViaDaemon(flags.mesh);
|
||||
if (daemonRows !== null) {
|
||||
if (flags.json) { console.log(JSON.stringify(daemonRows, null, 2)); return; }
|
||||
if (daemonRows.length === 0) { render.info(dim("(no state)")); return; }
|
||||
render.section(`state (${daemonRows.length})`);
|
||||
for (const e of daemonRows) {
|
||||
const val = typeof e.value === "string" ? e.value : JSON.stringify(e.value);
|
||||
process.stdout.write(` ${bold(e.key)}: ${val}\n`);
|
||||
process.stdout.write(` ${dim(e.updatedBy + " · " + new Date(e.updatedAt).toLocaleString())}\n`);
|
||||
}
|
||||
return;
|
||||
}
|
||||
await withMesh({ meshSlug: flags.mesh ?? null }, async (client, mesh) => {
|
||||
const entries = await client.listState();
|
||||
|
||||
|
||||
@@ -9,6 +9,7 @@ export const EXIT = {
|
||||
PERMISSION_DENIED: 7,
|
||||
INTERNAL_ERROR: 8,
|
||||
CLAUDE_MISSING: 9,
|
||||
IO_ERROR: 10,
|
||||
} as const;
|
||||
|
||||
export type ExitCode = (typeof EXIT)[keyof typeof EXIT];
|
||||
|
||||
@@ -69,6 +69,21 @@ export interface SkillFull extends SkillSummary {
|
||||
manifest?: unknown;
|
||||
}
|
||||
|
||||
export interface StateRow {
|
||||
key: string;
|
||||
value: unknown;
|
||||
updatedBy: string;
|
||||
updatedAt: string;
|
||||
}
|
||||
|
||||
export interface MemoryRow {
|
||||
id: string;
|
||||
content: string;
|
||||
tags: string[];
|
||||
rememberedBy: string;
|
||||
rememberedAt: string;
|
||||
}
|
||||
|
||||
const HELLO_ACK_TIMEOUT_MS = 5_000;
|
||||
const SEND_ACK_TIMEOUT_MS = 15_000;
|
||||
const BACKOFF_CAPS_MS = [1_000, 2_000, 4_000, 8_000, 16_000, 30_000];
|
||||
@@ -91,6 +106,10 @@ export class DaemonBrokerClient {
|
||||
private peerListResolvers = new Map<string, PendingPeerList>();
|
||||
private skillListResolvers = new Map<string, { resolve: (rows: SkillSummary[]) => void; timer: NodeJS.Timeout }>();
|
||||
private skillDataResolvers = new Map<string, { resolve: (row: SkillFull | null) => void; timer: NodeJS.Timeout }>();
|
||||
private stateGetResolvers = new Map<string, { resolve: (row: StateRow | null) => void; timer: NodeJS.Timeout }>();
|
||||
private stateListResolvers = new Map<string, { resolve: (rows: StateRow[]) => void; timer: NodeJS.Timeout }>();
|
||||
private memoryStoreResolvers = new Map<string, { resolve: (id: string | null) => void; timer: NodeJS.Timeout }>();
|
||||
private memoryRecallResolvers = new Map<string, { resolve: (rows: MemoryRow[]) => void; timer: NodeJS.Timeout }>();
|
||||
private sessionPubkey: string | null = null;
|
||||
private sessionSecretKey: string | null = null;
|
||||
private opens: Array<() => void> = [];
|
||||
@@ -226,6 +245,50 @@ export class DaemonBrokerClient {
|
||||
return;
|
||||
}
|
||||
|
||||
if (msg.type === "state_value" || msg.type === "state_data") {
|
||||
const reqId = String(msg._reqId ?? "");
|
||||
const pending = this.stateGetResolvers.get(reqId);
|
||||
if (pending) {
|
||||
this.stateGetResolvers.delete(reqId);
|
||||
clearTimeout(pending.timer);
|
||||
pending.resolve((msg.state ?? msg.row ?? null) as StateRow | null);
|
||||
}
|
||||
return;
|
||||
}
|
||||
|
||||
if (msg.type === "state_list") {
|
||||
const reqId = String(msg._reqId ?? "");
|
||||
const pending = this.stateListResolvers.get(reqId);
|
||||
if (pending) {
|
||||
this.stateListResolvers.delete(reqId);
|
||||
clearTimeout(pending.timer);
|
||||
pending.resolve(Array.isArray(msg.entries) ? (msg.entries as StateRow[]) : []);
|
||||
}
|
||||
return;
|
||||
}
|
||||
|
||||
if (msg.type === "memory_stored") {
|
||||
const reqId = String(msg._reqId ?? "");
|
||||
const pending = this.memoryStoreResolvers.get(reqId);
|
||||
if (pending) {
|
||||
this.memoryStoreResolvers.delete(reqId);
|
||||
clearTimeout(pending.timer);
|
||||
pending.resolve(typeof msg.memoryId === "string" ? msg.memoryId : null);
|
||||
}
|
||||
return;
|
||||
}
|
||||
|
||||
if (msg.type === "memory_recall_result") {
|
||||
const reqId = String(msg._reqId ?? "");
|
||||
const pending = this.memoryRecallResolvers.get(reqId);
|
||||
if (pending) {
|
||||
this.memoryRecallResolvers.delete(reqId);
|
||||
clearTimeout(pending.timer);
|
||||
pending.resolve(Array.isArray(msg.matches) ? (msg.matches as MemoryRow[]) : []);
|
||||
}
|
||||
return;
|
||||
}
|
||||
|
||||
if (msg.type === "push" || msg.type === "inbound") {
|
||||
this.opts.onPush?.(msg);
|
||||
return;
|
||||
@@ -329,6 +392,76 @@ export class DaemonBrokerClient {
|
||||
});
|
||||
}
|
||||
|
||||
/** Read a single shared state row. Null on disconnect / timeout / not-found. */
|
||||
async getState(key: string, timeoutMs = 5_000): Promise<StateRow | null> {
|
||||
if (this._status !== "open" || !this.ws) return null;
|
||||
return new Promise<StateRow | null>((resolve) => {
|
||||
const reqId = `sg-${++this.reqCounter}`;
|
||||
const timer = setTimeout(() => {
|
||||
if (this.stateGetResolvers.delete(reqId)) resolve(null);
|
||||
}, timeoutMs);
|
||||
this.stateGetResolvers.set(reqId, { resolve, timer });
|
||||
try { this.ws!.send(JSON.stringify({ type: "get_state", key, _reqId: reqId })); }
|
||||
catch { this.stateGetResolvers.delete(reqId); clearTimeout(timer); resolve(null); }
|
||||
});
|
||||
}
|
||||
|
||||
/** List all shared state rows in the mesh. */
|
||||
async listState(timeoutMs = 5_000): Promise<StateRow[]> {
|
||||
if (this._status !== "open" || !this.ws) return [];
|
||||
return new Promise<StateRow[]>((resolve) => {
|
||||
const reqId = `sl-${++this.reqCounter}`;
|
||||
const timer = setTimeout(() => {
|
||||
if (this.stateListResolvers.delete(reqId)) resolve([]);
|
||||
}, timeoutMs);
|
||||
this.stateListResolvers.set(reqId, { resolve, timer });
|
||||
try { this.ws!.send(JSON.stringify({ type: "list_state", _reqId: reqId })); }
|
||||
catch { this.stateListResolvers.delete(reqId); clearTimeout(timer); resolve([]); }
|
||||
});
|
||||
}
|
||||
|
||||
/** Set a shared state value. Fire-and-forget. */
|
||||
setState(key: string, value: unknown): void {
|
||||
if (this._status !== "open" || !this.ws) return;
|
||||
try { this.ws.send(JSON.stringify({ type: "set_state", key, value })); }
|
||||
catch { /* ignore */ }
|
||||
}
|
||||
|
||||
/** Store a memory in the mesh. Returns the assigned id, or null on timeout. */
|
||||
async remember(content: string, tags?: string[], timeoutMs = 5_000): Promise<string | null> {
|
||||
if (this._status !== "open" || !this.ws) return null;
|
||||
return new Promise<string | null>((resolve) => {
|
||||
const reqId = `mr-${++this.reqCounter}`;
|
||||
const timer = setTimeout(() => {
|
||||
if (this.memoryStoreResolvers.delete(reqId)) resolve(null);
|
||||
}, timeoutMs);
|
||||
this.memoryStoreResolvers.set(reqId, { resolve, timer });
|
||||
try { this.ws!.send(JSON.stringify({ type: "remember", content, tags, _reqId: reqId })); }
|
||||
catch { this.memoryStoreResolvers.delete(reqId); clearTimeout(timer); resolve(null); }
|
||||
});
|
||||
}
|
||||
|
||||
/** Search memories by relevance. */
|
||||
async recall(query: string, timeoutMs = 5_000): Promise<MemoryRow[]> {
|
||||
if (this._status !== "open" || !this.ws) return [];
|
||||
return new Promise<MemoryRow[]>((resolve) => {
|
||||
const reqId = `mc-${++this.reqCounter}`;
|
||||
const timer = setTimeout(() => {
|
||||
if (this.memoryRecallResolvers.delete(reqId)) resolve([]);
|
||||
}, timeoutMs);
|
||||
this.memoryRecallResolvers.set(reqId, { resolve, timer });
|
||||
try { this.ws!.send(JSON.stringify({ type: "recall", query, _reqId: reqId })); }
|
||||
catch { this.memoryRecallResolvers.delete(reqId); clearTimeout(timer); resolve([]); }
|
||||
});
|
||||
}
|
||||
|
||||
/** Forget a memory by id. Fire-and-forget. */
|
||||
forget(memoryId: string): void {
|
||||
if (this._status !== "open" || !this.ws) return;
|
||||
try { this.ws.send(JSON.stringify({ type: "forget", memoryId })); }
|
||||
catch { /* ignore */ }
|
||||
}
|
||||
|
||||
/** Set the daemon's profile (avatar/title/bio/capabilities). Fire-and-forget. */
|
||||
setProfile(profile: { avatar?: string; title?: string; bio?: string; capabilities?: string[] }): void {
|
||||
if (this._status !== "open" || !this.ws) return;
|
||||
|
||||
@@ -2,6 +2,7 @@ import { request as httpRequest } from "node:http";
|
||||
|
||||
import { DAEMON_PATHS, DAEMON_TCP_HOST, DAEMON_TCP_DEFAULT_PORT } from "../paths.js";
|
||||
import { readLocalToken } from "../local-token.js";
|
||||
import { readSessionTokenFromEnv } from "~/services/session/token.js";
|
||||
|
||||
export interface IpcRequestOptions {
|
||||
method?: "GET" | "POST" | "PATCH" | "DELETE";
|
||||
@@ -44,6 +45,19 @@ export async function ipc<T = unknown>(opts: IpcRequestOptions): Promise<IpcResp
|
||||
headers.authorization = `Bearer ${tok}`;
|
||||
}
|
||||
|
||||
// Per-session token attribution. When the calling process has
|
||||
// CLAUDEMESH_IPC_TOKEN_FILE set (a launched session and its
|
||||
// descendants), attach the session token. The daemon's auth
|
||||
// middleware resolves it to a SessionInfo and uses it for default-
|
||||
// mesh scoping. Sent as a second Authorization header is not
|
||||
// possible per HTTP semantics, so we layer: when both UDS and a
|
||||
// session token exist, send the session token; the bearer remains
|
||||
// only for TCP loopback callers.
|
||||
if (!useTcp) {
|
||||
const sessionTok = readSessionTokenFromEnv();
|
||||
if (sessionTok) headers.authorization = `ClaudeMesh-Session ${sessionTok}`;
|
||||
}
|
||||
|
||||
return new Promise<IpcResponse<T>>((resolve, reject) => {
|
||||
const req = httpRequest(
|
||||
useTcp
|
||||
|
||||
@@ -10,6 +10,10 @@ import { listOutbox, requeueDeadOrPending, type OutboxStatus } from "../db/outbo
|
||||
import { randomUUID } from "node:crypto";
|
||||
import { bindSseStream, type EventBus } from "../events.js";
|
||||
import type { DaemonBrokerClient } from "../broker.js";
|
||||
import {
|
||||
registerSession, deregisterByToken, resolveToken, listSessions, startReaper,
|
||||
type SessionInfo,
|
||||
} from "../session-registry.js";
|
||||
import { VERSION } from "~/constants/urls.js";
|
||||
|
||||
/**
|
||||
@@ -172,12 +176,28 @@ function makeHandler(opts: {
|
||||
}
|
||||
}
|
||||
|
||||
// Per-session token resolution. Layers on top of the machine-level
|
||||
// local-token auth above: callers from inside a `claudemesh launch`-
|
||||
// spawned session pass `Authorization: ClaudeMesh-Session <hex>`
|
||||
// (instead of, or in addition to, Bearer over TCP) and we resolve
|
||||
// it to a SessionInfo that downstream routes use for default-mesh
|
||||
// scoping and attribution.
|
||||
let session: SessionInfo | null = null;
|
||||
{
|
||||
const authz = req.headers.authorization ?? "";
|
||||
const sm = /^ClaudeMesh-Session\s+([0-9a-f]{64})$/i.exec(authz.trim());
|
||||
if (sm && sm[1]) session = resolveToken(sm[1].toLowerCase());
|
||||
}
|
||||
/** Pick mesh from explicit body/query first, then session default. */
|
||||
const meshFromCtx = (explicit?: string | null): string | null =>
|
||||
(explicit && explicit.trim()) ? explicit : (session?.mesh ?? null);
|
||||
|
||||
// Routing.
|
||||
if (req.method === "GET" && url.pathname === "/v1/version") {
|
||||
respond(res, 200, {
|
||||
daemon_version: VERSION,
|
||||
ipc_api: "v1",
|
||||
ipc_features: ["version", "health", "send", "inbox", "events", "peers", "profile", "skills"],
|
||||
ipc_features: ["version", "health", "send", "inbox", "events", "peers", "profile", "skills", "state", "memory", "sessions"],
|
||||
schema_version: 1,
|
||||
});
|
||||
return;
|
||||
@@ -188,6 +208,102 @@ function makeHandler(opts: {
|
||||
return;
|
||||
}
|
||||
|
||||
// Session registry routes (1.29.0)
|
||||
if (req.method === "POST" && url.pathname === "/v1/sessions/register") {
|
||||
try {
|
||||
const body = await readJsonBody(req, 64 * 1024) as Record<string, unknown> | null;
|
||||
if (!body) { respond(res, 400, { error: "missing body" }); return; }
|
||||
const token = typeof body.token === "string" ? body.token : "";
|
||||
if (!/^[0-9a-f]{64}$/i.test(token)) { respond(res, 400, { error: "token must be 64 hex chars" }); return; }
|
||||
const sessionId = typeof body.session_id === "string" ? body.session_id : "";
|
||||
const mesh = typeof body.mesh === "string" ? body.mesh : "";
|
||||
const displayName = typeof body.display_name === "string" ? body.display_name : "";
|
||||
const pid = typeof body.pid === "number" ? body.pid : 0;
|
||||
if (!sessionId || !mesh || !displayName || !pid) {
|
||||
respond(res, 400, { error: "session_id, mesh, display_name, pid all required" });
|
||||
return;
|
||||
}
|
||||
const cwd = typeof body.cwd === "string" ? body.cwd : undefined;
|
||||
const role = typeof body.role === "string" ? body.role : undefined;
|
||||
const groups = Array.isArray(body.groups)
|
||||
? body.groups.filter((g): g is string => typeof g === "string")
|
||||
: undefined;
|
||||
|
||||
// 1.30.0 — optional per-session presence material. Older CLIs
|
||||
// omit this; the daemon's session-broker subsystem just won't
|
||||
// open a per-session WS for those.
|
||||
let presence: SessionInfo["presence"] | undefined;
|
||||
const rawPresence = body.presence;
|
||||
if (rawPresence && typeof rawPresence === "object") {
|
||||
const p = rawPresence as Record<string, unknown>;
|
||||
const sessionPubkey = typeof p.session_pubkey === "string" ? p.session_pubkey.toLowerCase() : "";
|
||||
const sessionSecretKey = typeof p.session_secret_key === "string" ? p.session_secret_key.toLowerCase() : "";
|
||||
const att = p.parent_attestation as Record<string, unknown> | undefined;
|
||||
if (
|
||||
/^[0-9a-f]{64}$/.test(sessionPubkey) &&
|
||||
/^[0-9a-f]{128}$/.test(sessionSecretKey) &&
|
||||
att && typeof att === "object" &&
|
||||
typeof att.session_pubkey === "string" &&
|
||||
typeof att.parent_member_pubkey === "string" &&
|
||||
typeof att.expires_at === "number" &&
|
||||
typeof att.signature === "string"
|
||||
) {
|
||||
presence = {
|
||||
sessionPubkey,
|
||||
sessionSecretKey,
|
||||
parentAttestation: {
|
||||
sessionPubkey: (att.session_pubkey as string).toLowerCase(),
|
||||
parentMemberPubkey: (att.parent_member_pubkey as string).toLowerCase(),
|
||||
expiresAt: att.expires_at as number,
|
||||
signature: (att.signature as string).toLowerCase(),
|
||||
},
|
||||
};
|
||||
} else {
|
||||
opts.log("warn", "session_register_presence_malformed", { mesh });
|
||||
}
|
||||
}
|
||||
|
||||
const stored = registerSession({
|
||||
token: token.toLowerCase(),
|
||||
sessionId, mesh, displayName, pid, cwd, role, groups,
|
||||
...(presence ? { presence } : {}),
|
||||
});
|
||||
opts.log("info", "session_registered", {
|
||||
sessionId, mesh, pid,
|
||||
presence: presence ? "yes" : "no",
|
||||
});
|
||||
respond(res, 200, {
|
||||
ok: true,
|
||||
registered_at: stored.registeredAt,
|
||||
presence_accepted: !!presence,
|
||||
});
|
||||
} catch (e) {
|
||||
respond(res, 400, { error: String(e) });
|
||||
}
|
||||
return;
|
||||
}
|
||||
|
||||
if (req.method === "DELETE" && url.pathname.startsWith("/v1/sessions/")) {
|
||||
const tail = url.pathname.slice("/v1/sessions/".length);
|
||||
if (!/^[0-9a-f]{64}$/i.test(tail)) { respond(res, 400, { error: "invalid token" }); return; }
|
||||
const ok = deregisterByToken(tail.toLowerCase());
|
||||
respond(res, ok ? 200 : 404, { ok, token_prefix: tail.slice(0, 8) });
|
||||
return;
|
||||
}
|
||||
|
||||
if (req.method === "GET" && url.pathname === "/v1/sessions/me") {
|
||||
if (!session) { respond(res, 401, { error: "no session token" }); return; }
|
||||
const { token, ...redacted } = session;
|
||||
respond(res, 200, { session: { ...redacted, token_prefix: token.slice(0, 8) } });
|
||||
return;
|
||||
}
|
||||
|
||||
if (req.method === "GET" && url.pathname === "/v1/sessions") {
|
||||
const all = listSessions().map(({ token, ...rest }) => ({ ...rest, token_prefix: token.slice(0, 8) }));
|
||||
respond(res, 200, { sessions: all });
|
||||
return;
|
||||
}
|
||||
|
||||
if (req.method === "GET" && url.pathname === "/v1/events") {
|
||||
if (!opts.bus) {
|
||||
respond(res, 503, { error: "event bus not initialised" });
|
||||
@@ -202,7 +318,7 @@ function makeHandler(opts: {
|
||||
respond(res, 503, { error: "broker not initialised" });
|
||||
return;
|
||||
}
|
||||
const filterMesh = url.searchParams.get("mesh") ?? undefined;
|
||||
const filterMesh = meshFromCtx(url.searchParams.get("mesh")) ?? undefined;
|
||||
try {
|
||||
// Aggregate across all attached meshes; each peer record gets a
|
||||
// `mesh` field so the caller can scope client-side. A single
|
||||
@@ -212,7 +328,7 @@ function makeHandler(opts: {
|
||||
if (filterMesh && filterMesh !== slug) continue;
|
||||
try {
|
||||
const peers = await b.listPeers();
|
||||
for (const p of peers) all.push({ ...(p as Record<string, unknown>), mesh: slug });
|
||||
for (const p of peers) all.push({ ...(p as unknown as Record<string, unknown>), mesh: slug });
|
||||
} catch (e) {
|
||||
opts.log("warn", "ipc_peers_broker_failed", { mesh: slug, err: String(e) });
|
||||
}
|
||||
@@ -224,20 +340,153 @@ function makeHandler(opts: {
|
||||
return;
|
||||
}
|
||||
|
||||
if (req.method === "GET" && url.pathname === "/v1/state") {
|
||||
if (!opts.brokers || opts.brokers.size === 0) {
|
||||
respond(res, 503, { error: "broker not initialised" });
|
||||
return;
|
||||
}
|
||||
const filterMesh = meshFromCtx(url.searchParams.get("mesh")) ?? undefined;
|
||||
const key = url.searchParams.get("key");
|
||||
try {
|
||||
if (key) {
|
||||
// Single key lookup. Walk attached meshes; first match wins
|
||||
// (or ?mesh=<slug> scopes the search).
|
||||
for (const [slug, b] of opts.brokers.entries()) {
|
||||
if (filterMesh && filterMesh !== slug) continue;
|
||||
const row = await b.getState(key).catch(() => null);
|
||||
if (row) { respond(res, 200, { state: { ...row, mesh: slug } }); return; }
|
||||
}
|
||||
respond(res, 404, { error: "state_not_found", key });
|
||||
return;
|
||||
}
|
||||
// No key — list all entries across attached meshes.
|
||||
const all: Array<Record<string, unknown> & { mesh: string }> = [];
|
||||
for (const [slug, b] of opts.brokers.entries()) {
|
||||
if (filterMesh && filterMesh !== slug) continue;
|
||||
const rows = await b.listState().catch(() => []);
|
||||
for (const r of rows) all.push({ ...(r as unknown as Record<string, unknown>), mesh: slug });
|
||||
}
|
||||
respond(res, 200, { entries: all });
|
||||
} catch (e) {
|
||||
respond(res, 502, { error: "broker_unreachable", detail: String(e) });
|
||||
}
|
||||
return;
|
||||
}
|
||||
|
||||
if (req.method === "POST" && url.pathname === "/v1/state") {
|
||||
if (!opts.brokers || opts.brokers.size === 0) {
|
||||
respond(res, 503, { error: "broker not initialised" });
|
||||
return;
|
||||
}
|
||||
try {
|
||||
const body = await readJsonBody(req, 256 * 1024) as Record<string, unknown> | null;
|
||||
if (!body || typeof body.key !== "string") {
|
||||
respond(res, 400, { error: "missing 'key' (string)" });
|
||||
return;
|
||||
}
|
||||
const requested = meshFromCtx(typeof body.mesh === "string" ? body.mesh : null);
|
||||
let chosen = requested;
|
||||
if (!chosen && opts.brokers.size === 1) chosen = opts.brokers.keys().next().value as string;
|
||||
if (!chosen) {
|
||||
respond(res, 400, { error: "mesh_required", attached: [...opts.brokers.keys()] });
|
||||
return;
|
||||
}
|
||||
const broker = opts.brokers.get(chosen);
|
||||
if (!broker) { respond(res, 404, { error: "mesh_not_attached", mesh: chosen }); return; }
|
||||
broker.setState(body.key, body.value);
|
||||
respond(res, 200, { ok: true, key: body.key, mesh: chosen });
|
||||
} catch (e) {
|
||||
respond(res, 400, { error: String(e) });
|
||||
}
|
||||
return;
|
||||
}
|
||||
|
||||
if (req.method === "GET" && url.pathname === "/v1/memory") {
|
||||
if (!opts.brokers || opts.brokers.size === 0) {
|
||||
respond(res, 503, { error: "broker not initialised" });
|
||||
return;
|
||||
}
|
||||
const query = url.searchParams.get("q") ?? "";
|
||||
const filterMesh = meshFromCtx(url.searchParams.get("mesh")) ?? undefined;
|
||||
try {
|
||||
const all: Array<Record<string, unknown> & { mesh: string }> = [];
|
||||
for (const [slug, b] of opts.brokers.entries()) {
|
||||
if (filterMesh && filterMesh !== slug) continue;
|
||||
const rows = await b.recall(query).catch(() => []);
|
||||
for (const r of rows) all.push({ ...(r as unknown as Record<string, unknown>), mesh: slug });
|
||||
}
|
||||
respond(res, 200, { matches: all });
|
||||
} catch (e) {
|
||||
respond(res, 502, { error: "broker_unreachable", detail: String(e) });
|
||||
}
|
||||
return;
|
||||
}
|
||||
|
||||
if (req.method === "POST" && url.pathname === "/v1/memory") {
|
||||
if (!opts.brokers || opts.brokers.size === 0) {
|
||||
respond(res, 503, { error: "broker not initialised" });
|
||||
return;
|
||||
}
|
||||
try {
|
||||
const body = await readJsonBody(req, 256 * 1024) as Record<string, unknown> | null;
|
||||
if (!body || typeof body.content !== "string") {
|
||||
respond(res, 400, { error: "missing 'content' (string)" });
|
||||
return;
|
||||
}
|
||||
const requested = meshFromCtx(typeof body.mesh === "string" ? body.mesh : null);
|
||||
let chosen = requested;
|
||||
if (!chosen && opts.brokers.size === 1) chosen = opts.brokers.keys().next().value as string;
|
||||
if (!chosen) {
|
||||
respond(res, 400, { error: "mesh_required", attached: [...opts.brokers.keys()] });
|
||||
return;
|
||||
}
|
||||
const broker = opts.brokers.get(chosen);
|
||||
if (!broker) { respond(res, 404, { error: "mesh_not_attached", mesh: chosen }); return; }
|
||||
const tags = Array.isArray(body.tags) ? body.tags.filter((t) => typeof t === "string") as string[] : undefined;
|
||||
const id = await broker.remember(body.content, tags);
|
||||
if (!id) { respond(res, 502, { error: "remember_timeout" }); return; }
|
||||
respond(res, 200, { id, mesh: chosen });
|
||||
} catch (e) {
|
||||
respond(res, 400, { error: String(e) });
|
||||
}
|
||||
return;
|
||||
}
|
||||
|
||||
if (req.method === "DELETE" && url.pathname.startsWith("/v1/memory/")) {
|
||||
if (!opts.brokers || opts.brokers.size === 0) {
|
||||
respond(res, 503, { error: "broker not initialised" });
|
||||
return;
|
||||
}
|
||||
const id = decodeURIComponent(url.pathname.slice("/v1/memory/".length));
|
||||
if (!id) { respond(res, 400, { error: "missing memory id" }); return; }
|
||||
const requested = url.searchParams.get("mesh");
|
||||
let chosen = requested;
|
||||
if (!chosen && opts.brokers.size === 1) chosen = opts.brokers.keys().next().value as string;
|
||||
if (!chosen) {
|
||||
respond(res, 400, { error: "mesh_required", attached: [...opts.brokers.keys()] });
|
||||
return;
|
||||
}
|
||||
const broker = opts.brokers.get(chosen);
|
||||
if (!broker) { respond(res, 404, { error: "mesh_not_attached", mesh: chosen }); return; }
|
||||
broker.forget(id);
|
||||
respond(res, 200, { ok: true, id, mesh: chosen });
|
||||
return;
|
||||
}
|
||||
|
||||
if (req.method === "GET" && url.pathname === "/v1/skills") {
|
||||
if (!opts.brokers || opts.brokers.size === 0) {
|
||||
respond(res, 503, { error: "broker not initialised" });
|
||||
return;
|
||||
}
|
||||
const query = url.searchParams.get("query") ?? undefined;
|
||||
const filterMesh = url.searchParams.get("mesh") ?? undefined;
|
||||
const filterMesh = meshFromCtx(url.searchParams.get("mesh")) ?? undefined;
|
||||
try {
|
||||
const all: Array<Record<string, unknown> & { mesh: string }> = [];
|
||||
for (const [slug, b] of opts.brokers.entries()) {
|
||||
if (filterMesh && filterMesh !== slug) continue;
|
||||
try {
|
||||
const skills = await b.listSkills(query);
|
||||
for (const s of skills) all.push({ ...(s as Record<string, unknown>), mesh: slug });
|
||||
for (const s of skills) all.push({ ...(s as unknown as Record<string, unknown>), mesh: slug });
|
||||
} catch (e) {
|
||||
opts.log("warn", "ipc_skills_broker_failed", { mesh: slug, err: String(e) });
|
||||
}
|
||||
@@ -256,7 +505,7 @@ function makeHandler(opts: {
|
||||
}
|
||||
const name = decodeURIComponent(url.pathname.slice("/v1/skills/".length));
|
||||
if (!name) { respond(res, 400, { error: "missing skill name" }); return; }
|
||||
const filterMesh = url.searchParams.get("mesh") ?? undefined;
|
||||
const filterMesh = meshFromCtx(url.searchParams.get("mesh")) ?? undefined;
|
||||
try {
|
||||
// First mesh that has the skill wins. With ?mesh=<slug>, only that
|
||||
// mesh is queried.
|
||||
@@ -284,7 +533,7 @@ function makeHandler(opts: {
|
||||
// present in the body or query, otherwise broadcast to all attached
|
||||
// meshes (presence is per-mesh, but most users want consistent
|
||||
// presence across all of theirs).
|
||||
const requested = (typeof body.mesh === "string" ? body.mesh : url.searchParams.get("mesh")) || null;
|
||||
const requested = meshFromCtx(typeof body.mesh === "string" ? body.mesh : url.searchParams.get("mesh"));
|
||||
const targets = requested
|
||||
? [opts.brokers.get(requested)].filter(Boolean) as DaemonBrokerClient[]
|
||||
: [...opts.brokers.values()];
|
||||
|
||||
@@ -4,10 +4,12 @@ import { DAEMON_PATHS } from "./paths.js";
|
||||
import { acquireSingletonLock, releaseSingletonLock } from "./lock.js";
|
||||
import { ensureLocalToken } from "./local-token.js";
|
||||
import { startIpcServer } from "./ipc/server.js";
|
||||
import { setRegistryHooks, startReaper, type SessionInfo } from "./session-registry.js";
|
||||
import { openSqlite, type SqliteDb } from "./db/sqlite.js";
|
||||
import { migrateOutbox } from "./db/outbox.js";
|
||||
import { migrateInbox } from "./db/inbox.js";
|
||||
import { DaemonBrokerClient } from "./broker.js";
|
||||
import { SessionBrokerClient } from "./session-broker.js";
|
||||
import { startDrainWorker, type DrainHandle } from "./drain.js";
|
||||
import { handleBrokerPush } from "./inbound.js";
|
||||
import { EventBus } from "./events.js";
|
||||
@@ -153,6 +155,57 @@ export async function runDaemon(opts: RunDaemonOptions = {}): Promise<number> {
|
||||
let drain: DrainHandle | null = null;
|
||||
drain = startDrainWorker({ db: outboxDb, brokers });
|
||||
|
||||
// 1.30.0 — per-session broker presence. Always on. Older CLIs that
|
||||
// don't include `presence` material in the register body just won't
|
||||
// get a session WS; the daemon's own member-keyed broker still
|
||||
// covers them.
|
||||
const sessionBrokers = new Map<string, SessionBrokerClient>();
|
||||
setRegistryHooks({
|
||||
onRegister: (info) => {
|
||||
if (!info.presence) return;
|
||||
const meshConfig = meshConfigs.get(info.mesh);
|
||||
if (!meshConfig) {
|
||||
process.stderr.write(JSON.stringify({
|
||||
level: "warn", msg: "session_broker_no_mesh_config", mesh: info.mesh,
|
||||
ts: new Date().toISOString(),
|
||||
}) + "\n");
|
||||
return;
|
||||
}
|
||||
// Drop any pre-existing session WS under this token (re-register).
|
||||
const prior = sessionBrokers.get(info.token);
|
||||
if (prior) {
|
||||
sessionBrokers.delete(info.token);
|
||||
prior.close().catch(() => { /* ignore */ });
|
||||
}
|
||||
const client = new SessionBrokerClient({
|
||||
mesh: meshConfig,
|
||||
sessionPubkey: info.presence.sessionPubkey,
|
||||
sessionSecretKey: info.presence.sessionSecretKey,
|
||||
parentAttestation: info.presence.parentAttestation,
|
||||
sessionId: info.sessionId,
|
||||
displayName: info.displayName,
|
||||
...(info.role ? { role: info.role } : {}),
|
||||
...(info.cwd ? { cwd: info.cwd } : {}),
|
||||
pid: info.pid,
|
||||
});
|
||||
sessionBrokers.set(info.token, client);
|
||||
client.connect().catch((err) =>
|
||||
process.stderr.write(JSON.stringify({
|
||||
level: "warn", msg: "session_broker_connect_failed",
|
||||
mesh: info.mesh, err: String(err), ts: new Date().toISOString(),
|
||||
}) + "\n"),
|
||||
);
|
||||
},
|
||||
onDeregister: (info: SessionInfo) => {
|
||||
const client = sessionBrokers.get(info.token);
|
||||
if (!client) return;
|
||||
sessionBrokers.delete(info.token);
|
||||
client.close().catch(() => { /* ignore */ });
|
||||
},
|
||||
});
|
||||
|
||||
startReaper();
|
||||
|
||||
const ipc = startIpcServer({
|
||||
localToken,
|
||||
tcpEnabled,
|
||||
@@ -191,6 +244,10 @@ export async function runDaemon(opts: RunDaemonOptions = {}): Promise<number> {
|
||||
for (const b of brokers.values()) {
|
||||
try { await b.close(); } catch { /* ignore */ }
|
||||
}
|
||||
for (const b of sessionBrokers.values()) {
|
||||
try { await b.close(); } catch { /* ignore */ }
|
||||
}
|
||||
sessionBrokers.clear();
|
||||
await ipc.close();
|
||||
try { outboxDb.close(); } catch { /* ignore */ }
|
||||
try { inboxDb.close(); } catch { /* ignore */ }
|
||||
|
||||
205
apps/cli/src/daemon/session-broker.ts
Normal file
205
apps/cli/src/daemon/session-broker.ts
Normal file
@@ -0,0 +1,205 @@
|
||||
/**
|
||||
* Per-launch session broker WebSocket.
|
||||
*
|
||||
* Owned by the daemon, one per registered session. Holds a long-lived
|
||||
* presence row on the broker keyed on the session's ephemeral pubkey
|
||||
* (rather than the parent member's stable pubkey). Sibling sessions —
|
||||
* two `claudemesh launch` runs in the same cwd — finally see each other
|
||||
* in `peer list` because their presence rows coexist instead of fighting
|
||||
* over the same memberPubkey snapshot.
|
||||
*
|
||||
* Differences from `DaemonBrokerClient`:
|
||||
* - Uses session_hello (1.30.0+ broker), with a parent-vouched
|
||||
* attestation provided at construction time.
|
||||
* - Does NOT drain the outbox — that stays the parent member-keyed
|
||||
* DaemonBrokerClient's job. Keeps the responsibility split clean
|
||||
* and avoids two clients fighting over the same outbox row.
|
||||
* - Does NOT carry list_peers / state / memory RPCs. This client is
|
||||
* presence-only (and inbound DM delivery for messages targeted at
|
||||
* the session pubkey).
|
||||
*
|
||||
* Old brokers reply with `unknown_message_type` on session_hello — we
|
||||
* surface that as a one-shot `error` event and the daemon decides
|
||||
* whether to fall back. For 1.30.0 we just log + retry; the broker is
|
||||
* expected to be deployed first.
|
||||
*
|
||||
* Spec: .artifacts/specs/2026-05-04-per-session-presence.md.
|
||||
*/
|
||||
|
||||
import { hostname as osHostname } from "node:os";
|
||||
import WebSocket from "ws";
|
||||
|
||||
import type { JoinedMesh } from "~/services/config/facade.js";
|
||||
import { signSessionHello } from "~/services/broker/session-hello-sig.js";
|
||||
|
||||
export type SessionBrokerStatus = "connecting" | "open" | "closed" | "reconnecting";
|
||||
|
||||
export interface ParentAttestation {
|
||||
sessionPubkey: string;
|
||||
parentMemberPubkey: string;
|
||||
/** Unix ms. Broker rejects > now+24h or already past. */
|
||||
expiresAt: number;
|
||||
signature: string;
|
||||
}
|
||||
|
||||
export interface SessionBrokerOptions {
|
||||
mesh: JoinedMesh;
|
||||
/** Per-launch ephemeral keypair. */
|
||||
sessionPubkey: string;
|
||||
sessionSecretKey: string;
|
||||
/** Parent-vouched attestation, signed by mesh.secretKey at launch time. */
|
||||
parentAttestation: ParentAttestation;
|
||||
/** Stable session_id from the launch (used for dedup on the broker). */
|
||||
sessionId: string;
|
||||
/** Display name override for this session. */
|
||||
displayName?: string;
|
||||
/** Initial groups. Format mirrors the regular hello. */
|
||||
groups?: Array<{ name: string; role?: string }>;
|
||||
/** Role tag (informational, not auth-bearing). */
|
||||
role?: string;
|
||||
/** Working directory (informational, surfaced in peer list). */
|
||||
cwd?: string;
|
||||
/** Pid of the launched session (NOT the daemon). */
|
||||
pid: number;
|
||||
onStatusChange?: (s: SessionBrokerStatus) => void;
|
||||
log?: (level: "info" | "warn" | "error", msg: string, meta?: Record<string, unknown>) => void;
|
||||
}
|
||||
|
||||
const HELLO_ACK_TIMEOUT_MS = 5_000;
|
||||
const BACKOFF_CAPS_MS = [1_000, 2_000, 4_000, 8_000, 16_000, 30_000];
|
||||
|
||||
export class SessionBrokerClient {
|
||||
private ws: WebSocket | null = null;
|
||||
private _status: SessionBrokerStatus = "closed";
|
||||
private closed = false;
|
||||
private reconnectAttempt = 0;
|
||||
private reconnectTimer: NodeJS.Timeout | null = null;
|
||||
private helloTimer: NodeJS.Timeout | null = null;
|
||||
|
||||
constructor(private opts: SessionBrokerOptions) {}
|
||||
|
||||
get status(): SessionBrokerStatus { return this._status; }
|
||||
get meshSlug(): string { return this.opts.mesh.slug; }
|
||||
get sessionPubkey(): string { return this.opts.sessionPubkey; }
|
||||
|
||||
private log = (level: "info" | "warn" | "error", msg: string, meta?: Record<string, unknown>) => {
|
||||
(this.opts.log ?? defaultLog)(level, msg, {
|
||||
mesh: this.opts.mesh.slug,
|
||||
session_pubkey: this.opts.sessionPubkey.slice(0, 12),
|
||||
...meta,
|
||||
});
|
||||
};
|
||||
|
||||
private setStatus(s: SessionBrokerStatus) {
|
||||
if (this._status === s) return;
|
||||
this._status = s;
|
||||
this.opts.onStatusChange?.(s);
|
||||
}
|
||||
|
||||
/** Open the WS, run session_hello, resolve once the broker accepts. */
|
||||
async connect(): Promise<void> {
|
||||
if (this.closed) throw new Error("client_closed");
|
||||
if (this._status === "connecting" || this._status === "open") return;
|
||||
this.setStatus("connecting");
|
||||
|
||||
const ws = new WebSocket(this.opts.mesh.brokerUrl);
|
||||
this.ws = ws;
|
||||
|
||||
return new Promise<void>((resolve, reject) => {
|
||||
ws.on("open", async () => {
|
||||
try {
|
||||
const { timestamp, signature } = await signSessionHello({
|
||||
meshId: this.opts.mesh.meshId,
|
||||
parentMemberPubkey: this.opts.mesh.pubkey,
|
||||
sessionPubkey: this.opts.sessionPubkey,
|
||||
sessionSecretKey: this.opts.sessionSecretKey,
|
||||
});
|
||||
ws.send(JSON.stringify({
|
||||
type: "session_hello",
|
||||
meshId: this.opts.mesh.meshId,
|
||||
parentMemberId: this.opts.mesh.memberId,
|
||||
parentMemberPubkey: this.opts.mesh.pubkey,
|
||||
sessionPubkey: this.opts.sessionPubkey,
|
||||
parentAttestation: this.opts.parentAttestation,
|
||||
displayName: this.opts.displayName,
|
||||
sessionId: this.opts.sessionId,
|
||||
pid: this.opts.pid,
|
||||
cwd: this.opts.cwd ?? process.cwd(),
|
||||
hostname: osHostname(),
|
||||
peerType: "ai" as const,
|
||||
channel: "claudemesh-session",
|
||||
...(this.opts.groups && this.opts.groups.length > 0 ? { groups: this.opts.groups } : {}),
|
||||
...(this.opts.role ? { role: this.opts.role } : {}),
|
||||
timestamp,
|
||||
signature,
|
||||
}));
|
||||
this.helloTimer = setTimeout(() => {
|
||||
this.log("warn", "session_hello_ack_timeout");
|
||||
try { ws.close(); } catch { /* ignore */ }
|
||||
reject(new Error("session_hello_ack_timeout"));
|
||||
}, HELLO_ACK_TIMEOUT_MS);
|
||||
} catch (e) {
|
||||
reject(e instanceof Error ? e : new Error(String(e)));
|
||||
}
|
||||
});
|
||||
|
||||
ws.on("message", (raw) => {
|
||||
let msg: Record<string, unknown>;
|
||||
try { msg = JSON.parse(raw.toString()) as Record<string, unknown>; }
|
||||
catch { return; }
|
||||
|
||||
if (msg.type === "hello_ack") {
|
||||
if (this.helloTimer) clearTimeout(this.helloTimer);
|
||||
this.helloTimer = null;
|
||||
this.setStatus("open");
|
||||
this.reconnectAttempt = 0;
|
||||
resolve();
|
||||
return;
|
||||
}
|
||||
|
||||
if (msg.type === "error") {
|
||||
// Older brokers respond with `unknown_message_type` to session_hello;
|
||||
// surface that so the daemon can decide to skip per-session presence
|
||||
// rather than churn through reconnects.
|
||||
this.log("warn", "broker_error", { code: msg.code, message: msg.message });
|
||||
if (msg.code === "unknown_message_type") {
|
||||
this.closed = true;
|
||||
}
|
||||
return;
|
||||
}
|
||||
// push / inbound — presence-only client ignores them; the daemon's
|
||||
// member-keyed client handles all DM decryption.
|
||||
});
|
||||
|
||||
ws.on("close", (code, reason) => {
|
||||
if (this.helloTimer) { clearTimeout(this.helloTimer); this.helloTimer = null; }
|
||||
if (this.closed) { this.setStatus("closed"); return; }
|
||||
this.setStatus("reconnecting");
|
||||
const wait = BACKOFF_CAPS_MS[Math.min(this.reconnectAttempt, BACKOFF_CAPS_MS.length - 1)] ?? 30_000;
|
||||
this.reconnectAttempt++;
|
||||
this.log("info", "session_broker_reconnect_scheduled", { wait_ms: wait, code, reason: reason.toString("utf8") });
|
||||
this.reconnectTimer = setTimeout(
|
||||
() => this.connect().catch((err) => this.log("warn", "session_broker_reconnect_failed", { err: String(err) })),
|
||||
wait,
|
||||
);
|
||||
if (this._status === "connecting") reject(new Error(`closed_before_hello_${code}`));
|
||||
});
|
||||
|
||||
ws.on("error", (err) => this.log("warn", "session_broker_ws_error", { err: err.message }));
|
||||
});
|
||||
}
|
||||
|
||||
async close(): Promise<void> {
|
||||
this.closed = true;
|
||||
if (this.reconnectTimer) { clearTimeout(this.reconnectTimer); this.reconnectTimer = null; }
|
||||
if (this.helloTimer) { clearTimeout(this.helloTimer); this.helloTimer = null; }
|
||||
try { this.ws?.close(); } catch { /* ignore */ }
|
||||
this.setStatus("closed");
|
||||
}
|
||||
}
|
||||
|
||||
function defaultLog(level: "info" | "warn" | "error", msg: string, meta?: Record<string, unknown>) {
|
||||
const line = JSON.stringify({ level, msg, ...meta, ts: new Date().toISOString() });
|
||||
if (level === "info") process.stdout.write(line + "\n");
|
||||
else process.stderr.write(line + "\n");
|
||||
}
|
||||
146
apps/cli/src/daemon/session-registry.ts
Normal file
146
apps/cli/src/daemon/session-registry.ts
Normal file
@@ -0,0 +1,146 @@
|
||||
/**
|
||||
* In-memory per-token session registry kept by the daemon.
|
||||
*
|
||||
* `claudemesh launch` POSTs `/v1/sessions/register` with the token it
|
||||
* minted plus session metadata (sessionId, mesh, displayName, pid,
|
||||
* cwd, role, groups). Subsequent CLI invocations from inside that
|
||||
* session present the token via `Authorization: ClaudeMesh-Session
|
||||
* <hex>` and the daemon's IPC auth middleware resolves it here in O(1).
|
||||
*
|
||||
* Lifecycle:
|
||||
* - register replaces any prior entry under the same `sessionId`
|
||||
* (handles re-launch and `--resume` flows cleanly).
|
||||
* - reaper polls every 30 s and drops entries whose pid is dead.
|
||||
* - hard ttl ceiling of 24 h is a leak guard for forgotten sessions.
|
||||
*
|
||||
* Persistence: in-memory only for v1. A daemon restart clears the
|
||||
* registry — every launched session needs to re-register. That's fine
|
||||
* for now because launch.ts re-registers on `ensureDaemonRunning`'s
|
||||
* success path, and most ad-hoc CLI invocations from outside a launched
|
||||
* session have no token to begin with.
|
||||
*/
|
||||
|
||||
/**
|
||||
* Optional per-launch presence material. Carried opaquely through the
|
||||
* registry; the daemon's session-broker subsystem (1.30.0+) reads it to
|
||||
* open a long-lived broker WebSocket per session. Absent on older CLIs
|
||||
* — register accepts payloads without it for backward compat.
|
||||
*/
|
||||
export interface SessionPresence {
|
||||
/** Hex ed25519 pubkey, 64 chars. */
|
||||
sessionPubkey: string;
|
||||
/** Hex ed25519 secret key (held in-memory only; never disk). */
|
||||
sessionSecretKey: string;
|
||||
/** Parent-member-signed attestation; see signParentAttestation. */
|
||||
parentAttestation: {
|
||||
sessionPubkey: string;
|
||||
parentMemberPubkey: string;
|
||||
expiresAt: number;
|
||||
signature: string;
|
||||
};
|
||||
}
|
||||
|
||||
export interface SessionInfo {
|
||||
token: string;
|
||||
sessionId: string;
|
||||
mesh: string;
|
||||
displayName: string;
|
||||
pid: number;
|
||||
cwd?: string;
|
||||
role?: string;
|
||||
groups?: string[];
|
||||
/** 1.30.0+: per-launch presence material. */
|
||||
presence?: SessionPresence;
|
||||
registeredAt: number;
|
||||
}
|
||||
|
||||
/** Lifecycle callbacks invoked synchronously after registry mutation. */
|
||||
export interface RegistryHooks {
|
||||
onRegister?: (info: SessionInfo) => void;
|
||||
onDeregister?: (info: SessionInfo) => void;
|
||||
}
|
||||
|
||||
const TTL_MS = 24 * 60 * 60 * 1000;
|
||||
const REAPER_INTERVAL_MS = 30 * 1000;
|
||||
|
||||
const byToken = new Map<string, SessionInfo>();
|
||||
const bySessionId = new Map<string, string>();
|
||||
const hooks: RegistryHooks = {};
|
||||
|
||||
let reaperHandle: NodeJS.Timeout | null = null;
|
||||
|
||||
export function startReaper(): void {
|
||||
if (reaperHandle) return;
|
||||
reaperHandle = setInterval(reapDead, REAPER_INTERVAL_MS).unref?.() ?? reaperHandle;
|
||||
}
|
||||
|
||||
export function stopReaper(): void {
|
||||
if (reaperHandle) { clearInterval(reaperHandle); reaperHandle = null; }
|
||||
}
|
||||
|
||||
/**
|
||||
* Wire daemon-level lifecycle hooks. Called once at daemon boot — passing
|
||||
* `{}` clears them. Idempotent across calls so tests can re-bind.
|
||||
*/
|
||||
export function setRegistryHooks(next: RegistryHooks): void {
|
||||
hooks.onRegister = next.onRegister;
|
||||
hooks.onDeregister = next.onDeregister;
|
||||
}
|
||||
|
||||
export function registerSession(info: Omit<SessionInfo, "registeredAt">): SessionInfo {
|
||||
// Replace any prior entry under the same sessionId.
|
||||
const priorToken = bySessionId.get(info.sessionId);
|
||||
if (priorToken && priorToken !== info.token) {
|
||||
const prior = byToken.get(priorToken);
|
||||
if (prior) {
|
||||
byToken.delete(priorToken);
|
||||
try { hooks.onDeregister?.(prior); } catch { /* hook errors must never throttle the registry */ }
|
||||
}
|
||||
}
|
||||
|
||||
const stored: SessionInfo = { ...info, registeredAt: Date.now() };
|
||||
byToken.set(info.token, stored);
|
||||
bySessionId.set(info.sessionId, info.token);
|
||||
try { hooks.onRegister?.(stored); } catch { /* see above */ }
|
||||
return stored;
|
||||
}
|
||||
|
||||
export function deregisterByToken(token: string): boolean {
|
||||
const entry = byToken.get(token);
|
||||
if (!entry) return false;
|
||||
byToken.delete(token);
|
||||
if (bySessionId.get(entry.sessionId) === token) bySessionId.delete(entry.sessionId);
|
||||
try { hooks.onDeregister?.(entry); } catch { /* see above */ }
|
||||
return true;
|
||||
}
|
||||
|
||||
export function resolveToken(token: string): SessionInfo | null {
|
||||
const entry = byToken.get(token);
|
||||
if (!entry) return null;
|
||||
if (Date.now() - entry.registeredAt > TTL_MS) {
|
||||
deregisterByToken(token);
|
||||
return null;
|
||||
}
|
||||
return entry;
|
||||
}
|
||||
|
||||
export function listSessions(): SessionInfo[] {
|
||||
return [...byToken.values()];
|
||||
}
|
||||
|
||||
function reapDead(): void {
|
||||
const dead: string[] = [];
|
||||
for (const [token, info] of byToken.entries()) {
|
||||
if (Date.now() - info.registeredAt > TTL_MS) { dead.push(token); continue; }
|
||||
try { process.kill(info.pid, 0); } catch { dead.push(token); }
|
||||
}
|
||||
for (const t of dead) deregisterByToken(t);
|
||||
}
|
||||
|
||||
/** Test helper. */
|
||||
export function _resetRegistry(): void {
|
||||
byToken.clear();
|
||||
bySessionId.clear();
|
||||
hooks.onRegister = undefined;
|
||||
hooks.onDeregister = undefined;
|
||||
}
|
||||
@@ -9,6 +9,7 @@ import { renderVersion } from "~/cli/output/version.js";
|
||||
import { isInviteUrl, normaliseInviteUrl } from "~/utils/url.js";
|
||||
import { classifyInvocation } from "~/cli/policy-classify.js";
|
||||
import { gate, type ApprovalMode } from "~/services/policy/index.js";
|
||||
import { setDaemonPolicy, policyFromFlags } from "~/services/daemon/policy.js";
|
||||
import { bold, clay, cyan, dim, orange } from "~/ui/styles.js";
|
||||
|
||||
installSignalHandlers();
|
||||
@@ -16,6 +17,11 @@ installErrorHandlers();
|
||||
|
||||
const { command, positionals, flags } = parseArgv(process.argv);
|
||||
|
||||
// Resolve daemon policy once at boot — daemon-routing helpers read this
|
||||
// instead of inspecting flags themselves. --no-daemon and --strict are
|
||||
// mutually exclusive (--no-daemon wins if both are passed).
|
||||
setDaemonPolicy(policyFromFlags(flags));
|
||||
|
||||
/**
|
||||
* Resolve the coarse approval mode from CLI flags + env.
|
||||
* --approval-mode <plan|read-only|write|yolo> explicit
|
||||
@@ -67,7 +73,7 @@ USAGE
|
||||
claudemesh <invite-url> join a mesh, then launch
|
||||
claudemesh launch --name <n> --join <url> join + launch in one step
|
||||
|
||||
Mesh
|
||||
Mesh (alias: "workspace" — claudemesh workspace <verb> mirrors each)
|
||||
claudemesh create <name> create a new mesh
|
||||
claudemesh join <url> join a mesh (accepts short /i/ or long /join/ link)
|
||||
claudemesh launch [slug] launch Claude Code on a mesh (alias: connect)
|
||||
@@ -210,6 +216,8 @@ Flags
|
||||
--policy <path> override policy file
|
||||
-y, --yes skip confirmations (= --approval-mode yolo)
|
||||
-q, --quiet suppress non-essential output
|
||||
--strict require daemon for broker-touching verbs (no cold-path fallback)
|
||||
--no-daemon skip daemon entirely; open broker WS directly (CI / sandboxed scripts)
|
||||
`;
|
||||
|
||||
/**
|
||||
@@ -283,6 +291,12 @@ async function main(): Promise<void> {
|
||||
join: normaliseInviteUrl(command),
|
||||
yes: !!flags.y || !!flags.yes,
|
||||
resume: flags.resume as string | undefined,
|
||||
role: flags.role as string | undefined,
|
||||
groups: flags.groups as string | undefined,
|
||||
"message-mode": flags["message-mode"] as string | undefined,
|
||||
"system-prompt": flags["system-prompt"] as string | undefined,
|
||||
continue: !!flags.continue,
|
||||
quiet: !!flags.quiet,
|
||||
}, process.argv.slice(2));
|
||||
return;
|
||||
}
|
||||
@@ -298,6 +312,12 @@ async function main(): Promise<void> {
|
||||
name: flags.name as string | undefined,
|
||||
yes: !!flags.y || !!flags.yes,
|
||||
resume: flags.resume as string | undefined,
|
||||
role: flags.role as string | undefined,
|
||||
groups: flags.groups as string | undefined,
|
||||
"message-mode": flags["message-mode"] as string | undefined,
|
||||
"system-prompt": flags["system-prompt"] as string | undefined,
|
||||
continue: !!flags.continue,
|
||||
quiet: !!flags.quiet,
|
||||
}, process.argv.slice(2));
|
||||
return;
|
||||
}
|
||||
@@ -316,6 +336,12 @@ async function main(): Promise<void> {
|
||||
join: flags.join as string,
|
||||
yes: !!flags.y || !!flags.yes,
|
||||
resume: flags.resume as string,
|
||||
role: flags.role as string,
|
||||
groups: flags.groups as string,
|
||||
"message-mode": flags["message-mode"] as string,
|
||||
"system-prompt": flags["system-prompt"] as string,
|
||||
continue: !!flags.continue,
|
||||
quiet: !!flags.quiet,
|
||||
}, process.argv.slice(2));
|
||||
break;
|
||||
}
|
||||
@@ -324,6 +350,37 @@ async function main(): Promise<void> {
|
||||
case "delete": case "rm": { const { deleteMesh } = await import("~/commands/delete-mesh.js"); process.exit(await deleteMesh(positionals[0] ?? "", { yes: !!flags.y || !!flags.yes })); break; }
|
||||
case "rename": { const { rename } = await import("~/commands/rename.js"); process.exit(await rename(positionals[0] ?? "", positionals[1] ?? "")); break; }
|
||||
case "share": case "invite": { const { invite } = await import("~/commands/invite.js"); process.exit(await invite(positionals[0], { mesh: flags.mesh as string, json: !!flags.json })); break; }
|
||||
// workspace — alias surface for mesh-management verbs (v1.27.0 teaser; full
|
||||
// rename arrives in 1.28.0). Each sub mirrors an existing top-level verb.
|
||||
case "workspace": {
|
||||
const sub = positionals[0];
|
||||
if (!sub || sub === "launch" || sub === "connect" || sub === "open") {
|
||||
const { runLaunch } = await import("~/commands/launch.js");
|
||||
await runLaunch({
|
||||
mesh: positionals[1] ?? flags.mesh as string,
|
||||
name: flags.name as string,
|
||||
join: flags.join as string,
|
||||
yes: !!flags.y || !!flags.yes,
|
||||
resume: flags.resume as string,
|
||||
role: flags.role as string,
|
||||
groups: flags.groups as string,
|
||||
"message-mode": flags["message-mode"] as string,
|
||||
"system-prompt": flags["system-prompt"] as string,
|
||||
continue: !!flags.continue,
|
||||
quiet: !!flags.quiet,
|
||||
}, process.argv.slice(2));
|
||||
}
|
||||
else if (sub === "list" || sub === "ls") { const { runList } = await import("~/commands/list.js"); await runList(); }
|
||||
else if (sub === "info") { const { runInfo } = await import("~/commands/info.js"); await runInfo({}); }
|
||||
else if (sub === "create" || sub === "new") { const { newMesh } = await import("~/commands/new.js"); process.exit(await newMesh(positionals[1] ?? "", { json: !!flags.json })); }
|
||||
else if (sub === "join" || sub === "add") { const { runJoin } = await import("~/commands/join.js"); await runJoin(positionals.slice(1)); }
|
||||
else if (sub === "delete" || sub === "rm") { const { deleteMesh } = await import("~/commands/delete-mesh.js"); process.exit(await deleteMesh(positionals[1] ?? "", { yes: !!flags.y || !!flags.yes })); }
|
||||
else if (sub === "rename") { const { rename } = await import("~/commands/rename.js"); process.exit(await rename(positionals[1] ?? "", positionals[2] ?? "")); }
|
||||
else if (sub === "share" || sub === "invite") { const { invite } = await import("~/commands/invite.js"); process.exit(await invite(positionals[1], { mesh: flags.mesh as string, json: !!flags.json })); }
|
||||
else if (sub === "overview") { const { runMe } = await import("~/commands/me.js"); process.exit(await runMe({ mesh: flags.mesh as string, json: !!flags.json })); }
|
||||
else { console.error("Usage: claudemesh workspace <list|info|create|join|delete|rename|share|launch|overview>"); process.exit(EXIT.INVALID_ARGS); }
|
||||
break;
|
||||
}
|
||||
case "disconnect": { const { runDisconnect } = await import("~/commands/kick.js"); process.exit(await runDisconnect(positionals[0], { mesh: flags.mesh as string, stale: flags.stale as string, all: !!flags.all })); break; }
|
||||
case "kick": { const { runKick } = await import("~/commands/kick.js"); process.exit(await runKick(positionals[0], { mesh: flags.mesh as string, stale: flags.stale as string, all: !!flags.all })); break; }
|
||||
case "ban": { const { runBan } = await import("~/commands/ban.js"); process.exit(await runBan(positionals[0], { mesh: flags.mesh as string })); break; }
|
||||
|
||||
@@ -125,7 +125,7 @@ function subscribeEvents(onEvent: (e: DaemonEvent) => void): { close: () => void
|
||||
}
|
||||
if (!dataLine) continue;
|
||||
try {
|
||||
const parsed = JSON.parse(dataLine);
|
||||
const parsed = JSON.parse(dataLine) as Record<string, unknown>;
|
||||
onEvent({ kind, ts: String(parsed.ts ?? ""), data: parsed });
|
||||
} catch { /* malformed event; skip */ }
|
||||
}
|
||||
|
||||
@@ -1,114 +0,0 @@
|
||||
/**
|
||||
* Bridge client — CLI invocations dial the per-mesh Unix socket the
|
||||
* MCP push-pipe holds open, so they reuse its warm WS instead of opening
|
||||
* a fresh one (~5ms vs ~300-700ms).
|
||||
*
|
||||
* Usage from a command:
|
||||
*
|
||||
* const result = await tryBridge(meshSlug, "send", { to, message });
|
||||
* if (result === null) { ...fall through to cold withMesh()... }
|
||||
* else { ...warm path succeeded... }
|
||||
*
|
||||
* `tryBridge` returns null on:
|
||||
* - socket file absent (no push-pipe running)
|
||||
* - socket connect fails (push-pipe crashed without cleanup)
|
||||
* - bridge timeout
|
||||
* That null is the caller's signal to fall back to a cold WS connection
|
||||
* via `withMesh`. So the bridge is purely an optimization — every verb
|
||||
* still works without it.
|
||||
*/
|
||||
|
||||
import { createConnection } from "node:net";
|
||||
import { existsSync } from "node:fs";
|
||||
import { randomUUID } from "node:crypto";
|
||||
import {
|
||||
socketPath,
|
||||
frame,
|
||||
LineParser,
|
||||
type BridgeRequest,
|
||||
type BridgeResponse,
|
||||
type BridgeVerb,
|
||||
} from "./protocol.js";
|
||||
|
||||
const DEFAULT_TIMEOUT_MS = 5_000;
|
||||
|
||||
/**
|
||||
* Send one request and await the matching response. Returns:
|
||||
* - { ok: true, result } on success
|
||||
* - { ok: false, error } on bridge-reachable-but-broker-error
|
||||
* - null on bridge-unreachable (caller should fall back to cold WS)
|
||||
*/
|
||||
export async function tryBridge(
|
||||
meshSlug: string,
|
||||
verb: BridgeVerb,
|
||||
args: Record<string, unknown> = {},
|
||||
timeoutMs: number = DEFAULT_TIMEOUT_MS,
|
||||
): Promise<{ ok: true; result: unknown } | { ok: false; error: string } | null> {
|
||||
const path = socketPath(meshSlug);
|
||||
if (!existsSync(path)) return null;
|
||||
|
||||
return new Promise((resolve) => {
|
||||
const id = randomUUID();
|
||||
const req: BridgeRequest = { id, verb, args };
|
||||
const parser = new LineParser();
|
||||
let settled = false;
|
||||
|
||||
const finish = (
|
||||
value: { ok: true; result: unknown } | { ok: false; error: string } | null,
|
||||
): void => {
|
||||
if (settled) return;
|
||||
settled = true;
|
||||
try { socket.destroy(); } catch {}
|
||||
clearTimeout(timer);
|
||||
resolve(value);
|
||||
};
|
||||
|
||||
const socket = createConnection({ path });
|
||||
|
||||
const timer = setTimeout(() => {
|
||||
finish(null); // timeout = bridge unreachable, fall back to cold path
|
||||
}, timeoutMs);
|
||||
|
||||
socket.on("connect", () => {
|
||||
try {
|
||||
socket.write(frame(req));
|
||||
} catch {
|
||||
finish(null);
|
||||
}
|
||||
});
|
||||
|
||||
socket.on("data", (chunk) => {
|
||||
const lines = parser.feed(chunk);
|
||||
for (const line of lines) {
|
||||
if (!line.trim()) continue;
|
||||
let res: BridgeResponse;
|
||||
try {
|
||||
res = JSON.parse(line) as BridgeResponse;
|
||||
} catch {
|
||||
continue;
|
||||
}
|
||||
if (res.id !== id) continue; // not our response — keep reading
|
||||
if (res.ok) finish({ ok: true, result: res.result });
|
||||
else finish({ ok: false, error: res.error });
|
||||
return;
|
||||
}
|
||||
});
|
||||
|
||||
socket.on("error", (err) => {
|
||||
// ENOENT (file disappeared between existsSync and connect),
|
||||
// ECONNREFUSED (stale socket), EPERM (permission), etc. — all mean
|
||||
// bridge unreachable.
|
||||
const code = (err as NodeJS.ErrnoException).code;
|
||||
if (code === "ECONNREFUSED" || code === "ENOENT" || code === "EPERM") {
|
||||
finish(null);
|
||||
} else {
|
||||
finish(null);
|
||||
}
|
||||
});
|
||||
|
||||
socket.on("close", () => {
|
||||
// If we close without a response, treat as unreachable.
|
||||
finish(null);
|
||||
});
|
||||
});
|
||||
}
|
||||
@@ -1,19 +1,48 @@
|
||||
// Try forwarding a send through the local daemon's IPC. Returns null if
|
||||
// the daemon isn't running or the daemon's mesh doesn't match the target
|
||||
// mesh — the caller falls back to the bridge or cold path.
|
||||
|
||||
import { existsSync } from "node:fs";
|
||||
// Daemon-routed CLI helpers. Returns null when the daemon is unreachable
|
||||
// AND auto-spawn could not bring it up — caller is expected to fall back
|
||||
// to its cold-path WS or to error out under `--strict`.
|
||||
//
|
||||
// Auto-recovery: when the daemon socket is missing or stale, every
|
||||
// helper here calls into the lifecycle module which probes, spawns
|
||||
// (under a lock), polls, and retries — so cold-path fallback only
|
||||
// fires if auto-spawn failed. The lifecycle module caches its
|
||||
// per-process result, so a script doing 50 sends pays the spawn cost
|
||||
// at most once.
|
||||
//
|
||||
// 1.28.0: the orphaned bridge tier between daemon and cold paths was
|
||||
// removed. Two paths only: daemon (with auto-spawn) → cold.
|
||||
|
||||
import { ipc } from "~/daemon/ipc/client.js";
|
||||
import { DAEMON_PATHS } from "~/daemon/paths.js";
|
||||
import { ensureDaemonReady } from "~/services/daemon/lifecycle.js";
|
||||
import { getDaemonPolicy } from "~/services/daemon/policy.js";
|
||||
import { warnDaemonState } from "~/ui/warnings.js";
|
||||
|
||||
function meshQuery(mesh?: string): string {
|
||||
return mesh ? `?mesh=${encodeURIComponent(mesh)}` : "";
|
||||
}
|
||||
|
||||
/** Common entry: ensure the daemon is reachable, emitting a one-shot
|
||||
* stderr warning describing what we did. Returns true when the daemon
|
||||
* is now reachable, false when the caller should fall back.
|
||||
*
|
||||
* --no-daemon short-circuits to false; --strict's enforcement lives at
|
||||
* the cold-path entry point (`withMesh` in commands/connect.ts) so a
|
||||
* single chokepoint covers every verb. */
|
||||
async function daemonReachable(): Promise<boolean> {
|
||||
const policy = getDaemonPolicy();
|
||||
if (policy.mode === "no-daemon") return false;
|
||||
const res = await ensureDaemonReady({ noAutoSpawn: false });
|
||||
warnDaemonState(res, {});
|
||||
return res.state === "up" || res.state === "started";
|
||||
}
|
||||
|
||||
/** Try fetching the peer list through the daemon (~1ms warm IPC).
|
||||
* Returns null when the daemon socket isn't present so the caller can
|
||||
* fall back to bridge / cold paths. */
|
||||
export async function tryListPeersViaDaemon(): Promise<unknown[] | null> {
|
||||
if (!existsSync(DAEMON_PATHS.SOCK_FILE)) return null;
|
||||
export async function tryListPeersViaDaemon(mesh?: string): Promise<unknown[] | null> {
|
||||
if (!(await daemonReachable())) return null;
|
||||
try {
|
||||
const res = await ipc<{ peers?: unknown[] }>({ path: "/v1/peers", timeoutMs: 3_000 });
|
||||
const res = await ipc<{ peers?: unknown[] }>({ path: `/v1/peers${meshQuery(mesh)}`, timeoutMs: 3_000 });
|
||||
if (res.status !== 200) return null;
|
||||
return Array.isArray(res.body.peers) ? res.body.peers : [];
|
||||
} catch (err) {
|
||||
@@ -24,10 +53,10 @@ export async function tryListPeersViaDaemon(): Promise<unknown[] | null> {
|
||||
}
|
||||
|
||||
/** Try fetching mesh-published skills through the daemon. */
|
||||
export async function tryListSkillsViaDaemon(): Promise<unknown[] | null> {
|
||||
if (!existsSync(DAEMON_PATHS.SOCK_FILE)) return null;
|
||||
export async function tryListSkillsViaDaemon(mesh?: string): Promise<unknown[] | null> {
|
||||
if (!(await daemonReachable())) return null;
|
||||
try {
|
||||
const res = await ipc<{ skills?: unknown[] }>({ path: "/v1/skills", timeoutMs: 3_000 });
|
||||
const res = await ipc<{ skills?: unknown[] }>({ path: `/v1/skills${meshQuery(mesh)}`, timeoutMs: 3_000 });
|
||||
if (res.status !== 200) return null;
|
||||
return Array.isArray(res.body.skills) ? res.body.skills : [];
|
||||
} catch (err) {
|
||||
@@ -38,11 +67,11 @@ export async function tryListSkillsViaDaemon(): Promise<unknown[] | null> {
|
||||
}
|
||||
|
||||
/** Try fetching one skill body through the daemon. */
|
||||
export async function tryGetSkillViaDaemon(name: string): Promise<unknown | null> {
|
||||
if (!existsSync(DAEMON_PATHS.SOCK_FILE)) return null;
|
||||
export async function tryGetSkillViaDaemon(name: string, mesh?: string): Promise<unknown | null> {
|
||||
if (!(await daemonReachable())) return null;
|
||||
try {
|
||||
const res = await ipc<{ skill?: unknown }>({
|
||||
path: `/v1/skills/${encodeURIComponent(name)}`,
|
||||
path: `/v1/skills/${encodeURIComponent(name)}${meshQuery(mesh)}`,
|
||||
timeoutMs: 3_000,
|
||||
});
|
||||
if (res.status === 404) return null;
|
||||
@@ -51,6 +80,109 @@ export async function tryGetSkillViaDaemon(name: string): Promise<unknown | null
|
||||
} catch { return null; }
|
||||
}
|
||||
|
||||
// --- state ---
|
||||
|
||||
export type StateEntry = {
|
||||
key: string;
|
||||
value: unknown;
|
||||
updatedBy: string;
|
||||
updatedAt: string;
|
||||
mesh?: string;
|
||||
};
|
||||
|
||||
/** Try reading a single state key through the daemon. Returns:
|
||||
* - the entry when the daemon found it
|
||||
* - undefined when the daemon ran but the key is unset (404)
|
||||
* - null when the daemon socket isn't present (caller falls back) */
|
||||
export async function tryGetStateViaDaemon(key: string, mesh?: string): Promise<StateEntry | undefined | null> {
|
||||
if (!(await daemonReachable())) return null;
|
||||
try {
|
||||
const path = `/v1/state?key=${encodeURIComponent(key)}${mesh ? `&mesh=${encodeURIComponent(mesh)}` : ""}`;
|
||||
const res = await ipc<{ state?: StateEntry; error?: string }>({ path, timeoutMs: 3_000 });
|
||||
if (res.status === 404) return undefined;
|
||||
if (res.status !== 200) return null;
|
||||
return res.body.state ?? undefined;
|
||||
} catch (err) {
|
||||
const msg = String(err);
|
||||
if (/ENOENT|ECONNREFUSED|ipc_timeout/.test(msg)) return null;
|
||||
return null;
|
||||
}
|
||||
}
|
||||
|
||||
export async function tryListStateViaDaemon(mesh?: string): Promise<StateEntry[] | null> {
|
||||
if (!(await daemonReachable())) return null;
|
||||
try {
|
||||
const res = await ipc<{ entries?: StateEntry[] }>({ path: `/v1/state${meshQuery(mesh)}`, timeoutMs: 3_000 });
|
||||
if (res.status !== 200) return null;
|
||||
return Array.isArray(res.body.entries) ? res.body.entries : [];
|
||||
} catch (err) {
|
||||
const msg = String(err);
|
||||
if (/ENOENT|ECONNREFUSED|ipc_timeout/.test(msg)) return null;
|
||||
return null;
|
||||
}
|
||||
}
|
||||
|
||||
export async function trySetStateViaDaemon(key: string, value: unknown, mesh?: string): Promise<boolean> {
|
||||
if (!(await daemonReachable())) return false;
|
||||
try {
|
||||
const res = await ipc<{ ok?: boolean; error?: string }>({
|
||||
method: "POST",
|
||||
path: "/v1/state",
|
||||
timeoutMs: 3_000,
|
||||
body: { key, value, ...(mesh ? { mesh } : {}) },
|
||||
});
|
||||
return res.status === 200 && res.body.ok === true;
|
||||
} catch { return false; }
|
||||
}
|
||||
|
||||
// --- memory ---
|
||||
|
||||
export type MemoryEntry = {
|
||||
id: string;
|
||||
content: string;
|
||||
tags: string[];
|
||||
rememberedBy: string;
|
||||
rememberedAt: string;
|
||||
mesh?: string;
|
||||
};
|
||||
|
||||
export async function tryRememberViaDaemon(content: string, tags?: string[], mesh?: string): Promise<{ id: string; mesh?: string } | null> {
|
||||
if (!(await daemonReachable())) return null;
|
||||
try {
|
||||
const res = await ipc<{ id?: string; mesh?: string; error?: string }>({
|
||||
method: "POST",
|
||||
path: "/v1/memory",
|
||||
timeoutMs: 5_000,
|
||||
body: { content, ...(tags?.length ? { tags } : {}), ...(mesh ? { mesh } : {}) },
|
||||
});
|
||||
if (res.status !== 200 || !res.body.id) return null;
|
||||
return { id: res.body.id, mesh: res.body.mesh };
|
||||
} catch { return null; }
|
||||
}
|
||||
|
||||
export async function tryRecallViaDaemon(query: string, mesh?: string): Promise<MemoryEntry[] | null> {
|
||||
if (!(await daemonReachable())) return null;
|
||||
try {
|
||||
const path = `/v1/memory?q=${encodeURIComponent(query)}${mesh ? `&mesh=${encodeURIComponent(mesh)}` : ""}`;
|
||||
const res = await ipc<{ matches?: MemoryEntry[] }>({ path, timeoutMs: 5_000 });
|
||||
if (res.status !== 200) return null;
|
||||
return Array.isArray(res.body.matches) ? res.body.matches : [];
|
||||
} catch (err) {
|
||||
const msg = String(err);
|
||||
if (/ENOENT|ECONNREFUSED|ipc_timeout/.test(msg)) return null;
|
||||
return null;
|
||||
}
|
||||
}
|
||||
|
||||
export async function tryForgetViaDaemon(id: string, mesh?: string): Promise<boolean> {
|
||||
if (!(await daemonReachable())) return false;
|
||||
try {
|
||||
const path = `/v1/memory/${encodeURIComponent(id)}${meshQuery(mesh)}`;
|
||||
const res = await ipc<{ ok?: boolean }>({ method: "DELETE", path, timeoutMs: 3_000 });
|
||||
return res.status === 200 && res.body.ok === true;
|
||||
} catch { return false; }
|
||||
}
|
||||
|
||||
export type DaemonSendOk = {
|
||||
ok: true;
|
||||
messageId: string;
|
||||
@@ -72,7 +204,7 @@ export async function trySendViaDaemon(args: {
|
||||
* right mesh by either flag or single-mesh-default. */
|
||||
expectedMesh?: string;
|
||||
}): Promise<DaemonSendResult | null> {
|
||||
if (!existsSync(DAEMON_PATHS.SOCK_FILE)) return null;
|
||||
if (!(await daemonReachable())) return null;
|
||||
|
||||
try {
|
||||
const res = await ipc<{
|
||||
|
||||
@@ -1,93 +0,0 @@
|
||||
/**
|
||||
* Bridge protocol — wire format between the MCP push-pipe (server) and
|
||||
* CLI invocations (client) over a per-mesh Unix domain socket.
|
||||
*
|
||||
* Why: every CLI op should reuse the warm WS the push-pipe already holds
|
||||
* (~5ms) instead of opening its own (~300-700ms cold start). The bridge is
|
||||
* the load-bearing piece of the CLI-first architecture — see
|
||||
* .artifacts/specs/2026-05-02-architecture-north-star.md commitment #3.
|
||||
*
|
||||
* Wire format: line-delimited JSON. One JSON object per "\n"-terminated line.
|
||||
* Each request carries an `id` string; the response echoes it.
|
||||
*
|
||||
* Socket path: ~/.claudemesh/sockets/<mesh-slug>.sock (mode 0600).
|
||||
*
|
||||
* Connection model: persistent. A CLI invocation opens, sends one or more
|
||||
* requests, reads matching responses, then closes. Multiplexing via `id`
|
||||
* means concurrent CLI calls don't have to serialize on the same socket
|
||||
* (though current callers all do one round-trip and exit).
|
||||
*/
|
||||
|
||||
import { homedir } from "node:os";
|
||||
import { join } from "node:path";
|
||||
|
||||
export const PROTOCOL_VERSION = 1;
|
||||
|
||||
/** Socket path for a given mesh. Caller is responsible for ensuring the
|
||||
* parent directory exists (`~/.claudemesh/sockets/`). */
|
||||
export function socketPath(meshSlug: string): string {
|
||||
return join(homedir(), ".claudemesh", "sockets", `${meshSlug}.sock`);
|
||||
}
|
||||
|
||||
/** Directory holding all per-mesh sockets. Created with mode 0700 on push-pipe boot. */
|
||||
export function socketDir(): string {
|
||||
return join(homedir(), ".claudemesh", "sockets");
|
||||
}
|
||||
|
||||
/**
|
||||
* Verbs the bridge accepts. Keep this list narrow in 1.2.0 — three writes
|
||||
* (send, summary, status), the read-shaped peers, plus ping for health.
|
||||
* Expand in 1.3.0 once the bridge is proven.
|
||||
*/
|
||||
export type BridgeVerb =
|
||||
| "ping"
|
||||
| "peers"
|
||||
| "send"
|
||||
| "summary"
|
||||
| "status_set"
|
||||
| "visible";
|
||||
|
||||
export interface BridgeRequest {
|
||||
id: string;
|
||||
verb: BridgeVerb;
|
||||
args?: Record<string, unknown>;
|
||||
}
|
||||
|
||||
export interface BridgeResponseOk {
|
||||
id: string;
|
||||
ok: true;
|
||||
result: unknown;
|
||||
}
|
||||
|
||||
export interface BridgeResponseErr {
|
||||
id: string;
|
||||
ok: false;
|
||||
error: string;
|
||||
}
|
||||
|
||||
export type BridgeResponse = BridgeResponseOk | BridgeResponseErr;
|
||||
|
||||
/** Serialise a request/response to a single line ("\n"-terminated). */
|
||||
export function frame(obj: BridgeRequest | BridgeResponse): string {
|
||||
return JSON.stringify(obj) + "\n";
|
||||
}
|
||||
|
||||
/**
|
||||
* Stateful line-buffered parser. Pass each chunk from the socket via
|
||||
* `feed`; collect completed lines from the returned array.
|
||||
*/
|
||||
export class LineParser {
|
||||
private buf = "";
|
||||
|
||||
feed(chunk: Buffer | string): string[] {
|
||||
this.buf += typeof chunk === "string" ? chunk : chunk.toString("utf-8");
|
||||
const lines: string[] = [];
|
||||
let nl = this.buf.indexOf("\n");
|
||||
while (nl !== -1) {
|
||||
lines.push(this.buf.slice(0, nl));
|
||||
this.buf = this.buf.slice(nl + 1);
|
||||
nl = this.buf.indexOf("\n");
|
||||
}
|
||||
return lines;
|
||||
}
|
||||
}
|
||||
@@ -1,229 +0,0 @@
|
||||
/**
|
||||
* Bridge server — the MCP push-pipe runs one of these per connected mesh.
|
||||
*
|
||||
* Listens on a Unix domain socket at `~/.claudemesh/sockets/<mesh-slug>.sock`,
|
||||
* accepts line-delimited JSON requests from CLI invocations, dispatches each
|
||||
* request to the corresponding `BrokerClient` method, and writes the response
|
||||
* back on the same line.
|
||||
*
|
||||
* Lifecycle:
|
||||
* - `startBridgeServer(client)` is called from the MCP push-pipe boot path
|
||||
* once the WS is connected (or even before — verbs that need an open WS
|
||||
* will return an error).
|
||||
* - On startup it `unlinks` any stale socket file (left by a crashed
|
||||
* prior process), then `listen`s.
|
||||
* - On shutdown (`stop()`) it closes the listener and unlinks the socket.
|
||||
*
|
||||
* Concurrency: each accepted connection gets its own line-buffered parser.
|
||||
* Multiple in-flight requests are correlated by `id`; the server doesn't
|
||||
* need to serialize because the underlying `BrokerClient` calls are
|
||||
* `async` and non-blocking.
|
||||
*
|
||||
* Error model: malformed lines are dropped silently (don't tear down the
|
||||
* socket). Unknown verbs return `{ok: false, error: "unknown verb"}`.
|
||||
* Broker errors are wrapped into the `error` string.
|
||||
*/
|
||||
|
||||
import { createServer, type Server, type Socket } from "node:net";
|
||||
import { mkdirSync, unlinkSync, existsSync, chmodSync } from "node:fs";
|
||||
import { dirname } from "node:path";
|
||||
import type { BrokerClient } from "~/services/broker/facade.js";
|
||||
import {
|
||||
socketPath,
|
||||
socketDir,
|
||||
frame,
|
||||
LineParser,
|
||||
type BridgeRequest,
|
||||
type BridgeResponse,
|
||||
type BridgeVerb,
|
||||
} from "./protocol.js";
|
||||
|
||||
export interface BridgeServer {
|
||||
stop(): void;
|
||||
path: string;
|
||||
}
|
||||
|
||||
type PeerStatus = "idle" | "working" | "dnd";
|
||||
|
||||
/**
|
||||
* Resolve a `to` string to a broker-friendly target spec. Mirrors what
|
||||
* `commands/send.ts` does today — display name → pubkey, hex stays hex,
|
||||
* `@group` and `*` pass through.
|
||||
*/
|
||||
async function resolveTarget(
|
||||
client: BrokerClient,
|
||||
to: string,
|
||||
): Promise<{ ok: true; spec: string } | { ok: false; error: string }> {
|
||||
if (to.startsWith("@") || to === "*" || /^[0-9a-f]{64}$/i.test(to)) {
|
||||
return { ok: true, spec: to };
|
||||
}
|
||||
const peers = await client.listPeers();
|
||||
const match = peers.find((p) => p.displayName.toLowerCase() === to.toLowerCase());
|
||||
if (!match) {
|
||||
return {
|
||||
ok: false,
|
||||
error: `peer "${to}" not found. online: ${peers.map((p) => p.displayName).join(", ") || "(none)"}`,
|
||||
};
|
||||
}
|
||||
return { ok: true, spec: match.pubkey };
|
||||
}
|
||||
|
||||
async function dispatch(
|
||||
client: BrokerClient,
|
||||
req: BridgeRequest,
|
||||
): Promise<BridgeResponse> {
|
||||
const args = req.args ?? {};
|
||||
try {
|
||||
switch (req.verb as BridgeVerb) {
|
||||
case "ping": {
|
||||
const peers = await client.listPeers();
|
||||
return {
|
||||
id: req.id,
|
||||
ok: true,
|
||||
result: {
|
||||
mesh: client.meshSlug,
|
||||
ws_status: client.status,
|
||||
peers_online: peers.length,
|
||||
push_buffer: client.pushHistory.length,
|
||||
},
|
||||
};
|
||||
}
|
||||
case "peers": {
|
||||
const peers = await client.listPeers();
|
||||
return { id: req.id, ok: true, result: peers };
|
||||
}
|
||||
case "send": {
|
||||
const to = String(args.to ?? "");
|
||||
const message = String(args.message ?? "");
|
||||
const priority = (args.priority as "now" | "next" | "low" | undefined) ?? "next";
|
||||
if (!to || !message) {
|
||||
return { id: req.id, ok: false, error: "send: `to` and `message` required" };
|
||||
}
|
||||
const resolved = await resolveTarget(client, to);
|
||||
if (!resolved.ok) return { id: req.id, ok: false, error: resolved.error };
|
||||
const result = await client.send(resolved.spec, message, priority);
|
||||
if (!result.ok) {
|
||||
return { id: req.id, ok: false, error: result.error ?? "send failed" };
|
||||
}
|
||||
return {
|
||||
id: req.id,
|
||||
ok: true,
|
||||
result: { messageId: result.messageId, target: resolved.spec },
|
||||
};
|
||||
}
|
||||
case "summary": {
|
||||
const text = String(args.summary ?? "");
|
||||
if (!text) return { id: req.id, ok: false, error: "summary: `summary` required" };
|
||||
await client.setSummary(text);
|
||||
return { id: req.id, ok: true, result: { summary: text } };
|
||||
}
|
||||
case "status_set": {
|
||||
const state = String(args.status ?? "") as PeerStatus;
|
||||
if (!["idle", "working", "dnd"].includes(state)) {
|
||||
return { id: req.id, ok: false, error: "status_set: must be idle | working | dnd" };
|
||||
}
|
||||
await client.setStatus(state);
|
||||
return { id: req.id, ok: true, result: { status: state } };
|
||||
}
|
||||
case "visible": {
|
||||
const visible = Boolean(args.visible);
|
||||
await client.setVisible(visible);
|
||||
return { id: req.id, ok: true, result: { visible } };
|
||||
}
|
||||
default:
|
||||
return { id: req.id, ok: false, error: `unknown verb: ${req.verb}` };
|
||||
}
|
||||
} catch (err) {
|
||||
return {
|
||||
id: req.id,
|
||||
ok: false,
|
||||
error: err instanceof Error ? err.message : String(err),
|
||||
};
|
||||
}
|
||||
}
|
||||
|
||||
function handleConnection(socket: Socket, client: BrokerClient): void {
|
||||
const parser = new LineParser();
|
||||
|
||||
socket.on("data", (chunk) => {
|
||||
const lines = parser.feed(chunk);
|
||||
for (const line of lines) {
|
||||
if (!line.trim()) continue;
|
||||
let req: BridgeRequest;
|
||||
try {
|
||||
req = JSON.parse(line) as BridgeRequest;
|
||||
} catch {
|
||||
continue;
|
||||
}
|
||||
if (!req || typeof req !== "object" || !req.id || !req.verb) continue;
|
||||
|
||||
// Fire-and-await without blocking the read loop.
|
||||
void dispatch(client, req).then((res) => {
|
||||
try {
|
||||
socket.write(frame(res));
|
||||
} catch {
|
||||
/* socket might have closed mid-flight; ignore */
|
||||
}
|
||||
});
|
||||
}
|
||||
});
|
||||
|
||||
socket.on("error", () => {
|
||||
// Don't crash the push-pipe on per-connection errors.
|
||||
});
|
||||
}
|
||||
|
||||
/**
|
||||
* Start the per-mesh bridge server. Returns a handle the caller stores so
|
||||
* it can `stop()` on shutdown.
|
||||
*
|
||||
* Idempotent: if a socket file already exists, attempts to connect to it.
|
||||
* If that connection succeeds, another live process owns it — return null.
|
||||
* If it fails (ECONNREFUSED), the file is stale; unlink it and proceed.
|
||||
*/
|
||||
export function startBridgeServer(client: BrokerClient): BridgeServer | null {
|
||||
const path = socketPath(client.meshSlug);
|
||||
const dir = socketDir();
|
||||
|
||||
if (!existsSync(dir)) {
|
||||
mkdirSync(dir, { recursive: true, mode: 0o700 });
|
||||
}
|
||||
|
||||
// Last-writer-wins: unconditionally remove any existing socket file and
|
||||
// bind fresh. A live process previously holding it keeps its already-
|
||||
// accepted connections (sockets aren't path-based after connect), but
|
||||
// new CLI dials hit the new server. In practice this only matters when
|
||||
// two `claudemesh launch` invocations target the same mesh — rare, and
|
||||
// either instance serving CLI requests is fine because both speak to
|
||||
// the same broker.
|
||||
if (existsSync(path)) {
|
||||
try { unlinkSync(path); } catch {}
|
||||
}
|
||||
|
||||
const server: Server = createServer((socket) => handleConnection(socket, client));
|
||||
|
||||
try {
|
||||
server.listen(path);
|
||||
} catch (err) {
|
||||
process.stderr.write(`[claudemesh] bridge: failed to bind ${path}: ${String(err)}\n`);
|
||||
return null;
|
||||
}
|
||||
|
||||
server.on("error", (err) => {
|
||||
process.stderr.write(`[claudemesh] bridge: ${String(err)}\n`);
|
||||
});
|
||||
|
||||
// Tighten permissions so other users on the host can't dial in.
|
||||
try { chmodSync(path, 0o600); } catch {}
|
||||
|
||||
let stopped = false;
|
||||
return {
|
||||
path,
|
||||
stop(): void {
|
||||
if (stopped) return;
|
||||
stopped = true;
|
||||
try { server.close(); } catch {}
|
||||
try { unlinkSync(path); } catch {}
|
||||
},
|
||||
};
|
||||
}
|
||||
72
apps/cli/src/services/broker/session-hello-sig.ts
Normal file
72
apps/cli/src/services/broker/session-hello-sig.ts
Normal file
@@ -0,0 +1,72 @@
|
||||
/**
|
||||
* CLI-side helpers for the per-session attestation flow.
|
||||
*
|
||||
* Two pieces:
|
||||
* 1. `signParentAttestation` — `claudemesh launch` calls this with the
|
||||
* member's stable secret key to mint a long-lived (≤24h) token that
|
||||
* vouches for an ephemeral session pubkey. The attestation travels
|
||||
* with the session-token registration to the daemon.
|
||||
* 2. `signSessionHello` — the daemon's `SessionBrokerClient` calls this
|
||||
* on every WS-connect to sign the canonical session-hello bytes with
|
||||
* the session secret key (proves liveness + possession).
|
||||
*
|
||||
* Both formats mirror the broker's `canonicalSessionAttestation` /
|
||||
* `canonicalSessionHello`. Drift will surface as `bad_signature` from
|
||||
* the broker, never silent breakage.
|
||||
*/
|
||||
|
||||
import { ensureSodium } from "~/services/crypto/keypair.js";
|
||||
|
||||
/** Default attestation lifetime — 12h leaves headroom under broker's 24h cap. */
|
||||
export const DEFAULT_ATTESTATION_TTL_MS = 12 * 60 * 60 * 1000;
|
||||
|
||||
export interface ParentAttestation {
|
||||
sessionPubkey: string;
|
||||
parentMemberPubkey: string;
|
||||
expiresAt: number;
|
||||
signature: string;
|
||||
}
|
||||
|
||||
/** Sign the parent-vouches-session attestation. */
|
||||
export async function signParentAttestation(args: {
|
||||
parentMemberPubkey: string;
|
||||
parentSecretKey: string;
|
||||
sessionPubkey: string;
|
||||
/** Override the lifetime; default 12h. */
|
||||
ttlMs?: number;
|
||||
/** Override clock for tests. */
|
||||
now?: number;
|
||||
}): Promise<ParentAttestation> {
|
||||
const s = await ensureSodium();
|
||||
const expiresAt = (args.now ?? Date.now()) + (args.ttlMs ?? DEFAULT_ATTESTATION_TTL_MS);
|
||||
const canonical = `claudemesh-session-attest|${args.parentMemberPubkey}|${args.sessionPubkey}|${expiresAt}`;
|
||||
const sig = s.crypto_sign_detached(
|
||||
s.from_string(canonical),
|
||||
s.from_hex(args.parentSecretKey),
|
||||
);
|
||||
return {
|
||||
sessionPubkey: args.sessionPubkey,
|
||||
parentMemberPubkey: args.parentMemberPubkey,
|
||||
expiresAt,
|
||||
signature: s.to_hex(sig),
|
||||
};
|
||||
}
|
||||
|
||||
/** Sign the per-WS-connect session-hello bytes. */
|
||||
export async function signSessionHello(args: {
|
||||
meshId: string;
|
||||
parentMemberPubkey: string;
|
||||
sessionPubkey: string;
|
||||
sessionSecretKey: string;
|
||||
now?: number;
|
||||
}): Promise<{ timestamp: number; signature: string }> {
|
||||
const s = await ensureSodium();
|
||||
const timestamp = args.now ?? Date.now();
|
||||
const canonical =
|
||||
`claudemesh-session-hello|${args.meshId}|${args.parentMemberPubkey}|${args.sessionPubkey}|${timestamp}`;
|
||||
const sig = s.crypto_sign_detached(
|
||||
s.from_string(canonical),
|
||||
s.from_hex(args.sessionSecretKey),
|
||||
);
|
||||
return { timestamp, signature: s.to_hex(sig) };
|
||||
}
|
||||
243
apps/cli/src/services/daemon/lifecycle.ts
Normal file
243
apps/cli/src/services/daemon/lifecycle.ts
Normal file
@@ -0,0 +1,243 @@
|
||||
/**
|
||||
* Daemon lifecycle helper — probe, auto-spawn, retry, fall-through.
|
||||
*
|
||||
* Every daemon-routed CLI verb passes through `ensureDaemonReady()` before
|
||||
* its IPC call. The helper:
|
||||
*
|
||||
* 1. Probes the socket via a fast `/v1/version` IPC (~5-10 ms).
|
||||
* 2. If the socket is missing OR present-but-stale, attempts a detached
|
||||
* `claudemesh daemon up` spawn under a file-lock.
|
||||
* 3. Polls for the new socket up to a budget (default 3s).
|
||||
* 4. Returns a state describing what happened, so the caller can either
|
||||
* proceed warm or fall back to the cold path with a clear warning.
|
||||
*
|
||||
* State machine:
|
||||
* - "up" daemon was already running
|
||||
* - "started" daemon was down; we spawned it; it came up
|
||||
* - "down" daemon was down; auto-spawn skipped (e.g., recursion guard)
|
||||
* - "spawn-failed" spawn attempted but socket never appeared within budget
|
||||
* - "spawn-suppressed" recently-failed marker is fresh; skipped retry
|
||||
*
|
||||
* Stale-socket handling: if the socket file exists but the IPC probe
|
||||
* fails (ECONNREFUSED / timeout), we treat the file as stale, remove
|
||||
* it, and proceed as if the daemon were down. This fixes the prior bug
|
||||
* where `existsSync(SOCK_FILE)` was a false positive after a daemon
|
||||
* crash.
|
||||
*
|
||||
* Recursion guard: when we spawn the daemon we set
|
||||
* `CLAUDEMESH_INTERNAL_NO_AUTOSPAWN=1` in its env so any nested CLI
|
||||
* calls inside the daemon skip the auto-spawn check and avoid a loop.
|
||||
*/
|
||||
|
||||
import { existsSync, readFileSync, statSync, unlinkSync, writeFileSync } from "node:fs";
|
||||
import { join } from "node:path";
|
||||
|
||||
import { ipc, IpcError } from "~/daemon/ipc/client.js";
|
||||
import { DAEMON_PATHS } from "~/daemon/paths.js";
|
||||
|
||||
export type DaemonReadyState =
|
||||
| "up"
|
||||
| "started"
|
||||
| "down"
|
||||
| "spawn-failed"
|
||||
| "spawn-suppressed";
|
||||
|
||||
export interface EnsureDaemonResult {
|
||||
state: DaemonReadyState;
|
||||
/** Total ms spent in this call (probe ± spawn ± poll). */
|
||||
durationMs: number;
|
||||
/** When state is `spawn-failed` or `spawn-suppressed`, a one-line reason. */
|
||||
reason?: string;
|
||||
}
|
||||
|
||||
export interface EnsureDaemonOpts {
|
||||
/** Total budget for socket-appearance polling after spawn. Default 3000ms. */
|
||||
budgetMs?: number;
|
||||
/** Skip auto-spawn entirely. Used by `--no-daemon` and the recursion guard. */
|
||||
noAutoSpawn?: boolean;
|
||||
/** When auto-spawning a legacy single-mesh daemon, pin a slug. Omit for multi-mesh (default). */
|
||||
mesh?: string;
|
||||
}
|
||||
|
||||
const SPAWN_LOCK_FILE = () => join(DAEMON_PATHS.DAEMON_DIR, ".spawn.lock");
|
||||
const SPAWN_FAIL_FILE = () => join(DAEMON_PATHS.DAEMON_DIR, ".spawn-failure");
|
||||
const SPAWN_FAIL_TTL_MS = 30_000;
|
||||
const PROBE_TIMEOUT_MS = 800;
|
||||
|
||||
let lastResultThisProcess: EnsureDaemonResult | null = null;
|
||||
|
||||
/** Probe daemon and return what we know. Cached per-process so a script
|
||||
* with 50 sends doesn't re-spawn 50 times. */
|
||||
export async function ensureDaemonReady(opts: EnsureDaemonOpts = {}): Promise<EnsureDaemonResult> {
|
||||
if (lastResultThisProcess && (lastResultThisProcess.state === "up" || lastResultThisProcess.state === "started")) {
|
||||
return lastResultThisProcess;
|
||||
}
|
||||
if (process.env.CLAUDEMESH_INTERNAL_NO_AUTOSPAWN === "1") {
|
||||
opts = { ...opts, noAutoSpawn: true };
|
||||
}
|
||||
const result = await runEnsureDaemon(opts);
|
||||
lastResultThisProcess = result;
|
||||
return result;
|
||||
}
|
||||
|
||||
/** Reset the per-process cache. Test helper. */
|
||||
export function _resetDaemonReadyCache(): void {
|
||||
lastResultThisProcess = null;
|
||||
}
|
||||
|
||||
async function runEnsureDaemon(opts: EnsureDaemonOpts): Promise<EnsureDaemonResult> {
|
||||
const t0 = Date.now();
|
||||
|
||||
// Step 1 — probe.
|
||||
const probe = await probeDaemon();
|
||||
if (probe === "up") return { state: "up", durationMs: Date.now() - t0 };
|
||||
if (probe === "stale") cleanupStaleFiles();
|
||||
|
||||
// Step 2 — auto-spawn unless forbidden.
|
||||
if (opts.noAutoSpawn) {
|
||||
return { state: "down", durationMs: Date.now() - t0, reason: "auto-spawn disabled" };
|
||||
}
|
||||
if (recentSpawnFailureFresh()) {
|
||||
return {
|
||||
state: "spawn-suppressed",
|
||||
durationMs: Date.now() - t0,
|
||||
reason: `daemon failed to start within last ${Math.round(SPAWN_FAIL_TTL_MS / 1000)}s`,
|
||||
};
|
||||
}
|
||||
|
||||
// Step 3 — spawn detached.
|
||||
const spawnRes = await spawnDaemon(opts);
|
||||
if (spawnRes.ok) {
|
||||
return { state: "started", durationMs: Date.now() - t0 };
|
||||
}
|
||||
|
||||
// Step 4 — record failure for backoff and report.
|
||||
markSpawnFailure();
|
||||
return { state: "spawn-failed", durationMs: Date.now() - t0, reason: spawnRes.reason };
|
||||
}
|
||||
|
||||
async function probeDaemon(): Promise<"up" | "absent" | "stale"> {
|
||||
if (!existsSync(DAEMON_PATHS.SOCK_FILE)) return "absent";
|
||||
try {
|
||||
const res = await ipc<{ version?: string }>({ path: "/v1/version", timeoutMs: PROBE_TIMEOUT_MS });
|
||||
if (res.status === 200) return "up";
|
||||
return "stale";
|
||||
} catch (err) {
|
||||
if (err instanceof IpcError) return "stale";
|
||||
const msg = String(err);
|
||||
if (/ENOENT|ECONNREFUSED|ipc_timeout|EPIPE|ECONNRESET/.test(msg)) return "stale";
|
||||
return "stale";
|
||||
}
|
||||
}
|
||||
|
||||
function cleanupStaleFiles(): void {
|
||||
for (const p of [DAEMON_PATHS.SOCK_FILE, DAEMON_PATHS.PID_FILE]) {
|
||||
try { unlinkSync(p); } catch { /* best-effort */ }
|
||||
}
|
||||
}
|
||||
|
||||
function recentSpawnFailureFresh(): boolean {
|
||||
try {
|
||||
const st = statSync(SPAWN_FAIL_FILE());
|
||||
return Date.now() - st.mtimeMs < SPAWN_FAIL_TTL_MS;
|
||||
} catch {
|
||||
return false;
|
||||
}
|
||||
}
|
||||
|
||||
function markSpawnFailure(): void {
|
||||
try { writeFileSync(SPAWN_FAIL_FILE(), String(Date.now()), { mode: 0o600 }); } catch { /* best-effort */ }
|
||||
}
|
||||
|
||||
function clearSpawnFailure(): void {
|
||||
try { unlinkSync(SPAWN_FAIL_FILE()); } catch { /* best-effort */ }
|
||||
}
|
||||
|
||||
interface SpawnResult { ok: boolean; reason?: string; }
|
||||
|
||||
async function spawnDaemon(opts: EnsureDaemonOpts): Promise<SpawnResult> {
|
||||
const lockResult = await acquireOrShareLock(opts);
|
||||
if (lockResult === "wait-existing") {
|
||||
// Another process is spawning; just wait for the socket to appear.
|
||||
return await pollForSocket(opts.budgetMs ?? 3_000);
|
||||
}
|
||||
|
||||
try {
|
||||
const { spawn } = await import("node:child_process");
|
||||
const binary = await resolveCliBinary();
|
||||
const args = ["daemon", "up"];
|
||||
if (opts.mesh) args.push("--mesh", opts.mesh);
|
||||
|
||||
const child = spawn(binary, args, {
|
||||
detached: true,
|
||||
stdio: "ignore",
|
||||
env: { ...process.env, CLAUDEMESH_INTERNAL_NO_AUTOSPAWN: "1" },
|
||||
});
|
||||
child.unref();
|
||||
|
||||
const polled = await pollForSocket(opts.budgetMs ?? 3_000);
|
||||
if (polled.ok) clearSpawnFailure();
|
||||
return polled;
|
||||
} catch (err) {
|
||||
return { ok: false, reason: err instanceof Error ? err.message : String(err) };
|
||||
} finally {
|
||||
releaseLock();
|
||||
}
|
||||
}
|
||||
|
||||
/** Acquire spawn lock. If another process holds it AND its pid is alive,
|
||||
* return "wait-existing" so we share that spawn attempt. If the pid is
|
||||
* dead, take over the lock. */
|
||||
async function acquireOrShareLock(_opts: EnsureDaemonOpts): Promise<"acquired" | "wait-existing"> {
|
||||
const lockPath = SPAWN_LOCK_FILE();
|
||||
if (existsSync(lockPath)) {
|
||||
try {
|
||||
const pidStr = readFileSync(lockPath, "utf8").trim();
|
||||
const pid = Number.parseInt(pidStr, 10);
|
||||
if (Number.isFinite(pid) && pid > 0) {
|
||||
try {
|
||||
process.kill(pid, 0); // signal 0 = liveness probe
|
||||
return "wait-existing";
|
||||
} catch {
|
||||
// Holder is dead — fall through to take over.
|
||||
}
|
||||
}
|
||||
} catch { /* unreadable lock — take over */ }
|
||||
}
|
||||
try {
|
||||
writeFileSync(lockPath, String(process.pid), { mode: 0o600 });
|
||||
} catch { /* best-effort; lock is advisory */ }
|
||||
return "acquired";
|
||||
}
|
||||
|
||||
function releaseLock(): void {
|
||||
try { unlinkSync(SPAWN_LOCK_FILE()); } catch { /* best-effort */ }
|
||||
}
|
||||
|
||||
async function pollForSocket(budgetMs: number): Promise<SpawnResult> {
|
||||
const start = Date.now();
|
||||
while (Date.now() - start < budgetMs) {
|
||||
if (existsSync(DAEMON_PATHS.SOCK_FILE)) {
|
||||
// Don't just trust file presence — confirm it answers.
|
||||
const probe = await probeDaemon();
|
||||
if (probe === "up") return { ok: true };
|
||||
}
|
||||
await new Promise((r) => setTimeout(r, 150));
|
||||
}
|
||||
return { ok: false, reason: `socket did not appear within ${budgetMs}ms` };
|
||||
}
|
||||
|
||||
/** Resolve the absolute path to the `claudemesh` binary the user is running.
|
||||
* When invoked via tsx/bun in dev, fall back to the system `claudemesh`. */
|
||||
async function resolveCliBinary(): Promise<string> {
|
||||
const argv1 = process.argv[1] ?? "claudemesh";
|
||||
if (/\.ts$/.test(argv1) || /node_modules|src\/entrypoints/.test(argv1)) {
|
||||
try {
|
||||
const { execSync } = await import("node:child_process");
|
||||
return execSync("which claudemesh", { encoding: "utf8" }).trim() || "claudemesh";
|
||||
} catch {
|
||||
return "claudemesh";
|
||||
}
|
||||
}
|
||||
return argv1;
|
||||
}
|
||||
42
apps/cli/src/services/daemon/policy.ts
Normal file
42
apps/cli/src/services/daemon/policy.ts
Normal file
@@ -0,0 +1,42 @@
|
||||
/**
|
||||
* Per-process daemon policy — set once at CLI entry from --no-daemon /
|
||||
* --strict / env var, then read by daemon-routing helpers.
|
||||
*
|
||||
* Modes:
|
||||
* "auto" (default) probe → auto-spawn → retry → cold fallback
|
||||
* "strict" probe → auto-spawn → retry → ERROR (no cold fallback)
|
||||
* "no-daemon" skip daemon entirely → straight to cold path
|
||||
*
|
||||
* Env equivalents (for headless/CI use):
|
||||
* CLAUDEMESH_STRICT_DAEMON=1 → strict
|
||||
* CLAUDEMESH_NO_DAEMON=1 → no-daemon
|
||||
*
|
||||
* Flag wins over env when both are set.
|
||||
*/
|
||||
|
||||
export type DaemonMode = "auto" | "strict" | "no-daemon";
|
||||
|
||||
export interface DaemonPolicy { mode: DaemonMode; }
|
||||
|
||||
let policy: DaemonPolicy = readEnvDefault();
|
||||
|
||||
function readEnvDefault(): DaemonPolicy {
|
||||
if (process.env.CLAUDEMESH_NO_DAEMON === "1") return { mode: "no-daemon" };
|
||||
if (process.env.CLAUDEMESH_STRICT_DAEMON === "1") return { mode: "strict" };
|
||||
return { mode: "auto" };
|
||||
}
|
||||
|
||||
export function setDaemonPolicy(mode: DaemonMode): void {
|
||||
policy = { mode };
|
||||
}
|
||||
|
||||
export function getDaemonPolicy(): DaemonPolicy {
|
||||
return policy;
|
||||
}
|
||||
|
||||
/** Pick a mode from parsed flags. CLI flags win over env. */
|
||||
export function policyFromFlags(flags: Record<string, unknown>): DaemonMode {
|
||||
if (flags["no-daemon"]) return "no-daemon";
|
||||
if (flags.strict) return "strict";
|
||||
return readEnvDefault().mode;
|
||||
}
|
||||
56
apps/cli/src/services/session/resolve.ts
Normal file
56
apps/cli/src/services/session/resolve.ts
Normal file
@@ -0,0 +1,56 @@
|
||||
/**
|
||||
* CLI-side session resolver. Reads the session token from env, asks
|
||||
* the daemon `GET /v1/sessions/me`, and caches the result for the
|
||||
* lifetime of this CLI invocation.
|
||||
*
|
||||
* Used by verbs that iterate multiple meshes client-side (peer list,
|
||||
* me, member list) so that, when invoked from inside a launched
|
||||
* session, they auto-scope to that session's workspace instead of
|
||||
* aggregating across every joined mesh.
|
||||
*
|
||||
* Returns null when:
|
||||
* - no token in env (caller is outside a launched session, or
|
||||
* bare `claudemesh` with no installed daemon).
|
||||
* - token present but daemon doesn't recognize it (registry was
|
||||
* reset by a daemon restart).
|
||||
* - any IPC error (treat as "no scoping info, fall back to default
|
||||
* behavior").
|
||||
*/
|
||||
|
||||
import { ipc } from "~/daemon/ipc/client.js";
|
||||
import { readSessionTokenFromEnv } from "./token.js";
|
||||
|
||||
export interface ResolvedSession {
|
||||
sessionId: string;
|
||||
mesh: string;
|
||||
displayName: string;
|
||||
pid: number;
|
||||
cwd?: string;
|
||||
role?: string;
|
||||
groups?: string[];
|
||||
}
|
||||
|
||||
let cached: ResolvedSession | null | undefined = undefined;
|
||||
|
||||
export async function getSessionInfo(): Promise<ResolvedSession | null> {
|
||||
if (cached !== undefined) return cached;
|
||||
const tok = readSessionTokenFromEnv();
|
||||
if (!tok) { cached = null; return null; }
|
||||
try {
|
||||
const res = await ipc<{ session?: ResolvedSession }>({
|
||||
path: "/v1/sessions/me",
|
||||
timeoutMs: 1_500,
|
||||
});
|
||||
if (res.status !== 200 || !res.body.session) { cached = null; return null; }
|
||||
cached = res.body.session;
|
||||
return cached;
|
||||
} catch {
|
||||
cached = null;
|
||||
return null;
|
||||
}
|
||||
}
|
||||
|
||||
/** Test helper. */
|
||||
export function _resetSessionCache(): void {
|
||||
cached = undefined;
|
||||
}
|
||||
53
apps/cli/src/services/session/token.ts
Normal file
53
apps/cli/src/services/session/token.ts
Normal file
@@ -0,0 +1,53 @@
|
||||
/**
|
||||
* Per-session IPC tokens — mint, persist, read.
|
||||
*
|
||||
* Each `claudemesh launch` mints a 32-byte random token, writes it to
|
||||
* `<tmpdir>/session-token` (mode 0o600), and exposes the path to the
|
||||
* spawned `claude` via `CLAUDEMESH_IPC_TOKEN_FILE`. Subprocesses
|
||||
* inheriting this env auto-attach the token to every IPC request via
|
||||
* the `Authorization: ClaudeMesh-Session <hex>` header. The daemon's
|
||||
* registry resolves the token to `{sessionId, mesh, displayName, pid,
|
||||
* cwd, ...}` in O(1) and uses it for auto-scoping + attribution.
|
||||
*
|
||||
* Why a file path env var, not the value directly:
|
||||
* `ps eww -p <pid>` shows env values to other processes of the same
|
||||
* uid. The path leaks; the secret in mode-0600 files inside a
|
||||
* mode-0700 tmpdir does not. Same trick OpenSSH uses for SSH_AUTH_SOCK.
|
||||
*/
|
||||
|
||||
import { randomBytes } from "node:crypto";
|
||||
import { existsSync, readFileSync, writeFileSync } from "node:fs";
|
||||
|
||||
const ENV_TOKEN_FILE = "CLAUDEMESH_IPC_TOKEN_FILE";
|
||||
|
||||
export interface MintedToken {
|
||||
token: string;
|
||||
/** Filesystem path the token was written to. Pass via env to children. */
|
||||
filePath: string;
|
||||
}
|
||||
|
||||
/** Generate a fresh 64-hex token and write it under `dir`. */
|
||||
export function mintSessionToken(dir: string, fileName = "session-token"): MintedToken {
|
||||
const token = randomBytes(32).toString("hex");
|
||||
const filePath = `${dir}/${fileName}`;
|
||||
writeFileSync(filePath, token, { mode: 0o600 });
|
||||
return { token, filePath };
|
||||
}
|
||||
|
||||
/** Read a token from the path in CLAUDEMESH_IPC_TOKEN_FILE, if present.
|
||||
* Falls back to a literal CLAUDEMESH_IPC_TOKEN env value (for testing).
|
||||
* Returns null when neither is set or the file is unreadable. */
|
||||
export function readSessionTokenFromEnv(env: NodeJS.ProcessEnv = process.env): string | null {
|
||||
const direct = env.CLAUDEMESH_IPC_TOKEN;
|
||||
if (direct && /^[0-9a-f]{64}$/i.test(direct)) return direct.toLowerCase();
|
||||
const path = env[ENV_TOKEN_FILE];
|
||||
if (!path) return null;
|
||||
try {
|
||||
if (!existsSync(path)) return null;
|
||||
const raw = readFileSync(path, "utf8").trim();
|
||||
if (/^[0-9a-f]{64}$/i.test(raw)) return raw.toLowerCase();
|
||||
return null;
|
||||
} catch { return null; }
|
||||
}
|
||||
|
||||
export const TOKEN_FILE_ENV = ENV_TOKEN_FILE;
|
||||
9
apps/cli/src/types/text-import.d.ts
vendored
Normal file
9
apps/cli/src/types/text-import.d.ts
vendored
Normal file
@@ -0,0 +1,9 @@
|
||||
/**
|
||||
* Bun's text-import attribute lets us bake `.md` content into the bundle
|
||||
* at build time. TypeScript doesn't know about the import attribute
|
||||
* syntax for non-JS modules, so we declare the wildcard here.
|
||||
*/
|
||||
declare module "*.md" {
|
||||
const content: string;
|
||||
export default content;
|
||||
}
|
||||
60
apps/cli/src/ui/warnings.ts
Normal file
60
apps/cli/src/ui/warnings.ts
Normal file
@@ -0,0 +1,60 @@
|
||||
/**
|
||||
* Once-per-process daemon-state warnings, routed to stderr.
|
||||
*
|
||||
* Suppressed under --quiet (caller responsibility — we never inspect
|
||||
* argv). JSON callers should consult the result's `state` field
|
||||
* directly and skip calling this helper.
|
||||
*/
|
||||
|
||||
import type { EnsureDaemonResult } from "~/services/daemon/lifecycle.js";
|
||||
import { getDaemonPolicy } from "~/services/daemon/policy.js";
|
||||
import { dim } from "./styles.js";
|
||||
|
||||
let alreadyWarned = false;
|
||||
|
||||
export interface WarnDaemonOpts {
|
||||
quiet?: boolean;
|
||||
/** When true, emit nothing — the caller will surface the state in JSON. */
|
||||
json?: boolean;
|
||||
}
|
||||
|
||||
/** Print a single, severity-appropriate line to stderr describing the
|
||||
* result of `ensureDaemonReady`. Returns whether anything was printed. */
|
||||
export function warnDaemonState(
|
||||
res: EnsureDaemonResult,
|
||||
opts: WarnDaemonOpts = {},
|
||||
): boolean {
|
||||
if (alreadyWarned) return false;
|
||||
if (opts.quiet || opts.json) return false;
|
||||
if (res.state === "up") return false;
|
||||
|
||||
// Under --strict, the cold-path gate at `withMesh` will print its own
|
||||
// refusal message — suppress the misleading "using cold path" hint
|
||||
// here so the user sees a single, accurate error.
|
||||
if (getDaemonPolicy().mode === "strict" && res.state !== "started") return false;
|
||||
|
||||
alreadyWarned = true;
|
||||
const tag = (label: string) => `[claudemesh] ${label}`;
|
||||
const hint = (s: string) => dim(s);
|
||||
|
||||
switch (res.state) {
|
||||
case "started":
|
||||
process.stderr.write(`${tag("info")} daemon restarted automatically ${hint(`(took ${res.durationMs}ms)`)}\n`);
|
||||
return true;
|
||||
case "down":
|
||||
process.stderr.write(`${tag("info")} daemon not running — using cold path ${hint("(slower; run `claudemesh daemon up` for warm path)")}\n`);
|
||||
return true;
|
||||
case "spawn-suppressed":
|
||||
process.stderr.write(`${tag("warn")} ${res.reason ?? "daemon failed to start recently"} — using cold path ${hint("(run `claudemesh doctor`)")}\n`);
|
||||
return true;
|
||||
case "spawn-failed":
|
||||
process.stderr.write(`${tag("warn")} daemon spawn failed${res.reason ? `: ${res.reason}` : ""} — using cold path ${hint("(check ~/.claudemesh/daemon/daemon.log)")}\n`);
|
||||
return true;
|
||||
}
|
||||
return false;
|
||||
}
|
||||
|
||||
/** Reset the once-per-process latch. Test helper. */
|
||||
export function _resetDaemonWarningLatch(): void {
|
||||
alreadyWarned = false;
|
||||
}
|
||||
@@ -1,14 +1,18 @@
|
||||
import { describe, it, expect } from "vitest";
|
||||
import { execSync } from "node:child_process";
|
||||
import { spawnSync } from "node:child_process";
|
||||
import { resolve } from "node:path";
|
||||
|
||||
const CLI = resolve(__dirname, "../../dist/entrypoints/cli.js");
|
||||
|
||||
describe("golden: whoami --json", () => {
|
||||
it("outputs schema_version 1.0 when not signed in", () => {
|
||||
// `whoami --json` exits 2 (EXIT.AUTH_FAILED) when not signed in.
|
||||
// The JSON is still valid output and the contract under test —
|
||||
// capture stdout independently of exit code.
|
||||
const env = { ...process.env, CLAUDEMESH_CONFIG_DIR: "/tmp/claudemesh-golden-test-" + Date.now() };
|
||||
const output = execSync(`node ${CLI} whoami --json`, { encoding: "utf-8", env }).trim();
|
||||
const json = JSON.parse(output);
|
||||
const result = spawnSync("node", [CLI, "whoami", "--json"], { encoding: "utf-8", env });
|
||||
expect([0, 2]).toContain(result.status);
|
||||
const json = JSON.parse(result.stdout.trim());
|
||||
expect(json.schema_version).toBe("1.0");
|
||||
expect(json.signed_in).toBe(false);
|
||||
});
|
||||
|
||||
88
apps/cli/tests/unit/session-hello-sig.test.ts
Normal file
88
apps/cli/tests/unit/session-hello-sig.test.ts
Normal file
@@ -0,0 +1,88 @@
|
||||
/**
|
||||
* CLI-side session-hello signing.
|
||||
*
|
||||
* Roundtrip: the signatures we mint with the CLI helpers must match the
|
||||
* canonical bytes the broker recomputes from the same fields. Drift here
|
||||
* shows up as `bad_signature` on the broker — easier to catch in unit
|
||||
* tests than in end-to-end flow.
|
||||
*/
|
||||
|
||||
import { describe, expect, test } from "vitest";
|
||||
import sodium from "libsodium-wrappers";
|
||||
import {
|
||||
signParentAttestation,
|
||||
signSessionHello,
|
||||
DEFAULT_ATTESTATION_TTL_MS,
|
||||
} from "../../src/services/broker/session-hello-sig.js";
|
||||
|
||||
async function makeKeypair(): Promise<{ publicKey: string; secretKey: string }> {
|
||||
await sodium.ready;
|
||||
const kp = sodium.crypto_sign_keypair();
|
||||
return {
|
||||
publicKey: sodium.to_hex(kp.publicKey),
|
||||
secretKey: sodium.to_hex(kp.privateKey),
|
||||
};
|
||||
}
|
||||
|
||||
describe("signParentAttestation", () => {
|
||||
test("produces canonical bytes that verify against parent pubkey", async () => {
|
||||
await sodium.ready;
|
||||
const parent = await makeKeypair();
|
||||
const session = await makeKeypair();
|
||||
|
||||
const att = await signParentAttestation({
|
||||
parentMemberPubkey: parent.publicKey,
|
||||
parentSecretKey: parent.secretKey,
|
||||
sessionPubkey: session.publicKey,
|
||||
});
|
||||
expect(att.parentMemberPubkey).toBe(parent.publicKey);
|
||||
expect(att.sessionPubkey).toBe(session.publicKey);
|
||||
expect(att.signature).toMatch(/^[0-9a-f]{128}$/);
|
||||
|
||||
const canonical =
|
||||
`claudemesh-session-attest|${parent.publicKey}|${session.publicKey}|${att.expiresAt}`;
|
||||
const ok = sodium.crypto_sign_verify_detached(
|
||||
sodium.from_hex(att.signature),
|
||||
sodium.from_string(canonical),
|
||||
sodium.from_hex(parent.publicKey),
|
||||
);
|
||||
expect(ok).toBe(true);
|
||||
});
|
||||
|
||||
test("default TTL ≤24h cap", async () => {
|
||||
const parent = await makeKeypair();
|
||||
const session = await makeKeypair();
|
||||
const now = 1_700_000_000_000;
|
||||
const att = await signParentAttestation({
|
||||
parentMemberPubkey: parent.publicKey,
|
||||
parentSecretKey: parent.secretKey,
|
||||
sessionPubkey: session.publicKey,
|
||||
now,
|
||||
});
|
||||
expect(att.expiresAt).toBe(now + DEFAULT_ATTESTATION_TTL_MS);
|
||||
expect(att.expiresAt - now).toBeLessThanOrEqual(24 * 60 * 60 * 1000);
|
||||
});
|
||||
});
|
||||
|
||||
describe("signSessionHello", () => {
|
||||
test("signature verifies against session pubkey", async () => {
|
||||
await sodium.ready;
|
||||
const session = await makeKeypair();
|
||||
const result = await signSessionHello({
|
||||
meshId: "mesh-x",
|
||||
parentMemberPubkey: "c".repeat(64),
|
||||
sessionPubkey: session.publicKey,
|
||||
sessionSecretKey: session.secretKey,
|
||||
});
|
||||
expect(result.signature).toMatch(/^[0-9a-f]{128}$/);
|
||||
|
||||
const canonical =
|
||||
`claudemesh-session-hello|mesh-x|${"c".repeat(64)}|${session.publicKey}|${result.timestamp}`;
|
||||
const ok = sodium.crypto_sign_verify_detached(
|
||||
sodium.from_hex(result.signature),
|
||||
sodium.from_string(canonical),
|
||||
sodium.from_hex(session.publicKey),
|
||||
);
|
||||
expect(ok).toBe(true);
|
||||
});
|
||||
});
|
||||
135
apps/cli/tests/unit/session-registry-hooks.test.ts
Normal file
135
apps/cli/tests/unit/session-registry-hooks.test.ts
Normal file
@@ -0,0 +1,135 @@
|
||||
/**
|
||||
* Session-registry lifecycle hooks (1.30.0+).
|
||||
*
|
||||
* The daemon's session-broker subsystem subscribes to register/deregister
|
||||
* events to open and close per-session WSes. Verifies:
|
||||
* - hooks fire on register + deregister
|
||||
* - replacing an entry under the same sessionId fires deregister(prior)
|
||||
* followed by register(new)
|
||||
* - reaper-triggered deregister fires the hook for dead pids
|
||||
* - presence material round-trips through the registry
|
||||
*/
|
||||
|
||||
import { afterEach, describe, expect, test, vi } from "vitest";
|
||||
import {
|
||||
_resetRegistry,
|
||||
deregisterByToken,
|
||||
registerSession,
|
||||
resolveToken,
|
||||
setRegistryHooks,
|
||||
type SessionInfo,
|
||||
} from "../../src/daemon/session-registry.js";
|
||||
|
||||
const PRESENCE = {
|
||||
sessionPubkey: "a".repeat(64),
|
||||
sessionSecretKey: "b".repeat(128),
|
||||
parentAttestation: {
|
||||
sessionPubkey: "a".repeat(64),
|
||||
parentMemberPubkey: "c".repeat(64),
|
||||
expiresAt: Date.now() + 60 * 60 * 1000,
|
||||
signature: "d".repeat(128),
|
||||
},
|
||||
};
|
||||
|
||||
afterEach(() => {
|
||||
_resetRegistry();
|
||||
});
|
||||
|
||||
describe("session-registry hooks", () => {
|
||||
test("onRegister fires on register", () => {
|
||||
const onRegister = vi.fn();
|
||||
const onDeregister = vi.fn();
|
||||
setRegistryHooks({ onRegister, onDeregister });
|
||||
|
||||
registerSession({
|
||||
token: "t".repeat(64),
|
||||
sessionId: "sess-1",
|
||||
mesh: "alpha",
|
||||
displayName: "Alex",
|
||||
pid: 12345,
|
||||
presence: PRESENCE,
|
||||
});
|
||||
|
||||
expect(onRegister).toHaveBeenCalledTimes(1);
|
||||
expect(onDeregister).not.toHaveBeenCalled();
|
||||
const arg = onRegister.mock.calls[0]![0] as SessionInfo;
|
||||
expect(arg.sessionId).toBe("sess-1");
|
||||
expect(arg.presence).toEqual(PRESENCE);
|
||||
});
|
||||
|
||||
test("onDeregister fires on explicit deregister", () => {
|
||||
const onRegister = vi.fn();
|
||||
const onDeregister = vi.fn();
|
||||
setRegistryHooks({ onRegister, onDeregister });
|
||||
|
||||
const token = "e".repeat(64);
|
||||
registerSession({
|
||||
token, sessionId: "sess-2", mesh: "alpha", displayName: "Alex",
|
||||
pid: 12345,
|
||||
});
|
||||
onRegister.mockClear();
|
||||
|
||||
const ok = deregisterByToken(token);
|
||||
expect(ok).toBe(true);
|
||||
expect(onDeregister).toHaveBeenCalledTimes(1);
|
||||
const arg = onDeregister.mock.calls[0]![0] as SessionInfo;
|
||||
expect(arg.sessionId).toBe("sess-2");
|
||||
});
|
||||
|
||||
test("re-registering same sessionId deregisters prior entry first", () => {
|
||||
const onRegister = vi.fn();
|
||||
const onDeregister = vi.fn();
|
||||
setRegistryHooks({ onRegister, onDeregister });
|
||||
|
||||
const oldToken = "1".repeat(64);
|
||||
const newToken = "2".repeat(64);
|
||||
registerSession({
|
||||
token: oldToken, sessionId: "sess-3", mesh: "alpha",
|
||||
displayName: "Alex", pid: 12345,
|
||||
});
|
||||
expect(onRegister).toHaveBeenCalledTimes(1);
|
||||
|
||||
// Replace under same sessionId — prior must be torn down before new one.
|
||||
registerSession({
|
||||
token: newToken, sessionId: "sess-3", mesh: "alpha",
|
||||
displayName: "Alex", pid: 12345,
|
||||
});
|
||||
|
||||
expect(onDeregister).toHaveBeenCalledTimes(1);
|
||||
expect(onRegister).toHaveBeenCalledTimes(2);
|
||||
expect((onDeregister.mock.calls[0]![0] as SessionInfo).token).toBe(oldToken);
|
||||
expect((onRegister.mock.calls[1]![0] as SessionInfo).token).toBe(newToken);
|
||||
// Old token is unresolvable now.
|
||||
expect(resolveToken(oldToken)).toBeNull();
|
||||
expect(resolveToken(newToken)).toBeTruthy();
|
||||
});
|
||||
|
||||
test("hooks tolerate throws (registry mutation still succeeds)", () => {
|
||||
setRegistryHooks({
|
||||
onRegister: () => { throw new Error("boom"); },
|
||||
onDeregister: () => { throw new Error("boom"); },
|
||||
});
|
||||
const token = "f".repeat(64);
|
||||
expect(() =>
|
||||
registerSession({
|
||||
token, sessionId: "sess-4", mesh: "alpha",
|
||||
displayName: "Alex", pid: 12345,
|
||||
}),
|
||||
).not.toThrow();
|
||||
expect(resolveToken(token)).toBeTruthy();
|
||||
expect(() => deregisterByToken(token)).not.toThrow();
|
||||
expect(resolveToken(token)).toBeNull();
|
||||
});
|
||||
|
||||
test("presence is preserved through resolveToken", () => {
|
||||
setRegistryHooks({});
|
||||
const token = "9".repeat(64);
|
||||
registerSession({
|
||||
token, sessionId: "sess-5", mesh: "alpha",
|
||||
displayName: "Alex", pid: 12345, presence: PRESENCE,
|
||||
});
|
||||
const got = resolveToken(token);
|
||||
expect(got).not.toBeNull();
|
||||
expect(got!.presence).toEqual(PRESENCE);
|
||||
});
|
||||
});
|
||||
106
docs/roadmap.md
106
docs/roadmap.md
@@ -223,43 +223,91 @@ The v0.9.0 foundation got promoted in three quick releases:
|
||||
IPC accept time, drain is a forwarder. Adds `mesh`, `target_spec`,
|
||||
`nonce`, `ciphertext`, `priority` columns to the outbox.
|
||||
- **1.25.0** — CLI thin-client routing for `peer list`,
|
||||
`skill list`, `skill get`. Same daemon-first / bridge / cold-path
|
||||
fallback shape as `trySendViaDaemon`.
|
||||
`skill list`, `skill get`.
|
||||
- **1.25.0** — ambient mode: raw `claude` Just Works after
|
||||
`claudemesh install`. No more `claudemesh launch` ceremony for the
|
||||
common case.
|
||||
`claudemesh install`.
|
||||
|
||||
What this leaves on the v2.0.0 redesign roadmap is documented at
|
||||
`.artifacts/specs/2026-05-04-v2-roadmap-completion.md`: daemon
|
||||
multi-mesh, full CLI-to-thin-client conversion, mesh→workspace
|
||||
rename, HKDF identity.
|
||||
What this leaves on the v2.0.0 redesign is documented at
|
||||
`.artifacts/specs/2026-05-04-v2-roadmap-completion.md`.
|
||||
|
||||
---
|
||||
|
||||
## v2.0.0 — *the daemon redesign*
|
||||
## v1.26.0 → v1.30.0 — *Sprint A toward v2* — *shipped*
|
||||
|
||||
The single largest architectural shift. Promotes the persistent
|
||||
thing (the user's account + identity) to a persistent process (the
|
||||
daemon), demotes the ephemeral thing (the Claude session) to a thin
|
||||
client. **Half-shipped via 1.24.0 + 1.25.0; remainder spec'd at
|
||||
`.artifacts/specs/2026-05-04-v2-roadmap-completion.md`.**
|
||||
The Sprint A push completed everything spec'd for v2.0.0 *except* HKDF
|
||||
identity (deferred for security review).
|
||||
|
||||
- **`claudemesh-daemon`** — long-lived per-user launchd / systemd
|
||||
unit. One WebSocket per workspace, persistent across reboots and
|
||||
Claude restarts. Listens on `~/.claudemesh/sockets/<workspace>.sock`.
|
||||
- **HKDF-derived peer keypairs** — same identity across machines,
|
||||
no key copy ritual. Web sign-up = CLI sign-up = same crypto identity.
|
||||
- **Stateless CLI verbs** — every existing command becomes a thin
|
||||
socket client of the daemon. ~3000 LoC removed.
|
||||
- **MCP server shrinks to ~50 LoC** — just a daemon-socket →
|
||||
`experimental.claude/channel` adapter.
|
||||
- **`claudemesh launch` deprecated** — ambient mode means `claude`
|
||||
works with no flags. Launch becomes a one-line alias that prints
|
||||
"ambient mode now, just run `claude`."
|
||||
- **"Mesh" → "workspace" public surface** — DB tables keep
|
||||
`mesh_*` names for migration sanity.
|
||||
- **1.26.0** — multi-mesh daemon. One process attaches to every joined
|
||||
workspace simultaneously. Aggregate read routes (`/v1/peers`,
|
||||
`/v1/skills`) tag each record with its mesh; explicit `?mesh=<slug>`
|
||||
narrows server-side. Outbox dispatch picks the right broker via the
|
||||
`mesh` column.
|
||||
- **1.27.0** — thin-client expansion to state + memory. `state get`,
|
||||
`state set`, `state list`, `remember`, `recall`, `forget` all route
|
||||
through `/v1/state` and `/v1/memory`. First teaser of the
|
||||
`claudemesh workspace <verb>` alias surface.
|
||||
- **1.27.1** — wired six previously-dead launch flags through the CLI
|
||||
entrypoint (`--role`, `--groups`, `--message-mode`, `--system-prompt`,
|
||||
`--continue`, `--quiet`). Pure plumbing fix.
|
||||
- **1.27.2** — bundled `SKILL.md` gains a canonical fully-populated
|
||||
spawn template + per-flag annotation table for unattended scripting.
|
||||
- **1.27.3** — self-healing daemon lifecycle. Every CLI verb probes
|
||||
`/v1/version` (no more stale-socket false positives), auto-spawns a
|
||||
detached `daemon up` under a file-lock when down, polls until live.
|
||||
30 s recently-failed marker prevents thundering-herd retries.
|
||||
- **1.28.0** — bridge tier deletion (~600 LoC dead code removed) +
|
||||
per-process daemon policy: `--strict` (refuse cold fallback) and
|
||||
`--no-daemon` (skip daemon entirely). Single chokepoint at
|
||||
`withMesh`. Env equivalents.
|
||||
- **1.29.0** — per-session IPC tokens. Every `claudemesh launch` mints
|
||||
a 32-byte token under tmpdir mode-0600, registers it with the
|
||||
daemon, exposes the path via `CLAUDEMESH_IPC_TOKEN_FILE` to children.
|
||||
Daemon resolves `Authorization: ClaudeMesh-Session <hex>` to a
|
||||
`SessionInfo`. CLI invocations from inside a launched session
|
||||
auto-scope to its workspace instead of aggregating across all
|
||||
joined meshes (verified: `peer list` returns 1 workspace's peers
|
||||
with token, all 3 without). Server-side `meshFromCtx()` plumbing
|
||||
on every read route.
|
||||
- **1.30.0** — per-session broker presence. Two `claudemesh launch`
|
||||
sessions in the same cwd finally see each other in `peer list`. Each
|
||||
launched session has a long-lived broker presence row owned by the
|
||||
daemon, identified by a per-launch ephemeral keypair vouched by the
|
||||
member's stable key (OAuth-refresh-vs-access shape). Broker gains a
|
||||
`session_hello` handler with parent-attestation TTL ≤24h + session-
|
||||
signature checks; daemon adds a slim `SessionBrokerClient` and
|
||||
registry lifecycle hooks. Also fixes a latent 1.29.0 TDZ bug where
|
||||
`claudemesh launch`'s IPC session-token registration was silently
|
||||
failing every run. Spec at
|
||||
`.artifacts/specs/2026-05-04-per-session-presence.md`.
|
||||
|
||||
Spec: `.artifacts/specs/2026-05-02-roadmap.md`.
|
||||
What's left for true v2.0.0 (next sessions):
|
||||
|
||||
- **1.31.0** — launch wizard refactor (single render loop, daemon-as-
|
||||
step probe panel, last-used persistence, drop `@ts-nocheck`).
|
||||
- **1.32.0** — setup wizard refactor (state-detection snapshot, four-
|
||||
branch flow, daemon install offer, post-join panel).
|
||||
- **1.33.0** — full mesh→workspace public-surface rename in help/docs/
|
||||
site; mesh aliases tagged deprecated; protocol/DB stay `mesh_*`.
|
||||
|
||||
---
|
||||
|
||||
## v2.0.0 — *HKDF cross-machine identity*
|
||||
|
||||
The remaining v2 promise after Sprint A: the user's account secret
|
||||
derives a deterministic ed25519 keypair per workspace. Same identity
|
||||
across laptop + desktop + server, no key copy ritual.
|
||||
|
||||
- **`HKDF(account_secret, info: "claudemesh/mesh/<mesh_id>/peer",
|
||||
salt: <user_id>)`** — derived per-workspace.
|
||||
- **Broker `account_secret` distribution** — vended on first
|
||||
authenticated install over TLS. Needs design review on key
|
||||
compromise recovery story.
|
||||
- **Migration** — existing keypairs in config keep working. Opt-in
|
||||
re-enrollment for users who want cross-machine sync.
|
||||
- **Hello-sig protocol** — unchanged.
|
||||
|
||||
Reserved as its own sprint with an explicit security-review window.
|
||||
Estimated 2-3 weeks.
|
||||
|
||||
---
|
||||
|
||||
|
||||
@@ -5,4 +5,10 @@ import { env } from "./env";
|
||||
import { schema } from "./schema";
|
||||
|
||||
const client = postgres(env.DATABASE_URL ?? "");
|
||||
export const db = drizzle({ client, schema, casing: "snake_case" });
|
||||
// `schema` aggregates many `import * as <ns>` namespace bags. Drizzle's
|
||||
// TSchema generic struggles with namespace-typed records — the runtime
|
||||
// shape is correct but tsc can't unify the deeply-nested table/relation
|
||||
// types against DrizzleConfig's overload set. ts-expect-error keeps the
|
||||
// rest of the typecheck honest while documenting the known mismatch.
|
||||
// @ts-expect-error drizzle TSchema generic narrowing
|
||||
export const db = drizzle(client, { schema, casing: "snake_case" });
|
||||
|
||||
Reference in New Issue
Block a user