Files

Alejandro Gutiérrez b57e47ed65 feat(broker): m1 — two-phase claim/deliver + client_ack + role-tagged presence

Three correctness fixes on top of the m1 schema migration:

1) Fix the drainForMember claim-then-push race
   ----------------------------------------------------------------
   Previously the claim CTE set delivered_at = NOW() *before* the WS
   send. If readyState !== OPEN at push time, the row was marked
   delivered and the message dropped silently — at-most-once with no
   retry hook.

   The new flow:
     - claim sets (claimed_at, claim_id, claim_expires_at = NOW()+30s)
     - delivered_at stays NULL until the recipient acks
     - re-eligibility predicate now also accepts rows whose lease
       expired, so dropped pushes redeliver (at-least-once)

   Adds two helpers:
     - markDelivered() — scoped to (mesh_id, recipient pubkey) so a
       peer can only ack its own messages
     - sweepExpiredClaims() — clears expired (claimed_at, claim_id,
       claim_expires_at) every 15s, wired into startSweepers

2) Accept `client_ack` from recipients
   ----------------------------------------------------------------
   New WS message type handled in the dispatcher right after `send`.
   Lookups by clientMessageId or brokerMessageId; either is fine. Until
   the daemon (apps/cli, separate worktree) starts emitting acks, leases
   will simply expire and re-deliver — which is the desired retry
   behaviour.

3) Tag presence rows with `role`
   ----------------------------------------------------------------
   handleHello (member-keyed, used by the long-lived daemon WS) →
     role: 'control-plane'
   handleSessionHello (per-Claude-Code session WS) →
     role: 'session'

   listPeersInMesh exposes the new field; the peers_list response
   surfaces it. WSPeersListMessage type adds an optional `role` plus the
   long-undocumented `memberPubkey`. CLI-side filter swap from peerType
   to role lands in a follow-up worktree — that's why the CLI is
   untouched here per the M1 spec.

Typechecks clean (apps/broker tsc --noEmit, packages/db tsc --noEmit).
Test suite needs a real DB so wasn't run in this worktree; existing
dup-delivery and broker tests use drainForMember positionally and the
new claimerPresenceId arg is optional, so they should continue to pass.

2026-05-04 18:10:25 +01:00

scripts

fix(api): mint owner peer-identity row at mesh creation

2026-05-02 17:02:40 +01:00

src

feat(broker): m1 — two-phase claim/deliver + client_ack + role-tagged presence

2026-05-04 18:10:25 +01:00

tests

feat(broker): canonical session-hello + parent-attestation helpers

2026-05-04 12:57:28 +01:00

DEPLOY_SPEC.md

docs(broker): production deployment spec

2026-04-04 22:15:24 +01:00

Dockerfile

feat(ga): close remaining GA blockers (backcompat, HA prep, tests, docs)

2026-04-15 23:51:28 +01:00

eslint.config.js

feat(broker): scaffold apps/broker workspace (bun WS runtime, no port yet)

2026-04-04 21:24:17 +01:00

package.json

refactor: rename cli-v2 → cli, archive legacy cli, plus broker-side grants + auto-migrate

2026-04-15 08:44:52 +01:00

README.md

fix(broker): default port 7899 → 7900 to avoid collision with claude-intercom dev

2026-04-04 21:48:57 +01:00

tsconfig.json

feat(broker): branded react-email template for mesh invite

2026-04-15 02:04:28 +01:00

vitest.config.ts

feat(crypto): client-side direct-message encryption with crypto_box

2026-04-04 22:48:33 +01:00

README.md

@claudemesh/broker

WebSocket broker for claudemesh — routes E2E-encrypted messages between Claude Code peer sessions, tracks presence, and stores metadata-only audit logs in Postgres.

What it is

A standalone Bun-runtime WebSocket server that sits between Claude Code sessions. Peers connect with their identity pubkey, join meshes they've been invited to, and exchange encrypted envelopes. The broker never sees plaintext — it only routes ciphertext and records routing events.

Running locally

# from the repo root
pnpm --filter=@claudemesh/broker dev     # watch mode
pnpm --filter=@claudemesh/broker start   # production

Required env vars

Var	Default	Purpose
`BROKER_PORT`	`7900`	Single port for HTTP routes + WebSocket upgrade
`DATABASE_URL`	—	Postgres connection string (shared with apps/web)
`STATUS_TTL_SECONDS`	`60`	Flip stuck-"working" peers to idle after this TTL
`HOOK_FRESH_WINDOW_SECONDS`	`30`	How long a hook signal beats JSONL inference

Routes (single port)

Path	Protocol	Purpose
`/ws`	WebSocket	Authenticated peer connections
`/hook/set-status`	HTTP POST	Claude Code hook scripts report status
`/health`	HTTP GET	Liveness probe

Depends on

@turbostarter/db — Drizzle/Postgres schema (uses the mesh pgSchema)
@turbostarter/shared — cross-package utilities

Deployment

Runs as a separate process (not inside Next.js). Intended deployment targets: Fly.io, Railway, or Coolify on the surfquant VPS. WebSocket server must be reachable at ic.claudemesh.com.

Status

Scaffold only. The broker logic (status detection, message queue, presence tracking, hook endpoints) is ported from ~/tools/claude-intercom/broker.ts in a follow-up step.