Files
claudemesh/apps/broker
Alejandro Gutiérrez b57e47ed65 feat(broker): m1 — two-phase claim/deliver + client_ack + role-tagged presence
Three correctness fixes on top of the m1 schema migration:

1) Fix the drainForMember claim-then-push race
   ----------------------------------------------------------------
   Previously the claim CTE set delivered_at = NOW() *before* the WS
   send. If readyState !== OPEN at push time, the row was marked
   delivered and the message dropped silently — at-most-once with no
   retry hook.

   The new flow:
     - claim sets (claimed_at, claim_id, claim_expires_at = NOW()+30s)
     - delivered_at stays NULL until the recipient acks
     - re-eligibility predicate now also accepts rows whose lease
       expired, so dropped pushes redeliver (at-least-once)

   Adds two helpers:
     - markDelivered() — scoped to (mesh_id, recipient pubkey) so a
       peer can only ack its own messages
     - sweepExpiredClaims() — clears expired (claimed_at, claim_id,
       claim_expires_at) every 15s, wired into startSweepers

2) Accept `client_ack` from recipients
   ----------------------------------------------------------------
   New WS message type handled in the dispatcher right after `send`.
   Lookups by clientMessageId or brokerMessageId; either is fine. Until
   the daemon (apps/cli, separate worktree) starts emitting acks, leases
   will simply expire and re-deliver — which is the desired retry
   behaviour.

3) Tag presence rows with `role`
   ----------------------------------------------------------------
   handleHello (member-keyed, used by the long-lived daemon WS) →
     role: 'control-plane'
   handleSessionHello (per-Claude-Code session WS) →
     role: 'session'

   listPeersInMesh exposes the new field; the peers_list response
   surfaces it. WSPeersListMessage type adds an optional `role` plus the
   long-undocumented `memberPubkey`. CLI-side filter swap from peerType
   to role lands in a follow-up worktree — that's why the CLI is
   untouched here per the M1 spec.

Typechecks clean (apps/broker tsc --noEmit, packages/db tsc --noEmit).
Test suite needs a real DB so wasn't run in this worktree; existing
dup-delivery and broker tests use drainForMember positionally and the
new claimerPresenceId arg is optional, so they should continue to pass.
2026-05-04 18:10:25 +01:00
..

@claudemesh/broker

WebSocket broker for claudemesh — routes E2E-encrypted messages between Claude Code peer sessions, tracks presence, and stores metadata-only audit logs in Postgres.

What it is

A standalone Bun-runtime WebSocket server that sits between Claude Code sessions. Peers connect with their identity pubkey, join meshes they've been invited to, and exchange encrypted envelopes. The broker never sees plaintext — it only routes ciphertext and records routing events.

Running locally

# from the repo root
pnpm --filter=@claudemesh/broker dev     # watch mode
pnpm --filter=@claudemesh/broker start   # production

Required env vars

Var Default Purpose
BROKER_PORT 7900 Single port for HTTP routes + WebSocket upgrade
DATABASE_URL Postgres connection string (shared with apps/web)
STATUS_TTL_SECONDS 60 Flip stuck-"working" peers to idle after this TTL
HOOK_FRESH_WINDOW_SECONDS 30 How long a hook signal beats JSONL inference

Routes (single port)

Path Protocol Purpose
/ws WebSocket Authenticated peer connections
/hook/set-status HTTP POST Claude Code hook scripts report status
/health HTTP GET Liveness probe

Depends on

  • @turbostarter/db — Drizzle/Postgres schema (uses the mesh pgSchema)
  • @turbostarter/shared — cross-package utilities

Deployment

Runs as a separate process (not inside Next.js). Intended deployment targets: Fly.io, Railway, or Coolify on the surfquant VPS. WebSocket server must be reachable at ic.claudemesh.com.

Status

Scaffold only. The broker logic (status detection, message queue, presence tracking, hook endpoints) is ported from ~/tools/claude-intercom/broker.ts in a follow-up step.