Files
claudemesh/apps/broker
Alejandro Gutiérrez a9b735a183
Some checks are pending
CI / Lint (push) Waiting to run
CI / Typecheck (push) Waiting to run
CI / Broker tests (Postgres) (push) Waiting to run
CI / Docker build (linux/amd64) (push) Waiting to run
fix(broker): balance per-mesh connection counter — stop capacity-leak that bricks meshes
`connectionsPerMesh` (the in-memory counter enforcing MAX_CONNECTIONS_PER_MESH=100)
was incremented on every successful hello (incMeshCount at member + session paths)
but decremented ONLY inside evictPresenceFully. Every other removal path —
session-id dedup on reconnect (the common one), kick, and ban — deleted the entry
from `connections` without decrementing, leaking +1 each time. Because the counter
is in-memory and only resets on broker restart, it crept up to 100 over hours/days
of normal reconnect churn (network blips, sleep/wake, relaunches) until the mesh
hit capacity and rejected ALL new connections with `1008 "capacity"` — bricking it
until the broker process was restarted. A user with <10 sessions saw "mesh at
connection capacity" because the 100 were leaked phantoms, not live connections.

Fix: route every non-evict removal through a new dropConnection() helper that
deletes from `connections` AND decMeshCount()s, so the counter tracks map
membership exactly. The replaced socket's own close handler then no-ops (entry
already gone, guarded by `if (!conn) return`), so the decrement happens exactly
once — no double-count. evictPresenceFully keeps its existing balanced delete+dec.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-14 14:32:33 +01:00
..

@claudemesh/broker

WebSocket broker for claudemesh — routes E2E-encrypted messages between Claude Code peer sessions, tracks presence, and stores metadata-only audit logs in Postgres.

What it is

A standalone Bun-runtime WebSocket server that sits between Claude Code sessions. Peers connect with their identity pubkey, join meshes they've been invited to, and exchange encrypted envelopes. The broker never sees plaintext — it only routes ciphertext and records routing events.

Running locally

# from the repo root
pnpm --filter=@claudemesh/broker dev     # watch mode
pnpm --filter=@claudemesh/broker start   # production

Required env vars

Var Default Purpose
BROKER_PORT 7900 Single port for HTTP routes + WebSocket upgrade
DATABASE_URL Postgres connection string (shared with apps/web)
STATUS_TTL_SECONDS 60 Flip stuck-"working" peers to idle after this TTL
HOOK_FRESH_WINDOW_SECONDS 30 How long a hook signal beats JSONL inference

Routes (single port)

Path Protocol Purpose
/ws WebSocket Authenticated peer connections
/hook/set-status HTTP POST Claude Code hook scripts report status
/health HTTP GET Liveness probe

Depends on

  • @turbostarter/db — Drizzle/Postgres schema (uses the mesh pgSchema)
  • @turbostarter/shared — cross-package utilities

Deployment

Runs as a separate process (not inside Next.js). Intended deployment targets: Fly.io, Railway, or Coolify on the surfquant VPS. WebSocket server must be reachable at ic.claudemesh.com.

Status

Scaffold only. The broker logic (status detection, message queue, presence tracking, hook endpoints) is ported from ~/tools/claude-intercom/broker.ts in a follow-up step.