Files
claudemesh/apps
Alejandro Gutiérrez a9b735a183
Some checks are pending
CI / Lint (push) Waiting to run
CI / Typecheck (push) Waiting to run
CI / Broker tests (Postgres) (push) Waiting to run
CI / Docker build (linux/amd64) (push) Waiting to run
fix(broker): balance per-mesh connection counter — stop capacity-leak that bricks meshes
`connectionsPerMesh` (the in-memory counter enforcing MAX_CONNECTIONS_PER_MESH=100)
was incremented on every successful hello (incMeshCount at member + session paths)
but decremented ONLY inside evictPresenceFully. Every other removal path —
session-id dedup on reconnect (the common one), kick, and ban — deleted the entry
from `connections` without decrementing, leaking +1 each time. Because the counter
is in-memory and only resets on broker restart, it crept up to 100 over hours/days
of normal reconnect churn (network blips, sleep/wake, relaunches) until the mesh
hit capacity and rejected ALL new connections with `1008 "capacity"` — bricking it
until the broker process was restarted. A user with <10 sessions saw "mesh at
connection capacity" because the 100 were leaked phantoms, not live connections.

Fix: route every non-evict removal through a new dropConnection() helper that
deletes from `connections` AND decMeshCount()s, so the counter tracks map
membership exactly. The replaced socket's own close handler then no-ops (entry
already gone, guarded by `if (!conn) return`), so the decrement happens exactly
once — no double-count. evictPresenceFully keeps its existing balanced delete+dec.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-14 14:32:33 +01:00
..