fix: queue TTL + per-member send rate limit + size cap + no-recipient reject + ack.error
Some checks failed
CI / Lint (push) Has been cancelled
CI / Typecheck (push) Has been cancelled
CI / Broker tests (Postgres) (push) Has been cancelled
CI / Docker build (linux/amd64) (push) Has been cancelled

Broker (all need redeploy):
- sweepOrphanMessages: DELETE undelivered message_queue rows older
  than 7 days; hourly sweep. Stops unbounded growth when a sender
  typos a name (queued forever, never claimed).
- Per-member send rate limit: TokenBucket(60/min, burst 10) keyed on
  memberId so reconnecting can't bypass. Surfaces as queued=false,
  error='rate_limit: ...'.
- Pre-flight size cap: reject at handleSend if nonce+ciphertext+
  targetSpec exceeds env.MAX_MESSAGE_BYTES with a clear error
  instead of silent WSS frame-level kill.
- No-recipient reject: for direct sends, check any matching peer
  is connected BEFORE queueing. Kills the self-send silent drop
  (sending to your own pubkey when you only have one session
  connected) and typo-to-offline-peer silent drops.
- WSAckMessage.error field added for structured failure reasons.

CLI:
- ws-client ack handler reads msg.queued and msg.error; surfaces
  rate_limit / too_large / no_recipient to callers instead of
  returning ok:true with a dummy messageId.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This commit is contained in:
Alejandro Gutiérrez
2026-04-15 14:44:09 +01:00
parent 39fe296aaa
commit 1a7a059e75
5 changed files with 116 additions and 5 deletions

View File

@@ -293,6 +293,29 @@ export async function sweepStalePresences(): Promise<void> {
);
}
/**
* Sweep undelivered message_queue rows older than 7 days.
*
* Messages sent to non-matching targetSpecs (e.g. typos, peer disconnected
* before claim) would otherwise sit in delivered_at=NULL forever — unbounded
* growth. 7d matches invite expiry, so any legitimately held message is
* already stale by then.
*
* Returns the number of rows deleted so the caller can log + meter.
*/
export async function sweepOrphanMessages(): Promise<number> {
const cutoff = new Date(Date.now() - 7 * 24 * 60 * 60 * 1000);
const result = await db.execute(sql`
DELETE FROM mesh.message_queue
WHERE delivered_at IS NULL
AND created_at < ${cutoff}
RETURNING id
`);
const rows = (result as unknown as { rows?: unknown[]; length?: number }).rows ?? result;
const count = Array.isArray(rows) ? rows.length : 0;
return count;
}
/** Sweep expired pending_status entries. */
export async function sweepPendingStatuses(): Promise<void> {
const cutoff = new Date(Date.now() - PENDING_TTL_MS);
@@ -1667,6 +1690,12 @@ export function startSweepers(): void {
console.error("[broker] stale presence sweep:", e),
);
}, 30_000);
// Orphan-message sweep every hour; cheap, rows are all >7d at deletion time.
setInterval(() => {
sweepOrphanMessages()
.then((n) => { if (n > 0) console.log(`[broker] orphan msgs swept: ${n}`); })
.catch((e) => console.error("[broker] orphan msg sweep:", e));
}, 60 * 60_000).unref();
}
/** Stop background sweepers and mark all active presences disconnected. */