Commit Graph

134 Commits

Author SHA1 Message Date
Alejandro Gutiérrez
77f4316f2d feat(broker+api+cli): per-topic E2E encryption — v0.3.0 phase 3 (CLI)
Some checks failed
CI / Lint (push) Has been cancelled
CI / Typecheck (push) Has been cancelled
CI / Broker tests (Postgres) (push) Has been cancelled
CI / Docker build (linux/amd64) (push) Has been cancelled
Wire format:
  topic_member_key.encrypted_key = base64(
    <32-byte sender x25519 pubkey> || crypto_box(topic_key)
  )

Embedding sender pubkey inline lets re-sealed copies (carrying a
different sender than the original creator-seal) decode the same
way as creator copies, without an extra schema column or join.
topic.encrypted_key_pubkey stays for backwards-compat metadata
but the wire truth is the inline prefix.

API (phase 3):
  GET  /v1/topics/:name/pending-seals  list members without keys
  POST /v1/topics/:name/seal           submit a re-sealed copy
  POST /v1/messages now accepts bodyVersion (1|2); v2 skips the
  regex mention extraction (server can't read v2 ciphertext).
  GET  /messages + /stream now return bodyVersion per row.

Broker + web mutations updated to use the inline-sender format
when sealing. ensureGeneralTopic (web) also generates topic keys
per the bugfix that landed earlier today; both producers now
share one wire format.

CLI (claudemesh-cli@1.8.0):
  + apps/cli/src/services/crypto/topic-key.ts — fetch/decrypt/encrypt/seal
  + claudemesh topic post <name> <msg> — encrypted REST send (v2)
  * claudemesh topic tail <name> — decrypts v2 on render, runs a
    30s background re-seal loop for pending joiners

Web client stays on v1 plaintext until phase 3.5 (browser-side
persistent identity in IndexedDB). Mention fan-out from phase 1
already works for both versions, so /v1/notifications keeps
working through the cutover.

Spec at .artifacts/specs/2026-05-02-topic-key-onboarding.md
updated with the implemented inline-sender format and the
phase 3.5 web plan.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-02 21:03:11 +01:00
Alejandro Gutiérrez
82ebd2b6be chore(broker): wire mentions through WS topic_send + dedupe imports
Some checks failed
CI / Lint (push) Has been cancelled
CI / Typecheck (push) Has been cancelled
CI / Broker tests (Postgres) (push) Has been cancelled
CI / Docker build (linux/amd64) (push) Has been cancelled
WSSendMessage gains an optional mentions field; the broker forwards
it into appendTopicMessage so WS-driven topic sends get the same
write-time fan-out path as REST POST /v1/messages. v1 messages
(today's plaintext-base64) still fall back to a body regex when the
field is omitted, so existing CLIs aren't broken; v2 ciphertext
clients in phase 3 will populate it.

Also drops the duplicate meshMember import (kept the meshMember-as-
memberTable alias which the rest of the file uses).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-02 20:45:57 +01:00
Alejandro Gutiérrez
da5103a315 feat(broker+api): per-topic symmetric keys — schema + creator seal
Some checks failed
CI / Lint (push) Has been cancelled
CI / Typecheck (push) Has been cancelled
CI / Broker tests (Postgres) (push) Has been cancelled
CI / Docker build (linux/amd64) (push) Has been cancelled
Phase 2 (infra layer) of v0.3.0. Topics now generate a 32-byte
XSalsa20-Poly1305 key on creation; the broker seals one copy via
crypto_box for the topic creator using an ephemeral x25519
sender keypair (whose public half lives on
topic.encrypted_key_pubkey). Topic key plaintext leaves memory
immediately after the creator's seal — the broker can't read it.

Schema 0026:
  + topic.encrypted_key_pubkey (text, nullable for legacy v0.2.0)
  + topic_message.body_version  (integer, 1=plaintext / 2=v2 cipher)
  + topic_member_key            (id, topic_id, member_id,
                                 encrypted_key, nonce, rotated_at)

API:
  + GET /v1/topics/:name/key — return the calling member's sealed
    copy. 404 if no copy exists yet (joined post-creation, no peer
    has re-sealed). 409 if the topic is legacy unencrypted.

Open question parked: how new joiners get their sealed copy
without ceding plaintext to the broker. Spec at
.artifacts/specs/2026-05-02-topic-key-onboarding.md picks
member-driven re-seal (Option B). Pending-seals endpoint, seal
POST, and the actual on-the-wire encryption ship in phase 3.

Mention fan-out from phase 1 (notification table) is decoupled
from ciphertext, so /v1/notifications + MentionsSection keep
working unchanged through both phases.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-02 20:28:10 +01:00
Alejandro Gutiérrez
1a238d4178 feat(api+broker+web): write-time mention fan-out via notification table
Some checks failed
CI / Lint (push) Has been cancelled
CI / Typecheck (push) Has been cancelled
CI / Broker tests (Postgres) (push) Has been cancelled
CI / Docker build (linux/amd64) (push) Has been cancelled
Phase 1 of v0.3.0 — replaces the regex-on-decoded-ciphertext scan
in /v1/notifications and the dashboard MentionsSection with reads
from a new mesh.notification table populated at write time.

Schema 0025: mesh.notification (id, mesh_id, topic_id, message_id,
recipient_member_id, sender_member_id, kind, created_at, read_at)
with a unique (message_id, recipient) so a re-fanned message yields
one row per recipient. Backfills existing v0.2.0 messages by
regex-matching the (still-base64-plaintext) bodies — guarded with
a base64 + length check so binary ciphertext doesn't crash the
migration.

Writers (POST /v1/messages + broker appendTopicMessage) now
extract @-mentions from either an explicit `mentions: string[]`
on the request OR a regex over the base64 plaintext (transitional
fallback). Targets are intersected with the mesh roster + capped
at 32 per message. Web chat panel sends the explicit array now so
it keeps working after phase 2 lands.

Readers switch to JOIN-on-notification:
  /v1/notifications      — table-backed, supports ?unread=1
  POST /v1/notifications/read  — new, mark by ids or all-up-to
  MentionsSection (RSC) — same JOIN, returns readAt for each row

GET /v1/notifications also gains a read_at field per row so a
future bell UI can show unread vs read.

Once per-topic encryption (phase 2) lands, the regex fallback
becomes a no-op for v2 messages — clients MUST send `mentions`,
which they already do.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-02 20:23:50 +01:00
Alejandro Gutiérrez
0f32529370 fix(apikey): revoke must verify a row was actually updated
Some checks failed
CI / Lint (push) Has been cancelled
CI / Typecheck (push) Has been cancelled
CI / Broker tests (Postgres) (push) Has been cancelled
CI / Docker build (linux/amd64) (push) Has been cancelled
claudemesh apikey revoke <id> reported success even when the input
didn't match any row in mesh.api_key. The CLI's `apikey list` shows
truncated 8-char prefixes; users naturally paste those; broker did
exact-id match against meshApiKey.id; UPDATE affected 0 rows; old
revokeApiKey returned void so the CLI couldn't tell. Discovered via
end-to-end CLI smoke test against prod (roadmap validation pass).

Three-part fix:

- broker.revokeApiKey now returns
  { status: "revoked"|"not_found"|"not_unique"; id?, matches? } and
  accepts either the full id or a unique prefix (>=6 chars). Prefix
  matching is bounded to the caller's mesh and only succeeds if
  exactly one row matches; ambiguous prefixes return not_unique so
  we never silently revoke the wrong key.

- New WSApiKeyRevokeResponseMessage carries the structured status
  back to the CLI. Old apikey_revoke_ok type removed before being
  released — never shipped to users. The error path is no longer
  used for not_found/not_unique cases; the unified response carries
  both outcomes.

- CLI's apiKeyRevoke now resolves with { ok, id } | { ok: false,
  code, message }. runApiKeyRevoke surfaces the code/message and
  exits non-zero on failure (NOT_FOUND for missing, INVALID_ARGS
  for ambiguous prefix).

Net effect: pasting `claudemesh apikey revoke vq0fwjdX` now actually
revokes the key whose id starts with vq0fwjdX (or fails loud if 0
or >1 keys match). Verified against prod via the new branch's CLI
binary before commit.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-02 18:39:25 +01:00
Alejandro Gutiérrez
2aa21fe07c fix(api): mint owner peer-identity row at mesh creation
Some checks failed
CI / Lint (push) Has been cancelled
CI / Typecheck (push) Has been cancelled
CI / Broker tests (Postgres) (push) Has been cancelled
CI / Docker build (linux/amd64) (push) Has been cancelled
Web-first owners had no mesh.member row because the broker only ever
created one on first WS hello (CLI flow). The topic chat page server
component requires that row to issue a dashboard apikey
(issuedByMemberId is a FK to mesh.member), so visiting the chat for a
web-only mesh hit notFound() on the owner's own room.

Forward fix: createMyMesh now generates a fresh ed25519 peer keypair,
inserts a mesh.member row with role=admin and dashboardUserId=userId,
and subscribes the owner to the auto-created #general topic as 'lead'.
The peer secret key is intentionally discarded — web users don't sign
anything in v0.2.0 (no DMs, base64 plaintext on topics). If the same
user later runs the CLI, the broker mints a separate member row from
its own keypair; both work for their respective surfaces.

Backfill: apps/broker/scripts/backfill-owner-members.ts walks every
non-archived mesh whose owner has no member row, generates real
ed25519 keypairs via libsodium, inserts the rows in a transaction,
and subscribes each as 'lead' on #general. Already run against prod
— 13 owner rows minted, ddtest verified end-to-end via playwriter
(send → poll → render round-trip ok).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-02 17:02:40 +01:00
Alejandro Gutiérrez
6de5e275fa chore(broker): comment migrate skip flag as break-glass only
Some checks failed
CI / Lint (push) Has been cancelled
CI / Typecheck (push) Has been cancelled
CI / Broker tests (Postgres) (push) Has been cancelled
CI / Docker build (linux/amd64) (push) Has been cancelled
Now that the filename-tracked runner is in place and prod is bootstrapped,
BROKER_SKIP_MIGRATE=1 is no longer needed. Removed from Coolify env;
the comment is updated to reflect that the flag is a break-glass for
ops, not the steady-state.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-02 16:45:36 +01:00
Alejandro Gutiérrez
c2cd67a885 feat(broker): filename-tracked migration runner replaces drizzle's
Some checks failed
CI / Lint (push) Has been cancelled
CI / Typecheck (push) Has been cancelled
CI / Broker tests (Postgres) (push) Has been cancelled
CI / Docker build (linux/amd64) (push) Has been cancelled
drizzle's _journal.json drifted to idx=11 while the file system had 25
.sql files; the prod drizzle.__drizzle_migrations table was further
behind with 3 rows. The runtime migrator silently skipped anything
outside the journal, so every new schema change required psql -f by
hand.

The new runner tracks applied files in mesh.__cmh_migrations
(filename PK + sha256 + applied_at). On startup it bootstraps the
tracking table inline, lists migrations/*.sql lexicographically,
filters out already-applied files, and runs the rest in transaction
order under the existing pg_advisory_lock. SHA mismatches on
already-applied files emit a warning but don't fail (cosmetic edits
are common); production drift detection lives elsewhere.

Bootstrap script at apps/broker/scripts/bootstrap-cmh-migrations.ts
computes file hashes and seeds the tracking table — already run
against prod with all 25 current files registered as applied. Future
deploys pick up only truly new migrations.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-02 16:41:51 +01:00
Alejandro Gutiérrez
2e97a0eeee feat(broker+api): every mesh ships with a default #general topic
Some checks failed
CI / Lint (push) Has been cancelled
CI / Typecheck (push) Has been cancelled
CI / Broker tests (Postgres) (push) Has been cancelled
CI / Docker build (linux/amd64) (push) Has been cancelled
The web chat surface needed a guaranteed landing room — a topic that
exists for every mesh from creation onward so the dashboard always has
somewhere to drop the user. #general is the convention; ephemeral DMs
remain ephemeral (mesh.message_queue) so agentic privacy is unchanged.

Three hooks plus a backfill:

- packages/api/src/modules/mesh/mutations.ts — createMyMesh now calls
  ensureGeneralTopic() right after the mesh insert. New helper is
  idempotent via the unique (mesh_id, name) index.
- apps/broker/src/index.ts — handleMeshCreate (CLI claudemesh new)
  inserts #general + subscribes the owner member as 'lead' in the
  same handler.
- apps/broker/src/crypto.ts — invite-claim flow auto-subscribes the
  newly minted member to #general as 'member', defensively ensuring
  the topic exists if predates this change.
- packages/db/migrations/0024_general_topic_backfill.sql — one-shot
  backfill: creates #general for every active mesh that doesn't have
  one, subscribes every active member, and marks the mesh owner as
  'lead' based on owner_user_id == member.user_id. Idempotent.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-02 16:32:16 +01:00
Alejandro Gutiérrez
13d691980a feat(broker+cli): apikey create/list/revoke verbs (v0.2.0 #71)
Some checks failed
CI / Lint (push) Has been cancelled
CI / Typecheck (push) Has been cancelled
CI / Broker tests (Postgres) (push) Has been cancelled
CI / Docker build (linux/amd64) (push) Has been cancelled
Issuance flow over WS for now (REST endpoints come next slice).
Plaintext secret returned ONCE on create — never recoverable.

- broker: 3 WS handlers (apikey_create/list/revoke), wire types in
  union, audit log on issuance + revoke
- ws-client: apiKeyCreate/List/Revoke with resolver maps, response
  dispatch
- CLI: claudemesh apikey create <label> [--cap a,b] [--topic c,d]
  [--expires ISO]; list shows status, scope, last-used; revoke by id
- policy: apikey create + revoke prompt by default (issuing or
  disabling a credential is meaningful)

Default capability set is "send,read" — least privilege for unscoped
keys (admin must explicitly opt-in).

Spec: .artifacts/specs/2026-05-02-v0.2.0-scope.md

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-02 02:13:12 +01:00
Alejandro Gutiérrez
f45380d231 feat(broker): api key schema and helpers
Some checks failed
CI / Lint (push) Has been cancelled
CI / Typecheck (push) Has been cancelled
CI / Broker tests (Postgres) (push) Has been cancelled
CI / Docker build (linux/amd64) (push) Has been cancelled
Foundation for v0.2.0 REST + external WS auth.

Bearer tokens stored as SHA-256 hashes; secrets are 256-bit CSPRNG so
Argon2 would waste cost without security gain.

Adds mesh.api_key table, migration 0023 applied manually to prod, and
helpers: createApiKey, listApiKeys, revokeApiKey, verifyApiKey.

Next slices: CLI apikey verbs and REST endpoints in apps/web router.

Spec: .artifacts/specs/2026-05-02-v0.2.0-scope.md

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-02 02:09:44 +01:00
Alejandro Gutiérrez
f98c2de5a3 fix(broker): topic-tagged sends bypass direct-target pre-flight
Some checks failed
CI / Lint (push) Has been cancelled
CI / Typecheck (push) Has been cancelled
CI / Broker tests (Postgres) (push) Has been cancelled
CI / Docker build (linux/amd64) (push) Has been cancelled
handleSend's pre-flight check rejected #<topicId> sends because the
target wasn't matched by @group / * / pubkey, so it fell into the
"direct" branch and looked for a peer with that pubkey. Topic targets
need their own class — delivery happens via topic_member, not by
matching connected peers.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-02 02:01:35 +01:00
Alejandro Gutiérrez
1afae7a507 feat(broker+cli): topics — conversation scope within a mesh (v0.2.0)
Some checks failed
CI / Lint (push) Has been cancelled
CI / Typecheck (push) Has been cancelled
CI / Broker tests (Postgres) (push) Has been cancelled
CI / Docker build (linux/amd64) (push) Has been cancelled
Adds the third axis of mesh organization: mesh = trust boundary,
group = identity tag, topic = conversation scope. Topic-tagged
messages filter delivery by topic_member rows and persist to a
topic_message history table for back-scroll on reconnect.

Schema (additive):
- mesh.topic, mesh.topic_member, mesh.topic_message tables
- topic_visibility (public|private|dm) and topic_member_role
  (lead|member|observer) enums
- migration 0022_topics.sql, hand-written following project convention
  (drizzle journal has been drifting since 0011)

Broker:
- 10 helpers (createTopic, listTopics, findTopicByName, joinTopic,
  leaveTopic, topicMembers, getMemberTopicIds, appendTopicMessage,
  topicHistory, markTopicRead)
- drainForMember matches "#<topicId>" target_specs via member's
  topic memberships
- 7 WS handlers (topic_create/list/join/leave/members/history/mark_read)
  + resolveTopicId helper accepting id-or-name
- handleSend auto-persists topic-tagged messages to history

CLI:
- claudemesh topic create/list/join/leave/members/history/read
- claudemesh send "#deploys" "..." resolves topic name to id
- bundled skill teaches Claude the DM/group/topic decision matrix
- policy-classify recognizes topic create/join/leave as writes

Spec: .artifacts/specs/2026-05-02-v0.2.0-scope.md

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-02 01:53:42 +01:00
Alejandro Gutiérrez
b49e9a9b61 feat(cli+broker): three-tier peer removal: disconnect, kick, ban
Some checks failed
CI / Lint (push) Has been cancelled
CI / Typecheck (push) Has been cancelled
CI / Broker tests (Postgres) (push) Has been cancelled
CI / Docker build (linux/amd64) (push) Has been cancelled
Broker (apps/broker/src/index.ts)
- Unified disconnect/kick handler uses close code 1000 for disconnect
  (CLI auto-reconnects) vs 4001 for kick (CLI exits, no reconnect).
- Ban now closes with code 4002.
- Hello handler: revoked members get a specific 'revoked' error with a
  'Contact the mesh owner to rejoin' message, then ws.close(4002).
  Previously banned users saw the generic 'unauthorized' error.
- list_bans handler returns { name, pubkey, revokedAt } for each
  revoked member.

CLI (apps/cli)
- ws-client: close codes 4001 and 4002 set .closed = true and stash
  .terminalClose so callers can surface a friendly message instead of
  the low-level 'ws terminal close' error. Revoked error in hello is
  also captured as a terminal close.
- withMesh catches terminalClose and prints:
  4001 → 'Kicked from this mesh. Run claudemesh to rejoin.'
  4002 → the broker's 'Contact the mesh owner to rejoin.' message
- kick.ts now exports runDisconnect + runKick with clear hints:
  'disconnect' → 'They will auto-reconnect within seconds.'
  'kick'       → 'They can rejoin anytime by running claudemesh.'
- cli.ts adds 'disconnect' dispatch; HELP updated.

Semantics:
  disconnect: session reset, no DB state, auto-reconnects
  kick      : session ends, no DB state, user must manually rejoin
  ban       : session ends + revokedAt set, cannot rejoin until unban

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-20 09:55:05 +01:00
Alejandro Gutiérrez
3ceac68e67 feat(cli+broker): kick, ban, unban, bans commands
Some checks failed
CI / Lint (push) Has been cancelled
CI / Typecheck (push) Has been cancelled
CI / Broker tests (Postgres) (push) Has been cancelled
CI / Docker build (linux/amd64) (push) Has been cancelled
Broker WS handlers:
- kick: disconnect peer(s) by name, --stale duration, or --all.
  Authz: owner or admin only. Closes WS + marks presence disconnected.
- ban: kick + set revokedAt on mesh.member. Hello already rejects
  revoked members, so ban is instant and permanent until unban.
- unban: clear revokedAt. Peer can rejoin with their existing keypair.
- list_bans: return all revoked members for a mesh.

Session-id dedup (previous commit): handleHello disconnects ghost
presences with matching (meshId, sessionId) before inserting the new
one. Eliminates duplicate entries after broker restarts.

CLI (alpha.37):
- claudemesh kick <peer|--stale 30m|--all>
- claudemesh ban/unban <peer>
- claudemesh bans [--json]
- Uses new sendAndWait() on ws-client for request-response pattern
  over WS (generic _reqId resolver).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-17 08:37:38 +01:00
Alejandro Gutiérrez
5ddb11b2d5 fix(broker): dedup presences by session_id on hello
Some checks failed
CI / Lint (push) Has been cancelled
CI / Typecheck (push) Has been cancelled
CI / Broker tests (Postgres) (push) Has been cancelled
CI / Docker build (linux/amd64) (push) Has been cancelled
When a client reconnects with the same session_id before the 90s
stale sweeper runs, the old ghost presence stays in the connections
map. Result: duplicate entries in list_peers for the same Claude
Code instance.

Now: handleHello iterates connections for matching (meshId, sessionId),
closes the old WS, deletes from map, marks disconnected in DB.
One session_id = one presence, always.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-16 21:40:25 +01:00
Alejandro Gutiérrez
2edbfce7d3 fix(broker): add BROKER_SKIP_MIGRATE=1 escape hatch for manual-migrated DBs
Some checks failed
CI / Lint (push) Has been cancelled
CI / Typecheck (push) Has been cancelled
CI / Broker tests (Postgres) (push) Has been cancelled
CI / Docker build (linux/amd64) (push) Has been cancelled
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-16 12:59:28 +01:00
Alejandro Gutiérrez
9f3a82dd63 fix(broker): use sql.unsafe for SET lock_timeout in migrate
Some checks failed
CI / Lint (push) Has been cancelled
CI / Typecheck (push) Has been cancelled
CI / Broker tests (Postgres) (push) Has been cancelled
CI / Docker build (linux/amd64) (push) Has been cancelled
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-16 12:55:04 +01:00
Alejandro Gutiérrez
05729ad8a4 feat(ga): close remaining GA blockers (backcompat, HA prep, tests, docs)
Some checks failed
CI / Lint (push) Has been cancelled
CI / Typecheck (push) Has been cancelled
CI / Broker tests (Postgres) (push) Has been cancelled
CI / Docker build (linux/amd64) (push) Has been cancelled
Backwards compat shim (task 27)
- requireCliAuth() falls back to body.user_id when BROKER_LEGACY_AUTH=1
  and no bearer present. Sets Deprecation + Warning headers + bumps a
  broker_legacy_auth_hits_total metric so operators can watch the
  legacy traffic drain to 0 before removing the shim.
- All handlers parse body BEFORE requireCliAuth so the fallback can
  read user_id out of it.

HA readiness (task 29)
- .artifacts/specs/2026-04-15-broker-ha-statelessness-audit.md
  documents every in-memory symbol and rollout plan (phase 0-4).
- packaging/docker-compose.ha-local.yml spins up 2 broker replicas
  behind Traefik sticky sessions for local smoke testing.
- apps/broker/src/audit.ts now wraps writes in a transaction that
  takes pg_advisory_xact_lock(meshId) and re-reads the tail hash
  inside the txn. Concurrent broker replicas can no longer fork the
  audit chain.

Deploy gate (task 30)
- /health stays permissive (200 even on transient DB blips) so
  Docker doesn't kill the container on a glitch.
- New /health/ready checks DB + optional EXPECTED_MIGRATION pin,
  returns 503 if either fails. External deploy gate can poll this
  and refuse to promote a broken deploy.

Metrics dashboard (task 32)
- packaging/grafana/claudemesh-broker.json: ready-to-import Grafana
  dashboard covering active conns, queue depth, routed/rejected
  rates, grant drops, legacy-auth hits, conn rejects.

Tests (task 28)
- audit-canonical.test.ts (4 tests) pins canonical JSON semantics.
- grants-enforcement.test.ts (6 tests) covers the member-then-
  session-pubkey lookup with default/explicit/blocked branches.

Docs (task 34)
- docs/env-vars.md catalogues every env var the broker + CLI read.

Crypto review prep (task 35)
- .artifacts/specs/2026-04-15-crypto-review-packet.md: reviewer
  brief, threat model, scope, test coverage list, deliverables.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-15 23:51:28 +01:00
Alejandro Gutiérrez
2be5e9dccb fix(security): resolve all 17 codex findings — auth, grants, crypto, ops
Some checks failed
CI / Lint (push) Has been cancelled
CI / Typecheck (push) Has been cancelled
CI / Broker tests (Postgres) (push) Has been cancelled
CI / Docker build (linux/amd64) (push) Has been cancelled
Critical: broker HTTP auth via cli_session bearer token on all /cli/*;
file download requires auth+membership; v2 claim gated; duplicate
claimInviteV2Core removed; grant enforcement tries member then
session pubkey; audit hash uses canonical sorted-keys JSON.

High: rate limit args fixed (burst 10, 60/min) + both buckets swept;
BROKER_ENCRYPTION_KEY fail-fast in prod; migrate uses pg_try + lock_
timeout; hello validates sessionPubkey hex; blocked DMs rejected pre-
queue; watch timers cleaned on disconnect.

Medium: inbound pushes serialized; reconnect jitter + timer guard;
hardcoded URLs through env; v2 claim path configurable.

Low: WSHelloMessage optional protocolVersion+capabilities.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-15 19:18:25 +01:00
Alejandro Gutiérrez
1a7a059e75 fix: queue TTL + per-member send rate limit + size cap + no-recipient reject + ack.error
Some checks failed
CI / Lint (push) Has been cancelled
CI / Typecheck (push) Has been cancelled
CI / Broker tests (Postgres) (push) Has been cancelled
CI / Docker build (linux/amd64) (push) Has been cancelled
Broker (all need redeploy):
- sweepOrphanMessages: DELETE undelivered message_queue rows older
  than 7 days; hourly sweep. Stops unbounded growth when a sender
  typos a name (queued forever, never claimed).
- Per-member send rate limit: TokenBucket(60/min, burst 10) keyed on
  memberId so reconnecting can't bypass. Surfaces as queued=false,
  error='rate_limit: ...'.
- Pre-flight size cap: reject at handleSend if nonce+ciphertext+
  targetSpec exceeds env.MAX_MESSAGE_BYTES with a clear error
  instead of silent WSS frame-level kill.
- No-recipient reject: for direct sends, check any matching peer
  is connected BEFORE queueing. Kills the self-send silent drop
  (sending to your own pubkey when you only have one session
  connected) and typo-to-offline-peer silent drops.
- WSAckMessage.error field added for structured failure reasons.

CLI:
- ws-client ack handler reads msg.queued and msg.error; surfaces
  rate_limit / too_large / no_recipient to callers instead of
  returning ok:true with a dummy messageId.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-15 14:44:09 +01:00
Alejandro Gutiérrez
3dfab0f792 fix(broker): don't broadcast peer_joined/peer_left/peer_returned to same-pubkey sessions
Some checks failed
CI / Lint (push) Has been cancelled
CI / Typecheck (push) Has been cancelled
CI / Broker tests (Postgres) (push) Has been cancelled
CI / Docker build (linux/amd64) (push) Has been cancelled
When a user opens multiple Claude Code instances on one laptop they
all share the same memberPubkey (one identity, one config.json). The
broker was broadcasting each Claude Code start/stop to every OTHER
session of the same user — showing as 'peer agutierrez left / joined'
spam in every active claude terminal.

Now: skip broadcast to presences whose memberPubkey equals the joining
or leaving presence's memberPubkey. Other actual peers on the mesh
still see the event.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-15 14:28:57 +01:00
Alejandro Gutiérrez
94e914f476 fix(broker): reject mesh create without valid pubkey
Some checks failed
CI / Lint (push) Has been cancelled
CI / Typecheck (push) Has been cancelled
CI / Broker tests (Postgres) (push) Has been cancelled
CI / Docker build (linux/amd64) (push) Has been cancelled
Older CLIs sometimes called POST /cli/mesh/create without a pubkey,
and the broker stored the string 'pending' as peer_pubkey on the
owner's mesh.member row. Every subsequent hello from the real CLI
failed the membership lookup silently, leaving the connection in
'reconnecting' forever with no useful log line.

Now: validate pubkey is 64 hex chars before creating the owner
member row. Existing 'pending' rows on prod were patched manually.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-15 12:50:11 +01:00
Alejandro Gutiérrez
ee12510ef1 refactor: rename cli-v2 → cli, archive legacy cli, plus broker-side grants + auto-migrate
Some checks failed
CI / Lint (push) Has been cancelled
CI / Typecheck (push) Has been cancelled
CI / Broker tests (Postgres) (push) Has been cancelled
CI / Docker build (linux/amd64) (push) Has been cancelled
- apps/cli/ is now the canonical CLI (was apps/cli-v2/).
- apps/cli/ legacy v0 archived as branch 'legacy-cli-archive' and tag
  'cli-v0-legacy-final' before deletion; git history preserves it too.
- .github/workflows/release-cli.yml paths updated.
- pnpm-lock.yaml regenerated.

Broker-side peer-grant enforcement (spec: 2026-04-15-per-peer-capabilities):
- 0020_peer-grants.sql adds peer_grants jsonb + GIN index on mesh.member.
- handleSend in broker fetches recipient grant maps once per send, drops
  messages silently when sender lacks the required capability.
- POST /cli/mesh/:slug/grants to update from CLI; broker_messages_dropped_by_grant_total metric.
- CLI grant/revoke/block now mirror to broker via syncToBroker.

Auto-migrate on broker startup:
- apps/broker/src/migrate.ts runs drizzle migrate with pg_advisory_lock
  before the HTTP server binds. Exits non-zero on failure so Coolify
  healthcheck fails closed.
- Dockerfile copies packages/db/migrations into /app/migrations.
- postgres 3.4.5 added as direct broker dep.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-15 08:44:52 +01:00
Alejandro Gutiérrez
ce52fcef2d feat(invite): branded email + one-command install+launch UX
Some checks failed
CI / Lint (push) Has been cancelled
CI / Typecheck (push) Has been cancelled
CI / Broker tests (Postgres) (push) Has been cancelled
CI / Docker build (linux/amd64) (push) Has been cancelled
Email (broker):
- Rebrand mesh-invitation.tsx to match site (clay accent #d97757,
  cream fg, Anthropic Serif/Mono, dark bg). Mesh glyph in header.
- Hero CTA links to the /i/short URL landing page.
- Single one-liner 'npm i -g claudemesh-cli && claudemesh launch --join URL'
  so new users copy once, paste once, done.

Web InstallToggle:
- Replace two-step numbered list with single one-liner in the first-time
  panel. Reduces copy/paste ops from 2 to 1 and stops prescribing
  'YourName' as a literal (CLI now defaults to $USER).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-15 02:14:27 +01:00
Alejandro Gutiérrez
77ee1d0d80 feat(broker): branded react-email template for mesh invite
Some checks failed
CI / Lint (push) Has been cancelled
CI / Typecheck (push) Has been cancelled
CI / Broker tests (Postgres) (push) Has been cancelled
CI / Docker build (linux/amd64) (push) Has been cancelled
Replaces the plain-text invite email with a standalone react-email
template (apps/broker/src/emails/mesh-invitation.tsx) using
@react-email/components + Tailwind. Rendered on demand in
handleCliMeshInvite and sent as both HtmlBody and TextBody via
Postmark (or html+text via Resend).

Self-contained — no dependency on @turbostarter/email, i18n, or ui
packages. Adds react, react-dom, @react-email/components, @react-email/render
to broker deps. Enables tsconfig jsx: react-jsx and .tsx includes.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-15 02:04:28 +01:00
Alejandro Gutiérrez
2f27a5eef4 feat(broker): actually send invite email via Postmark, return emailed flag
Some checks failed
CI / Lint (push) Has been cancelled
CI / Typecheck (push) Has been cancelled
CI / Broker tests (Postgres) (push) Has been cancelled
CI / Docker build (linux/amd64) (push) Has been cancelled
Broker now sends the invite email when body.email is provided and
POSTMARK_API_KEY (or RESEND_API_KEY) is configured. Returns
`emailed: boolean` so the CLI can honestly report whether the email
was sent instead of falsely claiming success on link generation.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-15 01:48:53 +01:00
Alejandro Gutiérrez
32851419e6 fix(broker): generate owner keys on CLI mesh create + proper invite signing
Some checks failed
CI / Lint (push) Has been cancelled
CI / Typecheck (push) Has been cancelled
CI / Broker tests (Postgres) (push) Has been cancelled
CI / Docker build (linux/amd64) (push) Has been cancelled
handleCliMeshCreate now generates ownerPubkey/ownerSecretKey/rootKey so
CLI-created meshes can issue invites. handleCliMeshInvite builds the
full signed v1 payload + v2 capability (matching createMyInvite in
packages/api) and self-heals meshes created by older broker versions
that are missing keys.

Fixes 500 on claudemesh share after CLI mesh create.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-15 00:37:16 +01:00
Alejandro Gutiérrez
e2b6e53cc1 feat(broker): add POST /cli/mesh/:slug/invite endpoint
Some checks failed
CI / Lint (push) Has been cancelled
CI / Typecheck (push) Has been cancelled
CI / Broker tests (Postgres) (push) Has been cancelled
CI / Docker build (linux/amd64) (push) Has been cancelled
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-13 22:10:34 +01:00
Alejandro Gutiérrez
3595fc2c4d feat(broker): add list_services and list_commands tools to telegram AI
Some checks failed
CI / Lint (push) Has been cancelled
CI / Typecheck (push) Has been cancelled
CI / Broker tests (Postgres) (push) Has been cancelled
CI / Docker build (linux/amd64) (push) Has been cancelled
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-13 21:20:00 +01:00
Alejandro Gutiérrez
2825ef7151 feat(broker): add conversation memory to telegram AI (10-turn window)
Some checks failed
CI / Typecheck (push) Has been cancelled
CI / Broker tests (Postgres) (push) Has been cancelled
CI / Docker build (linux/amd64) (push) Has been cancelled
CI / Lint (push) Has been cancelled
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-13 21:09:32 +01:00
Alejandro Gutiérrez
a9858ef876 fix(broker): teach AI difference between mesh names and peer names
Some checks failed
CI / Typecheck (push) Has been cancelled
CI / Lint (push) Has been cancelled
CI / Broker tests (Postgres) (push) Has been cancelled
CI / Docker build (linux/amd64) (push) Has been cancelled
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-13 21:06:08 +01:00
Alejandro Gutiérrez
6836a495a4 fix(broker): switch telegram AI to HTML formatting + strip markdown
Some checks failed
CI / Lint (push) Has been cancelled
CI / Typecheck (push) Has been cancelled
CI / Broker tests (Postgres) (push) Has been cancelled
CI / Docker build (linux/amd64) (push) Has been cancelled
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-13 20:58:45 +01:00
Alejandro Gutiérrez
07720f8f1e feat(broker): add list_meshes tool + multilingual AI responses
Some checks failed
CI / Lint (push) Has been cancelled
CI / Typecheck (push) Has been cancelled
CI / Broker tests (Postgres) (push) Has been cancelled
CI / Docker build (linux/amd64) (push) Has been cancelled
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-13 20:53:03 +01:00
Alejandro Gutiérrez
f4881b21b0 feat(broker): add claude-powered telegram bot with tool calling
Some checks failed
CI / Lint (push) Has been cancelled
CI / Typecheck (push) Has been cancelled
CI / Broker tests (Postgres) (push) Has been cancelled
CI / Docker build (linux/amd64) (push) Has been cancelled
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-13 20:40:16 +01:00
Alejandro Gutiérrez
4561076904 fix(broker): accept pubkey in mesh create + use in member row
Some checks failed
CI / Lint (push) Has been cancelled
CI / Typecheck (push) Has been cancelled
CI / Broker tests (Postgres) (push) Has been cancelled
CI / Docker build (linux/amd64) (push) Has been cancelled
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-13 19:02:09 +01:00
Alejandro Gutiérrez
0d53f2ae52 fix(broker): use raw SQL for mesh create to avoid Drizzle default issues
Some checks failed
CI / Lint (push) Has been cancelled
CI / Typecheck (push) Has been cancelled
CI / Broker tests (Postgres) (push) Has been cancelled
CI / Docker build (linux/amd64) (push) Has been cancelled
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-13 18:57:54 +01:00
Alejandro Gutiérrez
b328e78bd3 fix(broker): import generateId for mesh create handler
Some checks failed
CI / Lint (push) Has been cancelled
CI / Typecheck (push) Has been cancelled
CI / Broker tests (Postgres) (push) Has been cancelled
CI / Docker build (linux/amd64) (push) Has been cancelled
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-13 18:47:35 +01:00
Alejandro Gutiérrez
23604a125e fix(broker): mesh list includes owner meshes + auto-increment slug
Some checks failed
CI / Lint (push) Has been cancelled
CI / Typecheck (push) Has been cancelled
CI / Broker tests (Postgres) (push) Has been cancelled
CI / Docker build (linux/amd64) (push) Has been cancelled
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-13 18:12:06 +01:00
Alejandro Gutiérrez
b680260c8d feat(broker): add POST /cli/mesh/create endpoint
Some checks failed
CI / Lint (push) Has been cancelled
CI / Typecheck (push) Has been cancelled
CI / Broker tests (Postgres) (push) Has been cancelled
CI / Docker build (linux/amd64) (push) Has been cancelled
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-13 18:04:41 +01:00
Alejandro Gutiérrez
b65a545ece feat(broker): add /cli/meshes endpoint for merged mesh list
Some checks failed
CI / Typecheck (push) Has been cancelled
CI / Broker tests (Postgres) (push) Has been cancelled
CI / Lint (push) Has been cancelled
CI / Docker build (linux/amd64) (push) Has been cancelled
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-13 14:39:08 +01:00
Alejandro Gutiérrez
d07cff788c feat: three-token auth flow (session_id + user_code + device_code)
Some checks failed
CI / Lint (push) Has been cancelled
CI / Typecheck (push) Has been cancelled
CI / Broker tests (Postgres) (push) Has been cancelled
CI / Docker build (linux/amd64) (push) Has been cancelled
- session_id (clm_sess_...) in browser URL — identifies login attempt
- user_code (ABCD-EFGH) visual confirmation — shown in both terminal and browser
- device_code (secret) — CLI polls with this, never displayed
- CLI accepts stdin paste of JWT token while polling (race)
- Web page handles both ?session= and ?code= params

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-13 12:19:08 +01:00
Alejandro Gutiérrez
bb1310167e feat: granular mesh permissions + mesh delete + share picker
Some checks failed
CI / Lint (push) Has been cancelled
CI / Typecheck (push) Has been cancelled
CI / Broker tests (Postgres) (push) Has been cancelled
CI / Docker build (linux/amd64) (push) Has been cancelled
- Drizzle schema: mesh.permission table with 11 boolean flags
- Default permissions by role (owner > admin > member)
- Broker: GET/POST /cli/mesh/:slug/permissions
- Broker: DELETE /cli/mesh/:slug (owner only, soft delete)
- Broker: permission check module (getPermissions, checkPermission, setPermissions)
- CLI: mesh share with interactive mesh picker
- CLI: mesh delete with server-side delete + confirmation

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-13 12:03:28 +01:00
Alejandro Gutiérrez
ea4e3b03bb feat: paste-token auth flow for CLI
Some checks failed
CI / Broker tests (Postgres) (push) Has been cancelled
CI / Docker build (linux/amd64) (push) Has been cancelled
CI / Lint (push) Has been cancelled
CI / Typecheck (push) Has been cancelled
- Broker: POST /cli/token generates a 30-day JWT
- Web: /token page with Generate + Copy button
- Web: /api/auth/cli/token proxies to broker
- CLI: login option 3 "Paste a token" for headless environments

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-13 11:17:38 +01:00
Alejandro Gutiérrez
ca441dae45 feat(broker): device-code auth with PostgreSQL persistence
Some checks failed
CI / Typecheck (push) Has been cancelled
CI / Lint (push) Has been cancelled
CI / Broker tests (Postgres) (push) Has been cancelled
CI / Docker build (linux/amd64) (push) Has been cancelled
- Drizzle schema: device_code + cli_session tables in mesh pgSchema
- Broker endpoints: POST /cli/device-code, GET /cli/device-code/:code,
  POST /cli/device-code/:code/approve, GET /cli/sessions
- Web app API routes now proxy to broker (no in-memory state)
- Tracks devices per user: hostname, platform, arch, last_seen, token_hash
- JWT signed with CLI_SYNC_SECRET, 30-day expiry
- Session revocation support via revokedAt column

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-13 08:22:13 +01:00
Alejandro Gutiérrez
c1fa3bcb5c feat: anthropic-style mesh + invite redesign (wave 1 checkpoint)
Ships the user-visible friction fixes and the foundation for the v2
invite protocol. API wiring + CLI client + email UI ship in wave 2.

Meshes — shipped
- Drop global UNIQUE on mesh.slug; mesh.id is canonical everywhere
- Server derives slug from name; create form has no slug field
- Two users can freely name their mesh "platform"; no collision errors
- Migration 0017

Invites v1 — shipped (URL shortener, backward compatible)
- New invite.code column (base62, 8 chars, nullable unique index)
- createMyInvite mints both token + short code; returns shortUrl
- GET /api/public/invite-code/:code resolves short code to token
- New route /i/[code] server-redirects to /join/[token]
- Invite generator UI shows short URL; QR encodes short URL
- Advanced fields (role/maxUses/expiresInDays) collapsed under disclosure
- Migration 0018

Invites v2 — foundation (broker + DB only; API+CLI+Web wiring in wave 2)
- Broker: canonicalInviteV2, verifyInviteV2, sealRootKeyToRecipient
- Broker: POST /invites/:code/claim endpoint (atomic single-use accounting)
- Broker tests: invite-v2.test.ts (signature, expiry, revocation, exhaustion)
- DB: mesh.invite gains version/capabilityV2/claimedByPubkey columns
- DB: new mesh.pending_invite table for email invites
- Migration 0019
- Contract locked in docs/protocol.md §v2 + SPEC.md §14b

Consent landing — shipped
- /join/[token] redesigned: explicit role, inviter, mesh stats, consent
- New server components: invite-card, role-badge, inviter-line, consent-summary
- "Join [mesh] as [Role]" primary action (not just "Join")

Error surfacing — shipped
- handle() now parses {error} responses from hono route catch blocks
- onError fallback includes timestamp so handle() can match apiErrorSchema
- Real error messages reach the UI instead of "Something went wrong"

Docs
- SPEC.md §14b: v2 invite protocol
- docs/protocol.md: v2 claim wire format
- docs/roadmap.md: status
- .artifacts/specs/2026-04-10-anthropic-vision-meshes-invites.md

Deferred to wave 2/3
- API claim route wiring (packages/api)
- createMyInvite v2 capability generation
- Email invite mutation + Postmark delivery
- CLI v2 join flow (x25519 keypair + unseal)
- Web invite-generator email field + v2 display

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-10 13:41:11 +01:00
Alejandro Gutiérrez
dbea96960f fix(broker): plain text push messages, mesh slug in push label
Some checks failed
CI / Typecheck (push) Has been cancelled
CI / Lint (push) Has been cancelled
CI / Broker tests (Postgres) (push) Has been cancelled
CI / Docker build (linux/amd64) (push) Has been cancelled
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-10 02:27:22 +01:00
Alejandro Gutiérrez
a022da1998 fix(broker): show mesh slugs in /meshes + /status, remove all-meshes fallback
Some checks failed
CI / Lint (push) Has been cancelled
CI / Typecheck (push) Has been cancelled
CI / Broker tests (Postgres) (push) Has been cancelled
CI / Docker build (linux/amd64) (push) Has been cancelled
- /meshes and /status now show mesh slug names instead of truncated IDs
- meshSlug cached on connect and loaded from DB join on boot
- Remove dangerous fallback that connected to ALL meshes in email flow
- BridgeRow now includes optional meshSlug field

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-10 02:24:55 +01:00
Alejandro Gutiérrez
64266a75f7 fix(broker): plain text for email verification prompt (markdown parse error)
Some checks failed
CI / Lint (push) Has been cancelled
CI / Typecheck (push) Has been cancelled
CI / Broker tests (Postgres) (push) Has been cancelled
CI / Docker build (linux/amd64) (push) Has been cancelled
Masked email with asterisks broke Telegram Markdown bold syntax.
Use plain text for the code prompt message.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-09 17:15:10 +01:00
Alejandro Gutiérrez
2710f354a9 fix(broker): correct libsodium import in email connect callback
Some checks failed
CI / Lint (push) Has been cancelled
CI / Typecheck (push) Has been cancelled
CI / Broker tests (Postgres) (push) Has been cancelled
CI / Docker build (linux/amd64) (push) Has been cancelled
Dynamic import returns module wrapper, need .default.ready then .default
for the actual sodium functions.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-09 17:09:32 +01:00