Commit Graph

12 Commits

Author SHA1 Message Date
Alejandro Gutiérrez
716e674473 fix(broker+cli): multi-session DM routing + broadcast self-loopback (v0.3.2)
Some checks failed
CI / Lint (push) Has been cancelled
CI / Typecheck (push) Has been cancelled
CI / Broker tests (Postgres) (push) Has been cancelled
CI / Docker build (linux/amd64) (push) Has been cancelled
Two related bugs surfaced in multi-session production use of 1.8.0:

1. Replies via `claudemesh send <from_id>` rejected with "no connected
   peer for target" when the original sender's session had rotated
   (Claude Code restart, /resume). Root cause: from_id carried the
   ephemeral session pubkey, which disappears the moment the session
   ends. Fix: handleSend pre-flight now also resolves the target
   pubkey against the persistent meshMember table and routes to the
   owning member's live session(s); MCP push channel now sets from_id
   to the stable member pubkey and exposes the ephemeral one under
   from_session_pubkey.

2. Broadcast/* and @group sends loopback'd to the sender's *sibling*
   sessions (same member, different session keypair), surfacing a
   spurious "tampered or wrong keypair" decrypt warning on the
   sender's own inboxes. Fix: broadcast/group fan-out now skips by
   memberPubkey, not just by presence_id, so the entire sender member
   is excluded — direct sends keep per-presence skip so a member can
   still DM their own sibling session intentionally.

Push envelope now also carries senderMemberPubkey alongside
senderPubkey so any other client of the WS channel can choose the
right one.
2026-05-02 22:05:11 +01:00
Alejandro Gutiérrez
038a5b5bf7 feat(broker+api+cli): topic message reply-to threading (v0.3.1)
Some checks failed
CI / Lint (push) Has been cancelled
CI / Typecheck (push) Has been cancelled
CI / Broker tests (Postgres) (push) Has been cancelled
CI / Docker build (linux/amd64) (push) Has been cancelled
Adds a reply_to_id column (self-FK on topic_message) plus end-to-end
plumbing so a message can mark itself as a reply to a previous one in
the same topic.

- Schema: 0027_topic_message_reply_to.sql adds reply_to_id with
  ON DELETE SET NULL + index for backlink lookup.
- Broker: appendTopicMessage validates parent shares the topic, writes
  reply_to_id; topicHistory + topic_history_response surface it; WS
  push envelope now carries senderMemberId, senderName, topic name,
  reply_to_id, and message_id so recipients have everything they need
  to reply without a follow-up query.
- REST: POST /v1/messages accepts replyToId (validated server-side);
  GET /messages and SSE /stream emit it per row.
- CLI: \`topic post --reply-to <id|prefix>\` resolves prefixes against
  recent history; \`topic tail\` renders an "↳ in reply to <name>:
  <snippet>" line above replies and shows a copyable #shortid tag on
  every row.
- MCP push pipe: channel attributes now include from_pubkey,
  from_member_id, message_id, topic, reply_to_id — the recipient can
  thread a reply directly from the inbound notification.
- Skill + identity prompt updated to teach Claude how to use the new
  attributes for replies.

Bumped CLI to 1.9.0.
2026-05-02 21:58:21 +01:00
Alejandro Gutiérrez
0f32529370 fix(apikey): revoke must verify a row was actually updated
Some checks failed
CI / Lint (push) Has been cancelled
CI / Typecheck (push) Has been cancelled
CI / Broker tests (Postgres) (push) Has been cancelled
CI / Docker build (linux/amd64) (push) Has been cancelled
claudemesh apikey revoke <id> reported success even when the input
didn't match any row in mesh.api_key. The CLI's `apikey list` shows
truncated 8-char prefixes; users naturally paste those; broker did
exact-id match against meshApiKey.id; UPDATE affected 0 rows; old
revokeApiKey returned void so the CLI couldn't tell. Discovered via
end-to-end CLI smoke test against prod (roadmap validation pass).

Three-part fix:

- broker.revokeApiKey now returns
  { status: "revoked"|"not_found"|"not_unique"; id?, matches? } and
  accepts either the full id or a unique prefix (>=6 chars). Prefix
  matching is bounded to the caller's mesh and only succeeds if
  exactly one row matches; ambiguous prefixes return not_unique so
  we never silently revoke the wrong key.

- New WSApiKeyRevokeResponseMessage carries the structured status
  back to the CLI. Old apikey_revoke_ok type removed before being
  released — never shipped to users. The error path is no longer
  used for not_found/not_unique cases; the unified response carries
  both outcomes.

- CLI's apiKeyRevoke now resolves with { ok, id } | { ok: false,
  code, message }. runApiKeyRevoke surfaces the code/message and
  exits non-zero on failure (NOT_FOUND for missing, INVALID_ARGS
  for ambiguous prefix).

Net effect: pasting `claudemesh apikey revoke vq0fwjdX` now actually
revokes the key whose id starts with vq0fwjdX (or fails loud if 0
or >1 keys match). Verified against prod via the new branch's CLI
binary before commit.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-02 18:39:25 +01:00
Alejandro Gutiérrez
13d691980a feat(broker+cli): apikey create/list/revoke verbs (v0.2.0 #71)
Some checks failed
CI / Lint (push) Has been cancelled
CI / Typecheck (push) Has been cancelled
CI / Broker tests (Postgres) (push) Has been cancelled
CI / Docker build (linux/amd64) (push) Has been cancelled
Issuance flow over WS for now (REST endpoints come next slice).
Plaintext secret returned ONCE on create — never recoverable.

- broker: 3 WS handlers (apikey_create/list/revoke), wire types in
  union, audit log on issuance + revoke
- ws-client: apiKeyCreate/List/Revoke with resolver maps, response
  dispatch
- CLI: claudemesh apikey create <label> [--cap a,b] [--topic c,d]
  [--expires ISO]; list shows status, scope, last-used; revoke by id
- policy: apikey create + revoke prompt by default (issuing or
  disabling a credential is meaningful)

Default capability set is "send,read" — least privilege for unscoped
keys (admin must explicitly opt-in).

Spec: .artifacts/specs/2026-05-02-v0.2.0-scope.md

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-02 02:13:12 +01:00
Alejandro Gutiérrez
1afae7a507 feat(broker+cli): topics — conversation scope within a mesh (v0.2.0)
Some checks failed
CI / Lint (push) Has been cancelled
CI / Typecheck (push) Has been cancelled
CI / Broker tests (Postgres) (push) Has been cancelled
CI / Docker build (linux/amd64) (push) Has been cancelled
Adds the third axis of mesh organization: mesh = trust boundary,
group = identity tag, topic = conversation scope. Topic-tagged
messages filter delivery by topic_member rows and persist to a
topic_message history table for back-scroll on reconnect.

Schema (additive):
- mesh.topic, mesh.topic_member, mesh.topic_message tables
- topic_visibility (public|private|dm) and topic_member_role
  (lead|member|observer) enums
- migration 0022_topics.sql, hand-written following project convention
  (drizzle journal has been drifting since 0011)

Broker:
- 10 helpers (createTopic, listTopics, findTopicByName, joinTopic,
  leaveTopic, topicMembers, getMemberTopicIds, appendTopicMessage,
  topicHistory, markTopicRead)
- drainForMember matches "#<topicId>" target_specs via member's
  topic memberships
- 7 WS handlers (topic_create/list/join/leave/members/history/mark_read)
  + resolveTopicId helper accepting id-or-name
- handleSend auto-persists topic-tagged messages to history

CLI:
- claudemesh topic create/list/join/leave/members/history/read
- claudemesh send "#deploys" "..." resolves topic name to id
- bundled skill teaches Claude the DM/group/topic decision matrix
- policy-classify recognizes topic create/join/leave as writes

Spec: .artifacts/specs/2026-05-02-v0.2.0-scope.md

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-02 01:53:42 +01:00
Alejandro Gutiérrez
b49e9a9b61 feat(cli+broker): three-tier peer removal: disconnect, kick, ban
Some checks failed
CI / Lint (push) Has been cancelled
CI / Typecheck (push) Has been cancelled
CI / Broker tests (Postgres) (push) Has been cancelled
CI / Docker build (linux/amd64) (push) Has been cancelled
Broker (apps/broker/src/index.ts)
- Unified disconnect/kick handler uses close code 1000 for disconnect
  (CLI auto-reconnects) vs 4001 for kick (CLI exits, no reconnect).
- Ban now closes with code 4002.
- Hello handler: revoked members get a specific 'revoked' error with a
  'Contact the mesh owner to rejoin' message, then ws.close(4002).
  Previously banned users saw the generic 'unauthorized' error.
- list_bans handler returns { name, pubkey, revokedAt } for each
  revoked member.

CLI (apps/cli)
- ws-client: close codes 4001 and 4002 set .closed = true and stash
  .terminalClose so callers can surface a friendly message instead of
  the low-level 'ws terminal close' error. Revoked error in hello is
  also captured as a terminal close.
- withMesh catches terminalClose and prints:
  4001 → 'Kicked from this mesh. Run claudemesh to rejoin.'
  4002 → the broker's 'Contact the mesh owner to rejoin.' message
- kick.ts now exports runDisconnect + runKick with clear hints:
  'disconnect' → 'They will auto-reconnect within seconds.'
  'kick'       → 'They can rejoin anytime by running claudemesh.'
- cli.ts adds 'disconnect' dispatch; HELP updated.

Semantics:
  disconnect: session reset, no DB state, auto-reconnects
  kick      : session ends, no DB state, user must manually rejoin
  ban       : session ends + revokedAt set, cannot rejoin until unban

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-20 09:55:05 +01:00
Alejandro Gutiérrez
a5347cebc0 fix(cli): silence "session restored" log for one-shot commands (alpha.40)
Some checks failed
CI / Lint (push) Has been cancelled
CI / Typecheck (push) Has been cancelled
CI / Broker tests (Postgres) (push) Has been cancelled
CI / Docker build (linux/amd64) (push) Has been cancelled
Add quiet opt to BrokerClient; withMesh passes quiet:true so commands
like peers/state/info/remind no longer print per-mesh restore chatter.
Long-running paths (launch, MCP) stay verbose.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-19 19:54:53 +01:00
Alejandro Gutiérrez
3ceac68e67 feat(cli+broker): kick, ban, unban, bans commands
Some checks failed
CI / Lint (push) Has been cancelled
CI / Typecheck (push) Has been cancelled
CI / Broker tests (Postgres) (push) Has been cancelled
CI / Docker build (linux/amd64) (push) Has been cancelled
Broker WS handlers:
- kick: disconnect peer(s) by name, --stale duration, or --all.
  Authz: owner or admin only. Closes WS + marks presence disconnected.
- ban: kick + set revokedAt on mesh.member. Hello already rejects
  revoked members, so ban is instant and permanent until unban.
- unban: clear revokedAt. Peer can rejoin with their existing keypair.
- list_bans: return all revoked members for a mesh.

Session-id dedup (previous commit): handleHello disconnects ghost
presences with matching (meshId, sessionId) before inserting the new
one. Eliminates duplicate entries after broker restarts.

CLI (alpha.37):
- claudemesh kick <peer|--stale 30m|--all>
- claudemesh ban/unban <peer>
- claudemesh bans [--json]
- Uses new sendAndWait() on ws-client for request-response pattern
  over WS (generic _reqId resolver).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-17 08:37:38 +01:00
Alejandro Gutiérrez
2be5e9dccb fix(security): resolve all 17 codex findings — auth, grants, crypto, ops
Some checks failed
CI / Lint (push) Has been cancelled
CI / Typecheck (push) Has been cancelled
CI / Broker tests (Postgres) (push) Has been cancelled
CI / Docker build (linux/amd64) (push) Has been cancelled
Critical: broker HTTP auth via cli_session bearer token on all /cli/*;
file download requires auth+membership; v2 claim gated; duplicate
claimInviteV2Core removed; grant enforcement tries member then
session pubkey; audit hash uses canonical sorted-keys JSON.

High: rate limit args fixed (burst 10, 60/min) + both buckets swept;
BROKER_ENCRYPTION_KEY fail-fast in prod; migrate uses pg_try + lock_
timeout; hello validates sessionPubkey hex; blocked DMs rejected pre-
queue; watch timers cleaned on disconnect.

Medium: inbound pushes serialized; reconnect jitter + timer guard;
hardcoded URLs through env; v2 claim path configurable.

Low: WSHelloMessage optional protocolVersion+capabilities.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-15 19:18:25 +01:00
Alejandro Gutiérrez
1a7a059e75 fix: queue TTL + per-member send rate limit + size cap + no-recipient reject + ack.error
Some checks failed
CI / Lint (push) Has been cancelled
CI / Typecheck (push) Has been cancelled
CI / Broker tests (Postgres) (push) Has been cancelled
CI / Docker build (linux/amd64) (push) Has been cancelled
Broker (all need redeploy):
- sweepOrphanMessages: DELETE undelivered message_queue rows older
  than 7 days; hourly sweep. Stops unbounded growth when a sender
  typos a name (queued forever, never claimed).
- Per-member send rate limit: TokenBucket(60/min, burst 10) keyed on
  memberId so reconnecting can't bypass. Surfaces as queued=false,
  error='rate_limit: ...'.
- Pre-flight size cap: reject at handleSend if nonce+ciphertext+
  targetSpec exceeds env.MAX_MESSAGE_BYTES with a clear error
  instead of silent WSS frame-level kill.
- No-recipient reject: for direct sends, check any matching peer
  is connected BEFORE queueing. Kills the self-send silent drop
  (sending to your own pubkey when you only have one session
  connected) and typo-to-offline-peer silent drops.
- WSAckMessage.error field added for structured failure reasons.

CLI:
- ws-client ack handler reads msg.queued and msg.error; surfaces
  rate_limit / too_large / no_recipient to callers instead of
  returning ok:true with a dummy messageId.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-15 14:44:09 +01:00
Alejandro Gutiérrez
39fe296aaa fix(cli): decrypt falls back to member secret key when session key fails
Some checks failed
CI / Broker tests (Postgres) (push) Has been cancelled
CI / Docker build (linux/amd64) (push) Has been cancelled
CI / Lint (push) Has been cancelled
CI / Typecheck (push) Has been cancelled
When Alice's session-A encrypts a direct message to Bob (target = Bob's
stable member pubkey) and Bob's session-B receives it, Bob has BOTH an
ephemeral session secret key and the member secret key. The old code
only tried session_sk, then silently failed with '⚠ message from
<sender> failed to decrypt' even though the message was valid —
just encrypted to the member key.

Now: try session first, fall back to member on null. Matches the
sender side's choice freedom (encrypt using either key).

Repros when: user opens multiple Claude Code sessions (all use the
same member key but each generates its own session key), and one
session sends to another by display-name resolution which returns
the member pubkey.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-15 14:37:36 +01:00
Alejandro Gutiérrez
ee12510ef1 refactor: rename cli-v2 → cli, archive legacy cli, plus broker-side grants + auto-migrate
Some checks failed
CI / Lint (push) Has been cancelled
CI / Typecheck (push) Has been cancelled
CI / Broker tests (Postgres) (push) Has been cancelled
CI / Docker build (linux/amd64) (push) Has been cancelled
- apps/cli/ is now the canonical CLI (was apps/cli-v2/).
- apps/cli/ legacy v0 archived as branch 'legacy-cli-archive' and tag
  'cli-v0-legacy-final' before deletion; git history preserves it too.
- .github/workflows/release-cli.yml paths updated.
- pnpm-lock.yaml regenerated.

Broker-side peer-grant enforcement (spec: 2026-04-15-per-peer-capabilities):
- 0020_peer-grants.sql adds peer_grants jsonb + GIN index on mesh.member.
- handleSend in broker fetches recipient grant maps once per send, drops
  messages silently when sender lacks the required capability.
- POST /cli/mesh/:slug/grants to update from CLI; broker_messages_dropped_by_grant_total metric.
- CLI grant/revoke/block now mirror to broker via syncToBroker.

Auto-migrate on broker startup:
- apps/broker/src/migrate.ts runs drizzle migrate with pg_advisory_lock
  before the HTTP server binds. Exits non-zero on failure so Coolify
  healthcheck fails closed.
- Dockerfile copies packages/db/migrations into /app/migrations.
- postgres 3.4.5 added as direct broker dep.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-15 08:44:52 +01:00