Commit Graph

17 Commits

Author SHA1 Message Date
Alejandro Gutiérrez
1a14cef1e0 feat(cli): 1.31.0 — session autoclean + broker verification + service path
Some checks failed
CI / Lint (push) Has been cancelled
CI / Typecheck (push) Has been cancelled
CI / Broker tests (Postgres) (push) Has been cancelled
CI / Docker build (linux/amd64) (push) Has been cancelled
Three operability fixes for users running the daemon under launchd or
systemd.

PID-watcher autoclean
=====================

The session reaper already dropped registry entries with dead pids on
a 30s loop, but had two real-world gaps:

- 30s sweep let stale presence linger on the broker for half a minute
- bare process.kill(pid, 0) trusts a recycled pid; a registry entry
  could survive its real owner's death whenever the OS rolled the
  pid number forward to a new program

Process-exit IPC from claude-code is best-effort and skipped on
SIGKILL / OOM / segfault / panic, so it cannot replace the sweep.

Fix:

- New process-info.ts captures opaque per-process start-times via
  ps -o lstart= (works on macOS and Linux, ~1 ms per call)
- registerSession stores the start-time alongside the pid
- reapDead drops entries when pid is dead OR start-time changed
  since register
- Sweep cadence 30s -> 5s
- Best-effort fallback to bare liveness when start-time capture
  fails at register time

Registry hooks already close the per-session broker WS on
deregister, so peer list rebuilds within one sweep of any session
exit.

Service-managed daemon: no more "spawn failed" false alarms
===========================================================

After claudemesh install (which writes a launchd plist or systemd
unit with KeepAlive=true), users routinely saw

  [claudemesh] warn daemon spawn failed: socket did not appear
  within 3000ms

even when the daemon was running fine. Two contributing causes:

1. Probe timeout was 800ms — the first IPC after a launchd-driven
   restart can take longer (SQLite migration + broker WS opens) and
   tripped it. Bumped to 2500ms.
2. On a failed probe the CLI tried its own detached spawn, which
   collided with launchd's KeepAlive restart cycle (singleton lock
   fails, child exits) and we'd then time out polling for a socket
   that was actually about to come up.

Now: when the launchd plist or systemd unit exists, the CLI does not
attempt a spawn. It waits up to 8s for the OS-managed unit to bring
the socket up. New service-not-ready state distinguishes "OS hasn't
restarted it yet" from "we tried to spawn and it failed".

Install verifies broker connectivity, not just process start
============================================================

Previously install ended once launchctl reported the unit loaded —
a daemon that boots but cannot reach the broker (blocked :443,
expired TLS, DNS, broker outage) only surfaced on the user's first
peer list or send.

/v1/health now includes per-mesh broker WS state. install polls it
for up to 15s after service boot and prints either "broker
connected (mesh=...)" or a warning naming the meshes still in
connecting state, with a hint at common causes.

The verification is best-effort and does not fail the install — it
just surfaces the issue early.

Tests
=====

4 new vitest cases cover the reaper paths: dead pid, live pid plus
matching start-time, live pid plus mismatched start-time (PID
reuse), and the no-start-time fallback. 83 of 83 pass.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-04 14:05:44 +01:00
Alejandro Gutiérrez
71f7f81880 fix(cli): 1.30.2 — daemon service unit attaches to every joined mesh
Some checks failed
CI / Typecheck (push) Has been cancelled
CI / Lint (push) Has been cancelled
CI / Broker tests (Postgres) (push) Has been cancelled
CI / Docker build (linux/amd64) (push) Has been cancelled
claudemesh install was baking --mesh <primary> into the launchd plist /
systemd unit, locking the daemon to a single mesh and contradicting
1.26.0's multi-mesh design. users with >1 joined mesh fell off the
daemon path on every non-primary verb (cold-WS fallback, peer list
returning all meshes because the server-side filter ran against zero
attached state, "daemon spawn failed: socket did not appear" from
launched sessions in sibling meshes).

now: meshSlug is optional in InstallArgs; claudemesh install omits it
so the unit runs `claudemesh daemon up` with no flag, which attaches
to every joined mesh. `claudemesh daemon install-service --mesh <slug>`
is preserved as opt-in for single-mesh hosts and CI.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-04 13:44:11 +01:00
Alejandro Gutiérrez
6794aa8512 feat(daemon+mcp): daemon required for in-Claude-Code use; thin MCP shim
Some checks failed
CI / Lint (push) Has been cancelled
CI / Typecheck (push) Has been cancelled
CI / Broker tests (Postgres) (push) Has been cancelled
CI / Docker build (linux/amd64) (push) Has been cancelled
The architectural convergence v0.9.0 was building toward. CLI keeps
working without a daemon (claudemesh send/peer/inbox/...), but the MCP
push-pipe — which Claude Code uses for mid-turn channel emits, slash
commands, and resources — now requires the daemon. There is no fallback.

Daemon (additive):
- /v1/skills (list) and /v1/skills/:name (get) IPC endpoints, so the
  MCP shim can surface mesh skills without holding its own broker WS.
- listSkills() / getSkill() on DaemonBrokerClient.
- SSE 'message' event now carries plaintext body, sender_member_pubkey,
  priority, and subtype — full payload the MCP shim needs to render a
  channel notification.

MCP server: 979 → 469 LoC (470 of the remaining 469 is the unrelated
mesh-service proxy mode; the push-pipe path is ~200 LoC including
boilerplate).
- Probes ~/.claudemesh/daemon/daemon.sock at boot. Bails loudly with
  actionable instructions if missing.
- Subscribes to /v1/events SSE and translates each event into a
  notifications/claude/channel emit.
- Fetches mesh skills from the daemon for ListPrompts/GetPrompt and
  ListResources/ReadResource. ListTools returns []; the CLI is the API.
- No broker WS, no decryption, no reconnect logic. Daemon owns all of it.

claudemesh install: auto-installs and starts the daemon service for the
user's primary mesh (launchd / systemd-user). Pass --no-service to skip.

claudemesh launch: probes the daemon socket; if absent, spawns
'claudemesh daemon up --mesh <slug>' detached and waits up to 10s for
the socket. Surfaces a clear warning on timeout but doesn't block —
Claude Code's MCP shim will print the same error if the daemon really
isn't there.

Bundle: dist/entrypoints/mcp.js drops from 154KB → 104KB (gzipped 34KB
→ 19KB). Test: MCP boots cleanly via stdio, declares correct
capabilities, talks JSON-RPC; daemon /v1/skills returns the empty list
as expected on a mesh with no skills.

Released as 1.24.0 on npm.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-03 23:43:02 +01:00
Alejandro Gutiérrez
b4f457fceb feat(cli): 1.5.0 — CLI-first architecture, tool-less MCP, policy engine
Some checks failed
CI / Lint (push) Has been cancelled
CI / Typecheck (push) Has been cancelled
CI / Broker tests (Postgres) (push) Has been cancelled
CI / Docker build (linux/amd64) (push) Has been cancelled
CLI becomes the API; MCP becomes a tool-less push-pipe. Bundle -42%
(250 KB → 146 KB) after stripping ~1700 lines of dead tool handlers.

- Tool-less MCP: tools/list returns []. Inbound peer messages still
  arrive as experimental.claude/channel notifications mid-turn.
- Resource-noun-verb CLI: peer list, message send, memory recall, etc.
  Legacy flat verbs (peers, send, remember) remain as aliases.
- Bundled claudemesh skill auto-installed by `claudemesh install` —
  sole CLI-discoverability surface for Claude.
- Unix-socket bridge: CLI invocations dial the push-pipe's warm WS
  (~220 ms warm vs ~600 ms cold).
- --mesh <slug> flag: connect a session to multiple meshes.
- Policy engine: every broker-touching verb runs through a YAML gate
  at ~/.claudemesh/policy.yaml (auto-created). Destructive verbs
  prompt; non-TTY auto-denies. Audit log at ~/.claudemesh/audit.log.
- --approval-mode plan|read-only|write|yolo + --policy <path>.

Spec: .artifacts/specs/2026-05-02-architecture-north-star.md.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-02 01:18:19 +01:00
Alejandro Gutiérrez
ee12510ef1 refactor: rename cli-v2 → cli, archive legacy cli, plus broker-side grants + auto-migrate
Some checks failed
CI / Lint (push) Has been cancelled
CI / Typecheck (push) Has been cancelled
CI / Broker tests (Postgres) (push) Has been cancelled
CI / Docker build (linux/amd64) (push) Has been cancelled
- apps/cli/ is now the canonical CLI (was apps/cli-v2/).
- apps/cli/ legacy v0 archived as branch 'legacy-cli-archive' and tag
  'cli-v0-legacy-final' before deletion; git history preserves it too.
- .github/workflows/release-cli.yml paths updated.
- pnpm-lock.yaml regenerated.

Broker-side peer-grant enforcement (spec: 2026-04-15-per-peer-capabilities):
- 0020_peer-grants.sql adds peer_grants jsonb + GIN index on mesh.member.
- handleSend in broker fetches recipient grant maps once per send, drops
  messages silently when sender lacks the required capability.
- POST /cli/mesh/:slug/grants to update from CLI; broker_messages_dropped_by_grant_total metric.
- CLI grant/revoke/block now mirror to broker via syncToBroker.

Auto-migrate on broker startup:
- apps/broker/src/migrate.ts runs drizzle migrate with pg_advisory_lock
  before the HTTP server binds. Exits non-zero on failure so Coolify
  healthcheck fails closed.
- Dockerfile copies packages/db/migrations into /app/migrations.
- postgres 3.4.5 added as direct broker dep.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-15 08:44:52 +01:00
Alejandro Gutiérrez
b0dc538119 feat(cli): nudge user to join a mesh when install finds none
After MCP registration and hooks setup, `claudemesh install` now checks
the config for joined meshes. If empty, it prints actionable guidance
(join command + dashboard URL) instead of the generic "Next:" line.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-07 23:49:37 +01:00
Alejandro Gutiérrez
58ba01f20f fix(cli): sync CLAUDEMESH_TOOLS with current tool definitions and sort alphabetically
Add 4 missing tools (cancel_scheduled, grant_file_access, list_scheduled,
schedule_reminder) and sort the array alphabetically for maintainability.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-07 23:33:02 +01:00
Alejandro Gutiérrez
e76ade64d2 feat: scheduled messages — schedule_reminder, send_later, list_scheduled, cancel_scheduled
Some checks failed
CI / Lint (push) Has been cancelled
CI / Typecheck (push) Has been cancelled
CI / Broker tests (Postgres) (push) Has been cancelled
CI / Docker build (linux/amd64) (push) Has been cancelled
- Broker: schedule/list_scheduled/cancel_scheduled WS message types + in-memory delivery
- Client: scheduleMessage(), listScheduled(), cancelScheduled() with resolver Map pattern
- MCP: schedule_reminder, send_later, list_scheduled, cancel_scheduled tools
- CLI: claudemesh remind <msg> --in 2h | --at 15:00 | list | cancel <id>
- Types: WSScheduleMessage, WSScheduledAckMessage, WSScheduledListMessage, WSCancelScheduledAckMessage

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-07 14:53:42 +01:00
Alejandro Gutiérrez
9cefe863e3 fix(web): fully remove withPayload + admin routes from prod
Some checks failed
CI / Lint (push) Has been cancelled
CI / Typecheck (push) Has been cancelled
CI / Broker tests (Postgres) (push) Has been cancelled
CI / Docker build (linux/amd64) (push) Has been cancelled
withPayload crashes ALL routes with React #130 in standalone
output — even with admin page replaced by redirect. The wrapper
injects a client-side ConfigProvider that fails hydration.

Removed: withPayload wrapper, entire (payload) route group.
Kept: payload.config.ts, migrations, blog/changelog server-side
queries with graceful DB fallback.

Payload admin runs on local dev only (add withPayload back in
next.config when running pnpm dev). Production content via
static TSX pages or future API-based publishing.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-06 02:30:26 +01:00
Alejandro Gutiérrez
8bd8d1ff76 fix(web): remove payload REST API route + cli backup guards
Some checks failed
CI / Lint (push) Has been cancelled
CI / Typecheck (push) Has been cancelled
CI / Broker tests (Postgres) (push) Has been cancelled
CI / Docker build (linux/amd64) (push) Has been cancelled
Remove Payload's /api/[...slug] route that conflicts with existing
/api/[...route]. Blog/changelog pages use Payload's local API.

Includes cli install.ts backup + assertNoMcpLoss guards (from
worktree agent).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-06 01:11:09 +01:00
Alejandro Gutiérrez
7ab3c8d465 feat(cli): claudemesh launch command with transparency banner (v0.1.2)
Adds `claudemesh launch [args]` that spawns Claude Code with
--dangerously-load-development-channels server:claudemesh so peer
messages arrive as <channel> system reminders mid-turn instead of
pull-only via check_messages. Windows uses shell:true to resolve
claude.cmd from PATHEXT.

Prints an info banner before spawning that explains the channel's
scope (peer text injection only), the trust model (treat as
untrusted input), and that existing tool-approval prompts remain
the safety net. --quiet skips the banner.

Install output now mentions `claudemesh launch` as the recommended
launch path; plain `claude` still works for pull-only mode.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-05 22:22:46 +01:00
Alejandro Gutiérrez
b1f428c44b feat(cli): wss push → mcp channel injection + status hooks in install
Some checks failed
CI / Lint (push) Has been cancelled
CI / Typecheck (push) Has been cancelled
CI / Broker tests (Postgres) (push) Has been cancelled
CI / Docker build (linux/amd64) (push) Has been cancelled
Full parity with claude-peers:

1. Push-injection (the "tap on shoulder" UX)
   - MCP server now declares experimental.claude/channel capability
   - BrokerClient onPush handlers emit server.notification({
       method: "notifications/claude/channel",
       params: { content, meta: {from_id, from_name, mesh_slug,
                  mesh_id, priority, sent_at, delivered_at, kind}}
     })
   - Claude Code injects each push as <channel source="claudemesh">
     system reminder, so the receiver session sees inbound messages
     WITHOUT calling check_messages manually
   - Updated MCP instructions with the "RESPOND IMMEDIATELY" framing
     (adapted from claude-peers)

2. Status hooks in install (default-on, --no-hooks to opt out)
   - new apps/cli/src/commands/hook.ts: reads stdin JSON (Claude Code
     hook payload), extracts cwd+session_id, POSTs /hook/set-status
     to every joined mesh's broker in parallel with process.ppid +
     1s timeout per POST. Silent fail, fire-and-forget.
   - install.ts: writes to ~/.claude/settings.json registering
     `claudemesh hook idle` on Stop + `claudemesh hook working` on
     UserPromptSubmit. Idempotent, preserves other hook entries.
   - uninstall.ts: removes both hook entries + MCP entry; leaves
     unrelated hook/MCP entries alone.
   - dedupes by brokerUrl (multiple meshes on same broker → one POST)

3. CLI surface
   - new subcommand: `claudemesh hook <status>` (internal, but
     exposed so Claude Code can invoke it via the hook shell command)
   - `install --no-hooks` for users who want bare MCP registration
   - --help updated

Coexistence with claude-peers: both tools register Stop and
UserPromptSubmit hooks, each POSTs to its own broker. Claude Code
fires multiple hooks per event without conflict.

npm version 0.1.0 → 0.1.1 (patch).

Verified:
- install with hooks → 2 entries added to settings.json ✓
- install --no-hooks → "Hooks skipped" ✓
- uninstall → both MCP entry + 2 hook entries removed ✓
- `echo '{...}' | claudemesh hook idle` with no joined meshes →
  silent no-op ("no joined meshes, nothing to do") ✓
- MCP initialize response includes experimental.claude/channel ✓

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-05 19:17:33 +01:00
Alejandro Gutiérrez
59e999535d feat(cli): accept https://claudemesh.com/join/<token> invite URL format
Some checks failed
CI / Lint (push) Has been cancelled
CI / Typecheck (push) Has been cancelled
CI / Broker tests (Postgres) (push) Has been cancelled
CI / Docker build (linux/amd64) (push) Has been cancelled
Pairs with claudemesh-2's new /join/[token] landing page. Users can
now paste a clickable HTTPS URL instead of the dev-only ic:// scheme.

apps/cli/src/invite/parse.ts — new extractInviteToken() handles
four input formats before handing the raw base64url token to the
existing parseInviteLink pipeline:
  - https://claudemesh.com/join/<token>   (primary, clickable)
  - https://claudemesh.com/<locale>/join/<token>   (i18n prefix)
  - ic://join/<token>                     (still supported, dev)
  - <raw-token>                           (last resort: bare base64url)

User-facing strings updated to the HTTPS form:
- cli help: "join <url>"
- install success message
- list (no-meshes) hint
- MCP server "no meshes" error
- README.md primary example
- docs/QUICKSTART.md Path A + Path B

Verified extractInviteToken() on all 4 formats — each returns the
same base64url token → same broker /join lookup.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-05 16:32:50 +01:00
Alejandro Gutiérrez
64ca600195 chore(cli): rename package to claudemesh-cli (unscoped) for npm publish
Some checks failed
CI / Tests / 🧪 Test (push) Has been cancelled
@claudemesh/cli was already taken on npm by an unrelated project
(claudemesh "domain packages", v1.0.7). PM picked option A: publish
unscoped as claudemesh-cli. Binary name stays "claudemesh" — users
type the natural thing on install:

  npm install -g claudemesh-cli
  claudemesh install
  claudemesh join ic://join/...

renamed references everywhere:
- apps/cli/package.json: name
- apps/cli/README.md: title + install command
- apps/cli/src/{index.ts, mcp/server.ts, commands/install.ts} headers
- docs/QUICKSTART.md: install command, version banner, npx hint
- docs/roadmap.md: package name

also (PM journey-friction #5): surface the "restart Claude Code" step
LOUDLY in install output. Added a yellow-bold warning line after the
✓ success lines so new users don't miss the restart step (MCP tools
only load on Claude Code restart).

  ⚠  RESTART CLAUDE CODE for MCP tools to appear.

ANSI colors gated on isTTY + NO_COLOR/TERM=dumb guards.

bundle rebuilt. ready for npm publish pending user's `npm adduser`.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-05 14:41:59 +01:00
Alejandro Gutiérrez
714d82e4e7 chore(cli): bundle for node, prep for npm publish
Some checks failed
CI / Tests / 🧪 Test (push) Has been cancelled
Makes @claudemesh/cli installable globally via npm without requiring
bun on user machines. (Bun stays the dev runtime; bundled output is
node-compatible.)

- bun build --target=node --outfile dist/index.js produces a 2.69MB
  standalone bundle with node-shebang banner
- package.json: add description/keywords/author/license/homepage/
  repository, set bin to ./dist/index.js, files=[dist, README, LICENSE],
  publishConfig.access=public, engines.node >=20
- prepublishOnly auto-runs the build
- pin zod from catalog: to 4.1.13 (npm rejects catalog: refs)
- swap Bun.spawnSync → node:child_process.spawnSync in install.ts
  (the only Bun-global usage in the package)
- strip shebang from src/index.ts (banner supplies it post-bundle)

install command now runs in two modes:
- BUNDLED (npm i -g): detects dist/index.js path, writes MCP entry
  with command "claudemesh" (relies on the global bin shim on PATH)
- SOURCE (bun src/index.ts, dev): preflights bun, writes MCP entry
  with command "bun <absolute-path> mcp"

verified end-to-end:
- node dist/index.js --help prints usage ✓
- node dist/index.js install writes correct ~/.claude.json ✓
- node dist/index.js mcp | tools/list returns all 5 tools ✓
- bun src/index.ts install (dev mode) still works ✓

NOT PUBLISHED YET — @claudemesh/cli is owned by an unrelated project
on npm. Awaiting user decision on alternative name (claudemesh-cli,
@alezmad/claudemesh-cli, or new org scope). Bundle is name-agnostic
and will reuse regardless.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-05 14:31:27 +01:00
Alejandro Gutiérrez
47304d2a52 feat(cli): install command auto-writes ~/.claude.json MCP entry
Some checks failed
CI / Tests / 🧪 Test (push) Has been cancelled
The previous flow printed a \`claude mcp add ...\` command and asked
users to paste it. That's 2 steps, a typo surface, and a point of
user dropoff. Replace with direct read-modify-write of ~/.claude.json.

install:
- preflights bun on PATH (clear error + Bun.com link if missing)
- verifies the MCP entry file exists on disk
- reads ~/.claude.json (empty object if absent)
- adds/updates mcpServers.claudemesh with resolved absolute path
- writes back with 0600 perms, creates parent dir if needed
- read-back verification (bails loudly if post-write state is wrong)
- idempotent: re-running returns "unchanged" if entry already matches
- preserves existing mcpServers entries + other top-level config keys

uninstall:
- removes the claudemesh entry if present
- no-ops cleanly when entry or config file doesn't exist
- doesn't touch anything else

Both print a clear post-action hint: "Restart Claude Code to load
the MCP server. Then join a mesh with claudemesh join <invite-link>".

verified locally with HOME=/tmp/fake-home:
- fresh install → ✓ added, config emitted correctly
- re-install → ✓ unchanged (idempotent)
- install alongside existing "other-mcp" entry → both preserved,
  plus unrelated top-level keys kept verbatim
- uninstall → ✓ removed, claudemesh gone, other entries intact
- uninstall again → · not present (no error)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-05 14:19:58 +01:00
Alejandro Gutiérrez
8931296e82 feat(cli): scaffold @claudemesh/cli MCP client package (stubs)
The user-facing tool. Two invocation modes:
  - `claudemesh mcp`           → MCP server (stdio), consumed by Claude Code
  - `claudemesh <subcommand>`  → human CLI

Layout:
  apps/cli/
  ├── package.json       bin: { claudemesh: ./src/index.ts }
  ├── README.md          install + usage
  └── src/
      ├── index.ts       dispatcher (mcp | install | join | list | leave | --help)
      ├── env.ts         CLAUDEMESH_BROKER_URL, CONFIG_DIR, DEBUG
      ├── mcp/
      │   ├── server.ts  MCP stdio server with 5 tools
      │   ├── tools.ts   tool schemas (send_message, list_peers,
      │   │              check_messages, set_summary, set_status)
      │   └── types.ts
      ├── ws/client.ts   broker connection (stub for 15b)
      ├── state/config.ts ~/.claudemesh/config.json (joined meshes + keys)
      └── commands/
          ├── install.ts print `claude mcp add ...` instruction
          ├── join.ts    parse ic://join/... (stub, Step 17)
          ├── list.ts    show joined meshes
          └── leave.ts   remove mesh from local config

Tool stubs return "not connected, run `claudemesh join <invite-link>`"
errors until 15b wires the WS client.

Verified:
- `bun src/index.ts --help` → prints usage
- `bun src/index.ts install` → prints MCP add command with resolved path
- `bun src/index.ts list` → "No meshes joined yet"
- `bun src/index.ts mcp` (via stdin) → returns tools/list with all 5 tools

Deps: @modelcontextprotocol/sdk, ws, libsodium-wrappers, zod.
Lockfile regenerated in the same commit per claudemesh-3's flag —
avoids breaking Coolify's --frozen-lockfile deploys.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-04 22:23:12 +01:00