Files
claudemesh/.artifacts/specs/2026-05-02-v0.2.0-scope.md
Alejandro Gutiérrez f71218c1e1
Some checks failed
CI / Lint (push) Has been cancelled
CI / Typecheck (push) Has been cancelled
CI / Broker tests (Postgres) (push) Has been cancelled
CI / Docker build (linux/amd64) (push) Has been cancelled
docs(spec): v0.2.0 — humans-in-mesh interface is REST, not browser WS
Broker already plumbs peer_type. Real blocker is browser-side ed25519
hello-sig — sidestepped by exposing REST API for humans (and external
scripts/bots), with web chat UI as a thin REST client using dashboard
session auth. Collapses #2 (humans) and #3 (REST) into one deliverable.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-02 02:06:29 +01:00

11 KiB

claudemesh v0.2.0 — scope

Date: 2026-05-02 Status: draft Predecessor: 2026-05-02-architecture-north-star.md (1.5.0 architecture lock)


Cut

Theme: from agent-only mesh to mesh of agents, humans, and external systems — with conversation context.

# Feature Effort Spine
1 Topics (channels/rooms within a mesh) 2-3 d yes
2 Humans in the mesh (web chat panel) 2-3 d depends on #1
3 REST API + external WS (API keys per mesh) 2-3 d depends on #1
4 Bridge peer (forwards one topic between meshes) 1 d depends on #1

Optional pickup if all four ship early:

  • Local peer aliases (~0.5 d) — IRC-style local labels for hard-to-remember displayNames.
  • Semantic peer search (~0.5 d) — already in vision doc; useful once topics exist.

Total: 7-9 days plus 1-2 days slack. Targeting release window: 2026-05-12 to 2026-05-16.


Why this cut

The 1.5.0 architecture (CLI-first, tool-less MCP, policy engine) is finished. The next bottleneck is product surface, not engineering.

Current taxonomy mesh + group + role is the right organizational structure but missing a conversational primitive. Every message is DM or @group broadcast — there's no continuity for "the deploys conversation," no scoped state/memory/files, no way for a human to join a topic without joining the whole mesh, no way for a bridge to forward a single thread of work.

Topics fix this. They are the spine of v0.2.0:

  • Without topics, "humans in mesh" floods every human with every peer's chatter.
  • Without topics, "bridge" forwards everything (loop risk, signal-to-noise problem).
  • Without topics, REST API endpoints have no natural sub-mesh scope.

Once topics exist, humans + REST + bridge each become 50% smaller because they slot into a clean primitive instead of inventing one.


Deferred

Item Why later
Federation (broker-to-broker) Bridges prototype it. Learn from real use first.
Sandboxes (E2B / Modal) Orthogonal capability. Separate release.
Sim SDK (@claudemesh/sim) Niche audience; long-tail. v0.3.0+.
Welcome back / persistent MCP Already in progress as 1.6.0 patch.
Mesh telemetry Pre-PMF telemetry is busywork; users first.

Design sketches

1. Topics

Mental model: mesh is who you trust; group is who you are; topic is what you're talking about. Three orthogonal axes.

Wire shape:

topic:
  id: <ulid>
  mesh_slug: openclaw
  name: deploys           # unique within mesh
  description: "deploy + on-call"
  visibility: public      # public | private (invite-only) | dm (1:1, autocreated)
  created_by: <pubkey>
  created_at: <ts>

Membership:

topic_member:
  topic_id: <ulid>
  pubkey: <hex>           # session pubkey OR member_pubkey for durable identity
  role: lead | member | observer
  joined_at: <ts>
  last_read_at: <ts>      # for unread counts

Messages reference a topic, not just a target:

// existing send_message envelope gains a `topic` field
{
  "to": "@deploys",       // or topic id, or peer name (DM)
  "topic": "deploys",     // optional explicit, inferred from `to: @<topic>`
  "message": "...",
  "priority": "next"
}

Resolution rules:

  • to: "alice" → DM to peer alice (no topic).
  • to: "@frontend" → group broadcast (no topic — backwards compatible with 1.5.0).
  • to: "#deploys" → topic message; delivered only to topic subscribers.
  • to: "*" → mesh-wide broadcast (kept; lower-priority than topic for new comms).

State/memory/files scoping:

  • claudemesh state set <k> <v> --topic deploys — namespace under topic.
  • claudemesh remember "..." --topic deploys — topic-scoped memory.
  • claudemesh file list --topic deploys — files visible only to topic members.

CLI:

claudemesh topic create deploys --description "deploy + on-call"
claudemesh topic list                          # all topics in mesh
claudemesh topic join deploys
claudemesh topic leave deploys
claudemesh topic invite deploys <peer>         # private topics
claudemesh topic members deploys
claudemesh topic delete deploys                # creator/admin only
claudemesh send "#deploys" "rolling out 1.5.1"

MCP claude/channel notification gains topic as an attribute so peers know which conversation an inbound message belongs to.

Effort breakdown: schema + drizzle migration + CLI verbs + broker routing changes (filter by topic membership) + skill update. ~250 LoC across CLI + ~200 LoC broker.


2. Humans in the mesh

Mental model: a human is a peer with peer_type: "human" whose presence is durable (no session pubkey rotation; identity tied to an account). They join topics, not the whole mesh — so they only see relevant traffic.

Implementation update (2026-05-02): peer_type: "ai" | "human" | "connector" is already plumbed end-to-end in the broker (hello envelope, ConnectedPeer, list_peers). What was missing wasn't broker support — it's the interface for humans, who don't have browser-side ed25519 to do hello-sig. Realistic path: REST API is the human interface (rolled into #3 below). The web chat panel becomes a thin client that posts/reads via REST using the dashboard user's session auth — not its own keypair. This collapses #2 and #3 into a single deliverable: REST → UI on top.

Wire:

// hello envelope gains:
{
  "peer_type": "human",
  "session_pubkey": <ephemeral, per browser tab>,
  "member_pubkey": <durable, account-tied>,
  "display_name": "Alejandro"
}

Web panel (apps/web):

/dashboard/mesh/<slug>/topic/<topic-name>
  ├── topic header (members, settings)
  ├── message stream (WS-driven, infinite scroll on history)
  ├── compose box (typing indicator broadcast on focus)
  └── members sidebar (presence, profile, last_read_at)

Backend changes:

  • Persistent message history per topic (drizzle table topic_messages; existing direct messages stay ephemeral by design).
  • Topic-scoped read receipts (topic_member.last_read_at).
  • Typing indicator: short-lived broadcast on the topic channel ({type: "typing", peer: "..."}).

Privacy invariant: a human in #deploys sees only #deploys traffic + DMs sent to them. Never the whole mesh. This is the whole reason topics come first.

Effort: WS endpoint already exists (broker side). Add: topic_messages table, history endpoint, web UI components (compose, stream, members). ~3 days.


3. REST API + external WS

Auth: API keys per mesh, scoped by capability + topic.

api_key:
  id: <ulid>
  mesh_slug: openclaw
  label: "ci-bot"
  hash: <argon2id>
  capabilities: ["send", "read"]
  topic_scopes: ["#deploys"]      # null = all topics; explicit = whitelist
  created_at: <ts>
  last_used_at: <ts>
  revoked_at: <ts | null>

CLI for issuance (admin only):

claudemesh apikey create --label "ci-bot" --topic deploys --cap send,read
claudemesh apikey list
claudemesh apikey revoke <id>

REST endpoints (claudemesh.com/api/v1):

POST /v1/messages                  Send a message (auth: api key).
GET  /v1/topics/:name/messages     History (with pagination cursor).
GET  /v1/peers                     List online peers (filtered by key scope).
GET  /v1/state                     Read mesh state.
POST /v1/state                     Write mesh state.

External WS: wss://ic.claudemesh.com/ws?api_key=...&topic=deploys — connects with peer_type: "external". Push-pipe parity with internal sessions; can subscribe to topic streams.

Why REST keys not session keypairs: external clients (Zapier, GitHub Actions, mobile apps, Slack workspace bots) need long-lived bearer-like creds, not ephemeral keypairs. Different threat model — scope tightly via topic + capability.

Effort: ~3 days. Mostly broker work; CLI gets the issuance verbs.


4. Bridge peer

Mental model: a bridge is a peer that holds memberships in two meshes and forwards traffic on a single topic between them. SDK-only (no broker changes).

Implementation (uses existing @claudemesh/sdk):

import { Bridge } from "@claudemesh/sdk";

const bridge = new Bridge({
  meshes: ["work", "external"],
  topic: "incidents",
  filter: (msg) => !msg.tags.includes("internal-only"),
  loop_prevention: { tag: "via-bridge", max_hops: 2 },
});
await bridge.start();

Loop prevention: every forwarded message gets a bridge_hop_<n> tag; bridges drop messages that already carry their own tag (prevents echo) and any message with max_hops exceeded.

CLI: claudemesh bridge run <config.yaml> — runs an SDK bridge as a long-lived process. Useful for "run a bridge inside a docker container or systemd unit."

What it deliberately doesn't do:

  • Cross-broker federation (that's a separate broker-to-broker protocol).
  • Bidirectional state/memory sync (only messages on a single topic).
  • Identity unification (a peer in mesh A is not the same peer in mesh B; the bridge appears as the messenger).

Effort: ~1 day on top of the existing SDK.


Acceptance signals

v0.2.0 ships when all four are demonstrable end-to-end:

  1. A peer creates #deploys, two other peers join it, traffic is topic-scoped, mesh-wide chat doesn't see it.
  2. A human signs in at claudemesh.com, joins #deploys, sends a message, a Claude session in the mesh receives it as a <channel> interrupt with topic="deploys".
  3. A curl POST against /v1/messages with an API key delivers a message into #deploys; the same API key is rejected on #secrets.
  4. A bridge peer running locally forwards #incidents between two test meshes; loop is prevented; one-shot demo recorded.

Out of scope (explicitly)

  • Topic hierarchy / nesting (flat namespace per mesh; revisit at scale).
  • Topic-scoped capability grants (grant <peer> read:#topic) — solvable later via capability extension.
  • Threads-within-topics (Slack-style). Defer.
  • Voice / video / file-upload UX for humans — text only in v0.2.0.
  • Federation, sandboxes, sim-sdk — explicitly deferred above.

Risks

  • Topics retrofit risk — existing 1.5.0 message envelope assumes "to" is peer/group/star. Adding topic is additive on the wire but changes routing logic. Test path: backfill existing meshes with a default #general topic; opt-in to topic-only routing.
  • Web chat session lifecycle — humans expect "I closed the tab and came back, my place is preserved." Ephemeral session pubkeys break that. Workaround: tie human peer identity to member_pubkey + last_read_at on the topic; session pubkey rotates per tab but membership is durable.
  • API key abuse — leaked keys = anyone can post. Mitigations: capability + topic scoping; rate limits per key; last_used_at + audit trail; revoke verb is fast.

Open questions

  1. Do existing @group semantics survive intact, or do we collapse @group and #topic into one primitive? (Answer favored: keep both — different axes.)
  2. Should topics persist messages by default, or be opt-in? (Default: yes for peer_type: "human"-touched topics; configurable per topic for agent-only ones.)
  3. Where does mesh-MCP discovery live in the topic model — per topic or per mesh? (Likely per mesh; mesh-MCP is infrastructure, not conversation.)