alezmad/claudemesh

Fork 0

Files

Alejandro Gutiérrez ee12510ef1

CI / Lint (push) Has been cancelled

Details

CI / Typecheck (push) Has been cancelled

Details

CI / Broker tests (Postgres) (push) Has been cancelled

Details

CI / Docker build (linux/amd64) (push) Has been cancelled

Details

refactor: rename cli-v2 → cli, archive legacy cli, plus broker-side grants + auto-migrate

- apps/cli/ is now the canonical CLI (was apps/cli-v2/).
- apps/cli/ legacy v0 archived as branch 'legacy-cli-archive' and tag
  'cli-v0-legacy-final' before deletion; git history preserves it too.
- .github/workflows/release-cli.yml paths updated.
- pnpm-lock.yaml regenerated.

Broker-side peer-grant enforcement (spec: 2026-04-15-per-peer-capabilities):
- 0020_peer-grants.sql adds peer_grants jsonb + GIN index on mesh.member.
- handleSend in broker fetches recipient grant maps once per send, drops
  messages silently when sender lacks the required capability.
- POST /cli/mesh/:slug/grants to update from CLI; broker_messages_dropped_by_grant_total metric.
- CLI grant/revoke/block now mirror to broker via syncToBroker.

Auto-migrate on broker startup:
- apps/broker/src/migrate.ts runs drizzle migrate with pg_advisory_lock
  before the HTTP server binds. Exits non-zero on failure so Coolify
  healthcheck fails closed.
- Dockerfile copies packages/db/migrations into /app/migrations.
- postgres 3.4.5 added as direct broker dep.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

2026-04-15 08:44:52 +01:00

48 KiB

Raw Blame History

claudemesh v2 — Pass 1 Parity Test Plan

Status: backlog reference Created: 2026-04-11 Purpose: Concrete test-by-test plan that verifies v2 behaves identically to v1 for every feature in the 2026-04-11-v1-feature-inventory.md regression list. Green parity suite = v2 Pass 1 ready to ship. Red parity suite = keep working. Companion to: 2026-04-11-v1-feature-inventory.md (this document's §12 is the input to every test below)

1. Why this exists

v2 Pass 1 is a refactor: new folder structure, facade pattern, CLI user flows. The broker is unchanged, the backend services are unchanged, every v1 tool (the 79 MCP tools + 85 broker WS message types) must keep working. The only safe way to prove "keep working" is to run v1 and v2 side by side and assert they produce identical behavior.

v2 deliberately drops some v1 CLI commands (not tools — CLI subcommands exposed to end users). Because v2 has no users yet and no migration constraints, the v2 CLI picks the best command surface without backwards compatibility. Commands dropped from v2:

v1 command	Dropped in v2 because	Replacement
`claudemesh launch [args]`	Redundant with bare `claudemesh`. The subcommand word adds nothing.	`claudemesh` (bare, with flags)
`claudemesh disconnect telegram`	Bridge teardown is done inside Telegram (`/revoke`) or by leaving the mesh; CLI wrapper is cosmetic.	In-Telegram revoke; or leave the mesh

Preserved with UX upgrade: claudemesh connect telegram stays in v2 but is rewritten as an interactive wizard — mesh picker, QR code, t.me deep link, waits for bridge confirmation. See Pass 1 spec §5.7. v1's silent auto-pick of config.meshes[0] is a multi-mesh footgun and is replaced.

The 79 MCP tools and 85 WS message types are all preserved. Only the thin wrapper CLI subcommands that translated them are dropped. This is a conscious v2 decision to simplify the CLI surface, not a regression.

v1 has thin test coverage (2 CLI test files for ~12 k LOC, ~10 broker tests mostly covering crypto primitives). That's not enough to catch regressions during a refactor. We need a parity suite — a test layer that doesn't trust v2 to describe its own correctness, but compares it to v1 directly.

The parity suite is the acceptance criteria. v2 Pass 1 ships when it's green.

2. The seven test layers

#	Layer	Purpose	Compared against
1	Parity tests	Behavioral equivalence on the `inventory §12` regression list	v1 CLI subprocess
2	WS contract tests	Wire-format compatibility — v2 must speak the broker's v1 protocol	captured v1 envelopes
3	MCP tool handler tests	Every one of the 79 tools dispatches identically	v1 handler output
4	End-to-end smoke tests	Full user journeys against a real broker	self-consistent e2e flow
5	JSON output golden tests	`--json` output shape is stable for script consumers	v1 `--json` captures
6	Facade unit tests	Boundary invariants — no token leaks, Zod validation works	facade contract spec
7	Port-forwarded v1 tests	Existing crypto + broker tests still pass	unchanged

Tests run in parallel where possible. Layers 1 and 2 are the most load-bearing — they're the proof that v2 doesn't break existing users.

3. Layer 1 — Parity tests (inventory §12 driven)

One test file per regression check in the inventory's "must preserve" list. Every file spawns both v1 and v2 CLIs against the same mock broker, captures their behavior, and asserts match on the dimensions that matter (exit code, stdout JSON fields, broker-side DB state, WS messages sent).

File layout:

apps/cli-v2/tests/parity/
├── first-run/
├── session-lifecycle/
├── messaging/
├── crypto/
├── tools/        (this one is covered by layer 3)
├── backends/
├── scheduled/
├── telegram/
├── dashboard-sync/
├── webhooks/
└── doctor/

3.1 First-run parity (inventory §12.1)

Test file	Asserts
`first-run/install.test.ts`	`claudemesh install` writes identical entries to `~/.claude.json` + `~/.claude/settings.json`
`first-run/install-no-hooks.test.ts`	`claudemesh install --no-hooks` registers only the MCP server, not the status hooks
`first-run/uninstall.test.ts`	`claudemesh uninstall` removes everything `install` added, leaving other config entries untouched
`first-run/join-v1-invite.test.ts`	`claudemesh join <v1-invite-url>` enrolls using legacy invite format
`first-run/join-v2-invite.test.ts`	`claudemesh join <v2-invite-url>` enrolls using short-code + signed payload
`first-run/bare-first-run-welcome.test.ts`	`claudemesh` on a fresh machine (no config) shows the welcome wizard
`first-run/bare-returning-user-launches.test.ts`	`claudemesh` on a machine with config launches a session directly (no wizard)
`first-run/bare-with-flags-launches.test.ts`	`claudemesh --resume abc`, `claudemesh --mesh foo -y`, `claudemesh --name Alexis` all dispatch to the launch handler
`first-run/launch-word-unknown-command.test.ts`	`claudemesh launch` returns exit code 3 (invalid args) with a clear "Unknown command" error. The word `launch` is deliberately not a subcommand in v2.

3.2 Session lifecycle parity (inventory §12.2)

Test file	Asserts
`session-lifecycle/status-hook-start.test.ts`	Running `claudemesh hook` with a Claude Code `session_start` payload posts to `/hook/set-status` with `status: working`, source `hook`
`session-lifecycle/status-hook-stop.test.ts`	`session_stop` payload → `/hook/set-status` with `status: idle`, source `hook`
`session-lifecycle/status-priority.test.ts`	When a `hook` source status is fresh, a subsequent `manual` status is rejected (priority gating)
`session-lifecycle/status-ttl-sweep.test.ts`	After `WORKING_TTL_MS`, a stale `working` status decays to `idle` via the sweeper
`session-lifecycle/list-peers-freshness.test.ts`	`claudemesh peers` marks peers with stale hook source as degraded
`session-lifecycle/multi-mesh-status.test.ts`	Status updates in mesh A don't affect peer status in mesh B

3.3 Messaging parity (inventory §12.3)

Test file	Asserts
`messaging/send-priority-now.test.ts`	`send --priority now` delivers immediately, bypassing busy-gate
`messaging/send-priority-next.test.ts`	`send --priority next` waits for the recipient to be idle before delivery
`messaging/send-priority-low.test.ts`	`send --priority low` is pull-only (recipient must `check_messages` or `inbox`)
`messaging/send-to-group.test.ts`	`send @frontend <msg>` fans out to all group members, not individual peers
`messaging/send-broadcast.test.ts`	`send "*" <msg>` broadcasts to all connected peers in the mesh
`messaging/offline-queue-drain.test.ts`	Messages sent to an offline peer persist in `mesh.message_queue` and drain when the peer reconnects
`messaging/duplicate-delivery-prevention.test.ts`	Sending the same `messageId` twice does not double-deliver
`messaging/message-status-lookup.test.ts`	`message_status` returns correct delivery state: queued / delivered / acked
`messaging/inbox-drain.test.ts`	`claudemesh inbox` drains and prints pending messages; second run shows empty
`messaging/inbox-wait.test.ts`	`claudemesh inbox --wait 5` blocks for broker delivery up to 5s, returns early on arrival

3.4 Cryptographic integrity parity (inventory §12.4)

Test file	Asserts
`crypto/keypair-perms.test.ts`	Generated keypairs at `~/.claudemesh/keys/<mesh>.key` are mode `0600`, parent dir `0700`
`crypto/keypair-roundtrip.test.ts`	Keypair generation + persistence + reload produces the same public key
`crypto/hello-sig-verification.test.ts`	Valid Ed25519 hello signatures pass; altered timestamps are rejected as replay
`crypto/envelope-roundtrip.test.ts`	`send_message` ciphertext decrypts back to original on the recipient side
`crypto/file-encrypt.test.ts`	`share_file` with `to: <peer>` produces AES-GCM ciphertext + wrapped symmetric key in `mesh.file_key`
`crypto/file-decrypt.test.ts`	Recipient downloads + decrypts, content matches original
`crypto/grant-access-rewrap.test.ts`	`grant_file_access` adds a new `file_key` row for the additional recipient, re-wrapping the same symmetric key
`crypto/invite-v2-signature.test.ts`	v2 invite payloads pass Ed25519 signature verification; tampered payloads fail

3.5 Broker backends parity (inventory §12.6)

Test file	Asserts
`backends/postgres-mesh-execute.test.ts`	`mesh_execute "CREATE TABLE bugs..."` creates the table in the per-mesh schema
`backends/postgres-mesh-query.test.ts`	`mesh_query "SELECT * FROM bugs"` returns rows
`backends/postgres-mesh-schema.test.ts`	`mesh_schema` lists the newly-created table + columns
`backends/postgres-cross-mesh-isolation.test.ts`	Query in mesh A cannot see tables created in mesh B (schema-level isolation)
`backends/neo4j-graph-execute.test.ts`	`graph_execute "CREATE (n:Bug {id: 1})"` persists a node in the per-mesh Neo4j database
`backends/neo4j-graph-query.test.ts`	`graph_query "MATCH (n:Bug) RETURN n"` returns the created node
`backends/qdrant-vector-store.test.ts`	`vector_store collection=docs content=...` upserts into `mesh_<id>_docs` collection
`backends/qdrant-vector-search.test.ts`	`vector_search collection=docs query=...` returns nearest neighbors with metadata
`backends/qdrant-list-collections.test.ts`	`list_collections` enumerates the mesh's collections
`backends/minio-share-small-file.test.ts`	`share_file` with < 64 KB uploads and returns a fileId
`backends/minio-share-large-file.test.ts`	`share_file` with 10 MB uploads in chunks and returns a fileId
`backends/minio-get-file.test.ts`	`get_file` returns the content or presigned URL for download
`backends/minio-delete-file.test.ts`	`delete_file` removes the file from the bucket
`backends/docker-mcp-deploy.test.ts`	`mesh_mcp_deploy` with a catalog alias spawns a Docker container with the expected env + memory + network_allow
`backends/docker-mcp-logs.test.ts`	`mesh_mcp_logs` returns recent stdout/stderr from a running deployment
`backends/docker-mcp-undeploy.test.ts`	`mesh_mcp_undeploy` SIGTERMs the container cleanly

3.6 Scheduled messages + URL watch parity (inventory §12.7)

Test file	Asserts
`scheduled/one-shot-deliver-at.test.ts`	`schedule_reminder deliver_at=<ts+5s>` fires at the target timestamp
`scheduled/one-shot-in-seconds.test.ts`	`schedule_reminder in_seconds=5` fires 5 seconds after submission
`scheduled/cron-recurring.test.ts`	`schedule_reminder cron="/1 * * *"` fires every minute
`scheduled/persist-across-restart.test.ts`	Pending reminders survive a broker restart (re-registered from `mesh.scheduled_message` table)
`scheduled/list-cancel.test.ts`	`list_scheduled` shows pending; `cancel_scheduled <id>` prevents delivery
`scheduled/url-watch-hash-mode.test.ts`	`mesh_watch mode=hash` detects body change via SHA-256 comparison
`scheduled/url-watch-json-mode.test.ts`	`mesh_watch mode=json extract=data.status` detects value change at the jsonpath
`scheduled/url-watch-status-mode.test.ts`	`mesh_watch mode=status` detects HTTP status code change
`scheduled/url-watch-notify-on-match.test.ts`	`notify_on="match:up"` fires only when value equals `"up"`
`scheduled/url-watch-persist.test.ts`	Active watches persist across broker restart

3.7 Telegram bridge parity (inventory §12.8)

The Telegram bridge is a broker-side feature that continues to work in v2 Pass 1 because Pass 1 doesn't touch the broker. However, the v2 CLI does NOT expose claudemesh connect telegram / claudemesh disconnect telegram commands — those were v1-only CLI surface that we drop because (a) there are no users to migrate and (b) Telegram connection is better expressed via broker APIs that the user flows expose.

Instead, Telegram bridge parity is verified via e2e tests in §6 that connect directly to the broker's POST /tg/token endpoint, simulate inbound Telegram webhook payloads, and verify outbound routing via send_message(to: "tg:<username>").

Test file	Layer	Asserts
`telegram/connect-wizard-mesh-picker.test.ts`	parity	`claudemesh connect telegram` with >1 joined mesh shows the Ink mesh picker (v1 silently picked mesh[0] — v2 is explicit)
`telegram/connect-wizard-single-mesh.test.ts`	parity	With exactly one joined mesh, the wizard skips the picker and proceeds directly to token request
`telegram/connect-wizard-zero-mesh.test.ts`	parity	With zero joined meshes, exits with error code 5 and "run `claudemesh join` first"
`telegram/connect-wizard-happy-path.test.ts`	parity	Mock broker returns `{token, deepLink}`; wizard renders QR + link; simulated `telegram_bridge_connected` push triggers success message
`telegram/connect-wizard-poll-fallback.test.ts`	parity	When the broker does not emit a push event, the wizard falls back to polling `GET /mesh/:id/members` every 2s until a `tg:*` entry appears
`telegram/connect-wizard-rate-limited.test.ts`	parity	Broker 429 response is caught and rendered as "too many Telegram tokens in the last hour" instead of raw HTTP
`telegram/connect-wizard-link-flag.test.ts`	parity	`--link` flag prints only the deep link, no QR, no wait (scriptable)
`telegram/connect-wizard-status-flag.test.ts`	parity	`--status` flag checks existing bridge without generating a new token
`telegram/connect-wizard-ctrl-c.test.ts`	parity	Ctrl-C during the wait phase prints the "link stays valid" hint and exits 0
(e2e) `tests/e2e/telegram/broker-token-register.test.ts`	e2e	`POST /tg/token` registers a bot token, writes to `mesh.telegram_bridge`
(e2e) `tests/e2e/telegram/broker-inbound-routing.test.ts`	e2e	Simulated inbound Telegram update is routed as a mesh `send_message` with `subtype: telegram`
(parity) `telegram/send-message-to-tg-peer.test.ts`	parity	`send_message(to: "tg:<username>", ...)` via the v2 CLI calls the broker with the same WS envelope as v1 would
(parity) `telegram/list-peers-shows-tg-bridge.test.ts`	parity	When a Telegram bridge is registered on the broker, `claudemesh peers` includes `tg:<username>` entries with `type: bridge`

3.8 Dashboard sync parity (inventory §12.9)

Test file	Asserts
`dashboard-sync/browser-flow.test.ts`	`claudemesh sync` opens browser, receives JWT via `callback-listener`, fetches mesh list
`dashboard-sync/cli-sync-endpoint.test.ts`	`POST /cli-sync` with valid JWT returns the user's dashboard meshes; invalid JWT is rejected
`dashboard-sync/force-resync.test.ts`	`claudemesh sync --force` re-links even if already linked

3.9 Webhooks parity (inventory §12.10)

Test file	Asserts
`webhooks/create-returns-url.test.ts`	`create_webhook name=github` returns a POST URL
`webhooks/external-post-becomes-mesh-message.test.ts`	External `POST /hook/:meshId/:webhookId` with a JSON payload emits a mesh message to all peers
`webhooks/hmac-signature-validation.test.ts`	HMAC-signed requests pass, unsigned requests are rejected
`webhooks/list-delete.test.ts`	`list_webhooks` + `delete_webhook` round-trip works

3.10 Doctor checks parity (inventory §12.11)

Test file	Asserts
`doctor/check-node-version.test.ts`	`doctor` reports Node ≥ 20 (or warns if < 20 in a mocked env)
`doctor/check-claude-on-path.test.ts`	`doctor` detects `claude` binary on PATH
`doctor/check-mcp-registered.test.ts`	`doctor` detects MCP server entry in `~/.claude.json`
`doctor/check-hooks-registered.test.ts`	`doctor` detects status hooks in `~/.claude/settings.json`
`doctor/check-config-perms.test.ts`	`doctor` validates `~/.claudemesh/config.json` is mode `0600`
`doctor/check-keypairs-valid.test.ts`	`doctor` validates each mesh keypair can sign + verify

Parity layer total: ~70 test files. Each file runs both v1 and v2 in the same environment and diffs the outputs.

4. Layer 2 — WS contract tests

One contract test per broker WS message type (85 total from inventory §3). Each test captures what v1's WS client would send for a given input and asserts v2 sends the byte-identical envelope (modulo legitimate non-determinism like nonces and timestamps, which are normalized before comparison).

File layout:

apps/cli-v2/tests/contract/ws/
├── lifecycle/                 (3 tests: hello, hello_ack, get_clock)
├── messaging/                 (4 tests)
├── profile/                   (5 tests)
├── groups/                    (2 tests)
├── state/                     (3 tests)
├── memory/                    (3 tests)
├── files/                     (5 tests)
├── vectors/                   (4 tests)
├── graph/                     (2 tests)
├── sql/                       (3 tests)
├── streams/                   (5 tests)
├── contexts/                  (3 tests)
├── tasks/                     (4 tests)
├── scheduling/                (3 tests)
├── metadata/                  (3 tests)
├── clock/                     (4 tests)
├── skills/                    (5 tests)
├── mcp-registry/              (11 tests)
├── vault/                     (4 tests)
├── url-watch/                 (3 tests)
├── webhooks/                  (3 tests)
└── audit/                     (2 tests)

4.1 Contract test pattern

// tests/contract/ws/state/set-state.test.ts
import { describe, it, expect } from 'bun:test';
import { normalize, captureV1Envelope, captureV2Envelope } from '@/tests/helpers/wire-capture';

describe('WS contract: set_state', () => {
  it('v2 envelope matches v1 for string value', async () => {
    const input = { meshId: 'test-mesh', key: 'sprint', value: '2026-W15' };
    const v1 = await captureV1Envelope('set_state', input);
    const v2 = await captureV2Envelope('set_state', input);
    expect(normalize(v2)).toEqual(normalize(v1));
  });

  it('v2 envelope matches v1 for JSON value', async () => {
    const input = { meshId: 'test-mesh', key: 'deploy_freeze', value: { until: '2026-04-15' } };
    const v1 = await captureV1Envelope('set_state', input);
    const v2 = await captureV2Envelope('set_state', input);
    expect(normalize(v2)).toEqual(normalize(v1));
  });

  it('v2 envelope matches v1 for null value (deletion)', async () => {
    const input = { meshId: 'test-mesh', key: 'tmp', value: null };
    const v1 = await captureV1Envelope('set_state', input);
    const v2 = await captureV2Envelope('set_state', input);
    expect(normalize(v2)).toEqual(normalize(v1));
  });
});

4.2 The `normalize()` helper

Strips fields that are legitimately non-deterministic between v1 and v2:

nonce — random per envelope
timestamp — wall clock
messageId — random UUID
_reqId — random correlation ID
ciphertext — depends on nonce + random keypair; instead of comparing ciphertext directly, both envelopes are decrypted and the plaintext is compared

Everything else (message type, meshId, priority, sender pubkey, recipient, flags) must match byte-for-byte.

4.3 Full contract test manifest (85 tests)

Every WS message type from inventory §3 gets a file:

Family	WS messages	Test files
Lifecycle	`hello`, `hello_ack`, `get_clock`	3
Messaging	`send`, `peer_dir_request`, `peer_dir_response`, `peer_file_request`, `peer_file_response`	5
Profile	`set_status`, `set_summary`, `set_visible`, `set_profile`, `set_stats`	5
Groups	`join_group`, `leave_group`	2
State	`set_state`, `get_state`, `list_state`	3
Memory	`remember`, `recall`, `forget`	3
Files	`get_file`, `list_files`, `file_status`, `grant_file_access`, `delete_file`	5
Vectors	`vector_store`, `vector_search`, `vector_delete`, `list_collections`	4
Graph	`graph_query`, `graph_execute`	2
SQL	`mesh_query`, `mesh_execute`, `mesh_schema`	3
Streams	`create_stream`, `publish`, `subscribe`, `unsubscribe`, `list_streams`	5
Contexts	`share_context`, `get_context`, `list_contexts`	3
Tasks	`create_task`, `claim_task`, `complete_task`, `list_tasks`	4
Scheduling	`schedule`, `list_scheduled`, `cancel_scheduled`	3
Metadata	`mesh_info`, `list_peers`, `message_status`	3
Clock	`set_clock`, `pause_clock`, `resume_clock`, `get_clock`	4
Skills	`share_skill`, `get_skill`, `list_skills`, `remove_skill`, `skill_deploy`	5
MCP registry	`mcp_register`, `mcp_unregister`, `mcp_list`, `mcp_call`, `mcp_call_response`, `mcp_deploy`, `mcp_undeploy`, `mcp_update`, `mcp_logs`, `mcp_scope`, `mcp_schema`, `mcp_catalog`	12
Vault	`vault_set`, `vault_get`, `vault_list`, `vault_delete`	4
URL watch	`watch`, `unwatch`, `watch_list`	3
Webhooks	`create_webhook`, `list_webhooks`, `delete_webhook`	3
Audit	`audit_query`, `audit_verify`	2

Contract layer total: ~85 test files.

This layer is the load-bearing proof that v2's WS client speaks the broker's v1 protocol unchanged. If any of these tests fail, v1 users running v2 against production brokers will experience silent misbehavior.

5. Layer 3 — MCP tool handler tests

One test file per MCP tool from inventory §2 (79 tools). Each file:

Invokes the tool through v2's MCP server with a fixture input
Captures the WS message v2 sends to the broker
Captures the same request through v1's MCP server
Asserts both produce identical WS envelopes and identical return values

File layout mirrors v2's src/mcp/tools/:

apps/cli-v2/tests/mcp-tools/
├── memory/
│   ├── remember.test.ts
│   ├── recall.test.ts
│   └── forget.test.ts
├── state/
│   ├── set-state.test.ts
│   ├── get-state.test.ts
│   └── list-state.test.ts
├── messaging/
│   ├── send-message.test.ts
│   ├── list-peers.test.ts
│   ├── check-messages.test.ts
│   └── message-status.test.ts
├── profile/
│   ├── set-profile.test.ts
│   ├── set-status.test.ts
│   ├── set-summary.test.ts
│   └── set-visible.test.ts
├── groups/
│   ├── join-group.test.ts
│   └── leave-group.test.ts
├── files/
│   ├── share-file.test.ts
│   ├── get-file.test.ts
│   ├── list-files.test.ts
│   ├── file-status.test.ts
│   ├── delete-file.test.ts
│   ├── grant-file-access.test.ts
│   ├── read-peer-file.test.ts
│   └── list-peer-files.test.ts
├── vectors/
│   ├── vector-store.test.ts
│   ├── vector-search.test.ts
│   ├── vector-delete.test.ts
│   └── list-collections.test.ts
├── graph/
│   ├── graph-query.test.ts
│   └── graph-execute.test.ts
├── sql/
│   ├── mesh-query.test.ts
│   ├── mesh-execute.test.ts
│   └── mesh-schema.test.ts
├── streams/
│   ├── create-stream.test.ts
│   ├── publish.test.ts
│   ├── subscribe.test.ts
│   └── list-streams.test.ts
├── contexts/
│   ├── share-context.test.ts
│   ├── get-context.test.ts
│   └── list-contexts.test.ts
├── tasks/
│   ├── create-task.test.ts
│   ├── claim-task.test.ts
│   ├── complete-task.test.ts
│   └── list-tasks.test.ts
├── scheduling/
│   ├── schedule-reminder.test.ts
│   ├── list-scheduled.test.ts
│   └── cancel-scheduled.test.ts
├── metadata/
│   ├── mesh-info.test.ts
│   ├── mesh-stats.test.ts
│   ├── mesh-clock.test.ts
│   └── ping-mesh.test.ts
├── clock-write/
│   ├── mesh-set-clock.test.ts
│   ├── mesh-pause-clock.test.ts
│   └── mesh-resume-clock.test.ts
├── skills/
│   ├── share-skill.test.ts
│   ├── get-skill.test.ts
│   ├── list-skills.test.ts
│   ├── remove-skill.test.ts
│   └── mesh-skill-deploy.test.ts
├── mcp-registry-tier1/
│   ├── mesh-mcp-register.test.ts
│   ├── mesh-mcp-list.test.ts
│   ├── mesh-tool-call.test.ts
│   └── mesh-mcp-remove.test.ts
├── mcp-registry-tier2/
│   ├── mesh-mcp-deploy.test.ts
│   ├── mesh-mcp-undeploy.test.ts
│   ├── mesh-mcp-update.test.ts
│   ├── mesh-mcp-logs.test.ts
│   ├── mesh-mcp-scope.test.ts
│   ├── mesh-mcp-schema.test.ts
│   └── mesh-mcp-catalog.test.ts
├── vault/
│   ├── vault-set.test.ts
│   ├── vault-list.test.ts
│   └── vault-delete.test.ts
├── url-watch/
│   ├── mesh-watch.test.ts
│   ├── mesh-unwatch.test.ts
│   └── mesh-watches.test.ts
└── webhooks/
    ├── create-webhook.test.ts
    ├── list-webhooks.test.ts
    └── delete-webhook.test.ts

MCP layer total: 79 test files.

5.1 MCP handler test pattern

// tests/mcp-tools/memory/remember.test.ts
import { describe, it, expect } from 'bun:test';
import { v1McpServer, v2McpServer, mockBroker } from '@/tests/helpers';

describe('MCP tool: remember (parity)', () => {
  it('v1 and v2 produce identical WS envelopes', async () => {
    const input = {
      content: 'Payments API rate-limits at 100 req/s after March incident',
      tags: ['payments', 'rate-limit'],
    };

    const v1Sent = await v1McpServer.invokeAndCapture('remember', input);
    const v2Sent = await v2McpServer.invokeAndCapture('remember', input);

    expect(v2Sent).toEqual(v1Sent);
  });

  it('v1 and v2 return identical tool results', async () => {
    const input = { content: 'test memory', tags: [] };
    const v1Result = await v1McpServer.invoke('remember', input);
    const v2Result = await v2McpServer.invoke('remember', input);
    expect(v2Result).toEqual(v1Result);
  });

  it('v2 rejects invalid input via Zod with same error shape as v1', async () => {
    const invalidInput = { content: '', tags: 'not-an-array' };
    await expect(v2McpServer.invoke('remember', invalidInput)).rejects.toThrow();
  });
});

6. Layer 4 — End-to-end smoke tests

Full journey tests against a real broker in a Docker sandbox. Each test spins up:

A fresh Postgres + Neo4j + Qdrant + MinIO + broker stack (via testcontainers)
One or more v2 CLI instances as subprocesses
Optionally a Claude Code mock to simulate MCP client interactions

These are slower than layers 1–3 but catch integration bugs that unit-level parity tests miss.

File layout:

apps/cli-v2/tests/e2e/
├── first-run/
│   ├── install-and-join.test.ts
│   ├── fresh-mesh-create.test.ts
│   └── launch-with-mesh.test.ts
├── messaging/
│   ├── two-peer-send-receive.test.ts
│   ├── broadcast-to-group.test.ts
│   └── offline-queue-drain.test.ts
├── files/
│   ├── upload-download-small.test.ts
│   ├── upload-download-large.test.ts
│   └── e2e-encrypted-share.test.ts
├── state-and-memory/
│   ├── state-across-peers.test.ts
│   ├── memory-full-text-search.test.ts
│   └── context-share-and-query.test.ts
├── tasks/
│   ├── create-claim-complete.test.ts
│   └── list-by-status.test.ts
├── backends/
│   ├── shared-sql-roundtrip.test.ts
│   ├── neo4j-graph-roundtrip.test.ts
│   ├── qdrant-vector-roundtrip.test.ts
│   └── minio-file-roundtrip.test.ts
├── mcp-registry/
│   ├── deploy-catalog-entry.test.ts
│   ├── call-deployed-tool.test.ts
│   └── undeploy-cleanup.test.ts
├── telegram/
│   ├── connect-and-route.test.ts
│   └── disconnect-cleanup.test.ts
├── dashboard-sync/
│   └── browser-flow.test.ts
├── scheduled/
│   ├── cron-reminder-fires.test.ts
│   ├── url-watch-detects-change.test.ts
│   └── persist-across-restart.test.ts
├── webhooks/
│   └── inbound-post-becomes-message.test.ts
└── journey/
    ├── full-user-journey.test.ts         (install → join → send → file → logout)
    ├── connector-journey.test.ts         (deploy mcp → call tool → undeploy)
    └── skill-sharing-journey.test.ts     (share_skill → teammate loads → invoke)

E2E layer total: ~28 test files.

6.1 E2E test harness

Each e2e file uses a shared harness:

import { startFreshBroker, stopBroker, spawnCli } from '@/tests/helpers/e2e';

describe('e2e: two-peer send-receive', () => {
  let broker: BrokerHandle;
  let alice: CliHandle;
  let bob: CliHandle;

  beforeAll(async () => {
    broker = await startFreshBroker();
    alice = await spawnCli({ broker, displayName: 'Alice' });
    bob = await spawnCli({ broker, displayName: 'Bob' });
    await alice.join(broker.seedInvite);
    await bob.join(broker.seedInvite);
  });

  afterAll(async () => {
    await alice.shutdown();
    await bob.shutdown();
    await stopBroker(broker);
  });

  it('alice sends to bob; bob receives via inbox', async () => {
    await alice.send({ to: 'Bob', message: 'hello' });
    const inbox = await bob.inbox();
    expect(inbox).toHaveLength(1);
    expect(inbox[0].plaintext).toBe('hello');
  });
});

6.2 Testcontainers vs local broker

Two modes, switchable via env var:

E2E_BROKER=docker — spins up a fresh broker + all backends via testcontainers. Slow (~30s per test) but hermetic.
E2E_BROKER=local — connects to a running local broker (ic.claudemesh.com or localhost:8787). Fast but requires manual setup.

CI uses docker mode. Dev iteration uses local mode.

7. Layer 5 — JSON output golden tests

--json output is the stable contract for script consumers. These tests lock the shape and fields.

File layout:

apps/cli-v2/tests/golden/
├── list-json.test.ts
├── peers-json.test.ts
├── info-json.test.ts
├── inbox-json.test.ts
├── state-get-json.test.ts
├── state-list-json.test.ts
├── remember-json.test.ts
├── recall-json.test.ts
├── remind-json.test.ts
├── profile-json.test.ts
├── mcp-info-json.test.ts
└── mcp-stats-json.test.ts

7.1 Golden test pattern

// tests/golden/list-json.test.ts
import { describe, it, expect } from 'bun:test';
import { runV2Cli } from '@/tests/helpers';
import { listJsonShape } from '@/tests/fixtures/golden/list.json';

describe('golden: claudemesh list --json', () => {
  it('output shape matches locked schema', async () => {
    const output = JSON.parse(await runV2Cli(['list', '--json']));
    expect(output).toMatchObject(listJsonShape);
  });

  it('includes schema_version field', async () => {
    const output = JSON.parse(await runV2Cli(['list', '--json']));
    expect(output.schema_version).toMatch(/^\d+\.\d+$/);
  });

  it('mesh entries have all v1 fields', async () => {
    const output = JSON.parse(await runV2Cli(['list', '--json']));
    for (const mesh of output.meshes) {
      expect(mesh).toHaveProperty('slug');
      expect(mesh).toHaveProperty('name');
      expect(mesh).toHaveProperty('kind');
      expect(mesh).toHaveProperty('brokerUrl');
      expect(mesh).toHaveProperty('memberCount');
    }
  });
});

Golden fixtures live in tests/fixtures/golden/*.json and are captured from v1 CLI runs the first time the test is written, then locked.

Golden layer total: ~12 test files.

8. Layer 6 — Facade unit tests

Per-service colocated tests that verify the facade contract:

Every facade function validates input with Zod
Every output type passes the boundary scanner (no token, api_key, password, path-like patterns)
Error mapping via toDomainError preserves cause + logs unmapped errors
Never exposes class instances, DB connections, or raw HTTP responses

File layout (colocated with services):

apps/cli-v2/src/services/
├── auth/
│   └── facade.test.ts
├── mesh/
│   └── facade.test.ts
├── invite/
│   └── facade.test.ts
├── broker/
│   └── facade.test.ts
├── api/
│   └── facade.test.ts
├── crypto/
│   └── facade.test.ts
├── store/
│   └── facade.test.ts
├── config/
│   └── facade.test.ts
├── state/
│   └── facade.test.ts
├── device/
│   └── facade.test.ts
├── clipboard/
│   └── facade.test.ts
├── spawn/
│   └── facade.test.ts
├── telemetry/
│   └── facade.test.ts
├── health/
│   └── facade.test.ts
├── update/
│   └── facade.test.ts
├── i18n/
│   └── facade.test.ts
└── lifecycle/
    └── facade.test.ts

Plus one global test at tests/unit/facade-boundary-scan.test.ts that walks every facade and asserts no output type contains forbidden keys (AST-based via ts-morph, per facade-pattern spec §10.2).

Facade layer total: ~17 facade-specific test files + 1 global scanner.

8.1 Facade test pattern

// services/auth/facade.test.ts
import { describe, it, expect, vi } from 'vitest';
import * as facade from './facade';
import { getAuthService } from './index';

vi.mock('./index');

describe('auth facade contract', () => {
  it('loginWithDeviceCode rejects leaked token in output', async () => {
    vi.mocked(getAuthService).mockReturnValue({
      startDeviceCodeFlow: vi.fn().mockResolvedValue({
        user: { id: 'u1', display_name: 'Alejandro', email: 'a@b.c' },
        token: 'cm_session_SECRET',
        raw_response: { headers: {} },
      }),
    } as any);

    const result = await facade.loginWithDeviceCode();
    const serialized = JSON.stringify(result);

    expect(serialized).not.toContain('cm_session_');
    expect(serialized).not.toContain('SECRET');
    expect(serialized).not.toContain('raw_response');
  });

  it('loginWithToken validates input with Zod', async () => {
    await expect(facade.loginWithToken({ token: 'malformed' })).rejects.toMatchObject({
      code: 'AUTH_INVALID_TOKEN',
    });
  });

  it('whoAmI never throws', async () => {
    vi.mocked(getAuthService).mockReturnValue({
      getCurrentState: vi.fn().mockRejectedValue(new Error('boom')),
    } as any);
    await expect(facade.whoAmI()).resolves.toBeDefined();
  });

  it('toDomainError logs unmapped errors', async () => {
    const logSpy = vi.fn();
    vi.mocked(getAuthService).mockReturnValue({
      logout: vi.fn().mockRejectedValue(new TypeError('null pointer')),
      logger: { error: logSpy },
    } as any);

    await facade.logout();
    expect(logSpy).toHaveBeenCalledWith(
      expect.stringContaining('unmapped error'),
      expect.any(Object),
    );
  });
});

9. Layer 7 — Port-forwarded v1 tests

v1's existing tests cover crypto primitives and broker correctness. They all apply to v2 as-is because:

v2 uses the same crypto primitives (Ed25519, NaCl crypto_box, AES-GCM)
v2 talks to the same broker, so broker tests run unchanged
v2 parses the same invite formats

9.1 Tests to port forward

v1 test	Forward-port path	Notes
`apps/cli/src/__tests__/crypto-roundtrip.test.ts`	`apps/cli-v2/tests/unit/crypto-roundtrip.test.ts`	Direct copy; uses shared crypto primitives
`apps/cli/src/__tests__/invite-parse.test.ts`	`apps/cli-v2/tests/unit/invite-parse.test.ts`	Direct copy; v2 parses same v1 + v2 invite formats
`apps/broker/tests/broker.test.ts`	N/A — stays in broker	Broker unchanged
`apps/broker/tests/invite-signature.test.ts`	N/A — stays in broker	Broker unchanged
`apps/broker/tests/invite-v2.test.ts`	N/A — stays in broker	Broker unchanged
`apps/broker/tests/hello-signature.test.ts`	N/A — stays in broker	Broker unchanged
`apps/broker/tests/rate-limit.test.ts`	N/A — stays in broker	Broker unchanged
`apps/broker/tests/encoding.test.ts`	N/A — stays in broker	Broker unchanged
`apps/broker/tests/dup-delivery.test.ts`	N/A — stays in broker	Broker unchanged
`apps/broker/tests/metrics.test.ts`	N/A — stays in broker	Broker unchanged
`apps/broker/tests/logging.test.ts`	N/A — stays in broker	Broker unchanged
`apps/broker/tests/integration/health.test.ts`	N/A — stays in broker	Broker unchanged

Port-forward layer: 2 CLI tests copied, ~10 broker tests remain in place.

10. Test helper infrastructure

Shared helpers under apps/cli-v2/tests/helpers/:

10.1 `v1-runner.ts`

Spawns the v1 CLI as a subprocess and captures its output:

export async function runV1Cli(args: string[], opts?: RunOpts): Promise<CliResult>;
export async function v1Send(args: SendArgs): Promise<string>;  // returns messageId
export async function v1Join(inviteUrl: string): Promise<void>;
export async function v1Install(): Promise<void>;
// ... one helper per v1 command

Uses the installed claudemesh binary from apps/cli/ (v1). Tests assume v1 is available at ../../cli/dist/index.js or via npx claudemesh@0.10.5.

10.2 `v2-runner.ts`

Same interface, but spawns v2:

export async function runV2Cli(args: string[], opts?: RunOpts): Promise<CliResult>;
// ... matching v1 helper surface

Uses apps/cli-v2/dist/entrypoints/cli.js.

10.3 `wire-capture.ts`

Intercepts WS messages by routing both v1 and v2 CLIs through a test proxy that records envelopes:

export async function captureV1Envelope(op: string, input: any): Promise<WsEnvelope>;
export async function captureV2Envelope(op: string, input: any): Promise<WsEnvelope>;
export function normalize(env: WsEnvelope): WsEnvelope;

The proxy runs on a local port, the CLI's broker URL is set to ws://localhost:<port>/ws, and the proxy logs every message before forwarding to a real test broker.

10.4 `mock-broker.ts`

In-memory broker for unit tests. Implements enough of the WS protocol to test CLI-side behavior without a real database stack.

Handles:

Hello + authentication (skipped signature check in test mode)
Echo back ack for every client message
In-memory state for state_kv, memory, tasks
Configurable response fixtures for WS ops

10.5 `real-broker.ts` (for e2e)

Spins up a real broker + Postgres + Neo4j + Qdrant + MinIO stack via testcontainers:

export async function startFreshBroker(opts?: BrokerOpts): Promise<BrokerHandle>;
export async function stopBroker(handle: BrokerHandle): Promise<void>;

The stack is pre-configured with a seed mesh + fixture users + a known invite URL. Tests use these as starting state.

10.6 `temp-home.ts`

Creates an isolated ~/.claudemesh/ for each test:

export async function tempHome(fn: (homeDir: string) => Promise<void>): Promise<void>;

Cleans up on completion. Prevents tests from interfering with the developer's real claudemesh config.

10.7 `ink-render.ts`

Snapshots Ink screens for UI tests (per cli-v2-ux-design.md §12.1):

export async function renderScreen(Component: any, props?: any): Promise<string>;
export async function waitForText(frame: () => string, text: string, timeoutMs?: number): Promise<void>;

10.8 `sqlite-fixture.ts`

(Pass 1 only needs this if we add any local caching — most of Pass 1 won't touch SQLite since the local-first work is Pass 2.)

11. Shared fixtures

Under apps/cli-v2/tests/fixtures/:

fixtures/
├── auth/
│   ├── valid-session-token.json       # sample cm_session_... token
│   ├── valid-pat.json                 # sample cm_pat_... token
│   ├── expired-token.json
│   └── malformed-token.json
├── meshes/
│   ├── sample-personal-mesh.json
│   ├── sample-shared-mesh.json
│   └── sample-guest-mesh.json
├── invites/
│   ├── v1-invite-url.txt
│   ├── v2-invite-url.txt
│   ├── expired-invite-url.txt
│   └── malformed-invite-url.txt
├── wire/
│   ├── v1-envelopes/                  # captured v1 WS envelopes, one JSON file per op
│   │   ├── send.json
│   │   ├── set_state.json
│   │   ├── remember.json
│   │   └── ... (85 files)
│   └── broker-responses/              # captured v1 broker responses
│       ├── hello_ack.json
│       ├── peers_list.json
│       └── ...
├── golden/
│   ├── list-json.json                 # expected JSON output shape
│   ├── peers-json.json
│   ├── info-json.json
│   └── ... (12 files)
├── telegram/
│   ├── sample-bot-token.json
│   ├── sample-inbound-update.json     # Telegram webhook payload
│   └── expected-routed-message.json
└── mcp-tool-inputs/
    ├── memory-remember.json           # one fixture input per tool
    ├── memory-recall.json
    └── ... (79 files, one per tool)

Total fixture count: ~180 files. Most are small JSON snippets captured once from v1 runs and locked.

12. Execution order + dependencies

Tests run in parallel where possible, but some layers depend on others:

┌─────────────────────────────────────┐
│ Layer 7: port-forwarded v1 tests   │  ← no dependencies, runs first
│ Layer 6: facade unit tests          │  ← depends on v2 services existing
└─────────────────────────────────────┘
                ↓
┌─────────────────────────────────────┐
│ Layer 2: WS contract tests          │  ← depends on wire fixtures captured
│ Layer 3: MCP tool handler tests     │  ← depends on v1 + v2 MCP servers
│ Layer 5: JSON golden tests          │  ← depends on v1 + v2 CLI built
└─────────────────────────────────────┘
                ↓
┌─────────────────────────────────────┐
│ Layer 1: parity tests               │  ← depends on mock-broker + helpers
└─────────────────────────────────────┘
                ↓
┌─────────────────────────────────────┐
│ Layer 4: e2e smoke tests            │  ← depends on real broker + testcontainers
└─────────────────────────────────────┘

Layers 1, 2, 3, 5, 6, 7 run on every PR. Layer 4 runs on main merges + release candidates (slower).

13. CI integration

13.1 PR-level pipeline

jobs:
  lint-and-typecheck:
    - biome check
    - eslint (boundaries + 3 custom rules)
    - tsc --noEmit
    - dependency-cruiser

  unit-tests:
    - bun test tests/unit/                    # facade layer
    - bun test src/services/**/*.test.ts      # colocated facade tests
    - bun test tests/golden/                   # JSON shape

  parity-tests:
    - bun test tests/parity/
    needs: [v1-cli-available, v2-cli-built]

  contract-tests:
    - bun test tests/contract/
    needs: [wire-fixtures-available]

  mcp-tool-tests:
    - bun test tests/mcp-tools/

  port-forward-tests:
    - bun test tests/unit/crypto-roundtrip.test.ts
    - bun test tests/unit/invite-parse.test.ts

13.2 Release-candidate pipeline

Adds layer 4:

jobs:
  e2e-docker:
    - E2E_BROKER=docker bun test tests/e2e/
    timeout-minutes: 60

13.3 Coverage gates

Unit tests: ≥ 80% branch coverage on src/services/**/*.ts (excluding services/broker/*)
Parity tests: 100% of inventory §12 checks mapped to at least one passing test
Contract tests: 100% of 85 WS message types have at least one passing test
MCP tool tests: 100% of 79 tools have at least one passing test
E2E tests: all 28 journey tests passing on main

If any gate fails, the PR cannot merge.

14. Success criteria (the ship checklist)

v2 Pass 1 ships when ALL of these are green:

Layer 1 parity: 70 test files, every inventory §12 regression check has at least one passing parity test
Layer 2 contract: 85 test files, every broker WS message type has a contract test passing against captured v1 envelopes
Layer 3 MCP tools: 79 test files, every MCP tool handler produces identical WS output between v1 and v2
Layer 4 e2e: 28 journey tests pass against a real broker in Docker
Layer 5 golden: 12 JSON output tests pass, schema_version field present and stable
Layer 6 facade: 17 service facade test files pass + 1 global boundary scanner (AST-based, no false positives)
Layer 7 port-forward: 2 forwarded v1 CLI tests pass + broker test suite unchanged and green
Coverage gates: all thresholds met
Zero new regressions: any previously-passing test that starts failing must be fixed before merge (no skipping, no .todo)

Total test files in v2 Pass 1: ~295 (70 + 85 + 79 + 28 + 12 + 18 + 2).

That's a lot. Most are template-driven — one helper + one fixture + one pattern = many tests. The scaffolding pass creates the files with NotImplementedError stubs, and the implementation pass fills them in.

No time estimate. It's done when the checklist is green.

15. Scaffolding implications

When v2 is re-scaffolded, the test infrastructure ships alongside the source:

tests/helpers/ with v1-runner, v2-runner, wire-capture, mock-broker, real-broker, temp-home, ink-render as stubs with NotImplementedError
tests/fixtures/ with directory structure and placeholder JSON files
Every test file in layers 1–6 scaffolded with its describe blocks, imports, and fixture references — but the body is throw new NotImplementedError('<test name>')
A CI job that counts test files vs expected counts and fails if any are missing
A script tests/helpers/capture-v1-fixtures.ts that runs v1 once against a test broker to generate the wire fixtures

The scaffold pass adds ~300 test files. The implementation pass replaces the NotImplementedError bodies with real assertions, one file at a time, driven by the priority order above.

The implementation pass is organized by test layer, not by feature:

Scaffold everything (layers 1–6) with stubs
Implement layer 7 (port-forwarded tests) — fastest wins, establishes test harness
Implement layer 6 (facade units) — validates each service as it's written
Implement layer 5 (golden JSON) — locks output shapes early
Implement layer 2 (WS contract) — proves wire compatibility
Implement layer 3 (MCP tool handlers) — proves tool dispatch
Implement layer 1 (parity) — full behavioral equivalence
Implement layer 4 (e2e) — end-to-end sanity

When layer 1 is fully green, v2 Pass 1 is shippable.

16. What this plan does NOT cover

Explicitly out of scope for Pass 1 testing:

Broker-side tests — broker is unchanged in Pass 1; broker's own test suite runs unchanged
Performance regression tests — v2 shouldn't be slower than v1, but quantifying that is Pass 2 work with bench tests
Security audit — the spec-level security improvements (role-per-mesh Postgres, egress proxies, SSRF policies) are Pass 2
Accessibility audit — the testable a11y matrix is Pass 2 (requires the VoiceOver shim which is Pass 2)
Load tests — 10k concurrent peers, sustained message throughput — deferred
Chaos tests — broker restart mid-operation, network partition recovery — deferred to local-first Pass 2
Cross-platform tests on Windows — v2 Pass 1 targets macOS + Linux; Windows support is best-effort, tested on release candidates only

Pass 2 adds those layers when the corresponding features ship.

End of plan.

48 KiB Raw Blame History Unescape Escape

claudemesh v2 — Pass 1 Parity Test Plan

1. Why this exists

2. The seven test layers

3. Layer 1 — Parity tests (inventory §12 driven)

3.1 First-run parity (inventory §12.1)

3.2 Session lifecycle parity (inventory §12.2)

3.3 Messaging parity (inventory §12.3)

3.4 Cryptographic integrity parity (inventory §12.4)

3.5 Broker backends parity (inventory §12.6)

3.6 Scheduled messages + URL watch parity (inventory §12.7)

3.7 Telegram bridge parity (inventory §12.8)

3.8 Dashboard sync parity (inventory §12.9)

3.9 Webhooks parity (inventory §12.10)

3.10 Doctor checks parity (inventory §12.11)

4. Layer 2 — WS contract tests

4.1 Contract test pattern

4.2 The normalize() helper

4.3 Full contract test manifest (85 tests)

5. Layer 3 — MCP tool handler tests

5.1 MCP handler test pattern

6. Layer 4 — End-to-end smoke tests

6.1 E2E test harness

6.2 Testcontainers vs local broker

7. Layer 5 — JSON output golden tests

7.1 Golden test pattern

8. Layer 6 — Facade unit tests

8.1 Facade test pattern

9. Layer 7 — Port-forwarded v1 tests

9.1 Tests to port forward

10. Test helper infrastructure

10.1 v1-runner.ts

10.2 v2-runner.ts

10.3 wire-capture.ts

10.4 mock-broker.ts

10.5 real-broker.ts (for e2e)

10.6 temp-home.ts

10.7 ink-render.ts

10.8 sqlite-fixture.ts

11. Shared fixtures

12. Execution order + dependencies

13. CI integration

13.1 PR-level pipeline

13.2 Release-candidate pipeline

13.3 Coverage gates

14. Success criteria (the ship checklist)

15. Scaffolding implications

16. What this plan does NOT cover

48 KiB

Raw Blame History

4.2 The `normalize()` helper

10.1 `v1-runner.ts`

10.2 `v2-runner.ts`

10.3 `wire-capture.ts`

10.4 `mock-broker.ts`

10.5 `real-broker.ts` (for e2e)

10.6 `temp-home.ts`

10.7 `ink-render.ts`

10.8 `sqlite-fixture.ts`