Commit Graph

5 Commits

Author SHA1 Message Date
Alejandro Gutiérrez
05729ad8a4 feat(ga): close remaining GA blockers (backcompat, HA prep, tests, docs)
Some checks failed
CI / Lint (push) Has been cancelled
CI / Typecheck (push) Has been cancelled
CI / Broker tests (Postgres) (push) Has been cancelled
CI / Docker build (linux/amd64) (push) Has been cancelled
Backwards compat shim (task 27)
- requireCliAuth() falls back to body.user_id when BROKER_LEGACY_AUTH=1
  and no bearer present. Sets Deprecation + Warning headers + bumps a
  broker_legacy_auth_hits_total metric so operators can watch the
  legacy traffic drain to 0 before removing the shim.
- All handlers parse body BEFORE requireCliAuth so the fallback can
  read user_id out of it.

HA readiness (task 29)
- .artifacts/specs/2026-04-15-broker-ha-statelessness-audit.md
  documents every in-memory symbol and rollout plan (phase 0-4).
- packaging/docker-compose.ha-local.yml spins up 2 broker replicas
  behind Traefik sticky sessions for local smoke testing.
- apps/broker/src/audit.ts now wraps writes in a transaction that
  takes pg_advisory_xact_lock(meshId) and re-reads the tail hash
  inside the txn. Concurrent broker replicas can no longer fork the
  audit chain.

Deploy gate (task 30)
- /health stays permissive (200 even on transient DB blips) so
  Docker doesn't kill the container on a glitch.
- New /health/ready checks DB + optional EXPECTED_MIGRATION pin,
  returns 503 if either fails. External deploy gate can poll this
  and refuse to promote a broken deploy.

Metrics dashboard (task 32)
- packaging/grafana/claudemesh-broker.json: ready-to-import Grafana
  dashboard covering active conns, queue depth, routed/rejected
  rates, grant drops, legacy-auth hits, conn rejects.

Tests (task 28)
- audit-canonical.test.ts (4 tests) pins canonical JSON semantics.
- grants-enforcement.test.ts (6 tests) covers the member-then-
  session-pubkey lookup with default/explicit/blocked branches.

Docs (task 34)
- docs/env-vars.md catalogues every env var the broker + CLI read.

Crypto review prep (task 35)
- .artifacts/specs/2026-04-15-crypto-review-packet.md: reviewer
  brief, threat model, scope, test coverage list, deliverables.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-15 23:51:28 +01:00
Alejandro Gutiérrez
ee12510ef1 refactor: rename cli-v2 → cli, archive legacy cli, plus broker-side grants + auto-migrate
Some checks failed
CI / Lint (push) Has been cancelled
CI / Typecheck (push) Has been cancelled
CI / Broker tests (Postgres) (push) Has been cancelled
CI / Docker build (linux/amd64) (push) Has been cancelled
- apps/cli/ is now the canonical CLI (was apps/cli-v2/).
- apps/cli/ legacy v0 archived as branch 'legacy-cli-archive' and tag
  'cli-v0-legacy-final' before deletion; git history preserves it too.
- .github/workflows/release-cli.yml paths updated.
- pnpm-lock.yaml regenerated.

Broker-side peer-grant enforcement (spec: 2026-04-15-per-peer-capabilities):
- 0020_peer-grants.sql adds peer_grants jsonb + GIN index on mesh.member.
- handleSend in broker fetches recipient grant maps once per send, drops
  messages silently when sender lacks the required capability.
- POST /cli/mesh/:slug/grants to update from CLI; broker_messages_dropped_by_grant_total metric.
- CLI grant/revoke/block now mirror to broker via syncToBroker.

Auto-migrate on broker startup:
- apps/broker/src/migrate.ts runs drizzle migrate with pg_advisory_lock
  before the HTTP server binds. Exits non-zero on failure so Coolify
  healthcheck fails closed.
- Dockerfile copies packages/db/migrations into /app/migrations.
- postgres 3.4.5 added as direct broker dep.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-15 08:44:52 +01:00
Alejandro Gutiérrez
bb28e16c7d fix(broker): increase healthcheck start-period, catch Grammy errors
Some checks failed
CI / Lint (push) Has been cancelled
CI / Typecheck (push) Has been cancelled
CI / Broker tests (Postgres) (push) Has been cancelled
CI / Docker build (linux/amd64) (push) Has been cancelled
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-09 11:14:44 +01:00
Alejandro Gutiérrez
f4bcad91b0 refactor(deploy): trim docker images via pnpm deploy --legacy
Some checks failed
CI / Lint (push) Has been cancelled
CI / Typecheck (push) Has been cancelled
CI / Broker tests (Postgres) (push) Has been cancelled
CI / Docker build (linux/amd64) (push) Has been cancelled
Use pnpm deploy to flatten each package's runtime subset into /deploy,
then copy ONLY that into the runtime stage. Catalog + workspace:*
specifiers previously forced full-workspace resolution into every
image's node_modules — unnecessary for either runtime.

Results (arm64, same smoke tests pass):
- broker:   3.26GB → 341MB  (-90%, drops all devDeps incl. drizzle-kit)
- migrate:  3.27GB → 653MB  (-80%, keeps drizzle-kit which IS runtime)

Broker /health confirms GIT_SHA build-arg still propagates (gitSha:
"30bc24f" in smoke test). Migrate still reads drizzle.config.ts and
attempts the connection correctly.

--legacy flag needed because pnpm 10 defaults to inject-workspace-
packages mode which the monorepo doesn't opt into; legacy is safe here.
--ignore-scripts on deploy skips the root postinstall (sherif lint:ws)
which has nothing to do with runtime.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-05 15:37:21 +01:00
Alejandro Gutiérrez
c6674e971a chore(deploy): production Dockerfiles for broker + web + env template
Some checks failed
CI / Tests / 🧪 Test (push) Has been cancelled
- apps/broker/Dockerfile: oven/bun 1.2-slim runtime, multi-stage, pnpm deps,
  non-root bun user, GIT_SHA build-arg, /health-based HEALTHCHECK, port 7900
- apps/web/Dockerfile: Next.js 15 standalone, multi-stage, non-root nextjs
  user, NEXT_PUBLIC_* baked as build args, port 3000
- .env.production.template: DATABASE_URL, BetterAuth, OAuth, broker caps;
  no secrets
- Build context: repo root (pnpm workspace needs root pnpm-lock.yaml +
  pnpm-workspace.yaml); build with -f apps/{broker,web}/Dockerfile .

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-04 22:22:05 +01:00