Alejandro Gutiérrez
788ef84756
Phases 2-4: Requester support, batches, webhooks, scraper registry
...
Phase 2 - Requester & Batch Support:
- core/database.py: Added create_job params (requester_*, batch_*, priority, callback_*)
- core/database.py: Added batch methods (create_batch, get_batch, update_batch_progress, get_batches)
- core/database.py: Added update_job_callback for tracking webhook delivery
- api/routes/batches.py: New endpoints:
- POST /api/scrape/google-reviews/batch (submit batch)
- GET /api/batches (list batches)
- GET /api/batches/{id} (batch detail)
- DELETE /api/batches/{id} (cancel batch)
- api_server_production.py: Updated /api/scrape with requester, priority, callback fields
- api_server_production.py: New primary endpoint POST /api/scrape/google-reviews
Phase 3 - Webhooks:
- services/job_callback_service.py: New service with:
- JobCallbackService: send_job_callback, send_batch_callback, retry_failed_callbacks
- JobCallbackDispatcher: Background worker for callback monitoring
- Payload formats per spec (job.completed, job.failed, batch.completed)
- Exponential backoff for retries
- Error classification for failure payloads
Phase 4 - Scraper Registry:
- scrapers/registry.py: Database-backed version routing:
- get_scraper(): Version/variant/A/B routing
- _get_weighted_scraper(): Traffic-weighted random selection
- 60-second TTL cache for performance
- register_scraper, deprecate_scraper, update_traffic_allocation
- LegacyScraperRegistry preserved for backwards compatibility
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com >
2026-01-24 15:35:58 +00:00
Alejandro Gutiérrez
2412996c54
Phase 1: Database migrations for platform features
...
Migrations created:
- 001_add_job_platform_fields.sql: Add 15 new columns to jobs table
- Requester tracking (client_id, source, purpose, metadata)
- Batch support (batch_id, batch_index)
- Execution tracking (job_type, scraper_version, variant, priority)
- Webhook callbacks (url, status, sent_at, attempts)
- Result summary (JSONB for cross-type dashboard)
- 7 indexes for query performance
- 5 CHECK constraints for data validation
- 002_create_batches_table.sql: Batch job grouping
- Tracks batch progress (total/completed/failed)
- Batch-level callbacks
- Requester association
- 003_create_scraper_registry.sql: Scraper version management
- Version routing (stable/beta/canary variants)
- A/B traffic splitting (traffic_pct)
- Priority-based routing
- Seeds google_reviews v1.0.0 as stable default
- 004_create_api_keys.sql: API authentication
- Secure key storage (SHA-256 hashes, not plaintext)
- Scopes-based permissions
- Rate limiting support
- Key lifecycle (expiry, active status)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com >
2026-01-24 15:24:28 +00:00
Alejandro Gutiérrez
bb0291f265
Add CONTEXT-KEEPER.md for conversation continuity
...
Quick-reference document for resuming work after context compaction.
Contains: project overview, current state, spec summary, phases,
key decisions, file locations, and resumption instructions.
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com >
2026-01-24 15:14:01 +00:00