Commit Graph

116 Commits

Author SHA1 Message Date
Alejandro Gutiérrez
479f1ee94a fix(api): Use list[Any] for strengths to preserve V2 fields
Pydantic was coercing V2 StrengthToProtect dicts to the partial
ReportStrengthResponse type, dropping fields like `percentage` and
`top_quotes`. Changed to list[Any] to pass through raw data.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-30 15:33:08 +00:00
Alejandro Gutiérrez
2a292e0754 fix(synthesis): Select most common business_id to handle data leakage
Changed the business name query to ORDER BY COUNT(*) DESC instead of
arbitrary LIMIT 1, ensuring the correct business is identified even
when trace amounts of other business data leak into a job.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-30 15:28:02 +00:00
Alejandro Gutiérrez
69d617ca38 feat(api): Add support for V2 synthesis format in analytics endpoint
- Extended SynthesisResponse model to support both legacy (v1) and
  new 6-section (v2) report formats
- V2 format includes executive_summary, risk_scorecard, critical_issues,
  action_matrix, and tracking_kpis sections
- Frontend type guards use report_version and executive_summary fields
  to detect format and render appropriate components
- Backwards compatible: legacy v1 responses still work unchanged

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-30 15:12:41 +00:00
Alejandro Gutiérrez
0a53e98bf9 fix(pipeline): Update stage result to use new synthesis fields
Change action_matrix reference from old 'actions' field.
Add issues_identified and health_score to stage result data.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-30 14:50:21 +00:00
Alejandro Gutiérrez
d5ef13b58e feat(frontend): Add BusinessReport component for 6-section €60 report
- Create BusinessReport.tsx with 6 sections:
  1. Executive Summary (health score, rating, momentum)
  2. Risk Scorecard (indicators with colors/trends)
  3. Critical Issues (evidence, solutions, timelines)
  4. Strengths to Protect (quotes, leverage actions)
  5. Action Matrix (effort/impact quadrants)
  6. 90-Day Tracking (KPI targets table)

- Update types.ts with new interfaces:
  - SynthesisV2 for new report format
  - LegacySynthesis for backwards compatibility
  - Type guard isSynthesisV2() for runtime detection

- Update ReportTab to auto-detect synthesis version
- Update AnalystReport, ReviewIQDashboard, StoryView
  for backwards compatibility with union type

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-30 14:36:05 +00:00
Alejandro Gutiérrez
b4bef004e8 feat(synthesis): Redesign report to 6-section €60 business value format
Transform synthesis stage from consultant memo to productized report:

- New 6-section structure: Executive Summary, Risk Scorecard,
  Critical Issues, Strengths, Action Matrix, 90-Day Tracking
- Add chart data aggregation (7 charts: gauge, pies, trends)
- Evidence-grounded LLM prompt requiring quote citations
- Hyper-specific solution generation with WHAT/WHO/WHEN/WHY
- Taxonomy-guided solutions contextualized to business type
- Staff name extraction from quotes for leverage actions
- Success metrics tied to actual complaint keywords

Report sections now include:
- Health score (1-100) with revenue-at-risk estimates
- Risk indicators with color-coded scores and trends
- Critical issues with evidence, root cause, and specific solutions
- Strengths with named staff and specific marketing actions
- Action matrix with effort/impact quadrants and deadlines
- 90-day KPIs with 30/60/90 day targets

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-30 14:27:40 +00:00
Alejandro Gutiérrez
2f92735548 fix(ui): Replace nested button with div for accessibility
Changed outer button to div with role="button" to avoid HTML validation
error of nested buttons (translate button inside complaint card).

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-29 03:24:57 +00:00
Alejandro Gutiérrez
157b76040f fix(synthesis): Use correct column name 'id' instead of 'execution_id'
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-29 03:19:09 +00:00
Alejandro Gutiérrez
9b667e69a7 feat(pipeline): Add Stage 5 Synthesis for AI-generated narratives
- Add Stage5Synthesizer class that generates AI narratives and action plans
- Add generate() method to LLMClient for synthesis generation
- Integrate Stage 5 into pipeline runner after route stage
- Add synthesis JSONB column to pipeline.executions table
- Update reviewiq_analytics API to return synthesis data
- Synthesis includes: executive narrative, sentiment/category/timeline insights,
  action plan, marketing angles, and priority recommendations

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-29 03:12:53 +00:00
Alejandro Gutiérrez
c8ecb4b98f feat(reviewiq): Add AI synthesis support to dashboard components
Frontend:
- Add Synthesis type with action plan, insights, annotations
- ExecutiveSummary: Accept synthesis prop for AI narrative
- SentimentPie: Accept insight prop for contextual explanation
- IntensityHeatmap: Accept insight + highlightDomain props
- TimelineChart: Accept insight + annotations props
- All components gracefully degrade when synthesis is null

Backend:
- Add Stage 4: Synthesize for generating AI narratives
- Gathers context from classified spans
- Generates executive narrative, section insights, action plan
- Produces timeline annotations and marketing angles
- Stores synthesis in pipeline.executions table

Components show AI insights with purple gradient styling when available,
fall back to existing behavior when synthesis is not yet generated.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-29 02:59:47 +00:00
Alejandro Gutiérrez
8f9dd136cd feat(reviewiq): Redesign dashboard with user-friendly UX
- ExecutiveSummary: Add rating badge with emoji, AI narrative section,
  #1 Problem/#1 Strength cards, domain complaints with progress bars
- SentimentPie: Replace pie chart with card-based design showing
  sentiment score, emoji indicators (😊😟😐🤔), percentages
- IntensityHeatmap: Transform to Praise vs Complaints heatmap with
  friendly domain labels (👥 Staff, 💰 Pricing, etc.)
- URTBarChart: Horizontal progress bars with emojis, health indicators
- TimelineChart: Add view toggles (Sentiment/Volume/Rating), trend
  indicator, fix chronological order (oldest→newest left→right)
- ReviewIQDashboard: Streamline from 11 sections to 5, remove redundancy

Removed redundant components:
- DomainScores (merged into ExecutiveSummary)
- KPISection (stats in header)
- RatingSimulator (in ExecutiveSummary)
- StrengthsWeaknesses (in ExecutiveSummary)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-29 02:52:13 +00:00
Alejandro Gutiérrez
c6beeaa3dc feat: Add Opportunity Matrix with coordinate-based positioning
- Add 2x2 matrix visualization (Quick Wins, Critical, Strategic, Nice to Have)
- Position items based on frequency/effort coordinates with dots and leader lines
- Add L-shaped axes with arrows showing Frequency (X) and Effort (Y) directions
- Include unified coordinate grid overlay across all quadrants
- Add clickable subcode labels with hover effects

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-25 12:29:01 +00:00
Alejandro Gutiérrez
af82467595 fix: Link Analytics button to job analytics page
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-24 22:00:13 +00:00
Alejandro Gutiérrez
194e6e0fbf feat: Add view toggle between table and card views on pipeline page
- Add ViewToggle component with table/cards icons
- Default to table view with TanStack table
- Card view shows execution cards in grid layout
- Toggle persists view preference during session

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-24 21:19:30 +00:00
Alejandro Gutiérrez
4d48437b21 feat: Add TanStack table for pipeline executions with debug modal
- Create ExecutionsView component with TanStack Table
- Add status filter buttons with count badges
- Add action buttons: Analytics, Metrics, Debug
- Add debug modal with AI copy-paste button for failed executions
- Generate detailed debug report with stage metrics and error context
- Update executions page to use new component

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-24 21:16:58 +00:00
Alejandro Gutiérrez
796f587c57 feat: Add pipeline execution UI, stage metrics, and API proxy routes
- Add run pipeline page with job selection UI
- Add execution detail page with stage metrics visualization
- Add stage_metrics and total_duration_ms to pipeline.executions table
- Create Next.js API proxy routes for all pipeline endpoints
- Fix trailing slash issues in pipeline-api.ts URLs
- Add Docker volume mounts for pipeline packages
- Add REVIEWIQ_DATABASE_URL and LLM API keys to docker-compose
- Fix JSONB field parsing in execution detail endpoint

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-24 21:14:27 +00:00
Alejandro Gutiérrez
acdfed8044 fix: Improve version dropdown text contrast
Added text-gray-900 and font-medium classes to select element
for better readability.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-24 20:22:05 +00:00
Alejandro Gutiérrez
9f714913db feat: Add scraper version selector to frontend
- Add version selector dropdown in scrape confirmation modal
- Default to v1.1.0 (Multi-Sort) which bypasses ~1000 review limit
- Pass scraper_version through API proxy to backend
- Update /new page fallback to show v1.1.0 as available
- Show version description explaining multi-sort benefits

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-24 19:13:52 +00:00
Alejandro Gutiérrez
824634aa76 feat: Add extensible multi-pipeline integration system
This commit implements a plugin-like pipeline architecture with:

Pipeline Core Package (packages/pipeline-core/):
- BasePipeline abstract class all pipelines implement
- PipelineRegistry for database-backed discovery/management
- PipelineRunner for execution with status tracking
- DashboardConfig contracts for dynamic widget definitions

Database Migration (006_pipeline_registry.sql):
- pipeline.registry table for registered pipelines
- pipeline.executions table for execution history
- Views for execution stats and monitoring

ReviewIQ Pipeline Refactor:
- Implements BasePipeline interface
- Adds get_dashboard_config() with widget definitions
- Adds get_widget_data() methods for all dashboard widgets
- Maintains backward compatibility with Pipeline alias

Generic Pipeline API (api/routes/pipelines.py):
- GET /api/pipelines - List all registered pipelines
- GET /api/pipelines/{id} - Pipeline details
- POST /api/pipelines/{id}/execute - Execute pipeline
- GET /api/pipelines/{id}/dashboard - Dashboard config
- GET /api/pipelines/{id}/widgets/{w} - Widget data
- GET /api/pipelines/{id}/executions - Execution history

Frontend Dynamic Dashboard System:
- DynamicDashboard component renders from config
- WidgetRegistry maps types to components
- Widget components: StatCard, LineChart, BarChart,
  PieChart, DataTable, Heatmap
- Pipeline API client library

Frontend Pipeline Pages:
- /pipelines - List all registered pipelines
- /pipelines/[id] - Dynamic dashboard for pipeline
- /pipelines/[id]/executions - Execution history
- Pipelines nav item in Sidebar

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-24 19:05:38 +00:00
Alejandro Gutiérrez
d64f06ba9e feat: Add scraper version routing with v1.1.0 as default
- Import both v1.0.0 and v1.1.0 scraper versions
- Add SCRAPER_VERSIONS registry mapping version strings to functions
- Add get_scraper_for_version() to route based on job metadata
- Default to v1.1.0 (multi-sort) for new jobs
- Frontend can select specific version via scraper_version parameter
- Validation endpoint continues using v1.0.0 for speed

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-24 19:04:06 +00:00
Alejandro Gutiérrez
7771c734c6 fix: Remove undefined get_dom_reviews call in multi-sort passes
The multi-sort loop was calling get_dom_reviews() which doesn't exist.
API interception alone is sufficient for capturing reviews during
multi-sort passes, so we now use only api_reviews.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-24 18:58:04 +00:00
Alejandro Gutiérrez
fbd61ff7f7 feat: Add multi-sort scraper v1.1.0 and improve v1.0.0 reliability
v1.0.0 improvements:
- Add captcha detection (reCAPTCHA, unusual traffic, challenges)
- Block fonts, analytics, maps tiles for faster scrolling
- Add 95% close-enough threshold to skip unnecessary retries
- Stop immediately if captcha detected instead of retrying

v1.1.0 new features:
- Multi-sort strategy to bypass ~1000 review limit
- Cycles through newest/lowest/highest/relevant sorts
- Auto mode: enables multi-sort when total > 1000
- Diminishing returns detection (stops if <5% new per pass)
- Configurable sort order and thresholds

Also adds test_scraper_v110.py CLI tool for testing multi-sort.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-24 18:47:30 +00:00
Alejandro Gutiérrez
e2d7f6f118 feat: Add ScraperV1Adapter and real data pipeline test
- Add ScraperV1Adapter to transform scraped reviews into pipeline format
  - Handles relative timestamps (centerDate)
  - Generates deterministic IDs for DOM-sourced reviews
  - Filters out empty (rating-only) reviews

- Add sample barbershop reviews (79 reviews, 46 with text)
  - Real data from Las Palmas barbershop
  - Multi-language: Spanish, English, German, Norwegian, Italian

- Add test_pipeline_real_data.py for E2E testing with real data
  - Uses mock classifier based on keywords and rating
  - Full pipeline flow: raw -> enriched -> spans -> issues -> facts

Test results with real data:
- 46 reviews processed
- 6 languages detected (es: 35, en: 7, de: 1, no: 1, it: 1, ca: 1)
- 3 issues identified from negative reviews
- 29 fact records aggregated across date range 2017-2025

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-24 18:35:09 +00:00
Alejandro Gutiérrez
3e57c887e9 test: Add E2E pipeline test with real database
Tests the full pipeline flow:
- Stage 1: Insert raw reviews, normalize text
- Stage 2: Mock LLM classification, insert spans
- Stage 3: Route negative spans to issues
- Stage 4: Aggregate facts by URT code and date

Validates all pipeline.* tables are populated correctly.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-24 18:28:53 +00:00
Alejandro Gutiérrez
03ed7029e2 feat: Add decoupled pipeline schema with separate PostgreSQL namespace
- Create consolidated migration (005_create_pipeline_schema.sql) with
  'pipeline' schema for all classification tables
- Update pipeline repositories to use schema prefix (pipeline.*)
- Add run_migrations() method to DatabaseManager
- Add CLI tool for running versioned migrations

Tables created in pipeline schema:
- reviews_raw, reviews_enriched (Stage 1)
- review_spans (Stage 2)
- issues, issue_spans, issue_events (Stage 3)
- fact_timeseries (Stage 4)
- urt_domains, urt_categories (taxonomy lookup)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-24 18:17:20 +00:00
Alejandro Gutiérrez
7d720f5378 feat: Add reviewiq-pipeline package for LLM-powered review classification
Implement a standalone Python package for processing customer reviews through
a 4-stage pipeline using URT (Universal Review Taxonomy) v5.1:

- Stage 1: Normalization (text cleaning, language detection, deduplication)
- Stage 2: LLM Classification (OpenAI/Anthropic span extraction with URT codes)
- Stage 3: Issue Routing (deterministic issue ID generation, span linking)
- Stage 4: Fact Aggregation (time series metrics for dashboards)

Package includes:
- TypedDict contracts matching Pipeline-Contracts-v1.md
- Async database layer with asyncpg and 5 SQL migrations
- LLM client abstraction supporting both OpenAI and Anthropic
- Sentence-transformers integration for embeddings
- Validation rules V1.x through V4.x
- CLI commands: migrate, run, validate, check
- 55 unit and integration tests (all passing)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-24 18:07:11 +00:00
Alejandro Gutiérrez
b780a23b66 fix: Correct imports in test_scraper CLI tool
- Import LogCapture from scraper module
- Remove unused StructuredLogger import
- Use correct log_capture parameter name

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-24 17:24:07 +00:00
Alejandro Gutiérrez
84f5efb5c7 feat: Add CLI tool for quick scraper testing
Usage:
  python tools/test_scraper.py "ClickRent Gran Canaria"
  python tools/test_scraper.py "Starbucks NYC" --max 100
  python tools/test_scraper.py --url "https://..." --headless
  python tools/test_scraper.py "Business" -o results.json -v

Features:
- Search by business name or direct URL
- Configurable max reviews and timeout
- Headless mode support
- JSON output option
- Real-time progress display
- Verbose logging mode

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-24 17:20:12 +00:00
Alejandro Gutiérrez
6b3f055760 fix: Prevent Chrome tab crash by removing processed DOM cards
Root cause: Cards were hidden but not removed from DOM, causing
memory buildup (400+ nodes) that crashed Chrome tabs.

Changes:
- Actually remove processed cards from DOM (not just hide them)
- Keep last 50 cards for scroll reference/continuity
- Remove adjacent separator elements along with cards
- Add logging when DOM cleanup removes cards
- Cards near scroll end stay visible for reference

This should prevent "tab crashed" errors during long scraping
sessions with 500+ reviews.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-24 17:17:21 +00:00
Alejandro Gutiérrez
65eb979c12 feat: Add "Copy Crash Report" button for failed/partial jobs
- Generate structured markdown crash report optimized for Claude
- Includes: job metadata, timeline, progress, error, logs (last 50)
- Adds context and suggested investigation steps
- Orange clipboard button appears for failed/partial jobs
- Shows green checkmark briefly after successful copy
- Fetches logs async when generating report

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-24 17:09:48 +00:00
Alejandro Gutiérrez
acd3b22e88 docs: Add pipeline development artifacts for parallel implementation
New artifacts:
- ReviewIQ-Pipeline-DevGuide.md: Entry point for pipeline work
- ReviewIQ-Pipeline-Contracts-v1.md: Stage I/O specs, validation rules, test fixtures
- ReviewIQ-Pipeline-Checklist.md: Per-stage implementation checklists
- ReviewIQ-Codebase-Overview.md: File structure, integration points
- ReviewIQ-v3.2.1-Taxonomy-Versioning.md: Taxonomy versioning addendum

Updated:
- ReviewIQ-v32-Decisions.md: Added B2 audit findings, taxonomy versioning decisions, pipeline status

These artifacts enable parallel development of pipeline stages 1-4 with:
- Independent validation (35 rules across stages)
- Clear input/output contracts
- Test fixtures for each stage
- Definition of done criteria

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-24 17:08:40 +00:00
Alejandro Gutiérrez
c2996bef1e fix: Calculate job speed using last successful data retrieval timestamp
- Use updated_at (last successful data loop) instead of Date.now()
- Speed now reflects actual data retrieval rate, not declining over time
- Updated in table column, monitored job view, and stats row
- Fall back to Date.now() if updated_at is not available

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-24 17:04:35 +00:00
Alejandro Gutiérrez
5165d65152 fix: Center confirmation modal using transform
- Use fixed positioning with top/left 50% and translate -50%
- More reliable centering regardless of parent containers
- Add max-width for mobile responsiveness

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-24 16:50:08 +00:00
Alejandro Gutiérrez
83b245bbfc fix: Show blue background with spinner during validation
- Keep blue background when isCheckingReviews is true
- Add cursor-wait during validation
- Move disabled styling to explicit condition check
- White spinner now visible on blue background

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-24 16:49:35 +00:00
Alejandro Gutiérrez
e0e86d2830 feat: Persist jobs to localStorage and reset search after launch
- Reset search fields after job is successfully launched
- Allow user to immediately start another scrape
- Save active jobs to localStorage for persistence across refresh
- Restore jobs from localStorage on page load
- Resume polling for non-terminal jobs (pending/running)
- Filter out jobs older than 24 hours
- Add remove button (X) to each job card
- Clean up localStorage when jobs are removed

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-24 16:47:01 +00:00
Alejandro Gutiérrez
0c8da54045 fix: Center confirmation modal properly
- Remove w-full that caused alignment issues
- Use fixed width (400px) for consistent centering

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-24 16:40:54 +00:00
Alejandro Gutiérrez
ccfe00cebe fix: Properly center map click modal
- Remove w-full and mx-auto that caused alignment issues
- Use fixed width (280px) instead of max-w-xs
- Let flex container handle centering

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-24 16:40:12 +00:00
Alejandro Gutiérrez
956d5dacda fix: Center map click modal with proper padding
- Center modal properly within map preview area
- Add 24px padding from map edges
- Make modal more compact (max-w-xs)
- Reduce text and element sizes for better fit

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-24 16:38:49 +00:00
Alejandro Gutiérrez
d4c3018429 refactor: Change search fields to horizontal layout
- Place Business Name, Location, and Validate button in same row
- Reduce padding and font sizes for compact inline layout
- Show abbreviated text on mobile (responsive)
- Use checkmark indicator for auto-detected location

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-24 16:37:08 +00:00
Alejandro Gutiérrez
82b2c51e4e feat: Split search into Business Name + Location fields
- Split single search input into two fields: Business Name (required)
  and Location (auto-detected from IP geolocation)
- Auto-fill location field with city/country from IP on page load
- Add click overlay on map iframe to prevent interaction
- Add warning modal when user clicks map, directing them to use search
- Update test URLs to use split format
- Make Validate button full-width for better UX

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-24 16:35:15 +00:00
Alejandro Gutiérrez
afab5127b3 Restore Google Maps iframe preview
- Restore original Google Maps embed iframe approach
- URL: maps.google.com/maps?q=...&output=embed&z=15
- Add "Open in Maps" overlay button on the map
- Height 300px for better visibility

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-24 16:29:33 +00:00
Alejandro Gutiérrez
43fd1515d2 Align artifacts with canonical URT v5.1 specification
Fixes inconsistencies discovered during audit against urt-taxonomy/:

- urt_profile ENUM: Add 'lite' and 'core' profiles (was missing)
- USN format: Use canonical regex from spec (was non-compliant)
- USN valence encoding: Add V0 (0) and V± (±) support
- USN grammar: Add Lite (URT:L:) and Core (URT:C:) formats
- Dimension codes: Fix temporal (TC/TR/TH/TF), evidence (ES/EI/EC),
  comparative (CR-N/CR-B/CR-W/CR-S) in decisions doc
- LLM contract: Full USN regex validation pattern

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-24 16:21:21 +00:00
Alejandro Gutiérrez
7666b7aea2 Fix: Replace broken Google Maps iframe with interactive preview + add scraper type selection
- Replace non-working Google Maps embed iframe with animated location preview
- Add "Open in Google Maps" button to open location in new tab
- Add scraper type selection dropdown fetching from /api/admin/scrapers
- Show selected scraper info with formatted labels (Google Reviews v1.0.0)
- Include scraper_version and scraper_variant in job submission

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-24 16:15:58 +00:00
Alejandro Gutiérrez
46cd54e275 Add LLM Classification Contract v1.0
Defines prompt, output schema, and validation rules for span-level
URT classification:

- System prompt with span extraction rules
- JSON schema for structured output
- 4 few-shot examples (multi-span, temporal, comparative)
- Structural and semantic validation rules
- Error handling with retry + fallback
- Performance considerations (token budget, batching, caching)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-24 16:07:31 +00:00
Alejandro Gutiérrez
3317553658 Wire frontend to real API endpoints
Dashboard page:
- Fetch top clients from /api/dashboard/by-client
- Show loading state while fetching
- Display empty state when no client data
- Show real client_id, job count, and success rate

Scrapers page:
- Fetch versions from /api/admin/scrapers
- Wire promote/deprecate buttons to real API calls
- Wire add version form to POST /api/admin/scrapers
- Wire traffic allocation to PUT /api/admin/scrapers/{id}/traffic
- Add loading and error states

Dockerfile:
- Add COPY commands for new directories (api/, core/, scrapers/, etc.)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-24 16:05:29 +00:00
Alejandro Gutiérrez
39c80fc8be Phases 5-7: Dashboard UI, Admin API, and Auth middleware
Phase 5 - Main Dashboard:
- Dashboard overview page with system health stats
- Jobs by status breakdown, success rates, top clients
- Dashboard API (/api/dashboard/overview, by-client, problems, by-version)

Phase 6 - Admin/Scraper Management:
- Scrapers management page with traffic allocation UI
- Admin API for scraper CRUD operations
- Traffic percentage updates for A/B testing
- Promote/deprecate scraper versions

Phase 7 - Authentication:
- API key authentication middleware
- SHA-256 key hashing (keys never stored in plain text)
- Scope-based authorization (jobs:read, jobs:write, admin)
- Rate limiting per API key

Also:
- Updated api_server_production.py to include new routers
- Extended core/database.py with dashboard query methods
- Added dashboard link to sidebar navigation
- Updated CONTEXT-KEEPER.md to mark all phases complete

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-24 15:43:00 +00:00
Alejandro Gutiérrez
788ef84756 Phases 2-4: Requester support, batches, webhooks, scraper registry
Phase 2 - Requester & Batch Support:
- core/database.py: Added create_job params (requester_*, batch_*, priority, callback_*)
- core/database.py: Added batch methods (create_batch, get_batch, update_batch_progress, get_batches)
- core/database.py: Added update_job_callback for tracking webhook delivery
- api/routes/batches.py: New endpoints:
  - POST /api/scrape/google-reviews/batch (submit batch)
  - GET /api/batches (list batches)
  - GET /api/batches/{id} (batch detail)
  - DELETE /api/batches/{id} (cancel batch)
- api_server_production.py: Updated /api/scrape with requester, priority, callback fields
- api_server_production.py: New primary endpoint POST /api/scrape/google-reviews

Phase 3 - Webhooks:
- services/job_callback_service.py: New service with:
  - JobCallbackService: send_job_callback, send_batch_callback, retry_failed_callbacks
  - JobCallbackDispatcher: Background worker for callback monitoring
  - Payload formats per spec (job.completed, job.failed, batch.completed)
  - Exponential backoff for retries
  - Error classification for failure payloads

Phase 4 - Scraper Registry:
- scrapers/registry.py: Database-backed version routing:
  - get_scraper(): Version/variant/A/B routing
  - _get_weighted_scraper(): Traffic-weighted random selection
  - 60-second TTL cache for performance
  - register_scraper, deprecate_scraper, update_traffic_allocation
  - LegacyScraperRegistry preserved for backwards compatibility

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-24 15:35:58 +00:00
Alejandro Gutiérrez
2412996c54 Phase 1: Database migrations for platform features
Migrations created:
- 001_add_job_platform_fields.sql: Add 15 new columns to jobs table
  - Requester tracking (client_id, source, purpose, metadata)
  - Batch support (batch_id, batch_index)
  - Execution tracking (job_type, scraper_version, variant, priority)
  - Webhook callbacks (url, status, sent_at, attempts)
  - Result summary (JSONB for cross-type dashboard)
  - 7 indexes for query performance
  - 5 CHECK constraints for data validation

- 002_create_batches_table.sql: Batch job grouping
  - Tracks batch progress (total/completed/failed)
  - Batch-level callbacks
  - Requester association

- 003_create_scraper_registry.sql: Scraper version management
  - Version routing (stable/beta/canary variants)
  - A/B traffic splitting (traffic_pct)
  - Priority-based routing
  - Seeds google_reviews v1.0.0 as stable default

- 004_create_api_keys.sql: API authentication
  - Secure key storage (SHA-256 hashes, not plaintext)
  - Scopes-based permissions
  - Rate limiting support
  - Key lifecycle (expiry, active status)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-24 15:24:28 +00:00
Alejandro Gutiérrez
544e028c3f Phase 0: Project restructure to ReviewIQ platform architecture
New structure:
- scrapers/google_reviews/v1_0_0.py (was modules/scraper_clean.py)
- scrapers/base.py (BaseScraper interface)
- scrapers/registry.py (ScraperRegistry for version routing)
- core/database.py, models.py, config.py, enums.py
- utils/logger.py, crash_analyzer.py, health_checks.py, helpers.py, date_converter.py
- workers/chrome_pool.py
- services/webhook_service.py
- api/ routes structure (empty, ready for Phase 2)
- tests/ structure mirroring source

All imports updated in:
- api_server_production.py (7 import paths updated)
- utils/health_checks.py (scraper import path)

Legacy modules moved to modules/_legacy/:
- data_storage.py, image_handler.py, s3_handler.py (unused)

Syntax verified, frontend build passing.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-24 15:22:08 +00:00
Alejandro Gutiérrez
bb0291f265 Add CONTEXT-KEEPER.md for conversation continuity
Quick-reference document for resuming work after context compaction.
Contains: project overview, current state, spec summary, phases,
key decisions, file locations, and resumption instructions.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-24 15:14:01 +00:00