Commit Graph

12 Commits

Author SHA1 Message Date
Alejandro Gutiérrez
5324542e72 feat(reputation-report): Add production-grade reputation report generator v8
Production fixes:
- Cross-business join safety: all queries join on (review_id, business_id)
- Timestamp normalization: iso_z() for all output timestamps
- Score formula alignment: matches PERIOD_SCORES_QUERY for consistency
- Invariant check: fails if scores.overall != comparisons.current
- primary_run_id: uses max(created_at) in time_window mode
- Language normalization: auto/auto-detect -> unknown
- Review language: majority voting over spans per review

Executive summary guardrails:
- Weakness priority: negative driver > qualifying dip > none
- Dip qualification: within 90 days AND review_count >= 3
- Most recent dip selection when multiple qualify
- No contradiction: "dip" cannot pair with "no major issues"
- Action grounding: must tie to cited weakness or top positive driver

CLI options:
- --no-summary: disable executive summary
- --require-summary: exit code 2 if LLM fails
- --summary-model: configurable model (default gpt-4o-mini)

Includes unit test suite (16 tests) for narrative guardrails.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-31 23:10:25 +00:00
Alejandro Gutiérrez
c797470421 fix(synthesis): Calculate analysis_period from actual data range
Previously hardcoded "Last 12 months" which was misleading when data
spanned multiple years. Now calculates the actual period from the
earliest to latest review dates in the dataset.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-30 15:49:51 +00:00
Alejandro Gutiérrez
2a292e0754 fix(synthesis): Select most common business_id to handle data leakage
Changed the business name query to ORDER BY COUNT(*) DESC instead of
arbitrary LIMIT 1, ensuring the correct business is identified even
when trace amounts of other business data leak into a job.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-30 15:28:02 +00:00
Alejandro Gutiérrez
0a53e98bf9 fix(pipeline): Update stage result to use new synthesis fields
Change action_matrix reference from old 'actions' field.
Add issues_identified and health_score to stage result data.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-30 14:50:21 +00:00
Alejandro Gutiérrez
b4bef004e8 feat(synthesis): Redesign report to 6-section €60 business value format
Transform synthesis stage from consultant memo to productized report:

- New 6-section structure: Executive Summary, Risk Scorecard,
  Critical Issues, Strengths, Action Matrix, 90-Day Tracking
- Add chart data aggregation (7 charts: gauge, pies, trends)
- Evidence-grounded LLM prompt requiring quote citations
- Hyper-specific solution generation with WHAT/WHO/WHEN/WHY
- Taxonomy-guided solutions contextualized to business type
- Staff name extraction from quotes for leverage actions
- Success metrics tied to actual complaint keywords

Report sections now include:
- Health score (1-100) with revenue-at-risk estimates
- Risk indicators with color-coded scores and trends
- Critical issues with evidence, root cause, and specific solutions
- Strengths with named staff and specific marketing actions
- Action matrix with effort/impact quadrants and deadlines
- 90-day KPIs with 30/60/90 day targets

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-30 14:27:40 +00:00
Alejandro Gutiérrez
157b76040f fix(synthesis): Use correct column name 'id' instead of 'execution_id'
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-29 03:19:09 +00:00
Alejandro Gutiérrez
9b667e69a7 feat(pipeline): Add Stage 5 Synthesis for AI-generated narratives
- Add Stage5Synthesizer class that generates AI narratives and action plans
- Add generate() method to LLMClient for synthesis generation
- Integrate Stage 5 into pipeline runner after route stage
- Add synthesis JSONB column to pipeline.executions table
- Update reviewiq_analytics API to return synthesis data
- Synthesis includes: executive narrative, sentiment/category/timeline insights,
  action plan, marketing angles, and priority recommendations

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-29 03:12:53 +00:00
Alejandro Gutiérrez
c8ecb4b98f feat(reviewiq): Add AI synthesis support to dashboard components
Frontend:
- Add Synthesis type with action plan, insights, annotations
- ExecutiveSummary: Accept synthesis prop for AI narrative
- SentimentPie: Accept insight prop for contextual explanation
- IntensityHeatmap: Accept insight + highlightDomain props
- TimelineChart: Accept insight + annotations props
- All components gracefully degrade when synthesis is null

Backend:
- Add Stage 4: Synthesize for generating AI narratives
- Gathers context from classified spans
- Generates executive narrative, section insights, action plan
- Produces timeline annotations and marketing angles
- Stores synthesis in pipeline.executions table

Components show AI insights with purple gradient styling when available,
fall back to existing behavior when synthesis is not yet generated.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-29 02:59:47 +00:00
Alejandro Gutiérrez
824634aa76 feat: Add extensible multi-pipeline integration system
This commit implements a plugin-like pipeline architecture with:

Pipeline Core Package (packages/pipeline-core/):
- BasePipeline abstract class all pipelines implement
- PipelineRegistry for database-backed discovery/management
- PipelineRunner for execution with status tracking
- DashboardConfig contracts for dynamic widget definitions

Database Migration (006_pipeline_registry.sql):
- pipeline.registry table for registered pipelines
- pipeline.executions table for execution history
- Views for execution stats and monitoring

ReviewIQ Pipeline Refactor:
- Implements BasePipeline interface
- Adds get_dashboard_config() with widget definitions
- Adds get_widget_data() methods for all dashboard widgets
- Maintains backward compatibility with Pipeline alias

Generic Pipeline API (api/routes/pipelines.py):
- GET /api/pipelines - List all registered pipelines
- GET /api/pipelines/{id} - Pipeline details
- POST /api/pipelines/{id}/execute - Execute pipeline
- GET /api/pipelines/{id}/dashboard - Dashboard config
- GET /api/pipelines/{id}/widgets/{w} - Widget data
- GET /api/pipelines/{id}/executions - Execution history

Frontend Dynamic Dashboard System:
- DynamicDashboard component renders from config
- WidgetRegistry maps types to components
- Widget components: StatCard, LineChart, BarChart,
  PieChart, DataTable, Heatmap
- Pipeline API client library

Frontend Pipeline Pages:
- /pipelines - List all registered pipelines
- /pipelines/[id] - Dynamic dashboard for pipeline
- /pipelines/[id]/executions - Execution history
- Pipelines nav item in Sidebar

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-24 19:05:38 +00:00
Alejandro Gutiérrez
e2d7f6f118 feat: Add ScraperV1Adapter and real data pipeline test
- Add ScraperV1Adapter to transform scraped reviews into pipeline format
  - Handles relative timestamps (centerDate)
  - Generates deterministic IDs for DOM-sourced reviews
  - Filters out empty (rating-only) reviews

- Add sample barbershop reviews (79 reviews, 46 with text)
  - Real data from Las Palmas barbershop
  - Multi-language: Spanish, English, German, Norwegian, Italian

- Add test_pipeline_real_data.py for E2E testing with real data
  - Uses mock classifier based on keywords and rating
  - Full pipeline flow: raw -> enriched -> spans -> issues -> facts

Test results with real data:
- 46 reviews processed
- 6 languages detected (es: 35, en: 7, de: 1, no: 1, it: 1, ca: 1)
- 3 issues identified from negative reviews
- 29 fact records aggregated across date range 2017-2025

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-24 18:35:09 +00:00
Alejandro Gutiérrez
03ed7029e2 feat: Add decoupled pipeline schema with separate PostgreSQL namespace
- Create consolidated migration (005_create_pipeline_schema.sql) with
  'pipeline' schema for all classification tables
- Update pipeline repositories to use schema prefix (pipeline.*)
- Add run_migrations() method to DatabaseManager
- Add CLI tool for running versioned migrations

Tables created in pipeline schema:
- reviews_raw, reviews_enriched (Stage 1)
- review_spans (Stage 2)
- issues, issue_spans, issue_events (Stage 3)
- fact_timeseries (Stage 4)
- urt_domains, urt_categories (taxonomy lookup)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-24 18:17:20 +00:00
Alejandro Gutiérrez
7d720f5378 feat: Add reviewiq-pipeline package for LLM-powered review classification
Implement a standalone Python package for processing customer reviews through
a 4-stage pipeline using URT (Universal Review Taxonomy) v5.1:

- Stage 1: Normalization (text cleaning, language detection, deduplication)
- Stage 2: LLM Classification (OpenAI/Anthropic span extraction with URT codes)
- Stage 3: Issue Routing (deterministic issue ID generation, span linking)
- Stage 4: Fact Aggregation (time series metrics for dashboards)

Package includes:
- TypedDict contracts matching Pipeline-Contracts-v1.md
- Async database layer with asyncpg and 5 SQL migrations
- LLM client abstraction supporting both OpenAI and Anthropic
- Sentence-transformers integration for embeddings
- Validation rules V1.x through V4.x
- CLI commands: migrate, run, validate, check
- 55 unit and integration tests (all passing)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-24 18:07:11 +00:00