# Reputation Report Pipeline **Version:** 1.0 (v8) **Status:** Production-ready **Location:** `packages/reviewiq-pipeline/scripts/reputation_report.py` ## Overview The Reputation Report generates business-facing, time-windowed reputation analytics from classified review spans. It produces a €50-value report suitable for SMB business owners, including: - Overall performance score (0-100 scale) - Domain and primitive breakdowns - Positive and negative drivers with evidence - Time comparisons (current vs previous period) - Sector benchmarks - Timeline visualization data - LLM-generated executive summary ## Quick Start ```bash # Basic usage (last 365 days) python scripts/reputation_report.py --business "Go Karts Mar Menor" --days 365 # With output file python scripts/reputation_report.py --business "Business Name" --days 30 --output report.json # Custom date range python scripts/reputation_report.py --business "Business Name" --start 2025-01-01 --end 2025-12-31 # Production mode (fail if LLM summary fails) python scripts/reputation_report.py --business "Business Name" --days 365 --require-summary ``` ## CLI Options | Option | Description | Default | |--------|-------------|---------| | `--business` | Business ID or search pattern (required) | - | | `--days` | Last N days to analyze | 30 | | `--start` | Window start (ISO-8601) | - | | `--end` | Window end (ISO-8601) | - | | `--run-id` | Specific run ID (overrides time window) | - | | `--timezone` | IANA timezone for window | UTC | | `--output, -o` | Output file path | stdout | | `--quiet, -q` | Suppress console summary | false | | `--no-summary` | Disable executive summary | false | | `--require-summary` | Exit code 2 if LLM fails | false | | `--summary-model` | LLM model for summary | gpt-4o-mini | ## Data Flow ``` ┌─────────────────────────────────────────────────────────────────┐ │ INPUT │ ├─────────────────────────────────────────────────────────────────┤ │ detected_spans_v2 ←──JOIN──→ review_facts_v1 │ │ (primitives, valence, (review_time_utc, rating, │ │ confidence, intensity) business_id) │ └─────────────────────────────────────────────────────────────────┘ │ ▼ ┌─────────────────────────────────────────────────────────────────┐ │ SPAN SELECTION │ ├─────────────────────────────────────────────────────────────────┤ │ Mode: time_window │ │ → Filter by review_time_utc in [start, end) │ │ → Join on (review_id, business_id) for data isolation │ │ │ │ Mode: latest_run │ │ → Filter by run_id │ └─────────────────────────────────────────────────────────────────┘ │ ▼ ┌─────────────────────────────────────────────────────────────────┐ │ COMPUTATION │ ├─────────────────────────────────────────────────────────────────┤ │ 1. Population stats (review count, language distribution) │ │ 2. Overall score: 100 × Σ(valence × conf × intensity) / Σ(...)│ │ 3. Domain scores (O/P/J/E/V weighted averages) │ │ 4. Primitive scores (per-primitive breakdown) │ │ 5. Drivers (impact = weighted share of total) │ │ 6. Alerts (SAFETY, UNMAPPED thresholds) │ │ 7. Recommendations (templated playbooks) │ └─────────────────────────────────────────────────────────────────┘ │ ▼ ┌─────────────────────────────────────────────────────────────────┐ │ TIME COMPARISONS │ ├─────────────────────────────────────────────────────────────────┤ │ Previous Window: │ │ → Same duration, immediately preceding current │ │ → Requires MIN_REVIEWS_FOR_COMPARISON (10) │ │ → Requires MIN_COVERAGE_FOR_COMPARISON (80%) │ │ │ │ Sector Benchmark: │ │ → Requires 500+ spans, 3+ businesses in sector │ │ → Status: ok | insufficient_data | missing_sector_code │ └─────────────────────────────────────────────────────────────────┘ │ ▼ ┌─────────────────────────────────────────────────────────────────┐ │ EXECUTIVE SUMMARY │ ├─────────────────────────────────────────────────────────────────┤ │ LLM-generated (gpt-4o-mini) with narrative guardrails: │ │ │ │ Weakness Priority: │ │ 1. Negative driver (if drivers.negatives non-empty) │ │ 2. Qualifying dip (within 90d, review_count ≥ 3) │ │ 3. None ("no persistent weaknesses surfaced") │ │ │ │ Guardrails: │ │ - No "recent dip" + "no major issues" contradiction │ │ - Most recent qualifying dip if multiple exist │ │ - Action must tie to cited weakness or top positive │ │ │ │ Fallback: Deterministic summary if LLM unavailable │ └─────────────────────────────────────────────────────────────────┘ │ ▼ ┌─────────────────────────────────────────────────────────────────┐ │ OUTPUT │ ├─────────────────────────────────────────────────────────────────┤ │ JSON Report (schema_version: 1.0) │ │ - business, window, population │ │ - scores (overall, domains, primitives) │ │ - drivers (positives, negatives with evidence) │ │ - alerts, recommendations │ │ - comparisons (previous_window, sector_benchmark) │ │ - timeline (granularity, points) │ │ - executive_summary, executive_summary_meta │ └─────────────────────────────────────────────────────────────────┘ ``` ## Output Schema ### Top-Level Fields ```json { "schema_version": "1.0", "report_id": "uuid", "primary_run_id": "uuid | null", "generated_at": "2026-01-31T12:00:00Z", "window": { "start", "end", "timezone", "mode" }, "business": { "business_id", "sector_code", "gbp_path" }, "population": { ... }, "scores": { "overall", "domains", "primitives" }, "drivers": { "positives", "negatives" }, "alerts": [ ... ], "recommendations": [ ... ], "comparisons": { "previous_window", "sector_benchmark" }, "timeline": { "granularity", "points" }, "executive_summary": "string | null", "executive_summary_meta": { "enabled", "generated", "model", "error", "fallback_used" } } ``` ### Scores Structure ```json { "overall": { "score": 85.3, "score_domain_weighted": 85.7, "positive_share": 0.897, "negative_share": 0.077, "mixed_share": 0.013, "neutral_share": 0.013 }, "domains": { "O": { "score": 100.0, "volume": 10 }, "P": { "score": 86.2, "volume": 17 }, "J": { "score": -23.4, "volume": 5 }, "E": { "score": 94.8, "volume": 35 }, "V": { "score": 100.0, "volume": 10 } }, "primitives": { "VALUE_FOR_MONEY": { "domain": "V", "score": 100.0, "volume": 10, "valence_counts": { "+": 10, "-": 0, "0": 0, "±": 0 }, "top_entities": [ ... ] } } } ``` ### Driver Structure ```json { "positives": [ { "primitive": "VALUE_FOR_MONEY", "impact": 0.147, "summary": "Positive V/VALUE_FOR_MONEY mentions.", "evidence": [ { "review_id": "abc123", "language": "en", "span_text": "the prices are super affordable.", "valence": "+", "intensity": 2, "confidence": 0.9 } ] } ], "negatives": [ ... ] } ``` ### Timeline Structure ```json { "granularity": "month", "points": [ { "bucket_start_utc": "2025-12-01T00:00:00Z", "review_count": 8, "span_count": 25, "positive_count": 15, "negative_count": 8, "avg_rating": 2.88, "strength_score": -32.6 } ] } ``` ## Production Guardrails ### Data Isolation All queries join `detected_spans_v2` with `review_facts_v1` on **both** `review_id` AND `business_id` to prevent cross-business contamination: ```sql JOIN pipeline.review_facts_v1 f ON f.review_id = s.review_id AND f.business_id = s.business_id ``` ### Score Consistency An invariant check ensures `scores.overall.score` matches `comparisons.previous_window.scores.overall.current`. If delta > 1.0, an `internal_inconsistency` alert is emitted. ### Executive Summary Meta ```json { "enabled": true, "generated": true, "model": "gpt-4o-mini", "error": null, "generated_at": "2026-01-31T12:00:00Z", "fallback_used": false } ``` - `enabled`: Whether summary generation was requested - `generated`: Whether LLM successfully produced a summary - `error`: Error message if generation failed - `fallback_used`: Whether deterministic fallback was used ### Exit Codes | Code | Meaning | |------|---------| | 0 | Success | | 1 | Business not found or no spans | | 2 | `--require-summary` and LLM failed | ## Scoring Formula ### Overall Score Same formula as `PERIOD_SCORES_QUERY` for consistency: ``` score = 100 × Σ(valence × confidence × intensity) / Σ(confidence × intensity) ``` Where: - valence: +1 for positive, -1 for negative, 0 for neutral/mixed - confidence: 0.0 to 1.0 - intensity: 1 to 3 ### Domain-Weighted Score Alternative metric (exposed as `score_domain_weighted`): ``` score = Σ(domain_score × domain_volume) / Σ(domain_volume) ``` ### Primitive Score ``` score = 100 × Σ(w × valence_num) / Σ(w) w = confidence × (0.75 + 0.25×(detail-1)) × (0.8 + 0.2×(intensity-1)) ``` ## Thresholds | Threshold | Value | Purpose | |-----------|-------|---------| | MIN_REVIEWS_FOR_COMPARISON | 10 | Minimum reviews per period for trend | | MIN_COVERAGE_FOR_COMPARISON | 0.80 | Minimum review_time coverage | | Sector benchmark spans | 500 | Minimum sector spans for benchmark | | Sector benchmark businesses | 3 | Minimum businesses in sector | | UNMAPPED rate warn | 0.10 | Alert if >10% unmapped | | UNMAPPED rate critical | 0.15 | Critical alert if >15% unmapped | | SAFETY negative warn | 0.05 | Alert if >5% SAFETY negative | | SAFETY negative critical | 0.10 | Critical alert if >10% SAFETY negative | | Dip recency | 90 days | Maximum age for "recent" dip | | Dip volume | 3 reviews | Minimum reviews to qualify as dip | ## Domain Mapping | Domain | Code | Primitives | |--------|------|------------| | Output/Product | O | TASTE, CRAFT, FRESHNESS, TEMPERATURE, EFFECTIVENESS, ACCURACY, CONDITION, CONSISTENCY | | People/Service | P | MANNER, COMPETENCE, ATTENTIVENESS, COMMUNICATION | | Journey/Process | J | SPEED, FRICTION, RELIABILITY, AVAILABILITY | | Environment | E | CLEANLINESS, COMFORT, SAFETY, AMBIANCE, ACCESSIBILITY, DIGITAL_UX | | Value | V | PRICE_LEVEL, PRICE_FAIRNESS, PRICE_TRANSPARENCY, VALUE_FOR_MONEY | | Meta | meta | HONESTY, ETHICS, PROMISES, ACKNOWLEDGMENT, RESPONSE_QUALITY, RECOVERY, RETURN_INTENT, RECOMMEND, RECOGNITION, UNMAPPED, NON_INFORMATIVE | ## Testing ```bash cd packages/reviewiq-pipeline python -m pytest tests/test_executive_summary.py -v ``` 16 tests covering: - Negative driver priority over dips - Qualifying dip selection (90 days + review_count ≥ 3) - Most recent dip when multiple qualify - Contradiction detection (dip + "no major issues") - Non-qualifying dips not cited as "recent" - Summary input construction ## Environment Variables | Variable | Required | Description | |----------|----------|-------------| | DATABASE_URL | Yes | PostgreSQL connection string | | OPENAI_API_KEY | No | Required for LLM summary (fallback used if missing) | ## Example Output ```bash $ python scripts/reputation_report.py --business "Go Karts Mar Menor" --days 365 --quiet Report written to stdout ============================================================ REPUTATION REPORT: Go Karts Mar Menor ============================================================ Window: 2025-01-31T12:00:00Z - 2026-01-31T12:00:00Z Reviews: 27 Content spans: 78 Overall score: 85.3 Positive share: 89.7% Negative share: 7.7% Top positive drivers: VALUE_FOR_MONEY: 14.7% impact RECOMMEND: 14.5% impact MANNER: 13.5% impact Top negative drivers: ============================================================ ``` ## Changelog ### v8 (2026-01-31) - Initial production release - Cross-business join safety - Score formula alignment - Executive summary with narrative guardrails - Comprehensive test suite