Files

Alejandro Gutiérrez 5324542e72 feat(reputation-report): Add production-grade reputation report generator v8

Production fixes:
- Cross-business join safety: all queries join on (review_id, business_id)
- Timestamp normalization: iso_z() for all output timestamps
- Score formula alignment: matches PERIOD_SCORES_QUERY for consistency
- Invariant check: fails if scores.overall != comparisons.current
- primary_run_id: uses max(created_at) in time_window mode
- Language normalization: auto/auto-detect -> unknown
- Review language: majority voting over spans per review

Executive summary guardrails:
- Weakness priority: negative driver > qualifying dip > none
- Dip qualification: within 90 days AND review_count >= 3
- Most recent dip selection when multiple qualify
- No contradiction: "dip" cannot pair with "no major issues"
- Action grounding: must tie to cited weakness or top positive driver

CLI options:
- --no-summary: disable executive summary
- --require-summary: exit code 2 if LLM fails
- --summary-model: configurable model (default gpt-4o-mini)

Includes unit test suite (16 tests) for narrative guardrails.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

2026-01-31 23:10:25 +00:00

scripts

feat(reputation-report): Add production-grade reputation report generator v8

2026-01-31 23:10:25 +00:00

src/reviewiq_pipeline

fix(synthesis): Calculate analysis_period from actual data range

2026-01-30 15:49:51 +00:00

tests

feat(reputation-report): Add production-grade reputation report generator v8

2026-01-31 23:10:25 +00:00

pyproject.toml

feat: Add extensible multi-pipeline integration system

2026-01-24 19:05:38 +00:00

README.md

feat: Add reviewiq-pipeline package for LLM-powered review classification

2026-01-24 18:07:11 +00:00

README.md

ReviewIQ Pipeline

LLM-powered review classification and analysis pipeline using URT (Universal Review Taxonomy) v5.1.

Features

Stage 1: Normalization - Text cleaning, language detection, deduplication
Stage 2: LLM Classification - Span extraction with URT codes using OpenAI/Anthropic
Stage 3: Issue Routing - Route negative spans to issues for tracking
Stage 4: Fact Aggregation - Pre-aggregate metrics for dashboard queries

Installation

pip install reviewiq-pipeline

Or install from source:

pip install -e packages/reviewiq-pipeline

Quick Start

Python API

from reviewiq_pipeline import Pipeline, Config

# Initialize
config = Config(
    database_url="postgresql://...",
    llm_provider="openai",
    llm_api_key="sk-...",
    taxonomy_version="v5.1"
)
pipeline = Pipeline(config)

# Run full pipeline
result = await pipeline.process(scraper_output)

# Or run individual stages
stage1_result = await pipeline.normalize(scraper_output)
stage2_result = await pipeline.classify(stage1_result)
stage3_result = await pipeline.route(stage2_result)
stage4_result = await pipeline.aggregate(business_id, date)

# Validate
validation = await pipeline.validate(job_id)

CLI

# Run migrations
reviewiq-pipeline migrate --database-url $DATABASE_URL

# Process a job
reviewiq-pipeline run --job-id <UUID> --stages 1,2,3,4

# Validate pipeline output
reviewiq-pipeline validate --job-id <UUID>

Configuration

Environment variables:

DATABASE_URL - PostgreSQL connection string
LLM_PROVIDER - openai or anthropic
OPENAI_API_KEY - OpenAI API key (if using OpenAI)
ANTHROPIC_API_KEY - Anthropic API key (if using Anthropic)
TAXONOMY_VERSION - URT taxonomy version (default: v5.1)

Development

# Install with dev dependencies
pip install -e "packages/reviewiq-pipeline[dev]"

# Run tests
pytest

# Run with coverage
pytest --cov=reviewiq_pipeline

# Type checking
mypy src/reviewiq_pipeline

# Linting
ruff check src/reviewiq_pipeline

License

MIT