Commit Graph

2 Commits

Author SHA1 Message Date
Alejandro Gutiérrez
03ed7029e2 feat: Add decoupled pipeline schema with separate PostgreSQL namespace
- Create consolidated migration (005_create_pipeline_schema.sql) with
  'pipeline' schema for all classification tables
- Update pipeline repositories to use schema prefix (pipeline.*)
- Add run_migrations() method to DatabaseManager
- Add CLI tool for running versioned migrations

Tables created in pipeline schema:
- reviews_raw, reviews_enriched (Stage 1)
- review_spans (Stage 2)
- issues, issue_spans, issue_events (Stage 3)
- fact_timeseries (Stage 4)
- urt_domains, urt_categories (taxonomy lookup)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-24 18:17:20 +00:00
Alejandro Gutiérrez
7d720f5378 feat: Add reviewiq-pipeline package for LLM-powered review classification
Implement a standalone Python package for processing customer reviews through
a 4-stage pipeline using URT (Universal Review Taxonomy) v5.1:

- Stage 1: Normalization (text cleaning, language detection, deduplication)
- Stage 2: LLM Classification (OpenAI/Anthropic span extraction with URT codes)
- Stage 3: Issue Routing (deterministic issue ID generation, span linking)
- Stage 4: Fact Aggregation (time series metrics for dashboards)

Package includes:
- TypedDict contracts matching Pipeline-Contracts-v1.md
- Async database layer with asyncpg and 5 SQL migrations
- LLM client abstraction supporting both OpenAI and Anthropic
- Sentence-transformers integration for embeddings
- Validation rules V1.x through V4.x
- CLI commands: migrate, run, validate, check
- 55 unit and integration tests (all passing)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-24 18:07:11 +00:00