feat(pipeline): Add Stage 5 Synthesis for AI-generated narratives

- Add Stage5Synthesizer class that generates AI narratives and action plans
- Add generate() method to LLMClient for synthesis generation
- Integrate Stage 5 into pipeline runner after route stage
- Add synthesis JSONB column to pipeline.executions table
- Update reviewiq_analytics API to return synthesis data
- Synthesis includes: executive narrative, sentiment/category/timeline insights,
  action plan, marketing angles, and priority recommendations

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
This commit is contained in:
Alejandro Gutiérrez
2026-01-29 03:12:53 +00:00
parent c8ecb4b98f
commit 9b667e69a7
5 changed files with 3129 additions and 67 deletions

View File

@@ -29,28 +29,205 @@ Your task is to extract semantic spans from customer reviews and classify each s
## SPAN EXTRACTION RULES
1. **Split on contrasting conjunctions**: but, however, although, despite, yet, though
2. **Split on topic/target change**: food → service → bathroom = 3 spans
3. **Split on valence change**: positive → negative = split
4. **Split on domain change**: O (Offering) → J (Journey) → E (Environment) = split
5. **Keep together**: cause→effect within same feedback unit ("X because Y" = 1 span)
**CRITICAL: Use TOPIC-BASED splitting, NOT sentence-based splitting.**
A span = all consecutive text about the SAME topic/domain, regardless of sentence count.
### When to KEEP TOGETHER (same span):
- Multiple sentences about the same topic: "The food was great. I loved the pasta. The sauce was perfect." → ONE span (all about Offering)
- Cause and effect: "The wait was long because they were understaffed" → ONE span
- Elaboration: "Staff was rude. They ignored us for 20 minutes." → ONE span (both about People)
- Single-topic reviews: Even if 5 sentences, if all about food → ONE span
### When to SPLIT (separate spans):
- Contrasting conjunctions that change topic: "Food was great BUT service was slow" → TWO spans
- Domain change: food (O) → staff (P) → ambiance (E) = split at each change
- Target change: "The waiter was nice but the manager was rude" → TWO spans (different people)
### Examples:
- "Amazing food. Best burger ever. Fries were crispy too." → 1 span (all Offering, V+)
- "Food was great but we waited an hour." → 2 spans (Offering V+, Journey V-)
- "I've been coming here for years. Always consistent quality." → 1 span (Relationship)
- "The staff are lovely and amazing with kids. More highchairs are definitely needed though." → 2 spans (People V+, Access V-)
**Guardrails**:
- Max 3 spans per sentence (if 4+, re-check for over-splitting)
- Min 1 span per review (even single-word reviews)
- Spans must be non-overlapping and cover meaningful content
- Prefer FEWER, LARGER spans over many small ones
- Most reviews should have 1-3 spans, rarely more
- Min 1 span per review
- Spans must be non-overlapping
## URT DOMAINS (Tier-3 codes: X#.##)
## URT TAXONOMY - COMPLETE (138 codes, use EXACT codes)
| Domain | Code | Description |
|--------|------|-------------|
| Offering | O1-O4 | Product/service quality, features, variety |
| Price | P1-P4 | Value, pricing, promotions, payment |
| Journey | J1-J4 | Timing, process, convenience, accessibility |
| Environment | E1-E4 | Physical space, ambiance, cleanliness, digital UX |
| Attitude | A1-A4 | Staff behavior, helpfulness, professionalism |
| Voice | V1-V4 | Brand, communication, marketing, transparency |
| Relationship | R1-R4 | Loyalty, trust, consistency, personalization |
### O - OFFERING (Product/Service Quality) - 18 codes
O1.01 Works/Doesn't Work: Basic functionality success or failure
O1.02 Performance Level: How well it operates
O1.03 Durability: Longevity and resistance to wear
O1.04 Reliability: Consistency of function over time
O1.05 Outcome Achievement: Did customer accomplish their goal?
O2.01 Materials/Inputs: Quality of components or ingredients
O2.02 Craftsmanship: Skill of construction or execution
O2.03 Presentation: Visual and aesthetic quality
O2.04 Attention to Detail: Finishing touches and refinement
O2.05 Condition at Delivery: State when received
O3.01 All Components Present: Nothing missing from what was promised
O3.02 Feature Availability: Promised features actually work
O3.03 Scope Delivery: Full scope of work completed
O3.04 Documentation: Supporting materials provided
O4.01 Specification Match: Matches what was ordered
O4.02 Personalization: Adapted to individual preferences
O4.03 Flexibility: Can be modified or adjusted
O4.04 Appropriateness: Right solution for the need
### P - PEOPLE (Staff Interactions) - 20 codes
P1.01 Warmth: Friendly and welcoming manner
P1.02 Respect: Treated with dignity
P1.03 Patience: Calm and tolerant approach
P1.04 Enthusiasm: Energy and engagement
P1.05 Empathy: Understanding feelings
P2.01 Knowledge: Expertise and understanding
P2.02 Skill: Technical ability
P2.03 Problem Solving: Ability to find solutions
P2.04 Advice Quality: Helpful recommendations
P2.05 Training Level: Staff training evident
P3.01 Attentiveness: Being present and engaged
P3.02 Initiative: Proactive help
P3.03 Follow-through: Completing promised actions
P3.04 Availability: Being available when needed
P3.05 Dedication: Commitment to helping
P4.01 Clarity: Clear communication
P4.02 Listening: Understanding customer needs
P4.03 Transparency: Honest and open
P4.04 Honesty: Truthful communication
P4.05 Proactive Updates: Keeping customer informed
### J - JOURNEY (Process & Timing) - 20 codes
J1.01 Speed: How fast things happen
J1.02 Punctuality: On-time delivery
J1.03 Queue Management: Handling of waiting customers
J1.04 Punctuality: Meeting scheduled times
J1.05 Pacing: Appropriate speed (not rushed/dragged)
J2.01 Simplicity: Easy process
J2.02 Friction: Obstacles encountered
J2.03 Navigation: Finding what you need
J2.04 Booking Availability: Slots/capacity when needed
J2.05 Inventory: Stock availability
J3.01 Consistency: Same experience every time
J3.02 Accuracy: Getting it right
J3.03 Uptime: System availability
J3.04 Data Accuracy: Correct info in systems
J3.05 Integration: Systems work together
J4.01 Problem Recognition: Acknowledging issues
J4.02 Resolution Speed: How fast problems get fixed
J4.03 Resolution Fairness: Fair handling of issues
J4.04 Escalation: Getting to right person
J4.05 Closure: Issue fully resolved
### E - ENVIRONMENT (Physical & Digital Space) - 20 codes
E1.01 Cleanliness: How clean the space is
E1.02 Comfort: Physical comfort
E1.03 Space Design: Layout and organization
E1.04 Ambiance: Atmosphere and vibe
E1.05 Comfort: Physical comfort
E2.01 Lighting: Light quality and level
E2.02 Sound/Noise: Audio environment
E2.03 Temperature: Climate control
E2.04 Visual Design: Aesthetics of interface
E2.05 Mobile Experience: Mobile usability
E3.01 Interface Design: Digital UX/UI
E3.02 App/Website Speed: Digital performance
E3.03 Usability: Ease of digital use
E3.04 Health Safety: Health precautions
E3.05 Cyber Security: Digital security
E4.01 Safety: Physical safety
E4.02 Security: Protection of belongings/data
E4.03 Health/Hygiene: Health standards
E4.04 Social Responsibility: Ethical practices
E4.05 Community Impact: Local community effect
### A - ACCESS (Availability & Accessibility) - 20 codes
A1.01 Hours: Operating hours
A1.02 Booking Availability: Appointment slots
A1.03 Inventory: Product availability
A1.04 Wayfinding: Finding destination
A1.05 Physical Accessibility: Disability accommodations
A2.01 Physical Access: Mobility accessibility
A2.02 Language Access: Language accommodation
A2.03 Digital Accessibility: Screen reader/a11y
A2.04 Language Accessibility: Multilingual support
A2.05 Hours of Operation: Service availability times
A3.01 Diversity Welcome: All backgrounds welcome
A3.02 Accommodation: Special needs accommodation
A3.03 Response Time: Speed of getting answers
A3.04 Documentation Clarity: Clear instructions
A3.05 Support Accessibility: Getting help when needed
A4.01 Location: Physical location convenience
A4.02 Parking: Parking availability
A4.03 Multiple Channels: Ways to engage
A4.04 Payment Flexibility: Multiple payment options
A4.05 Refund Accessibility: Getting money back
### V - VALUE (Pricing & Costs) - 20 codes ⚠️ USE FOR ALL PRICE/COST/FEE MENTIONS
V1.01 Price Level: Cost amount ("cheap", "expensive", "affordable", "", "$")
V1.02 Price Fairness: Fair for what you get
V1.03 Hidden Costs: Unexpected charges, surprise fees, hidden fees, extra charges
V1.04 Price Transparency: Clear pricing upfront
V1.05 Price Stability: Consistent pricing
V2.01 Clear Pricing: Easy to understand costs
V2.02 Honest Billing: Accurate charges
V2.03 Policy Clarity: Clear terms and conditions
V2.04 Quality-Price Ratio: Worth vs cost
V2.05 Competitive Value: Compared to alternatives
V3.01 Time Investment: Time required
V3.02 Hassle Factor: Difficulty and inconvenience
V3.03 Mental Load: Cognitive effort required
V3.04 Promotion Clarity: Clear offer terms
V3.05 Reward Redemption: Using points/rewards
V4.01 Value for Money: Worth what you paid
V4.02 ROI: Return on investment
V4.03 Overall Satisfaction: Happy with the exchange
V4.04 Billing Accuracy: Correct charges
V4.05 Billing Resolution: Fixing billing issues
### R - RELATIONSHIP (Trust & Loyalty) - 20 codes
R1.01 Honesty: Truthfulness
R1.02 Ethics: Ethical behavior, deceptive practices, scams
R1.03 Promises Kept: Following through on promises
R1.04 Ethics: Ethical behavior
R1.05 Accountability: Taking responsibility
R2.01 Consistency: Reliable over time
R2.02 Trustworthiness: Can be trusted
R2.03 Accountability: Takes responsibility
R2.04 Predictability: Consistent experience
R2.05 Standards: Meeting quality standards
R3.01 Error Acknowledgment: Admits mistakes
R3.02 Apology Quality: Sincere apologies
R3.03 Making It Right: Correcting mistakes
R3.04 Personal Connection: Human touch
R3.05 Going Extra Mile: Beyond expectations
R4.01 Customer Recognition: Remembers customers
R4.02 Loyalty Rewards: Rewards for loyalty
R4.03 Long-term Relationship: Builds relationships
R4.04 Service Recovery: Making things right
R4.05 Feedback Response: Acting on feedback
## CLASSIFICATION EXAMPLES (Critical Distinctions)
**PRICING/COSTS → V codes (Value), NOT P codes:**
- "Cheap prices", "good price", "€50" → V1.01 Price Level
- "Hidden charges", "surprise fees", "extra €35" → V1.03 Hidden Costs
- "Great value for money" → V4.01 Value for Money
- "Overcharged", "wrong amount" → V4.04 Billing Accuracy
**STAFF BEHAVIOR → P codes (People):**
- "Staff was friendly", "welcoming" → P1.01 Warmth
- "Rude", "disrespectful", "ignored us" → P1.02 Respect
- "Patient", "took their time" → P1.03 Patience
- "Knowledgeable", "expert" → P2.01 Knowledge
**DECEPTION/ETHICS → R codes (Relationship):**
- "They lied", "misleading" → R1.01 Honesty
- "Felt scammed", "dishonest practices" → R1.02 Ethics
- "Didn't honor the deal" → R1.03 Promises Kept
## DIMENSION CODES
@@ -159,6 +336,20 @@ class LLMClientBase(ABC):
self.config = config
self.total_tokens_used = 0
self.total_cost_usd = 0.0
self._custom_prompt: str | None = None
def set_prompt(self, prompt: str) -> None:
"""
Set a custom system prompt (e.g., built dynamically from database).
Args:
prompt: The system prompt to use for classification
"""
self._custom_prompt = prompt
def get_prompt(self) -> str:
"""Get the current system prompt (custom or default)."""
return self._custom_prompt or SYSTEM_PROMPT
@abstractmethod
async def classify(
@@ -178,6 +369,28 @@ class LLMClientBase(ABC):
"""
pass
@abstractmethod
async def generate(
self,
system_prompt: str,
user_prompt: str,
temperature: float = 0.7,
max_tokens: int = 4000,
) -> str:
"""
Generate text using the LLM (for synthesis, narratives, etc.).
Args:
system_prompt: System instructions
user_prompt: User content/context
temperature: Creativity level (0-1)
max_tokens: Maximum response length
Returns:
Generated text response
"""
pass
@abstractmethod
async def close(self) -> None:
"""Close the client and cleanup resources."""
@@ -211,7 +424,7 @@ class OpenAIClient(LLMClientBase):
start_time = time.time()
messages = [
{"role": "system", "content": SYSTEM_PROMPT},
{"role": "system", "content": self.get_prompt()},
{
"role": "user",
"content": f'Classify this review:\n\n"{review_text}"',
@@ -255,6 +468,43 @@ class OpenAIClient(LLMClientBase):
return result, metadata
async def generate(
self,
system_prompt: str,
user_prompt: str,
temperature: float = 0.7,
max_tokens: int = 4000,
) -> str:
"""Generate text using OpenAI."""
messages = [
{"role": "system", "content": system_prompt},
{"role": "user", "content": user_prompt},
]
response = await self.client.chat.completions.create(
model=self.model,
messages=messages,
temperature=temperature,
max_tokens=max_tokens,
response_format={"type": "json_object"},
timeout=self.config.llm_timeout_seconds,
)
content = response.choices[0].message.content
if not content:
raise ValueError("Empty response from OpenAI")
# Track usage
if response.usage:
input_tokens = response.usage.prompt_tokens
output_tokens = response.usage.completion_tokens
pricing = self.PRICING.get(self.model, {"input": 0.15, "output": 0.60})
cost = (input_tokens * pricing["input"] + output_tokens * pricing["output"]) / 1_000_000
self.total_tokens_used += input_tokens + output_tokens
self.total_cost_usd += cost
return content
async def close(self) -> None:
"""Close the OpenAI client."""
await self.client.close()
@@ -289,7 +539,7 @@ class AnthropicClient(LLMClientBase):
response = await self.client.messages.create(
model=self.model,
max_tokens=4096,
system=SYSTEM_PROMPT,
system=self.get_prompt(),
messages=[
{
"role": "user",
@@ -329,6 +579,58 @@ class AnthropicClient(LLMClientBase):
return result, metadata
async def generate(
self,
system_prompt: str,
user_prompt: str,
temperature: float = 0.7,
max_tokens: int = 4000,
) -> str:
"""Generate text using Anthropic."""
response = await self.client.messages.create(
model=self.model,
max_tokens=max_tokens,
system=system_prompt,
messages=[{"role": "user", "content": user_prompt}],
temperature=temperature,
)
content = response.content[0].text if response.content else ""
if not content:
raise ValueError("Empty response from Anthropic")
# Track usage
input_tokens = response.usage.input_tokens
output_tokens = response.usage.output_tokens
pricing = self.PRICING.get(self.model, {"input": 3.0, "output": 15.0})
cost = (input_tokens * pricing["input"] + output_tokens * pricing["output"]) / 1_000_000
self.total_tokens_used += input_tokens + output_tokens
self.total_cost_usd += cost
# Extract JSON from response (handles code blocks)
return self._extract_json_string(content)
def _extract_json_string(self, content: str) -> str:
"""Extract JSON string from response, handling markdown code blocks."""
import re
content = content.strip()
# If it starts with {, return as-is
if content.startswith("{"):
return content
# Try to find JSON in code blocks
json_match = re.search(r"```(?:json)?\s*([\s\S]*?)\s*```", content)
if json_match:
return json_match.group(1)
# Try to find JSON object
json_match = re.search(r"\{[\s\S]*\}", content)
if json_match:
return json_match.group(0)
return content
def _extract_json(self, content: str) -> dict[str, Any]:
"""Extract JSON from response, handling markdown code blocks."""
content = content.strip()