feat(pipeline): Add Stage 5 Synthesis for AI-generated narratives
- Add Stage5Synthesizer class that generates AI narratives and action plans - Add generate() method to LLMClient for synthesis generation - Integrate Stage 5 into pipeline runner after route stage - Add synthesis JSONB column to pipeline.executions table - Update reviewiq_analytics API to return synthesis data - Synthesis includes: executive narrative, sentiment/category/timeline insights, action plan, marketing angles, and priority recommendations Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
This commit is contained in:
@@ -29,28 +29,205 @@ Your task is to extract semantic spans from customer reviews and classify each s
|
||||
|
||||
## SPAN EXTRACTION RULES
|
||||
|
||||
1. **Split on contrasting conjunctions**: but, however, although, despite, yet, though
|
||||
2. **Split on topic/target change**: food → service → bathroom = 3 spans
|
||||
3. **Split on valence change**: positive → negative = split
|
||||
4. **Split on domain change**: O (Offering) → J (Journey) → E (Environment) = split
|
||||
5. **Keep together**: cause→effect within same feedback unit ("X because Y" = 1 span)
|
||||
**CRITICAL: Use TOPIC-BASED splitting, NOT sentence-based splitting.**
|
||||
|
||||
A span = all consecutive text about the SAME topic/domain, regardless of sentence count.
|
||||
|
||||
### When to KEEP TOGETHER (same span):
|
||||
- Multiple sentences about the same topic: "The food was great. I loved the pasta. The sauce was perfect." → ONE span (all about Offering)
|
||||
- Cause and effect: "The wait was long because they were understaffed" → ONE span
|
||||
- Elaboration: "Staff was rude. They ignored us for 20 minutes." → ONE span (both about People)
|
||||
- Single-topic reviews: Even if 5 sentences, if all about food → ONE span
|
||||
|
||||
### When to SPLIT (separate spans):
|
||||
- Contrasting conjunctions that change topic: "Food was great BUT service was slow" → TWO spans
|
||||
- Domain change: food (O) → staff (P) → ambiance (E) = split at each change
|
||||
- Target change: "The waiter was nice but the manager was rude" → TWO spans (different people)
|
||||
|
||||
### Examples:
|
||||
- "Amazing food. Best burger ever. Fries were crispy too." → 1 span (all Offering, V+)
|
||||
- "Food was great but we waited an hour." → 2 spans (Offering V+, Journey V-)
|
||||
- "I've been coming here for years. Always consistent quality." → 1 span (Relationship)
|
||||
- "The staff are lovely and amazing with kids. More highchairs are definitely needed though." → 2 spans (People V+, Access V-)
|
||||
|
||||
**Guardrails**:
|
||||
- Max 3 spans per sentence (if 4+, re-check for over-splitting)
|
||||
- Min 1 span per review (even single-word reviews)
|
||||
- Spans must be non-overlapping and cover meaningful content
|
||||
- Prefer FEWER, LARGER spans over many small ones
|
||||
- Most reviews should have 1-3 spans, rarely more
|
||||
- Min 1 span per review
|
||||
- Spans must be non-overlapping
|
||||
|
||||
## URT DOMAINS (Tier-3 codes: X#.##)
|
||||
## URT TAXONOMY - COMPLETE (138 codes, use EXACT codes)
|
||||
|
||||
| Domain | Code | Description |
|
||||
|--------|------|-------------|
|
||||
| Offering | O1-O4 | Product/service quality, features, variety |
|
||||
| Price | P1-P4 | Value, pricing, promotions, payment |
|
||||
| Journey | J1-J4 | Timing, process, convenience, accessibility |
|
||||
| Environment | E1-E4 | Physical space, ambiance, cleanliness, digital UX |
|
||||
| Attitude | A1-A4 | Staff behavior, helpfulness, professionalism |
|
||||
| Voice | V1-V4 | Brand, communication, marketing, transparency |
|
||||
| Relationship | R1-R4 | Loyalty, trust, consistency, personalization |
|
||||
### O - OFFERING (Product/Service Quality) - 18 codes
|
||||
O1.01 Works/Doesn't Work: Basic functionality success or failure
|
||||
O1.02 Performance Level: How well it operates
|
||||
O1.03 Durability: Longevity and resistance to wear
|
||||
O1.04 Reliability: Consistency of function over time
|
||||
O1.05 Outcome Achievement: Did customer accomplish their goal?
|
||||
O2.01 Materials/Inputs: Quality of components or ingredients
|
||||
O2.02 Craftsmanship: Skill of construction or execution
|
||||
O2.03 Presentation: Visual and aesthetic quality
|
||||
O2.04 Attention to Detail: Finishing touches and refinement
|
||||
O2.05 Condition at Delivery: State when received
|
||||
O3.01 All Components Present: Nothing missing from what was promised
|
||||
O3.02 Feature Availability: Promised features actually work
|
||||
O3.03 Scope Delivery: Full scope of work completed
|
||||
O3.04 Documentation: Supporting materials provided
|
||||
O4.01 Specification Match: Matches what was ordered
|
||||
O4.02 Personalization: Adapted to individual preferences
|
||||
O4.03 Flexibility: Can be modified or adjusted
|
||||
O4.04 Appropriateness: Right solution for the need
|
||||
|
||||
### P - PEOPLE (Staff Interactions) - 20 codes
|
||||
P1.01 Warmth: Friendly and welcoming manner
|
||||
P1.02 Respect: Treated with dignity
|
||||
P1.03 Patience: Calm and tolerant approach
|
||||
P1.04 Enthusiasm: Energy and engagement
|
||||
P1.05 Empathy: Understanding feelings
|
||||
P2.01 Knowledge: Expertise and understanding
|
||||
P2.02 Skill: Technical ability
|
||||
P2.03 Problem Solving: Ability to find solutions
|
||||
P2.04 Advice Quality: Helpful recommendations
|
||||
P2.05 Training Level: Staff training evident
|
||||
P3.01 Attentiveness: Being present and engaged
|
||||
P3.02 Initiative: Proactive help
|
||||
P3.03 Follow-through: Completing promised actions
|
||||
P3.04 Availability: Being available when needed
|
||||
P3.05 Dedication: Commitment to helping
|
||||
P4.01 Clarity: Clear communication
|
||||
P4.02 Listening: Understanding customer needs
|
||||
P4.03 Transparency: Honest and open
|
||||
P4.04 Honesty: Truthful communication
|
||||
P4.05 Proactive Updates: Keeping customer informed
|
||||
|
||||
### J - JOURNEY (Process & Timing) - 20 codes
|
||||
J1.01 Speed: How fast things happen
|
||||
J1.02 Punctuality: On-time delivery
|
||||
J1.03 Queue Management: Handling of waiting customers
|
||||
J1.04 Punctuality: Meeting scheduled times
|
||||
J1.05 Pacing: Appropriate speed (not rushed/dragged)
|
||||
J2.01 Simplicity: Easy process
|
||||
J2.02 Friction: Obstacles encountered
|
||||
J2.03 Navigation: Finding what you need
|
||||
J2.04 Booking Availability: Slots/capacity when needed
|
||||
J2.05 Inventory: Stock availability
|
||||
J3.01 Consistency: Same experience every time
|
||||
J3.02 Accuracy: Getting it right
|
||||
J3.03 Uptime: System availability
|
||||
J3.04 Data Accuracy: Correct info in systems
|
||||
J3.05 Integration: Systems work together
|
||||
J4.01 Problem Recognition: Acknowledging issues
|
||||
J4.02 Resolution Speed: How fast problems get fixed
|
||||
J4.03 Resolution Fairness: Fair handling of issues
|
||||
J4.04 Escalation: Getting to right person
|
||||
J4.05 Closure: Issue fully resolved
|
||||
|
||||
### E - ENVIRONMENT (Physical & Digital Space) - 20 codes
|
||||
E1.01 Cleanliness: How clean the space is
|
||||
E1.02 Comfort: Physical comfort
|
||||
E1.03 Space Design: Layout and organization
|
||||
E1.04 Ambiance: Atmosphere and vibe
|
||||
E1.05 Comfort: Physical comfort
|
||||
E2.01 Lighting: Light quality and level
|
||||
E2.02 Sound/Noise: Audio environment
|
||||
E2.03 Temperature: Climate control
|
||||
E2.04 Visual Design: Aesthetics of interface
|
||||
E2.05 Mobile Experience: Mobile usability
|
||||
E3.01 Interface Design: Digital UX/UI
|
||||
E3.02 App/Website Speed: Digital performance
|
||||
E3.03 Usability: Ease of digital use
|
||||
E3.04 Health Safety: Health precautions
|
||||
E3.05 Cyber Security: Digital security
|
||||
E4.01 Safety: Physical safety
|
||||
E4.02 Security: Protection of belongings/data
|
||||
E4.03 Health/Hygiene: Health standards
|
||||
E4.04 Social Responsibility: Ethical practices
|
||||
E4.05 Community Impact: Local community effect
|
||||
|
||||
### A - ACCESS (Availability & Accessibility) - 20 codes
|
||||
A1.01 Hours: Operating hours
|
||||
A1.02 Booking Availability: Appointment slots
|
||||
A1.03 Inventory: Product availability
|
||||
A1.04 Wayfinding: Finding destination
|
||||
A1.05 Physical Accessibility: Disability accommodations
|
||||
A2.01 Physical Access: Mobility accessibility
|
||||
A2.02 Language Access: Language accommodation
|
||||
A2.03 Digital Accessibility: Screen reader/a11y
|
||||
A2.04 Language Accessibility: Multilingual support
|
||||
A2.05 Hours of Operation: Service availability times
|
||||
A3.01 Diversity Welcome: All backgrounds welcome
|
||||
A3.02 Accommodation: Special needs accommodation
|
||||
A3.03 Response Time: Speed of getting answers
|
||||
A3.04 Documentation Clarity: Clear instructions
|
||||
A3.05 Support Accessibility: Getting help when needed
|
||||
A4.01 Location: Physical location convenience
|
||||
A4.02 Parking: Parking availability
|
||||
A4.03 Multiple Channels: Ways to engage
|
||||
A4.04 Payment Flexibility: Multiple payment options
|
||||
A4.05 Refund Accessibility: Getting money back
|
||||
|
||||
### V - VALUE (Pricing & Costs) - 20 codes ⚠️ USE FOR ALL PRICE/COST/FEE MENTIONS
|
||||
V1.01 Price Level: Cost amount ("cheap", "expensive", "affordable", "€", "$")
|
||||
V1.02 Price Fairness: Fair for what you get
|
||||
V1.03 Hidden Costs: Unexpected charges, surprise fees, hidden fees, extra charges
|
||||
V1.04 Price Transparency: Clear pricing upfront
|
||||
V1.05 Price Stability: Consistent pricing
|
||||
V2.01 Clear Pricing: Easy to understand costs
|
||||
V2.02 Honest Billing: Accurate charges
|
||||
V2.03 Policy Clarity: Clear terms and conditions
|
||||
V2.04 Quality-Price Ratio: Worth vs cost
|
||||
V2.05 Competitive Value: Compared to alternatives
|
||||
V3.01 Time Investment: Time required
|
||||
V3.02 Hassle Factor: Difficulty and inconvenience
|
||||
V3.03 Mental Load: Cognitive effort required
|
||||
V3.04 Promotion Clarity: Clear offer terms
|
||||
V3.05 Reward Redemption: Using points/rewards
|
||||
V4.01 Value for Money: Worth what you paid
|
||||
V4.02 ROI: Return on investment
|
||||
V4.03 Overall Satisfaction: Happy with the exchange
|
||||
V4.04 Billing Accuracy: Correct charges
|
||||
V4.05 Billing Resolution: Fixing billing issues
|
||||
|
||||
### R - RELATIONSHIP (Trust & Loyalty) - 20 codes
|
||||
R1.01 Honesty: Truthfulness
|
||||
R1.02 Ethics: Ethical behavior, deceptive practices, scams
|
||||
R1.03 Promises Kept: Following through on promises
|
||||
R1.04 Ethics: Ethical behavior
|
||||
R1.05 Accountability: Taking responsibility
|
||||
R2.01 Consistency: Reliable over time
|
||||
R2.02 Trustworthiness: Can be trusted
|
||||
R2.03 Accountability: Takes responsibility
|
||||
R2.04 Predictability: Consistent experience
|
||||
R2.05 Standards: Meeting quality standards
|
||||
R3.01 Error Acknowledgment: Admits mistakes
|
||||
R3.02 Apology Quality: Sincere apologies
|
||||
R3.03 Making It Right: Correcting mistakes
|
||||
R3.04 Personal Connection: Human touch
|
||||
R3.05 Going Extra Mile: Beyond expectations
|
||||
R4.01 Customer Recognition: Remembers customers
|
||||
R4.02 Loyalty Rewards: Rewards for loyalty
|
||||
R4.03 Long-term Relationship: Builds relationships
|
||||
R4.04 Service Recovery: Making things right
|
||||
R4.05 Feedback Response: Acting on feedback
|
||||
|
||||
## CLASSIFICATION EXAMPLES (Critical Distinctions)
|
||||
|
||||
**PRICING/COSTS → V codes (Value), NOT P codes:**
|
||||
- "Cheap prices", "good price", "€50" → V1.01 Price Level
|
||||
- "Hidden charges", "surprise fees", "extra €35" → V1.03 Hidden Costs
|
||||
- "Great value for money" → V4.01 Value for Money
|
||||
- "Overcharged", "wrong amount" → V4.04 Billing Accuracy
|
||||
|
||||
**STAFF BEHAVIOR → P codes (People):**
|
||||
- "Staff was friendly", "welcoming" → P1.01 Warmth
|
||||
- "Rude", "disrespectful", "ignored us" → P1.02 Respect
|
||||
- "Patient", "took their time" → P1.03 Patience
|
||||
- "Knowledgeable", "expert" → P2.01 Knowledge
|
||||
|
||||
**DECEPTION/ETHICS → R codes (Relationship):**
|
||||
- "They lied", "misleading" → R1.01 Honesty
|
||||
- "Felt scammed", "dishonest practices" → R1.02 Ethics
|
||||
- "Didn't honor the deal" → R1.03 Promises Kept
|
||||
|
||||
## DIMENSION CODES
|
||||
|
||||
@@ -159,6 +336,20 @@ class LLMClientBase(ABC):
|
||||
self.config = config
|
||||
self.total_tokens_used = 0
|
||||
self.total_cost_usd = 0.0
|
||||
self._custom_prompt: str | None = None
|
||||
|
||||
def set_prompt(self, prompt: str) -> None:
|
||||
"""
|
||||
Set a custom system prompt (e.g., built dynamically from database).
|
||||
|
||||
Args:
|
||||
prompt: The system prompt to use for classification
|
||||
"""
|
||||
self._custom_prompt = prompt
|
||||
|
||||
def get_prompt(self) -> str:
|
||||
"""Get the current system prompt (custom or default)."""
|
||||
return self._custom_prompt or SYSTEM_PROMPT
|
||||
|
||||
@abstractmethod
|
||||
async def classify(
|
||||
@@ -178,6 +369,28 @@ class LLMClientBase(ABC):
|
||||
"""
|
||||
pass
|
||||
|
||||
@abstractmethod
|
||||
async def generate(
|
||||
self,
|
||||
system_prompt: str,
|
||||
user_prompt: str,
|
||||
temperature: float = 0.7,
|
||||
max_tokens: int = 4000,
|
||||
) -> str:
|
||||
"""
|
||||
Generate text using the LLM (for synthesis, narratives, etc.).
|
||||
|
||||
Args:
|
||||
system_prompt: System instructions
|
||||
user_prompt: User content/context
|
||||
temperature: Creativity level (0-1)
|
||||
max_tokens: Maximum response length
|
||||
|
||||
Returns:
|
||||
Generated text response
|
||||
"""
|
||||
pass
|
||||
|
||||
@abstractmethod
|
||||
async def close(self) -> None:
|
||||
"""Close the client and cleanup resources."""
|
||||
@@ -211,7 +424,7 @@ class OpenAIClient(LLMClientBase):
|
||||
start_time = time.time()
|
||||
|
||||
messages = [
|
||||
{"role": "system", "content": SYSTEM_PROMPT},
|
||||
{"role": "system", "content": self.get_prompt()},
|
||||
{
|
||||
"role": "user",
|
||||
"content": f'Classify this review:\n\n"{review_text}"',
|
||||
@@ -255,6 +468,43 @@ class OpenAIClient(LLMClientBase):
|
||||
|
||||
return result, metadata
|
||||
|
||||
async def generate(
|
||||
self,
|
||||
system_prompt: str,
|
||||
user_prompt: str,
|
||||
temperature: float = 0.7,
|
||||
max_tokens: int = 4000,
|
||||
) -> str:
|
||||
"""Generate text using OpenAI."""
|
||||
messages = [
|
||||
{"role": "system", "content": system_prompt},
|
||||
{"role": "user", "content": user_prompt},
|
||||
]
|
||||
|
||||
response = await self.client.chat.completions.create(
|
||||
model=self.model,
|
||||
messages=messages,
|
||||
temperature=temperature,
|
||||
max_tokens=max_tokens,
|
||||
response_format={"type": "json_object"},
|
||||
timeout=self.config.llm_timeout_seconds,
|
||||
)
|
||||
|
||||
content = response.choices[0].message.content
|
||||
if not content:
|
||||
raise ValueError("Empty response from OpenAI")
|
||||
|
||||
# Track usage
|
||||
if response.usage:
|
||||
input_tokens = response.usage.prompt_tokens
|
||||
output_tokens = response.usage.completion_tokens
|
||||
pricing = self.PRICING.get(self.model, {"input": 0.15, "output": 0.60})
|
||||
cost = (input_tokens * pricing["input"] + output_tokens * pricing["output"]) / 1_000_000
|
||||
self.total_tokens_used += input_tokens + output_tokens
|
||||
self.total_cost_usd += cost
|
||||
|
||||
return content
|
||||
|
||||
async def close(self) -> None:
|
||||
"""Close the OpenAI client."""
|
||||
await self.client.close()
|
||||
@@ -289,7 +539,7 @@ class AnthropicClient(LLMClientBase):
|
||||
response = await self.client.messages.create(
|
||||
model=self.model,
|
||||
max_tokens=4096,
|
||||
system=SYSTEM_PROMPT,
|
||||
system=self.get_prompt(),
|
||||
messages=[
|
||||
{
|
||||
"role": "user",
|
||||
@@ -329,6 +579,58 @@ class AnthropicClient(LLMClientBase):
|
||||
|
||||
return result, metadata
|
||||
|
||||
async def generate(
|
||||
self,
|
||||
system_prompt: str,
|
||||
user_prompt: str,
|
||||
temperature: float = 0.7,
|
||||
max_tokens: int = 4000,
|
||||
) -> str:
|
||||
"""Generate text using Anthropic."""
|
||||
response = await self.client.messages.create(
|
||||
model=self.model,
|
||||
max_tokens=max_tokens,
|
||||
system=system_prompt,
|
||||
messages=[{"role": "user", "content": user_prompt}],
|
||||
temperature=temperature,
|
||||
)
|
||||
|
||||
content = response.content[0].text if response.content else ""
|
||||
if not content:
|
||||
raise ValueError("Empty response from Anthropic")
|
||||
|
||||
# Track usage
|
||||
input_tokens = response.usage.input_tokens
|
||||
output_tokens = response.usage.output_tokens
|
||||
pricing = self.PRICING.get(self.model, {"input": 3.0, "output": 15.0})
|
||||
cost = (input_tokens * pricing["input"] + output_tokens * pricing["output"]) / 1_000_000
|
||||
self.total_tokens_used += input_tokens + output_tokens
|
||||
self.total_cost_usd += cost
|
||||
|
||||
# Extract JSON from response (handles code blocks)
|
||||
return self._extract_json_string(content)
|
||||
|
||||
def _extract_json_string(self, content: str) -> str:
|
||||
"""Extract JSON string from response, handling markdown code blocks."""
|
||||
import re
|
||||
content = content.strip()
|
||||
|
||||
# If it starts with {, return as-is
|
||||
if content.startswith("{"):
|
||||
return content
|
||||
|
||||
# Try to find JSON in code blocks
|
||||
json_match = re.search(r"```(?:json)?\s*([\s\S]*?)\s*```", content)
|
||||
if json_match:
|
||||
return json_match.group(1)
|
||||
|
||||
# Try to find JSON object
|
||||
json_match = re.search(r"\{[\s\S]*\}", content)
|
||||
if json_match:
|
||||
return json_match.group(0)
|
||||
|
||||
return content
|
||||
|
||||
def _extract_json(self, content: str) -> dict[str, Any]:
|
||||
"""Extract JSON from response, handling markdown code blocks."""
|
||||
content = content.strip()
|
||||
|
||||
Reference in New Issue
Block a user