Add LLM Classification Contract v1.0
Defines prompt, output schema, and validation rules for span-level URT classification: - System prompt with span extraction rules - JSON schema for structured output - 4 few-shot examples (multi-span, temporal, comparative) - Structural and semantic validation rules - Error handling with retry + fallback - Performance considerations (token budget, batching, caching) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
This commit is contained in:
754
.artifacts/LLM-Classification-Contract-v1.md
Normal file
754
.artifacts/LLM-Classification-Contract-v1.md
Normal file
@@ -0,0 +1,754 @@
|
|||||||
|
# LLM Classification Contract v1.0
|
||||||
|
|
||||||
|
**Purpose**: Define the prompt, output schema, and validation rules for span-level URT classification.
|
||||||
|
**Target Model**: Claude 3.5 Sonnet / GPT-4o (structured output mode)
|
||||||
|
**Date**: 2026-01-24
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 1. Overview
|
||||||
|
|
||||||
|
The LLM receives a single review text and returns an array of **spans** — semantically distinct units of feedback. Each span is independently classified using URT v5.1.
|
||||||
|
|
||||||
|
**Pipeline position**:
|
||||||
|
```
|
||||||
|
reviews_raw.text → LLM → spans[] → review_spans table
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 2. System Prompt
|
||||||
|
|
||||||
|
```
|
||||||
|
You are a review classification system using URT (Universal Review Taxonomy) v5.1.
|
||||||
|
|
||||||
|
Your task is to extract semantic spans from customer reviews and classify each span independently.
|
||||||
|
|
||||||
|
## SPAN EXTRACTION RULES
|
||||||
|
|
||||||
|
1. **Split on contrasting conjunctions**: but, however, although, despite, yet, though
|
||||||
|
2. **Split on topic/target change**: food → service → bathroom = 3 spans
|
||||||
|
3. **Split on valence change**: positive → negative = split
|
||||||
|
4. **Split on domain change**: O (Offering) → J (Journey) → E (Environment) = split
|
||||||
|
5. **Keep together**: cause→effect within same feedback unit ("X because Y" = 1 span)
|
||||||
|
|
||||||
|
**Guardrails**:
|
||||||
|
- Max 3 spans per sentence (if 4+, re-check for over-splitting)
|
||||||
|
- Min 1 span per review (even single-word reviews)
|
||||||
|
- Spans must be non-overlapping and cover meaningful content
|
||||||
|
|
||||||
|
## URT DOMAINS (Tier-3 codes: X#.##)
|
||||||
|
|
||||||
|
| Domain | Code | Description |
|
||||||
|
|--------|------|-------------|
|
||||||
|
| Offering | O1-O4 | Product/service quality, features, variety |
|
||||||
|
| Price | P1-P4 | Value, pricing, promotions, payment |
|
||||||
|
| Journey | J1-J4 | Timing, process, convenience, accessibility |
|
||||||
|
| Environment | E1-E4 | Physical space, ambiance, cleanliness, digital UX |
|
||||||
|
| Attitude | A1-A4 | Staff behavior, helpfulness, professionalism |
|
||||||
|
| Voice | V1-V4 | Brand, communication, marketing, transparency |
|
||||||
|
| Relationship | R1-R4 | Loyalty, trust, consistency, personalization |
|
||||||
|
|
||||||
|
## DIMENSION CODES
|
||||||
|
|
||||||
|
### Valence
|
||||||
|
- V+ : Positive sentiment
|
||||||
|
- V- : Negative sentiment
|
||||||
|
- V0 : Neutral/factual
|
||||||
|
- V± : Mixed within the span
|
||||||
|
|
||||||
|
### Intensity
|
||||||
|
- I1 : Low ("okay", "fine", "decent")
|
||||||
|
- I2 : Moderate ("good", "bad", "slow")
|
||||||
|
- I3 : High ("amazing", "terrible", "unacceptable")
|
||||||
|
|
||||||
|
### Specificity
|
||||||
|
- S1 : Vague ("it was bad")
|
||||||
|
- S2 : Some detail ("the food was cold")
|
||||||
|
- S3 : Precise ("waited 45 minutes for appetizers")
|
||||||
|
|
||||||
|
### Actionability
|
||||||
|
- A1 : No clear action possible
|
||||||
|
- A2 : Possible actions, unclear which
|
||||||
|
- A3 : Clear, specific action ("train staff on X", "fix Y")
|
||||||
|
|
||||||
|
### Temporal
|
||||||
|
- TC : Current visit (default when no markers)
|
||||||
|
- TR : Recent pattern ("lately", "recently", "again")
|
||||||
|
- TH : Historical ("for years", "always", "used to")
|
||||||
|
- TF : Future ("won't return", "next time", "I expect")
|
||||||
|
|
||||||
|
### Evidence
|
||||||
|
- ES : Stated explicitly in text (default)
|
||||||
|
- EI : Inferred logically (not stated, but entailed)
|
||||||
|
- EC : Contextual (depends on surrounding text)
|
||||||
|
|
||||||
|
### Comparative
|
||||||
|
- CR-N : No comparison (default)
|
||||||
|
- CR-B : Better than alternatives
|
||||||
|
- CR-W : Worse than alternatives
|
||||||
|
- CR-S : Same as alternatives
|
||||||
|
|
||||||
|
## PRIMARY SPAN SELECTION
|
||||||
|
|
||||||
|
Mark exactly ONE span as is_primary=true using this order:
|
||||||
|
1. Highest intensity (I3 > I2 > I1)
|
||||||
|
2. Tie-break: negative over positive (V- > V± > V0 > V+)
|
||||||
|
3. Tie-break: earliest span_index
|
||||||
|
|
||||||
|
## USN (URT String Notation)
|
||||||
|
|
||||||
|
Generate a USN string for each span:
|
||||||
|
```
|
||||||
|
URT:S:{primary}[+{sec1}][+{sec2}]:{valence_sign}{intensity_num}:{S#}{A#}{temporal}.{evidence}.{CR_suffix}
|
||||||
|
```
|
||||||
|
|
||||||
|
Examples:
|
||||||
|
- `URT:S:J1.03:-2:22TC.ES.N` (J1.03, V-, I2, S2, A2, TC, ES, CR-N)
|
||||||
|
- `URT:S:P1.01+O2.03:+3:33TR.ES.B` (P1.01 primary, O2.03 secondary, V+, I3, S3, A3, TR, ES, CR-B)
|
||||||
|
|
||||||
|
Valence encoding: + for V+, - for V-, 0 for V0, ± for V±
|
||||||
|
CR suffix: N=CR-N, B=CR-B, W=CR-W, S=CR-S
|
||||||
|
|
||||||
|
## OUTPUT FORMAT
|
||||||
|
|
||||||
|
Return valid JSON matching the schema exactly. No markdown, no explanations.
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 3. Output JSON Schema
|
||||||
|
|
||||||
|
```json
|
||||||
|
{
|
||||||
|
"$schema": "https://json-schema.org/draft/2020-12/schema",
|
||||||
|
"title": "URT Span Extraction Response",
|
||||||
|
"type": "object",
|
||||||
|
"required": ["spans", "review_summary"],
|
||||||
|
"additionalProperties": false,
|
||||||
|
"properties": {
|
||||||
|
"spans": {
|
||||||
|
"type": "array",
|
||||||
|
"minItems": 1,
|
||||||
|
"maxItems": 15,
|
||||||
|
"items": {
|
||||||
|
"type": "object",
|
||||||
|
"required": [
|
||||||
|
"span_index",
|
||||||
|
"span_text",
|
||||||
|
"span_start",
|
||||||
|
"span_end",
|
||||||
|
"urt_primary",
|
||||||
|
"urt_secondary",
|
||||||
|
"valence",
|
||||||
|
"intensity",
|
||||||
|
"specificity",
|
||||||
|
"actionability",
|
||||||
|
"temporal",
|
||||||
|
"evidence",
|
||||||
|
"comparative",
|
||||||
|
"is_primary",
|
||||||
|
"usn"
|
||||||
|
],
|
||||||
|
"additionalProperties": false,
|
||||||
|
"properties": {
|
||||||
|
"span_index": {
|
||||||
|
"type": "integer",
|
||||||
|
"minimum": 0,
|
||||||
|
"description": "0-based position in review"
|
||||||
|
},
|
||||||
|
"span_text": {
|
||||||
|
"type": "string",
|
||||||
|
"minLength": 1,
|
||||||
|
"description": "Exact text extracted from review"
|
||||||
|
},
|
||||||
|
"span_start": {
|
||||||
|
"type": "integer",
|
||||||
|
"minimum": 0,
|
||||||
|
"description": "Character offset start (0-indexed)"
|
||||||
|
},
|
||||||
|
"span_end": {
|
||||||
|
"type": "integer",
|
||||||
|
"minimum": 1,
|
||||||
|
"description": "Character offset end (exclusive)"
|
||||||
|
},
|
||||||
|
"urt_primary": {
|
||||||
|
"type": "string",
|
||||||
|
"pattern": "^[OPJEAVR][1-4]\\.[0-9]{2}$",
|
||||||
|
"description": "Primary URT Tier-3 code"
|
||||||
|
},
|
||||||
|
"urt_secondary": {
|
||||||
|
"type": "array",
|
||||||
|
"maxItems": 2,
|
||||||
|
"items": {
|
||||||
|
"type": "string",
|
||||||
|
"pattern": "^[OPJEAVR][1-4]\\.[0-9]{2}$"
|
||||||
|
},
|
||||||
|
"description": "Secondary codes (max 2, different domains preferred)"
|
||||||
|
},
|
||||||
|
"valence": {
|
||||||
|
"type": "string",
|
||||||
|
"enum": ["V+", "V-", "V0", "V±"]
|
||||||
|
},
|
||||||
|
"intensity": {
|
||||||
|
"type": "string",
|
||||||
|
"enum": ["I1", "I2", "I3"]
|
||||||
|
},
|
||||||
|
"specificity": {
|
||||||
|
"type": "string",
|
||||||
|
"enum": ["S1", "S2", "S3"]
|
||||||
|
},
|
||||||
|
"actionability": {
|
||||||
|
"type": "string",
|
||||||
|
"enum": ["A1", "A2", "A3"]
|
||||||
|
},
|
||||||
|
"temporal": {
|
||||||
|
"type": "string",
|
||||||
|
"enum": ["TC", "TR", "TH", "TF"]
|
||||||
|
},
|
||||||
|
"evidence": {
|
||||||
|
"type": "string",
|
||||||
|
"enum": ["ES", "EI", "EC"]
|
||||||
|
},
|
||||||
|
"comparative": {
|
||||||
|
"type": "string",
|
||||||
|
"enum": ["CR-N", "CR-B", "CR-W", "CR-S"]
|
||||||
|
},
|
||||||
|
"is_primary": {
|
||||||
|
"type": "boolean",
|
||||||
|
"description": "True for exactly one span per review"
|
||||||
|
},
|
||||||
|
"confidence": {
|
||||||
|
"type": "string",
|
||||||
|
"enum": ["high", "medium", "low"],
|
||||||
|
"default": "medium"
|
||||||
|
},
|
||||||
|
"entity": {
|
||||||
|
"type": ["string", "null"],
|
||||||
|
"description": "Named entity if present (staff name, product, location)"
|
||||||
|
},
|
||||||
|
"entity_type": {
|
||||||
|
"type": ["string", "null"],
|
||||||
|
"enum": ["location", "staff", "product", "process", "time", "other", null]
|
||||||
|
},
|
||||||
|
"relation_type": {
|
||||||
|
"type": ["string", "null"],
|
||||||
|
"enum": ["cause_of", "effect_of", "contrast", "resolution", null],
|
||||||
|
"description": "Relationship to another span in this review"
|
||||||
|
},
|
||||||
|
"related_span_index": {
|
||||||
|
"type": ["integer", "null"],
|
||||||
|
"minimum": 0,
|
||||||
|
"description": "Index of related span (must be different from this span)"
|
||||||
|
},
|
||||||
|
"usn": {
|
||||||
|
"type": "string",
|
||||||
|
"pattern": "^URT:S:[OPJEAVR][1-4]\\.[0-9]{2}",
|
||||||
|
"description": "URT String Notation for audit"
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
},
|
||||||
|
"review_summary": {
|
||||||
|
"type": "object",
|
||||||
|
"required": ["dominant_valence", "dominant_domain", "span_count"],
|
||||||
|
"properties": {
|
||||||
|
"dominant_valence": {
|
||||||
|
"type": "string",
|
||||||
|
"enum": ["V+", "V-", "V0", "V±"]
|
||||||
|
},
|
||||||
|
"dominant_domain": {
|
||||||
|
"type": "string",
|
||||||
|
"pattern": "^[OPJEAVR]$"
|
||||||
|
},
|
||||||
|
"span_count": {
|
||||||
|
"type": "integer",
|
||||||
|
"minimum": 1
|
||||||
|
},
|
||||||
|
"has_comparative": {
|
||||||
|
"type": "boolean"
|
||||||
|
},
|
||||||
|
"has_entity": {
|
||||||
|
"type": "boolean"
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 4. Few-Shot Examples
|
||||||
|
|
||||||
|
### Example 1: Multi-span review (3 spans)
|
||||||
|
|
||||||
|
**Input**:
|
||||||
|
```json
|
||||||
|
{
|
||||||
|
"review_text": "The food was incredible, best steak I've ever had. But we waited 45 minutes for a table even with a reservation. The manager comped our drinks which was nice.",
|
||||||
|
"profile": "standard"
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
**Output**:
|
||||||
|
```json
|
||||||
|
{
|
||||||
|
"spans": [
|
||||||
|
{
|
||||||
|
"span_index": 0,
|
||||||
|
"span_text": "The food was incredible, best steak I've ever had.",
|
||||||
|
"span_start": 0,
|
||||||
|
"span_end": 50,
|
||||||
|
"urt_primary": "O1.01",
|
||||||
|
"urt_secondary": [],
|
||||||
|
"valence": "V+",
|
||||||
|
"intensity": "I3",
|
||||||
|
"specificity": "S2",
|
||||||
|
"actionability": "A1",
|
||||||
|
"temporal": "TC",
|
||||||
|
"evidence": "ES",
|
||||||
|
"comparative": "CR-B",
|
||||||
|
"is_primary": false,
|
||||||
|
"confidence": "high",
|
||||||
|
"entity": "steak",
|
||||||
|
"entity_type": "product",
|
||||||
|
"relation_type": null,
|
||||||
|
"related_span_index": null,
|
||||||
|
"usn": "URT:S:O1.01:+3:21TC.ES.B"
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"span_index": 1,
|
||||||
|
"span_text": "But we waited 45 minutes for a table even with a reservation.",
|
||||||
|
"span_start": 51,
|
||||||
|
"span_end": 112,
|
||||||
|
"urt_primary": "J1.02",
|
||||||
|
"urt_secondary": [],
|
||||||
|
"valence": "V-",
|
||||||
|
"intensity": "I3",
|
||||||
|
"specificity": "S3",
|
||||||
|
"actionability": "A3",
|
||||||
|
"temporal": "TC",
|
||||||
|
"evidence": "ES",
|
||||||
|
"comparative": "CR-N",
|
||||||
|
"is_primary": true,
|
||||||
|
"confidence": "high",
|
||||||
|
"entity": null,
|
||||||
|
"entity_type": null,
|
||||||
|
"relation_type": null,
|
||||||
|
"related_span_index": null,
|
||||||
|
"usn": "URT:S:J1.02:-3:33TC.ES.N"
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"span_index": 2,
|
||||||
|
"span_text": "The manager comped our drinks which was nice.",
|
||||||
|
"span_start": 113,
|
||||||
|
"span_end": 158,
|
||||||
|
"urt_primary": "A2.01",
|
||||||
|
"urt_secondary": [],
|
||||||
|
"valence": "V+",
|
||||||
|
"intensity": "I2",
|
||||||
|
"specificity": "S2",
|
||||||
|
"actionability": "A1",
|
||||||
|
"temporal": "TC",
|
||||||
|
"evidence": "ES",
|
||||||
|
"comparative": "CR-N",
|
||||||
|
"is_primary": false,
|
||||||
|
"confidence": "high",
|
||||||
|
"entity": "manager",
|
||||||
|
"entity_type": "staff",
|
||||||
|
"relation_type": "resolution",
|
||||||
|
"related_span_index": 1,
|
||||||
|
"usn": "URT:S:A2.01:+2:21TC.ES.N"
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"review_summary": {
|
||||||
|
"dominant_valence": "V±",
|
||||||
|
"dominant_domain": "J",
|
||||||
|
"span_count": 3,
|
||||||
|
"has_comparative": true,
|
||||||
|
"has_entity": true
|
||||||
|
}
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
**Why this classification**:
|
||||||
|
- Span 0: Food quality (O1.01), superlative language (I3), "best ever" is comparative (CR-B)
|
||||||
|
- Span 1: Wait time (J1.02), specific time (S3), actionable (A3), negative + I3 = primary
|
||||||
|
- Span 2: Staff action (A2.01), resolution of span 1's problem
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
### Example 2: Single-span review
|
||||||
|
|
||||||
|
**Input**:
|
||||||
|
```json
|
||||||
|
{
|
||||||
|
"review_text": "Great place!",
|
||||||
|
"profile": "standard"
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
**Output**:
|
||||||
|
```json
|
||||||
|
{
|
||||||
|
"spans": [
|
||||||
|
{
|
||||||
|
"span_index": 0,
|
||||||
|
"span_text": "Great place!",
|
||||||
|
"span_start": 0,
|
||||||
|
"span_end": 12,
|
||||||
|
"urt_primary": "O1.01",
|
||||||
|
"urt_secondary": [],
|
||||||
|
"valence": "V+",
|
||||||
|
"intensity": "I2",
|
||||||
|
"specificity": "S1",
|
||||||
|
"actionability": "A1",
|
||||||
|
"temporal": "TC",
|
||||||
|
"evidence": "ES",
|
||||||
|
"comparative": "CR-N",
|
||||||
|
"is_primary": true,
|
||||||
|
"confidence": "medium",
|
||||||
|
"entity": null,
|
||||||
|
"entity_type": null,
|
||||||
|
"relation_type": null,
|
||||||
|
"related_span_index": null,
|
||||||
|
"usn": "URT:S:O1.01:+2:11TC.ES.N"
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"review_summary": {
|
||||||
|
"dominant_valence": "V+",
|
||||||
|
"dominant_domain": "O",
|
||||||
|
"span_count": 1,
|
||||||
|
"has_comparative": false,
|
||||||
|
"has_entity": false
|
||||||
|
}
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
**Why this classification**:
|
||||||
|
- Vague positive (S1, A1), default to O1.01 for general positive
|
||||||
|
- Lower confidence due to lack of specificity
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
### Example 3: Historical + future temporal markers
|
||||||
|
|
||||||
|
**Input**:
|
||||||
|
```json
|
||||||
|
{
|
||||||
|
"review_text": "I've been coming here for years and the quality has always been consistent. Won't be back after today though - they've really gone downhill.",
|
||||||
|
"profile": "standard"
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
**Output**:
|
||||||
|
```json
|
||||||
|
{
|
||||||
|
"spans": [
|
||||||
|
{
|
||||||
|
"span_index": 0,
|
||||||
|
"span_text": "I've been coming here for years and the quality has always been consistent.",
|
||||||
|
"span_start": 0,
|
||||||
|
"span_end": 76,
|
||||||
|
"urt_primary": "R1.01",
|
||||||
|
"urt_secondary": ["O1.01"],
|
||||||
|
"valence": "V+",
|
||||||
|
"intensity": "I2",
|
||||||
|
"specificity": "S2",
|
||||||
|
"actionability": "A1",
|
||||||
|
"temporal": "TH",
|
||||||
|
"evidence": "ES",
|
||||||
|
"comparative": "CR-N",
|
||||||
|
"is_primary": false,
|
||||||
|
"confidence": "high",
|
||||||
|
"entity": null,
|
||||||
|
"entity_type": null,
|
||||||
|
"relation_type": "contrast",
|
||||||
|
"related_span_index": 1,
|
||||||
|
"usn": "URT:S:R1.01+O1.01:+2:21TH.ES.N"
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"span_index": 1,
|
||||||
|
"span_text": "Won't be back after today though - they've really gone downhill.",
|
||||||
|
"span_start": 77,
|
||||||
|
"span_end": 141,
|
||||||
|
"urt_primary": "R1.02",
|
||||||
|
"urt_secondary": [],
|
||||||
|
"valence": "V-",
|
||||||
|
"intensity": "I3",
|
||||||
|
"specificity": "S1",
|
||||||
|
"actionability": "A1",
|
||||||
|
"temporal": "TF",
|
||||||
|
"evidence": "ES",
|
||||||
|
"comparative": "CR-N",
|
||||||
|
"is_primary": true,
|
||||||
|
"confidence": "high",
|
||||||
|
"entity": null,
|
||||||
|
"entity_type": null,
|
||||||
|
"relation_type": "contrast",
|
||||||
|
"related_span_index": 0,
|
||||||
|
"usn": "URT:S:R1.02:-3:11TF.ES.N"
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"review_summary": {
|
||||||
|
"dominant_valence": "V-",
|
||||||
|
"dominant_domain": "R",
|
||||||
|
"span_count": 2,
|
||||||
|
"has_comparative": false,
|
||||||
|
"has_entity": false
|
||||||
|
}
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
**Why this classification**:
|
||||||
|
- Span 0: Historical loyalty (TH), secondary O1.01 for quality mention
|
||||||
|
- Span 1: Future intent (TF), I3 due to "really" + finality of "won't be back"
|
||||||
|
- Mutual contrast relationship
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
### Example 4: Comparative review
|
||||||
|
|
||||||
|
**Input**:
|
||||||
|
```json
|
||||||
|
{
|
||||||
|
"review_text": "Way better than the other coffee shops in the area. Their lattes are smoother and the prices are actually reasonable.",
|
||||||
|
"profile": "standard"
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
**Output**:
|
||||||
|
```json
|
||||||
|
{
|
||||||
|
"spans": [
|
||||||
|
{
|
||||||
|
"span_index": 0,
|
||||||
|
"span_text": "Way better than the other coffee shops in the area.",
|
||||||
|
"span_start": 0,
|
||||||
|
"span_end": 51,
|
||||||
|
"urt_primary": "O1.01",
|
||||||
|
"urt_secondary": [],
|
||||||
|
"valence": "V+",
|
||||||
|
"intensity": "I3",
|
||||||
|
"specificity": "S1",
|
||||||
|
"actionability": "A1",
|
||||||
|
"temporal": "TC",
|
||||||
|
"evidence": "ES",
|
||||||
|
"comparative": "CR-B",
|
||||||
|
"is_primary": true,
|
||||||
|
"confidence": "high",
|
||||||
|
"entity": null,
|
||||||
|
"entity_type": null,
|
||||||
|
"relation_type": null,
|
||||||
|
"related_span_index": null,
|
||||||
|
"usn": "URT:S:O1.01:+3:11TC.ES.B"
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"span_index": 1,
|
||||||
|
"span_text": "Their lattes are smoother",
|
||||||
|
"span_start": 52,
|
||||||
|
"span_end": 77,
|
||||||
|
"urt_primary": "O1.02",
|
||||||
|
"urt_secondary": [],
|
||||||
|
"valence": "V+",
|
||||||
|
"intensity": "I2",
|
||||||
|
"specificity": "S2",
|
||||||
|
"actionability": "A1",
|
||||||
|
"temporal": "TC",
|
||||||
|
"evidence": "ES",
|
||||||
|
"comparative": "CR-B",
|
||||||
|
"is_primary": false,
|
||||||
|
"confidence": "high",
|
||||||
|
"entity": "lattes",
|
||||||
|
"entity_type": "product",
|
||||||
|
"relation_type": null,
|
||||||
|
"related_span_index": null,
|
||||||
|
"usn": "URT:S:O1.02:+2:21TC.ES.B"
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"span_index": 2,
|
||||||
|
"span_text": "and the prices are actually reasonable.",
|
||||||
|
"span_start": 78,
|
||||||
|
"span_end": 117,
|
||||||
|
"urt_primary": "P1.01",
|
||||||
|
"urt_secondary": [],
|
||||||
|
"valence": "V+",
|
||||||
|
"intensity": "I2",
|
||||||
|
"specificity": "S2",
|
||||||
|
"actionability": "A1",
|
||||||
|
"temporal": "TC",
|
||||||
|
"evidence": "ES",
|
||||||
|
"comparative": "CR-B",
|
||||||
|
"is_primary": false,
|
||||||
|
"confidence": "high",
|
||||||
|
"entity": null,
|
||||||
|
"entity_type": null,
|
||||||
|
"relation_type": null,
|
||||||
|
"related_span_index": null,
|
||||||
|
"usn": "URT:S:P1.01:+2:21TC.ES.B"
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"review_summary": {
|
||||||
|
"dominant_valence": "V+",
|
||||||
|
"dominant_domain": "O",
|
||||||
|
"span_count": 3,
|
||||||
|
"has_comparative": true,
|
||||||
|
"has_entity": true
|
||||||
|
}
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 5. Validation Rules
|
||||||
|
|
||||||
|
### 5.1 Structural Validation (pre-insert)
|
||||||
|
|
||||||
|
| Rule | Check | Error |
|
||||||
|
|------|-------|-------|
|
||||||
|
| Span count | `1 <= spans.length <= 15` | INVALID_SPAN_COUNT |
|
||||||
|
| Exactly one primary | `spans.filter(s => s.is_primary).length === 1` | INVALID_PRIMARY_COUNT |
|
||||||
|
| Contiguous indices | `spans[i].span_index === i` for all i | NON_CONTIGUOUS_INDEX |
|
||||||
|
| Non-overlapping | `spans[i].span_end <= spans[i+1].span_start` | OVERLAPPING_SPANS |
|
||||||
|
| Valid offsets | `span_end > span_start && span_start >= 0` | INVALID_OFFSETS |
|
||||||
|
| Text matches | `review_text.slice(span_start, span_end) ~= span_text` | TEXT_MISMATCH |
|
||||||
|
| USN format | Matches regex for profile | INVALID_USN |
|
||||||
|
| Self-reference | `related_span_index !== span_index` | SELF_REFERENCE |
|
||||||
|
| Related exists | `related_span_index < spans.length` | INVALID_RELATION |
|
||||||
|
|
||||||
|
### 5.2 Semantic Validation (warnings, not errors)
|
||||||
|
|
||||||
|
| Rule | Check | Warning |
|
||||||
|
|------|-------|---------|
|
||||||
|
| Secondary domain | Secondary codes should differ from primary domain | SAME_DOMAIN_SECONDARY |
|
||||||
|
| Over-splitting | More than 3 spans per sentence | POSSIBLE_OVERSPLIT |
|
||||||
|
| Intensity/valence match | I3 + V0 is unusual | UNUSUAL_INTENSITY_VALENCE |
|
||||||
|
| Specificity/actionability | S1 + A3 is rare | UNUSUAL_SPEC_ACTION |
|
||||||
|
|
||||||
|
### 5.3 Text Matching Rules
|
||||||
|
|
||||||
|
Allow normalization:
|
||||||
|
- Whitespace collapse: multiple spaces → single space
|
||||||
|
- Trim: leading/trailing whitespace
|
||||||
|
- Case: must match exactly (no case normalization)
|
||||||
|
|
||||||
|
```python
|
||||||
|
def text_matches(review_text: str, span: dict) -> bool:
|
||||||
|
expected = review_text[span['span_start']:span['span_end']]
|
||||||
|
actual = span['span_text']
|
||||||
|
|
||||||
|
# Normalize whitespace
|
||||||
|
expected_norm = ' '.join(expected.split())
|
||||||
|
actual_norm = ' '.join(actual.split())
|
||||||
|
|
||||||
|
return expected_norm == actual_norm
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 6. Error Handling
|
||||||
|
|
||||||
|
### 6.1 Retry Strategy
|
||||||
|
|
||||||
|
| Error Type | Action |
|
||||||
|
|------------|--------|
|
||||||
|
| JSON parse error | Retry with "Return ONLY valid JSON" appended |
|
||||||
|
| Schema validation error | Retry with specific field errors in prompt |
|
||||||
|
| Offset mismatch | Retry with "Offsets must match exactly" warning |
|
||||||
|
| No primary span | Auto-select using primary selection rules |
|
||||||
|
| Multiple primary spans | Keep first by selection rules, unset others |
|
||||||
|
|
||||||
|
### 6.2 Fallback Behavior
|
||||||
|
|
||||||
|
If after 3 retries the LLM still fails:
|
||||||
|
|
||||||
|
```python
|
||||||
|
def fallback_single_span(review_text: str) -> dict:
|
||||||
|
"""Create minimal valid response for failed classification."""
|
||||||
|
return {
|
||||||
|
"spans": [{
|
||||||
|
"span_index": 0,
|
||||||
|
"span_text": review_text,
|
||||||
|
"span_start": 0,
|
||||||
|
"span_end": len(review_text),
|
||||||
|
"urt_primary": "O1.01", # Default: general offering
|
||||||
|
"urt_secondary": [],
|
||||||
|
"valence": "V0", # Neutral - we don't know
|
||||||
|
"intensity": "I1",
|
||||||
|
"specificity": "S1",
|
||||||
|
"actionability": "A1",
|
||||||
|
"temporal": "TC",
|
||||||
|
"evidence": "ES",
|
||||||
|
"comparative": "CR-N",
|
||||||
|
"is_primary": True,
|
||||||
|
"confidence": "low",
|
||||||
|
"entity": None,
|
||||||
|
"entity_type": None,
|
||||||
|
"relation_type": None,
|
||||||
|
"related_span_index": None,
|
||||||
|
"usn": "URT:S:O1.01:01:11TC.ES.N"
|
||||||
|
}],
|
||||||
|
"review_summary": {
|
||||||
|
"dominant_valence": "V0",
|
||||||
|
"dominant_domain": "O",
|
||||||
|
"span_count": 1,
|
||||||
|
"has_comparative": False,
|
||||||
|
"has_entity": False
|
||||||
|
},
|
||||||
|
"_fallback": True,
|
||||||
|
"_error": "Classification failed after 3 retries"
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 7. Performance Considerations
|
||||||
|
|
||||||
|
### 7.1 Prompt Token Budget
|
||||||
|
|
||||||
|
| Component | Tokens (approx) |
|
||||||
|
|-----------|-----------------|
|
||||||
|
| System prompt | ~800 |
|
||||||
|
| Schema | ~400 |
|
||||||
|
| 3 few-shot examples | ~1,200 |
|
||||||
|
| Average review input | ~100 |
|
||||||
|
| **Total input** | ~2,500 |
|
||||||
|
| Average output | ~300-800 |
|
||||||
|
|
||||||
|
### 7.2 Batching
|
||||||
|
|
||||||
|
For high-volume processing, consider:
|
||||||
|
- Batch 5-10 short reviews per request
|
||||||
|
- Use `review_id` field in input/output for correlation
|
||||||
|
- Validate each review's spans independently
|
||||||
|
|
||||||
|
### 7.3 Caching
|
||||||
|
|
||||||
|
Cache key: `sha256(review_text + model_version + prompt_version)`
|
||||||
|
|
||||||
|
Invalidate on:
|
||||||
|
- Model version change
|
||||||
|
- Prompt version change
|
||||||
|
- URT code taxonomy change
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 8. Version History
|
||||||
|
|
||||||
|
| Version | Date | Changes |
|
||||||
|
|---------|------|---------|
|
||||||
|
| 1.0 | 2026-01-24 | Initial contract for URT-Standard profile |
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 9. Future Extensions (v2.0)
|
||||||
|
|
||||||
|
- **Full profile support**: Add `causal_chain` to output schema
|
||||||
|
- **Confidence calibration**: Train confidence based on validation results
|
||||||
|
- **Entity linking**: Link entities across reviews for trend detection
|
||||||
|
- **Multi-language**: Add language detection and localized prompts
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
*End of LLM Classification Contract v1.0*
|
||||||
Reference in New Issue
Block a user