whyrating-engine-legacy/.artifacts/URT-v5.1-Reference.md

# Universal Review Taxonomy (URT) v5.1 Reference

## Overview

The Universal Review Taxonomy (URT) is a classification system for customer feedback. It provides a structured approach to categorizing, annotating, and analyzing review content across any industry.

### Key Characteristics

- **Three Profiles**: Core, Standard, Full (increasing detail)
- **Seven Domains**: Covering all aspects of customer experience
- **Tier-3 Canonical Codes**: Format `X#.##` (e.g., J1.02, P2.15)
- **Dimensional Annotation**: Valence, intensity, specificity, and more
- **Causal Analysis**: Root cause chains (Full profile)

---

## Domain Codes

URT organizes feedback into seven domains, each identified by a single letter.

| Domain | Letter | Description |
|--------|--------|-------------|
| Offering | O | Product/service quality |
| Price | P | Value, pricing, promotions |
| Journey | J | Customer experience, timing, process |
| Environment | E | Physical/digital space |
| Attitude | A | Staff behavior, service attitude |
| Voice | V | Brand, communication, marketing |
| Relationship | R | Loyalty, trust, long-term relationship |

### Tier-3 Code Format

```
Pattern: [OPJEAVR][1-4]\.[0-9]{2}
```

Examples:
- `J1.02` - Journey domain, category 1, subcategory 02
- `P2.15` - Price domain, category 2, subcategory 15
- `A3.01` - Attitude domain, category 3, subcategory 01

---

## Dimension Codes

### Valence

Indicates the sentiment direction of the feedback.

| Code | Meaning |
|------|---------|
| V+ | Positive |
| V- | Negative |
| V0 | Neutral |
| V± | Mixed |

### Intensity

Indicates the strength of the expressed sentiment.

| Code | Meaning |
|------|---------|
| I1 | Low intensity |
| I2 | Moderate intensity |
| I3 | High intensity |

### Specificity (Standard+)

Indicates how detailed the feedback is.

| Code | Meaning |
|------|---------|
| S1 | Low - vague, general |
| S2 | Medium - some detail |
| S3 | High - specific, precise |

### Actionability (Standard+)

Indicates whether clear actions can be derived from the feedback.

| Code | Meaning |
|------|---------|
| A1 | None - no clear action |
| A2 | Unclear - possible actions |
| A3 | Clear - specific actionable |

### Temporal (Standard+)

Indicates the time frame referenced in the feedback.

| Code | Meaning | Markers |
|------|---------|---------|
| TC | Current - this visit | "today", "this time", "yesterday" |
| TR | Recent - last few visits | "lately", "recently", "again" |
| TH | Historical - long-standing | "for years", "always", "historically" |
| TF | Future - expectations | "I won't come back", "next time" |

**Default**: TC when no temporal language exists.

### Evidence (Standard+)

Indicates how the information was obtained from the text.

| Code | Meaning | Example |
|------|---------|---------|
| ES | Stated - explicit in text | "Waited 45 minutes" |
| EI | Inferred - logically entailed | "Took 3 weeks to reply" → slow response |
| EC | Contextual - depends on context | "That happened again" |

**Default**: ES. Use EI/EC only when needed.

### Comparative

Indicates whether the feedback compares to alternatives.

| Code | Meaning |
|------|---------|
| CR-N | No comparison |
| CR-B | Better than alternatives |
| CR-W | Worse than alternatives |
| CR-S | Same as alternatives |

---

## USN (URT String Notation)

USN is a compact string encoding for URT annotations.

### Grammar

```
Standard: URT:S:{codes}:{V}{I}:{S}{A}{T}.{E}.{CR}
Full:     URT:F:{codes}:{V}{I}:{S}{A}{T}.{E}.{CR}:{causal}
```

### Encoding Rules

**Valence**:
- `+` for V+
- `-` for V-

**Intensity**:
- `1` for I1
- `2` for I2
- `3` for I3

### Examples

**Standard Profile**:
```
URT:S:J1.03:-2:22TC.ES.N
```
Decoded:
- Profile: Standard
- Code: J1.03
- Valence: V- (negative)
- Intensity: I2
- Specificity: S2
- Actionability: A2
- Temporal: TC
- Evidence: ES
- Comparative: CR-N

**Full Profile with Causal Chain**:
```
URT:F:J1.01+A1.04:-3:23TR.EI.S:CD.O,MG.O
```
Decoded:
- Profile: Full
- Codes: J1.01, A1.04
- Valence: V- (negative)
- Intensity: I3
- Specificity: S2
- Actionability: A3
- Temporal: TR
- Evidence: EI
- Comparative: CR-S
- Causal: CD.O (Conditions-Operational), MG.O (Management-Oversight)

---

## Causal Chain (Full Profile Only)

The causal chain identifies root causes across three layers, ordered from immediate to systemic.

### Layers

| Layer | Codes | Scope |
|-------|-------|-------|
| conditions | CD-S, CD-T, CD-E, CD-F, CD-O | Staff State, Team Dynamics, Equipment, Facility, Operational |
| management | MG-P, MG-T, MG-O, MG-R, MG-C | Planning, Training, Oversight, Resources, Communication |
| systemic | SY-R, SY-P, SY-C, SY-S, SY-H, SY-X | Resource Decisions, Policy, Culture, Standards, Human Capital, External |

### Code Reference

**Conditions Layer**:
- `CD-S` - Staff State
- `CD-T` - Team Dynamics
- `CD-E` - Equipment
- `CD-F` - Facility
- `CD-O` - Operational

**Management Layer**:
- `MG-P` - Planning
- `MG-T` - Training
- `MG-O` - Oversight
- `MG-R` - Resources
- `MG-C` - Communication

**Systemic Layer**:
- `SY-R` - Resource Decisions
- `SY-P` - Policy
- `SY-C` - Culture
- `SY-S` - Standards
- `SY-H` - Human Capital
- `SY-X` - External

### JSONB Schema

```json
[
  {"layer": "conditions", "code": "CD-O", "evidence": "ES"},
  {"layer": "management", "code": "MG-P", "evidence": "EI"}
]
```

### Constraints

- Maximum 3 entries (one per layer)
- Only include when text explicitly supports it
- Order: conditions → management → systemic

---

## Span Boundary Detection Rules

Spans are detected at the clause/topic level, not sentence level.

### Split Rules (in priority order)

1. **Split on contrasting conjunctions**: but, however, although, despite, yet
2. **Split when subject/target changes** (topic shift)
3. **Split when valence changes** (positive ↔ negative)
4. **Split when domain changes** (O/P/J/E/A/V/R)
5. **Keep together** for cause→effect within same feedback unit

### Guidelines

- **Maximum**: ~3 spans per sentence
- **Validation**: If 4+ spans detected, re-check for over-splitting

### Example

**Input**:
> "The food was great but the service was slow and the bathroom was dirty."

**Output**: 3 spans
1. "The food was great" (Offering, positive)
2. "the service was slow" (Journey/Attitude, negative)
3. "the bathroom was dirty" (Environment, negative)

**Reasoning**: Topic shift + domain shift at each boundary.

---

## Primary Span Selection

When a review contains multiple spans, select the primary span using these criteria in order:

### Selection Priority

1. **Highest intensity** (I3 > I2 > I1)
2. **Tie-break**: Negative over positive (V- > V± > V0 > V+)
3. **Tie-break**: Earliest span_index

### Example

Given spans:
- Span 0: I2, V+
- Span 1: I3, V+
- Span 2: I3, V-

**Primary**: Span 2 (highest intensity I3, negative valence wins tie-break)

---

## Secondary Codes Rules

Secondary codes capture additional topics mentioned in a span.

### Constraints

- **Maximum**: 2 secondary codes
- **Format**: Must be Tier-3 (X#.##)
- **Recommendation**: Should be different domain from primary

### Example

Primary: `J1.03` (Journey)
Secondary: `A2.01`, `E1.05` (Attitude, Environment)

---

## Quick Reference Card

### Profiles

| Profile | Dimensions | Causal Chain |
|---------|------------|--------------|
| Core | V, I | No |
| Standard | V, I, S, A, T, E, CR | No |
| Full | V, I, S, A, T, E, CR | Yes |

### USN Quick Format

```
URT:{S|F}:{tier3_codes}:{valence}{intensity}:{SAT}.{E}.{CR}[:{causal}]
```

### Domain Letters

```
O P J E A V R
│ │ │ │ │ │ └─ Relationship
│ │ │ │ │ └─── Voice
│ │ │ │ └───── Attitude
│ │ │ └─────── Environment
│ │ └───────── Journey
│ └─────────── Price
└───────────── Offering
```