Files

Alejandro Gutiérrez 544e028c3f Phase 0: Project restructure to ReviewIQ platform architecture

New structure:
- scrapers/google_reviews/v1_0_0.py (was modules/scraper_clean.py)
- scrapers/base.py (BaseScraper interface)
- scrapers/registry.py (ScraperRegistry for version routing)
- core/database.py, models.py, config.py, enums.py
- utils/logger.py, crash_analyzer.py, health_checks.py, helpers.py, date_converter.py
- workers/chrome_pool.py
- services/webhook_service.py
- api/ routes structure (empty, ready for Phase 2)
- tests/ structure mirroring source

All imports updated in:
- api_server_production.py (7 import paths updated)
- utils/health_checks.py (scraper import path)

Legacy modules moved to modules/_legacy/:
- data_storage.py, image_handler.py, s3_handler.py (unused)

Syntax verified, frontend build passing.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

2026-01-24 15:22:08 +00:00

7.7 KiB

Raw Blame History

Universal Review Taxonomy (URT) v5.1 Reference

Overview

The Universal Review Taxonomy (URT) is a classification system for customer feedback. It provides a structured approach to categorizing, annotating, and analyzing review content across any industry.

Key Characteristics

Three Profiles: Core, Standard, Full (increasing detail)
Seven Domains: Covering all aspects of customer experience
Tier-3 Canonical Codes: Format X#.## (e.g., J1.02, P2.15)
Dimensional Annotation: Valence, intensity, specificity, and more
Causal Analysis: Root cause chains (Full profile)

Domain Codes

URT organizes feedback into seven domains, each identified by a single letter.

Domain	Letter	Description
Offering	O	Product/service quality
Price	P	Value, pricing, promotions
Journey	J	Customer experience, timing, process
Environment	E	Physical/digital space
Attitude	A	Staff behavior, service attitude
Voice	V	Brand, communication, marketing
Relationship	R	Loyalty, trust, long-term relationship

Tier-3 Code Format

Pattern: [OPJEAVR][1-4]\.[0-9]{2}

Examples:

J1.02 - Journey domain, category 1, subcategory 02
P2.15 - Price domain, category 2, subcategory 15
A3.01 - Attitude domain, category 3, subcategory 01

Dimension Codes

Valence

Indicates the sentiment direction of the feedback.

Code	Meaning
V+	Positive
V-	Negative
V0	Neutral
V±	Mixed

Intensity

Indicates the strength of the expressed sentiment.

Code	Meaning
I1	Low intensity
I2	Moderate intensity
I3	High intensity

Specificity (Standard+)

Indicates how detailed the feedback is.

Code	Meaning
S1	Low - vague, general
S2	Medium - some detail
S3	High - specific, precise

Actionability (Standard+)

Indicates whether clear actions can be derived from the feedback.

Code	Meaning
A1	None - no clear action
A2	Unclear - possible actions
A3	Clear - specific actionable

Temporal (Standard+)

Indicates the time frame referenced in the feedback.

Code	Meaning	Markers
TC	Current - this visit	"today", "this time", "yesterday"
TR	Recent - last few visits	"lately", "recently", "again"
TH	Historical - long-standing	"for years", "always", "historically"
TF	Future - expectations	"I won't come back", "next time"

Default: TC when no temporal language exists.

Evidence (Standard+)

Indicates how the information was obtained from the text.

Code	Meaning	Example
ES	Stated - explicit in text	"Waited 45 minutes"
EI	Inferred - logically entailed	"Took 3 weeks to reply" → slow response
EC	Contextual - depends on context	"That happened again"

Default: ES. Use EI/EC only when needed.

Comparative

Indicates whether the feedback compares to alternatives.

Code	Meaning
CR-N	No comparison
CR-B	Better than alternatives
CR-W	Worse than alternatives
CR-S	Same as alternatives

USN (URT String Notation)

USN is a compact string encoding for URT annotations.

Grammar

Standard: URT:S:{codes}:{V}{I}:{S}{A}{T}.{E}.{CR}
Full:     URT:F:{codes}:{V}{I}:{S}{A}{T}.{E}.{CR}:{causal}

Encoding Rules

Valence:

+ for V+
- for V-

Intensity:

1 for I1
2 for I2
3 for I3

Examples

Standard Profile:

URT:S:J1.03:-2:22TC.ES.N

Decoded:

Profile: Standard
Code: J1.03
Valence: V- (negative)
Intensity: I2
Specificity: S2
Actionability: A2
Temporal: TC
Evidence: ES
Comparative: CR-N

Full Profile with Causal Chain:

URT:F:J1.01+A1.04:-3:23TR.EI.S:CD.O,MG.O

Decoded:

Profile: Full
Codes: J1.01, A1.04
Valence: V- (negative)
Intensity: I3
Specificity: S2
Actionability: A3
Temporal: TR
Evidence: EI
Comparative: CR-S
Causal: CD.O (Conditions-Operational), MG.O (Management-Oversight)

Causal Chain (Full Profile Only)

The causal chain identifies root causes across three layers, ordered from immediate to systemic.

Layers

Layer	Codes	Scope
conditions	CD-S, CD-T, CD-E, CD-F, CD-O	Staff State, Team Dynamics, Equipment, Facility, Operational
management	MG-P, MG-T, MG-O, MG-R, MG-C	Planning, Training, Oversight, Resources, Communication
systemic	SY-R, SY-P, SY-C, SY-S, SY-H, SY-X	Resource Decisions, Policy, Culture, Standards, Human Capital, External

Code Reference

Conditions Layer:

CD-S - Staff State
CD-T - Team Dynamics
CD-E - Equipment
CD-F - Facility
CD-O - Operational

Management Layer:

MG-P - Planning
MG-T - Training
MG-O - Oversight
MG-R - Resources
MG-C - Communication

Systemic Layer:

SY-R - Resource Decisions
SY-P - Policy
SY-C - Culture
SY-S - Standards
SY-H - Human Capital
SY-X - External

JSONB Schema

[
  {"layer": "conditions", "code": "CD-O", "evidence": "ES"},
  {"layer": "management", "code": "MG-P", "evidence": "EI"}
]

Constraints

Maximum 3 entries (one per layer)
Only include when text explicitly supports it
Order: conditions → management → systemic

Span Boundary Detection Rules

Spans are detected at the clause/topic level, not sentence level.

Split Rules (in priority order)

Split on contrasting conjunctions: but, however, although, despite, yet
Split when subject/target changes (topic shift)
Split when valence changes (positive ↔ negative)
Split when domain changes (O/P/J/E/A/V/R)
Keep together for cause→effect within same feedback unit

Guidelines

Maximum: ~3 spans per sentence
Validation: If 4+ spans detected, re-check for over-splitting

Example

Input:

"The food was great but the service was slow and the bathroom was dirty."

Output: 3 spans

"The food was great" (Offering, positive)
"the service was slow" (Journey/Attitude, negative)
"the bathroom was dirty" (Environment, negative)

Reasoning: Topic shift + domain shift at each boundary.

Primary Span Selection

When a review contains multiple spans, select the primary span using these criteria in order:

Selection Priority

Highest intensity (I3 > I2 > I1)
Tie-break: Negative over positive (V- > V± > V0 > V+)
Tie-break: Earliest span_index

Example

Given spans:

Span 0: I2, V+
Span 1: I3, V+
Span 2: I3, V-

Primary: Span 2 (highest intensity I3, negative valence wins tie-break)

Secondary Codes Rules

Secondary codes capture additional topics mentioned in a span.

Constraints

Maximum: 2 secondary codes
Format: Must be Tier-3 (X#.##)
Recommendation: Should be different domain from primary

Example

Primary: J1.03 (Journey) Secondary: A2.01, E1.05 (Attitude, Environment)

Quick Reference Card

Profiles

Profile	Dimensions	Causal Chain
Core	V, I	No
Standard	V, I, S, A, T, E, CR	No
Full	V, I, S, A, T, E, CR	Yes

USN Quick Format

URT:{S|F}:{tier3_codes}:{valence}{intensity}:{SAT}.{E}.{CR}[:{causal}]

Domain Letters

O P J E A V R
│ │ │ │ │ │ └─ Relationship
│ │ │ │ │ └─── Voice
│ │ │ │ └───── Attitude
│ │ │ └─────── Environment
│ │ └───────── Journey
│ └─────────── Price
└───────────── Offering

7.7 KiB Raw Blame History

Universal Review Taxonomy (URT) v5.1 Reference

Overview

Key Characteristics

Domain Codes

Tier-3 Code Format

Dimension Codes

Valence

Intensity

Specificity (Standard+)

Actionability (Standard+)

Temporal (Standard+)

Evidence (Standard+)

Comparative

USN (URT String Notation)

Grammar

Encoding Rules

Examples

Causal Chain (Full Profile Only)

Layers

Code Reference

JSONB Schema

Constraints

Span Boundary Detection Rules

Split Rules (in priority order)

Guidelines

Example

Primary Span Selection

Selection Priority

Example

Secondary Codes Rules

Constraints

Example

Quick Reference Card

Profiles

USN Quick Format

Domain Letters

7.7 KiB

Raw Blame History