Final polish: v3.1.2 operational safety constraints

- Add chk_dedup_scoped constraint enforcing tenant-scoped dedup format
- Filter location_type='owned' in populate_facts() for 'ALL' rollup
- Document competitor exclusion from 'ALL' sentinel rollups
- Add explicit comments in aggregation code for maintainability

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
This commit is contained in:
Alejandro Gutiérrez
2026-01-24 12:55:31 +00:00
parent c6443166b2
commit f99827717f

View File

@@ -271,6 +271,11 @@ CREATE INDEX idx_enriched_embedding ON reviews_enriched
ALTER TABLE reviews_enriched
ADD CONSTRAINT fk_enriched_location
FOREIGN KEY (business_id, place_id) REFERENCES locations(business_id, place_id);
-- Enforce tenant-scoped dedup format
ALTER TABLE reviews_enriched
ADD CONSTRAINT chk_dedup_scoped
CHECK (dedup_group_id IS NULL OR dedup_group_id LIKE business_id || ':%');
```
### 2.3 Issue Tables (Relational, No Arrays)
@@ -488,6 +493,11 @@ CREATE INDEX idx_facts_all_locations ON fact_timeseries(business_id, period_date
| `domain` | ⚡ Derived | Rollup from urt_code at query time |
| `issue` | 🔜 Optional | Recommended for issue timelines (v3.2) |
**v3.1 Rollup Rules**:
- `place_id='ALL'` includes **owned locations only** (not competitors)
- Competitor facts live at their `competitor_place_id`, never in `'ALL'` rollup
- Competitor comparison queries explicitly join on `competitor_place_id`
**v3.1 Trust Score Usage**:
- `trust_score` is applied to **issue priority scoring** and **filtering** (see §4.2)
- `trust_weighted_strength` / `trust_weighted_negative` columns are **reserved but not populated** in v3.1
@@ -985,23 +995,34 @@ async def populate_facts(business_id: str, date: date, bucket_type: str = 'day')
next_month = period_start.replace(day=28) + timedelta(days=4)
period_end = next_month.replace(day=1)
locations = await db.query(
"SELECT place_id FROM locations WHERE business_id = %s AND is_active = TRUE",
# Get owned locations (competitors excluded from 'ALL' rollup)
owned_locations = await db.query(
"SELECT place_id FROM locations WHERE business_id = %s AND is_active = TRUE AND location_type = 'owned'",
[business_id]
)
owned_place_ids = [loc['place_id'] for loc in owned_locations]
# Get competitor locations (facts per place only, no 'ALL' rollup)
competitor_locations = await db.query(
"SELECT place_id FROM locations WHERE business_id = %s AND is_active = TRUE AND location_type = 'competitor'",
[business_id]
)
all_place_ids = [loc['place_id'] for loc in locations]
# Per-location facts
for loc in locations:
place_id = loc['place_id']
# Per-location facts (owned)
for loc in owned_locations:
await populate_location_facts(
business_id, place_id, period_start, period_end, bucket_type
business_id, loc['place_id'], period_start, period_end, bucket_type
)
# All-locations rollup (place_id='ALL')
# Per-location facts (competitors — no 'ALL' rollup)
for loc in competitor_locations:
await populate_location_facts(
business_id, loc['place_id'], period_start, period_end, bucket_type
)
# All-locations rollup (owned only — place_id='ALL')
await populate_all_locations_facts(
business_id, all_place_ids, period_start, period_end, bucket_type
business_id, owned_place_ids, period_start, period_end, bucket_type
)