Files
whyrating-engine-legacy/modules/scraper_clean.py
Alejandro Gutiérrez 218927bd9b Filter out garbage API data (language codes, metadata)
- Reject authors with <= 3 chars (language codes like "es", "it", "no")
- Reject known non-review authors ("google", "maps", etc.)
- Reject timestamps that are URLs or very short strings

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-21 20:47:08 +00:00

18 KiB