Files
whyrating-engine-legacy/tools/test_scraper_v110.py
Alejandro Gutiérrez fbd61ff7f7 feat: Add multi-sort scraper v1.1.0 and improve v1.0.0 reliability
v1.0.0 improvements:
- Add captcha detection (reCAPTCHA, unusual traffic, challenges)
- Block fonts, analytics, maps tiles for faster scrolling
- Add 95% close-enough threshold to skip unnecessary retries
- Stop immediately if captcha detected instead of retrying

v1.1.0 new features:
- Multi-sort strategy to bypass ~1000 review limit
- Cycles through newest/lowest/highest/relevant sorts
- Auto mode: enables multi-sort when total > 1000
- Diminishing returns detection (stops if <5% new per pass)
- Configurable sort order and thresholds

Also adds test_scraper_v110.py CLI tool for testing multi-sort.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-24 18:47:30 +00:00

6.6 KiB