Files
whyrating-engine-legacy/modules/scraper_clean.py
Alejandro Gutiérrez 6934838a69 Real-time parsing + image blocking for large datasets
Key improvements:
- Parse reviews immediately during scroll (not at end)
- Fixes virtual scroll issue - was losing reviews after ~1000
- Block images via CDP for faster loading
- Smart recovery: 4 methods (keys, wheel, scroll up/down, click card)
- Dynamic timeout based on scroll state and content growth
- Spinner + network activity detection resets idle timer
- Sort by newest first option

Results: 1930 reviews (was 990) on 2433-review location

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-21 22:25:26 +00:00

24 KiB