Add get_business_card_info to scraper_clean with multilingual support

Replaces fast_scraper validation with efficient polling-based extraction
using the same navigation pattern as scrape_reviews:
- 10ms polling for consent handling (no fixed waits)
- 100ms polling for data extraction
- Exits early when data found

Supports multiple languages:
- Rating: stars/estrellas/étoiles/sterne/stelle
- Reviews: reviews/reseñas/avis/bewertungen/recensioni
- Handles comma decimals (4,8 -> 4.8)

Result: 6.3s to extract name, address, rating, total_reviews

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
This commit is contained in:
Alejandro Gutiérrez
2026-01-23 17:52:06 +00:00
parent 47bb032011
commit 0682c0ec61
2 changed files with 125 additions and 2 deletions

View File

@@ -21,8 +21,7 @@ from fastapi.responses import JSONResponse, StreamingResponse
from modules.database import DatabaseManager, JobStatus
from modules.webhooks import WebhookDispatcher, WebhookManager
from modules.health_checks import HealthCheckSystem
from modules.scraper_clean import fast_scrape_reviews, LogCapture # Clean scraper with hard refresh recovery
from modules.fast_scraper import check_reviews_available, get_business_card_info # Helper functions
from modules.scraper_clean import fast_scrape_reviews, LogCapture, get_business_card_info # Clean scraper
from modules.chrome_pool import (
start_worker_pools,
stop_worker_pools,