Add API interception for hybrid scraping and update selectors

- Add new api_interceptor.py module for CDP network interception
- Capture Google Maps internal API responses during scrolling
- Parse protobuf-like JSON responses to extract review data
- Merge API-captured reviews with DOM-scraped data
- Update CSS selectors for January 2026 Google Maps structure
- Add cookie consent dismissal for multiple languages
- Add --api-intercept CLI flag and config option
- Fix review card and pane selectors (.jftiEf, .XiKgde)
- Improve review ID extraction from card elements

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
This commit is contained in:
Alejandro Gutiérrez
2026-01-17 21:51:10 +00:00
parent 262f0c0be7
commit bdffb5eaac
5 changed files with 782 additions and 36 deletions

View File

@@ -47,7 +47,13 @@ class RawReview:
except Exception:
pass
# Try to get data-review-id from the card itself, or from a child element
rid = card.get_attribute("data-review-id") or ""
if not rid:
# Try to find it in a child element
review_id_elem = try_find(card, "[data-review-id]")
if review_id_elem:
rid = review_id_elem[0].get_attribute("data-review-id") or ""
author = first_text(card, 'div[class*="d4r55"]')
profile = first_attr(card, 'button[data-review-id]', "data-href")
avatar = first_attr(card, 'button[data-review-id] img', "src")