- Use updated_at (last successful data loop) instead of Date.now()
- Speed now reflects actual data retrieval rate, not declining over time
- Updated in table column, monitored job view, and stats row
- Fall back to Date.now() if updated_at is not available
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Use fixed positioning with top/left 50% and translate -50%
- More reliable centering regardless of parent containers
- Add max-width for mobile responsiveness
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Keep blue background when isCheckingReviews is true
- Add cursor-wait during validation
- Move disabled styling to explicit condition check
- White spinner now visible on blue background
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Reset search fields after job is successfully launched
- Allow user to immediately start another scrape
- Save active jobs to localStorage for persistence across refresh
- Restore jobs from localStorage on page load
- Resume polling for non-terminal jobs (pending/running)
- Filter out jobs older than 24 hours
- Add remove button (X) to each job card
- Clean up localStorage when jobs are removed
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Remove w-full that caused alignment issues
- Use fixed width (400px) for consistent centering
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Remove w-full and mx-auto that caused alignment issues
- Use fixed width (280px) instead of max-w-xs
- Let flex container handle centering
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Center modal properly within map preview area
- Add 24px padding from map edges
- Make modal more compact (max-w-xs)
- Reduce text and element sizes for better fit
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Place Business Name, Location, and Validate button in same row
- Reduce padding and font sizes for compact inline layout
- Show abbreviated text on mobile (responsive)
- Use checkmark indicator for auto-detected location
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Split single search input into two fields: Business Name (required)
and Location (auto-detected from IP geolocation)
- Auto-fill location field with city/country from IP on page load
- Add click overlay on map iframe to prevent interaction
- Add warning modal when user clicks map, directing them to use search
- Update test URLs to use split format
- Make Validate button full-width for better UX
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Restore original Google Maps embed iframe approach
- URL: maps.google.com/maps?q=...&output=embed&z=15
- Add "Open in Maps" overlay button on the map
- Height 300px for better visibility
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Replace non-working Google Maps embed iframe with animated location preview
- Add "Open in Google Maps" button to open location in new tab
- Add scraper type selection dropdown fetching from /api/admin/scrapers
- Show selected scraper info with formatted labels (Google Reviews v1.0.0)
- Include scraper_version and scraper_variant in job submission
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Dashboard page:
- Fetch top clients from /api/dashboard/by-client
- Show loading state while fetching
- Display empty state when no client data
- Show real client_id, job count, and success rate
Scrapers page:
- Fetch versions from /api/admin/scrapers
- Wire promote/deprecate buttons to real API calls
- Wire add version form to POST /api/admin/scrapers
- Wire traffic allocation to PUT /api/admin/scrapers/{id}/traffic
- Add loading and error states
Dockerfile:
- Add COPY commands for new directories (api/, core/, scrapers/, etc.)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Phase 5 - Main Dashboard:
- Dashboard overview page with system health stats
- Jobs by status breakdown, success rates, top clients
- Dashboard API (/api/dashboard/overview, by-client, problems, by-version)
Phase 6 - Admin/Scraper Management:
- Scrapers management page with traffic allocation UI
- Admin API for scraper CRUD operations
- Traffic percentage updates for A/B testing
- Promote/deprecate scraper versions
Phase 7 - Authentication:
- API key authentication middleware
- SHA-256 key hashing (keys never stored in plain text)
- Scope-based authorization (jobs:read, jobs:write, admin)
- Rate limiting per API key
Also:
- Updated api_server_production.py to include new routers
- Extended core/database.py with dashboard query methods
- Added dashboard link to sidebar navigation
- Updated CONTEXT-KEEPER.md to mark all phases complete
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Quick-reference document for resuming work after context compaction.
Contains: project overview, current state, spec summary, phases,
key decisions, file locations, and resumption instructions.
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Wrap handleJobsChange in useCallback to prevent infinite re-renders
caused by onJobsChange dependency changing on every render.
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Task #18: Complete integration of all JobDevTools components
- Updated job detail page (/jobs/[id]) with full JobDevTools UI
- Connected SSE stream for real-time structured logs + metrics
- Added crash-report and retry API routes for Next.js
- Added format conversion for old/new log formats
- Added DevTools links to JobsView modal and actions column
- Wired up CrashReport retry with auto-fix parameters
- Integrated SessionPanel for fingerprint display
- Integrated MetricsDashboard for real-time charts
Job DevTools implementation complete: 18/18 tasks
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Add chk_dedup_scoped constraint enforcing tenant-scoped dedup format
- Filter location_type='owned' in populate_facts() for 'ALL' rollup
- Document competitor exclusion from 'ALL' sentinel rollups
- Add explicit comments in aggregation code for maintainability
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Two micro-risk mitigations documented:
1. dedup_group_id: Format "{business_id}:{hash}" to prevent
cross-tenant collision on similar reviews.
2. Sentinel conventions: 'ALL' (spatial) vs 'all' (semantic).
Case matters — do not normalize.
Spec frozen as v3.1.2.
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Final fixes for production-ready spec:
1. locations.location_type: Added 'owned'|'competitor' flag.
Competitors now inserted into locations (preserves FK integrity).
2. Competitor fact query: Added business_id filter to prevent
cross-tenant contamination when same competitor tracked by
multiple customers.
3. issue_events versioning: Added source + review_version columns
for complete review reference in audit log.
4. Enrichment tenant-scoping: business_id now passed from ingest
job (not looked up). Validates place_id exists under tenant.
5. Footer: Fixed version string v3.1.1 → v3.1.2.
Status: Ship-ready specification.
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Three final fixes applied:
1. issue_spans versioning: Added source + review_version columns
with FK to reviews_enriched(source, review_id, review_version).
Spans now correctly reference the exact review version.
2. Competitor business_id rule: Clarified that competitor reviews
use customer's business_id + competitor's place_id (not NULL).
Keeps facts and joins working without special-case logic.
3. Trust-weighted facts: Clarified trust_weighted_* columns are
reserved but not populated in v3.1. Trust scoring applies to
issue priority only. Aggregation deferred to v3.2.
Status: Production-grade architecture specification.
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Complete pipeline architecture for Google Reviews intelligence:
- Versioned reviews_enriched with (source, review_id, version) PK
- Tenant-scoped locations with (business_id, place_id) PK
- Relational issue_spans replacing array aggregation
- Unified fact_timeseries spine with 'ALL' sentinel for rollups
- Clean competitor model (separate table, no fake business_ids)
- Trust scoring and dedup support
- KPI-ready join keys
Reviewed and fixed: PK for edited reviews, multi-tenant overlap,
param ordering bugs, fact population scope, entity field deferral.
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- reviewiq-pipeline-v1-final.md: Earlier pipeline specification
- test_metadata_extraction.py: Test script for metadata extraction
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Replace client-side state switching with proper Next.js routes:
- /new - New scrape form
- /jobs - Jobs list with table view
- /jobs/[id] - Individual job details and logs
- /analytics - Analytics overview (completed jobs)
- /analytics/[id] - Analytics for specific job
Add JobsContext for shared state across routes. Update Sidebar
to use next/link with pathname matching. Root page redirects to /new.
Also adds partial job status styling to JobsView.
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Transfer user's browser fingerprint (user-agent, viewport, timezone,
language, geolocation) to Chrome for more authentic scraping
- Display review topics from Google Maps in analytics dashboard
- Show business category badge in analytics header
- Fix date_text null handling in analytics (handle undefined/timestamp fields)
- Add review_topics and business_category to JobStatus interface
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Clear cookies and navigate to about:blank before loading URL
(ensures clean state when reusing pooled driver)
- Simplified regex patterns for rating/reviews extraction
- Uses partial word matching like scrape_reviews
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
All functionality now in scraper_clean.py:
- fast_scrape_reviews (main scraper)
- get_business_card_info (validation)
Updated health_checks.py to import from scraper_clean.
Removes 1,935 lines of duplicate/obsolete code.
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Replaces fast_scraper validation with efficient polling-based extraction
using the same navigation pattern as scrape_reviews:
- 10ms polling for consent handling (no fixed waits)
- 100ms polling for data extraction
- Exits early when data found
Supports multiple languages:
- Rating: stars/estrellas/étoiles/sterne/stelle
- Reviews: reviews/reseñas/avis/bewertungen/recensioni
- Handles comma decimals (4,8 -> 4.8)
Result: 6.3s to extract name, address, rating, total_reviews
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
The continue statement was skipping the card.style.display='none'
and card.innerHTML='' cleanup for cards already seen via API
interception. This caused DOM to grow unbounded during long scrapes.
Now ALL processed cards are hidden regardless of data source.
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Small (~79 reviews): R. Fleitas Peluqueros
- Medium (~589 reviews): ClickRent Gran Canaria
- Large (~2000+ reviews): Hospital Doctor Negrín
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Import fast_scrape_reviews from scraper_clean instead of fast_scraper
- Keeps helper functions (check_reviews_available, get_business_card_info) from fast_scraper
- Production now uses clean scraper with hard refresh recovery
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Add fast_scrape_reviews() wrapper to scraper_clean.py for API compatibility
- Set window size (1200x900) in wrapper to ensure proper Google Maps rendering
- Update job_manager.py to import from scraper_clean instead of fast_scraper
- Production now uses clean scraper with:
- Hard refresh recovery when stuck after 8+ soft recovery attempts
- API interception + DOM parsing for complete data collection
- Automatic deduplication across refreshes
Tested: 589/589 reviews collected in 55s
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>