Files
whyrating-engine-legacy/.artifacts/job-devtools-tasks.md
Alejandro Gutiérrez 59368a5bd5 Add Job DevTools implementation task breakdown
18 tasks organized in 5 parallel tracks:
- Track A: Backend logging infrastructure (4 tasks)
- Track B: Frontend log viewer (5 tasks)
- Track C: Crash analysis (4 tasks)
- Track D: Session & metrics (3 tasks)
- Track E: Review topics (2 tasks)

Includes dependency graph and 7-wave execution plan
for parallel AI agent workflow.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-24 11:14:02 +00:00

9.5 KiB

Job DevTools - Implementation Tasks

Dependency Graph

Wave 1 (Parallel start):
  #1 StructuredLogger ──┬──▶ #2 Migrate scraper ──▶ #3 SSE stream ──▶ #5 JobDevTools
                        │                                              │
                        ├──▶ #4 DB schema ──┬──▶ #10 Crash analyzer   ▼
                        │                   │         │            #6 LogViewer
                        │                   │         ▼                │
                        │                   ├──▶ #11 Crash API         ▼
                        │                   │         │            #7 CopyToolbar
                        │                   │         ▼                │
                        │                   │    #12 CrashReport       ▼
                        │                   │                      #8 LogEntry
                        │                   └──▶ #13 Session capture   │
                        │                              │               │
                        └──▶ #9 Crash detection        ▼               │
                                   │            #14 SessionPanel       │
                                   │                   │               │
                                   └───────────────────┼───────────────┘
                                                       │
  #16 Topics inference ──▶ #17 Topic tags              ▼
                                              #15 MetricsDashboard
                                                       │
                                                       ▼
                                              #18 INTEGRATION

Task Details

Track A: Backend Logging Infrastructure

Task #1: Create StructuredLogger class in Python backend

Priority: P0 (Foundation) Blocks: #2, #3, #4, #9

Create modules/structured_logger.py:

from dataclasses import dataclass, field, asdict
from typing import Optional, Dict, Any, List, Literal
from datetime import datetime
import threading
import time

LogLevel = Literal['DEBUG', 'INFO', 'WARN', 'ERROR', 'FATAL']
LogCategory = Literal['scraper', 'browser', 'network', 'system']

@dataclass
class LogEntry:
    timestamp: str
    timestamp_ms: int
    level: LogLevel
    category: LogCategory
    message: str
    metrics: Optional[Dict[str, Any]] = None
    network: Optional[Dict[str, Any]] = None
    snapshot_id: Optional[str] = None

class StructuredLogger:
    def __init__(self, max_entries: int = 10000):
        self._logs: List[LogEntry] = []
        self._lock = threading.Lock()
        self._max_entries = max_entries

    def _log(self, level: LogLevel, category: LogCategory, message: str,
             metrics: Dict = None, network: Dict = None, snapshot_id: str = None):
        now = datetime.utcnow()
        entry = LogEntry(
            timestamp=now.isoformat() + 'Z',
            timestamp_ms=int(time.time() * 1000),
            level=level,
            category=category,
            message=message,
            metrics=metrics,
            network=network,
            snapshot_id=snapshot_id
        )
        with self._lock:
            self._logs.append(entry)
            if len(self._logs) > self._max_entries:
                self._logs = self._logs[-self._max_entries:]

    def debug(self, category: LogCategory, message: str, **kwargs):
        self._log('DEBUG', category, message, **kwargs)

    def info(self, category: LogCategory, message: str, **kwargs):
        self._log('INFO', category, message, **kwargs)

    def warn(self, category: LogCategory, message: str, **kwargs):
        self._log('WARN', category, message, **kwargs)

    def error(self, category: LogCategory, message: str, **kwargs):
        self._log('ERROR', category, message, **kwargs)

    def fatal(self, category: LogCategory, message: str, **kwargs):
        self._log('FATAL', category, message, **kwargs)

    def get_logs(self) -> List[Dict]:
        with self._lock:
            return [asdict(e) for e in self._logs]

    def get_logs_by_category(self, category: LogCategory) -> List[Dict]:
        with self._lock:
            return [asdict(e) for e in self._logs if e.category == category]

Task #2: Migrate scraper_clean.py to use StructuredLogger

Blocked by: #1 Blocks: #3

Update all log calls in modules/scraper_clean.py:

  • Replace LogCapture with StructuredLogger
  • Add category to each log call
  • Add metrics where relevant (scroll_count, reviews_count, memory_mb)

Task #3: Update SSE stream to emit structured log events

Blocked by: #1, #2 Blocks: #5, #15

Update api_server_production.py:

  • Change log event format to include full LogEntry structure
  • Add metrics event type emitted every 5 seconds
  • Backward compatibility for old clients

Task #4: Add crash_reports table and schema

Blocked by: #1 Blocks: #10, #11, #13

Add to modules/database.py:

CREATE TABLE crash_reports (
    crash_id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    job_id UUID REFERENCES jobs(job_id) ON DELETE CASCADE,
    created_at TIMESTAMP NOT NULL DEFAULT NOW(),
    crash_type VARCHAR(50) NOT NULL,
    error_message TEXT,
    state JSONB NOT NULL,
    metrics_history JSONB,
    logs_before_crash JSONB,
    analysis JSONB,
    screenshot_url TEXT
);

ALTER TABLE jobs ADD COLUMN IF NOT EXISTS session_fingerprint JSONB;
ALTER TABLE jobs ADD COLUMN IF NOT EXISTS metrics_history JSONB;

Track B: Frontend Log Viewer

Task #5: Create JobDevTools React container component

Blocked by: #3 Blocks: #6, #18

Create web/components/JobDevTools/index.tsx:

  • Tab bar: All, Scraper, Browser, Network, System
  • Count badges per tab
  • Renders LogViewer, CopyToolbar, SessionPanel, CrashReport

Task #6: Create LogViewer component with virtualized list

Blocked by: #5 Blocks: #7, #18

Create web/components/JobDevTools/LogViewer.tsx:

  • Virtualized list (react-window)
  • Level filter, search, auto-scroll toggle
  • Timestamp format toggle

Task #7: Create CopyToolbar and copy utilities

Blocked by: #6 Blocks: #8, #18

Create:

  • web/components/JobDevTools/CopyToolbar.tsx
  • web/lib/copy-utils.ts

Task #8: Create LogEntry row component with click-to-copy

Blocked by: #7 Blocks: #18

Create web/components/JobDevTools/LogEntry.tsx:

  • Click to copy, shift+click for range
  • Level/category badges with colors
  • Expandable metrics view

Track C: Crash Analysis

Task #9: Implement crash detection wrapper in scraper

Blocked by: #1 Blocks: #10

Add to modules/scraper_clean.py:

  • Wrap execution in try/catch
  • Periodic metrics sampling (5s interval)
  • Compile CrashReport on failure
  • Helper: get_chrome_memory(), get_dom_node_count(), classify_crash()

Task #10: Create crash pattern analyzer

Blocked by: #4, #9 Blocks: #11

Create modules/crash_analyzer.py:

  • Pattern detection: memory_exhaustion, dom_bloat, rate_limited, consent_loop, scroll_timeout, element_stale
  • Confidence scoring
  • Suggested fix generation
  • Auto-fix parameters

Task #11: Add crash report API endpoints

Blocked by: #4, #10 Blocks: #12

Add to api_server_production.py:

  • GET /jobs/{job_id}/crash-report
  • POST /jobs/{job_id}/retry?apply_fix=...
  • GET /crashes/stats

Task #12: Create CrashReport frontend component

Blocked by: #11 Blocks: #18

Create web/components/JobDevTools/CrashReport.tsx:

  • Timeline to crash visualization
  • Pattern analysis display
  • "Apply Fix & Retry" button
  • Collapsible logs before crash

Track D: Session & Metrics

Task #13: Capture and store session fingerprint in backend

Blocked by: #4 Blocks: #14

Add to modules/scraper_clean.py:

  • Compile SessionFingerprint at job start
  • Run bot detection tests
  • Store in job metadata

Task #14: Create SessionPanel frontend component

Blocked by: #13 Blocks: #18

Create web/components/JobDevTools/SessionPanel.tsx:

  • "What Google Saw" display
  • Identity, Geolocation, Viewport sections
  • Bot detection indicators (green/yellow/red)

Task #15: Create MetricsDashboard with real-time charts

Blocked by: #3 Blocks: #18

Create web/components/JobDevTools/MetricsDashboard.tsx:

  • Extraction rate line chart
  • Cumulative reviews area chart
  • Memory usage line chart
  • API vs DOM pie chart

Track E: Review Topics

Task #16: Implement review topics inference algorithm

Blocks: #17

Add to modules/scraper_clean.py:

  • infer_review_topics(review_text, topics) function
  • Word boundary matching
  • Simple stemming variants
  • Add 'topics' field to each review

Task #17: Add topic tags to review cards in frontend

Blocked by: #16

Update:

  • web/components/ReviewAnalytics.tsx
  • web/lib/analytics.ts

Add topic tags to reviews, topic filter, topic distribution chart.


Task #18: Integrate JobDevTools into job detail page

Blocked by: #5, #6, #7, #8, #12, #14, #15

Replace current log display with JobDevTools component. Handle both old and new log formats. Connect SSE stream for real-time updates.


Execution Waves

Wave Tasks Parallel Agents
1 #1, #16 2
2 #2, #4, #9, #17 4
3 #3, #10, #13 3
4 #5, #11, #14, #15 4
5 #6, #12 2
6 #7 → #8 1 (sequential)
7 #18 1

Critical Path: #1 → #2 → #3 → #5 → #6 → #7 → #8 → #18