Add Job DevTools implementation task breakdown
18 tasks organized in 5 parallel tracks: - Track A: Backend logging infrastructure (4 tasks) - Track B: Frontend log viewer (5 tasks) - Track C: Crash analysis (4 tasks) - Track D: Session & metrics (3 tasks) - Track E: Review topics (2 tasks) Includes dependency graph and 7-wave execution plan for parallel AI agent workflow. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
This commit is contained in:
336
.artifacts/job-devtools-tasks.md
Normal file
336
.artifacts/job-devtools-tasks.md
Normal file
@@ -0,0 +1,336 @@
|
||||
# Job DevTools - Implementation Tasks
|
||||
|
||||
## Dependency Graph
|
||||
|
||||
```
|
||||
Wave 1 (Parallel start):
|
||||
#1 StructuredLogger ──┬──▶ #2 Migrate scraper ──▶ #3 SSE stream ──▶ #5 JobDevTools
|
||||
│ │
|
||||
├──▶ #4 DB schema ──┬──▶ #10 Crash analyzer ▼
|
||||
│ │ │ #6 LogViewer
|
||||
│ │ ▼ │
|
||||
│ ├──▶ #11 Crash API ▼
|
||||
│ │ │ #7 CopyToolbar
|
||||
│ │ ▼ │
|
||||
│ │ #12 CrashReport ▼
|
||||
│ │ #8 LogEntry
|
||||
│ └──▶ #13 Session capture │
|
||||
│ │ │
|
||||
└──▶ #9 Crash detection ▼ │
|
||||
│ #14 SessionPanel │
|
||||
│ │ │
|
||||
└───────────────────┼───────────────┘
|
||||
│
|
||||
#16 Topics inference ──▶ #17 Topic tags ▼
|
||||
#15 MetricsDashboard
|
||||
│
|
||||
▼
|
||||
#18 INTEGRATION
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Task Details
|
||||
|
||||
### Track A: Backend Logging Infrastructure
|
||||
|
||||
#### Task #1: Create StructuredLogger class in Python backend
|
||||
**Priority:** P0 (Foundation)
|
||||
**Blocks:** #2, #3, #4, #9
|
||||
|
||||
Create `modules/structured_logger.py`:
|
||||
|
||||
```python
|
||||
from dataclasses import dataclass, field, asdict
|
||||
from typing import Optional, Dict, Any, List, Literal
|
||||
from datetime import datetime
|
||||
import threading
|
||||
import time
|
||||
|
||||
LogLevel = Literal['DEBUG', 'INFO', 'WARN', 'ERROR', 'FATAL']
|
||||
LogCategory = Literal['scraper', 'browser', 'network', 'system']
|
||||
|
||||
@dataclass
|
||||
class LogEntry:
|
||||
timestamp: str
|
||||
timestamp_ms: int
|
||||
level: LogLevel
|
||||
category: LogCategory
|
||||
message: str
|
||||
metrics: Optional[Dict[str, Any]] = None
|
||||
network: Optional[Dict[str, Any]] = None
|
||||
snapshot_id: Optional[str] = None
|
||||
|
||||
class StructuredLogger:
|
||||
def __init__(self, max_entries: int = 10000):
|
||||
self._logs: List[LogEntry] = []
|
||||
self._lock = threading.Lock()
|
||||
self._max_entries = max_entries
|
||||
|
||||
def _log(self, level: LogLevel, category: LogCategory, message: str,
|
||||
metrics: Dict = None, network: Dict = None, snapshot_id: str = None):
|
||||
now = datetime.utcnow()
|
||||
entry = LogEntry(
|
||||
timestamp=now.isoformat() + 'Z',
|
||||
timestamp_ms=int(time.time() * 1000),
|
||||
level=level,
|
||||
category=category,
|
||||
message=message,
|
||||
metrics=metrics,
|
||||
network=network,
|
||||
snapshot_id=snapshot_id
|
||||
)
|
||||
with self._lock:
|
||||
self._logs.append(entry)
|
||||
if len(self._logs) > self._max_entries:
|
||||
self._logs = self._logs[-self._max_entries:]
|
||||
|
||||
def debug(self, category: LogCategory, message: str, **kwargs):
|
||||
self._log('DEBUG', category, message, **kwargs)
|
||||
|
||||
def info(self, category: LogCategory, message: str, **kwargs):
|
||||
self._log('INFO', category, message, **kwargs)
|
||||
|
||||
def warn(self, category: LogCategory, message: str, **kwargs):
|
||||
self._log('WARN', category, message, **kwargs)
|
||||
|
||||
def error(self, category: LogCategory, message: str, **kwargs):
|
||||
self._log('ERROR', category, message, **kwargs)
|
||||
|
||||
def fatal(self, category: LogCategory, message: str, **kwargs):
|
||||
self._log('FATAL', category, message, **kwargs)
|
||||
|
||||
def get_logs(self) -> List[Dict]:
|
||||
with self._lock:
|
||||
return [asdict(e) for e in self._logs]
|
||||
|
||||
def get_logs_by_category(self, category: LogCategory) -> List[Dict]:
|
||||
with self._lock:
|
||||
return [asdict(e) for e in self._logs if e.category == category]
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
#### Task #2: Migrate scraper_clean.py to use StructuredLogger
|
||||
**Blocked by:** #1
|
||||
**Blocks:** #3
|
||||
|
||||
Update all log calls in `modules/scraper_clean.py`:
|
||||
- Replace `LogCapture` with `StructuredLogger`
|
||||
- Add category to each log call
|
||||
- Add metrics where relevant (scroll_count, reviews_count, memory_mb)
|
||||
|
||||
---
|
||||
|
||||
#### Task #3: Update SSE stream to emit structured log events
|
||||
**Blocked by:** #1, #2
|
||||
**Blocks:** #5, #15
|
||||
|
||||
Update `api_server_production.py`:
|
||||
- Change log event format to include full LogEntry structure
|
||||
- Add metrics event type emitted every 5 seconds
|
||||
- Backward compatibility for old clients
|
||||
|
||||
---
|
||||
|
||||
#### Task #4: Add crash_reports table and schema
|
||||
**Blocked by:** #1
|
||||
**Blocks:** #10, #11, #13
|
||||
|
||||
Add to `modules/database.py`:
|
||||
```sql
|
||||
CREATE TABLE crash_reports (
|
||||
crash_id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
|
||||
job_id UUID REFERENCES jobs(job_id) ON DELETE CASCADE,
|
||||
created_at TIMESTAMP NOT NULL DEFAULT NOW(),
|
||||
crash_type VARCHAR(50) NOT NULL,
|
||||
error_message TEXT,
|
||||
state JSONB NOT NULL,
|
||||
metrics_history JSONB,
|
||||
logs_before_crash JSONB,
|
||||
analysis JSONB,
|
||||
screenshot_url TEXT
|
||||
);
|
||||
|
||||
ALTER TABLE jobs ADD COLUMN IF NOT EXISTS session_fingerprint JSONB;
|
||||
ALTER TABLE jobs ADD COLUMN IF NOT EXISTS metrics_history JSONB;
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### Track B: Frontend Log Viewer
|
||||
|
||||
#### Task #5: Create JobDevTools React container component
|
||||
**Blocked by:** #3
|
||||
**Blocks:** #6, #18
|
||||
|
||||
Create `web/components/JobDevTools/index.tsx`:
|
||||
- Tab bar: All, Scraper, Browser, Network, System
|
||||
- Count badges per tab
|
||||
- Renders LogViewer, CopyToolbar, SessionPanel, CrashReport
|
||||
|
||||
---
|
||||
|
||||
#### Task #6: Create LogViewer component with virtualized list
|
||||
**Blocked by:** #5
|
||||
**Blocks:** #7, #18
|
||||
|
||||
Create `web/components/JobDevTools/LogViewer.tsx`:
|
||||
- Virtualized list (react-window)
|
||||
- Level filter, search, auto-scroll toggle
|
||||
- Timestamp format toggle
|
||||
|
||||
---
|
||||
|
||||
#### Task #7: Create CopyToolbar and copy utilities
|
||||
**Blocked by:** #6
|
||||
**Blocks:** #8, #18
|
||||
|
||||
Create:
|
||||
- `web/components/JobDevTools/CopyToolbar.tsx`
|
||||
- `web/lib/copy-utils.ts`
|
||||
|
||||
---
|
||||
|
||||
#### Task #8: Create LogEntry row component with click-to-copy
|
||||
**Blocked by:** #7
|
||||
**Blocks:** #18
|
||||
|
||||
Create `web/components/JobDevTools/LogEntry.tsx`:
|
||||
- Click to copy, shift+click for range
|
||||
- Level/category badges with colors
|
||||
- Expandable metrics view
|
||||
|
||||
---
|
||||
|
||||
### Track C: Crash Analysis
|
||||
|
||||
#### Task #9: Implement crash detection wrapper in scraper
|
||||
**Blocked by:** #1
|
||||
**Blocks:** #10
|
||||
|
||||
Add to `modules/scraper_clean.py`:
|
||||
- Wrap execution in try/catch
|
||||
- Periodic metrics sampling (5s interval)
|
||||
- Compile CrashReport on failure
|
||||
- Helper: get_chrome_memory(), get_dom_node_count(), classify_crash()
|
||||
|
||||
---
|
||||
|
||||
#### Task #10: Create crash pattern analyzer
|
||||
**Blocked by:** #4, #9
|
||||
**Blocks:** #11
|
||||
|
||||
Create `modules/crash_analyzer.py`:
|
||||
- Pattern detection: memory_exhaustion, dom_bloat, rate_limited, consent_loop, scroll_timeout, element_stale
|
||||
- Confidence scoring
|
||||
- Suggested fix generation
|
||||
- Auto-fix parameters
|
||||
|
||||
---
|
||||
|
||||
#### Task #11: Add crash report API endpoints
|
||||
**Blocked by:** #4, #10
|
||||
**Blocks:** #12
|
||||
|
||||
Add to `api_server_production.py`:
|
||||
- GET /jobs/{job_id}/crash-report
|
||||
- POST /jobs/{job_id}/retry?apply_fix=...
|
||||
- GET /crashes/stats
|
||||
|
||||
---
|
||||
|
||||
#### Task #12: Create CrashReport frontend component
|
||||
**Blocked by:** #11
|
||||
**Blocks:** #18
|
||||
|
||||
Create `web/components/JobDevTools/CrashReport.tsx`:
|
||||
- Timeline to crash visualization
|
||||
- Pattern analysis display
|
||||
- "Apply Fix & Retry" button
|
||||
- Collapsible logs before crash
|
||||
|
||||
---
|
||||
|
||||
### Track D: Session & Metrics
|
||||
|
||||
#### Task #13: Capture and store session fingerprint in backend
|
||||
**Blocked by:** #4
|
||||
**Blocks:** #14
|
||||
|
||||
Add to `modules/scraper_clean.py`:
|
||||
- Compile SessionFingerprint at job start
|
||||
- Run bot detection tests
|
||||
- Store in job metadata
|
||||
|
||||
---
|
||||
|
||||
#### Task #14: Create SessionPanel frontend component
|
||||
**Blocked by:** #13
|
||||
**Blocks:** #18
|
||||
|
||||
Create `web/components/JobDevTools/SessionPanel.tsx`:
|
||||
- "What Google Saw" display
|
||||
- Identity, Geolocation, Viewport sections
|
||||
- Bot detection indicators (green/yellow/red)
|
||||
|
||||
---
|
||||
|
||||
#### Task #15: Create MetricsDashboard with real-time charts
|
||||
**Blocked by:** #3
|
||||
**Blocks:** #18
|
||||
|
||||
Create `web/components/JobDevTools/MetricsDashboard.tsx`:
|
||||
- Extraction rate line chart
|
||||
- Cumulative reviews area chart
|
||||
- Memory usage line chart
|
||||
- API vs DOM pie chart
|
||||
|
||||
---
|
||||
|
||||
### Track E: Review Topics
|
||||
|
||||
#### Task #16: Implement review topics inference algorithm
|
||||
**Blocks:** #17
|
||||
|
||||
Add to `modules/scraper_clean.py`:
|
||||
- `infer_review_topics(review_text, topics)` function
|
||||
- Word boundary matching
|
||||
- Simple stemming variants
|
||||
- Add 'topics' field to each review
|
||||
|
||||
---
|
||||
|
||||
#### Task #17: Add topic tags to review cards in frontend
|
||||
**Blocked by:** #16
|
||||
|
||||
Update:
|
||||
- `web/components/ReviewAnalytics.tsx`
|
||||
- `web/lib/analytics.ts`
|
||||
|
||||
Add topic tags to reviews, topic filter, topic distribution chart.
|
||||
|
||||
---
|
||||
|
||||
#### Task #18: Integrate JobDevTools into job detail page
|
||||
**Blocked by:** #5, #6, #7, #8, #12, #14, #15
|
||||
|
||||
Replace current log display with JobDevTools component.
|
||||
Handle both old and new log formats.
|
||||
Connect SSE stream for real-time updates.
|
||||
|
||||
---
|
||||
|
||||
## Execution Waves
|
||||
|
||||
| Wave | Tasks | Parallel Agents |
|
||||
|------|-------|-----------------|
|
||||
| 1 | #1, #16 | 2 |
|
||||
| 2 | #2, #4, #9, #17 | 4 |
|
||||
| 3 | #3, #10, #13 | 3 |
|
||||
| 4 | #5, #11, #14, #15 | 4 |
|
||||
| 5 | #6, #12 | 2 |
|
||||
| 6 | #7 → #8 | 1 (sequential) |
|
||||
| 7 | #18 | 1 |
|
||||
|
||||
**Critical Path:** #1 → #2 → #3 → #5 → #6 → #7 → #8 → #18
|
||||
Reference in New Issue
Block a user