feat: whyrating - initial project from turbostarter boilerplate

2026-02-04 01:54:52 +01:00
commit 5cdc07cd39
1618 changed files with 338230 additions and 0 deletions
--- a/docs/tcs/TCS-CLAUDE-CODE.md
+++ b/docs/tcs/TCS-CLAUDE-CODE.md
@@ -0,0 +1,627 @@
+# TCS with Claude Code CLI
+
+> Run Triangulated Compiler Synthesis entirely within Claude Code using multi-agent Task orchestration. No external API calls needed.
+
+## Overview
+
+Claude Code can execute TCS by spawning parallel Task agents that act as the generators, judge, and builder roles. All computation happens within Claude Code's context.
+
+```mermaid
+graph TD
+    U[User] -->|"Run TCS for X"| CC[Claude Code Main]
+    CC -->|spawn| G1[Task: Generator 1]
+    CC -->|spawn| G2[Task: Generator 2]
+    CC -->|spawn| G3[Task: Generator 3]
+    G1 & G2 & G3 -->|results| CC
+    CC -->|spawn| J[Task: Judge]
+    J -->|findings| CC
+    CC -->|if gaps| E[Task: Evolver]
+    E -->|new spec| CC
+    CC -->|spawn parallel| B1[Task: Scanner]
+    CC -->|spawn parallel| B2[Task: Parser]
+    CC -->|spawn parallel| B3[Task: Emitter]
+    B1 & B2 & B3 -->|code| CC
+    CC -->|write| F[Files]
+```
+
+## How It Works
+
+1. **User provides**: Domain, initial spec, target sample count
+2. **Claude Code orchestrates**: Spawns Task agents in parallel
+3. **No API calls**: All LLM work is done by Claude Code itself via Task tool
+4. **Files written**: Spec, compiler code, test samples saved to disk
+
+## Execution Protocol
+
+When asked to run TCS, Claude Code should:
+
+### Step 1: Gather Parameters
+
+Ask the user:
+```
+To run TCS, I need:
+1. Domain name (e.g., "form-builder", "chart-dsl")
+2. Path to initial spec (or I can create one)
+3. Number of iterations (recommended: 5-10)
+4. Samples per iteration (recommended: 5-10)
+5. Output directory
+```
+
+### Step 2: Initialize
+
+```
+Create output structure:
+  {output}/
+  ├── spec/
+  │   └── SPEC-v0.md (initial)
+  ├── samples/
+  │   ├── valid/
+  │   └── invalid/
+  ├── compiler/
+  │   ├── scanner.ts
+  │   ├── parser.ts
+  │   └── emitter.ts
+  └── state.json (checkpoint)
+```
+
+### Step 3: Run Iteration Loop
+
+For each iteration:
+
+#### 3a. Generate Samples (Parallel Tasks)
+
+Spawn N parallel Task agents with `subagent_type: "general-purpose"`:
+
+```
+Task 1: "Generate sample for [domain].
+         Prompt: 'Create a [random scenario]'
+
+         Return JSON:
+         {
+           'prompt': '...',
+           'rep1_visual': '... (description/JSX)',
+           'rep2_structured': { ... (JSON schema) },
+           'rep3_dsl': '... (DSL code)'
+         }
+
+         Use this spec: [current spec content]"
+
+Task 2: (same with different random prompt)
+Task 3: (same with different random prompt)
+...
+Task N: (same with different random prompt)
+```
+
+#### 3b. Triangulate (Parallel Tasks)
+
+For each sample, spawn a judge Task:
+
+```
+Task: "Compare these 3 representations for semantic equivalence:
+
+       VISUAL: [rep1]
+       STRUCTURED: [rep2]
+       DSL: [rep3]
+
+       Return JSON:
+       {
+         'equivalent': true/false,
+         'disagreements': ['...'],
+         'spec_gaps': ['...']
+       }"
+```
+
+#### 3c. Evolve Spec (Single Task if gaps found)
+
+```
+Task: "Based on these findings from triangulation:
+       [list of spec_gaps]
+
+       Current spec:
+       [spec content]
+
+       Return the COMPLETE updated spec with changes marked."
+```
+
+Write new spec version to `spec/SPEC-v{N}.md`
+
+#### 3d. Build/Update Compiler (Parallel Tasks)
+
+Spawn 3 parallel Tasks:
+
+```
+Task Scanner: "Write a TypeScript scanner for this DSL spec:
+               [spec content]
+
+               Return complete scanner.ts code."
+
+Task Parser:  "Write a TypeScript parser for this DSL spec:
+               [spec content]
+
+               Return complete parser.ts code."
+
+Task Emitter: "Write a TypeScript emitter for this DSL spec:
+               [spec content]
+               Target output: [structured format]
+
+               Return complete emitter.ts code."
+```
+
+Write files to `compiler/`
+
+#### 3e. Test (Main Agent)
+
+Claude Code main agent:
+- Reads validated samples
+- Runs compiler mentally or via Bash if executable
+- Compares output to expected
+- Records pass/fail
+
+#### 3f. Checkpoint
+
+Write to `state.json`:
+```json
+{
+  "iteration": 3,
+  "spec_version": 3,
+  "samples_validated": 25,
+  "samples_invalid": 3,
+  "pass_rate": 0.92,
+  "last_findings": ["..."]
+}
+```
+
+### Step 4: Convergence Check
+
+Stop when:
+- `pass_rate >= 0.95` AND
+- `last 2 iterations had 0 spec changes` AND
+- `samples_validated >= target`
+
+Or max iterations reached.
+
+### Step 5: Final Output
+
+```
+TCS Complete!
+
+Results:
+- Iterations: 7
+- Final spec: spec/SPEC-v7.md
+- Validated samples: 52
+- Pass rate: 98%
+- Compiler: compiler/{scanner,parser,emitter}.ts
+
+Files written to: {output}/
+```
+
+## Parallel Task Strategy
+
+**Maximize parallelism** by launching independent tasks together:
+
+```
+Iteration N:
+├── [PARALLEL] Generate 10 samples (10 Tasks)
+│   └── Wait for all
+├── [PARALLEL] Triangulate 10 samples (10 Tasks)
+│   └── Wait for all
+├── [SEQUENTIAL] Evolve spec (1 Task, only if gaps)
+├── [PARALLEL] Build compiler (3 Tasks)
+│   └── Wait for all
+└── [SEQUENTIAL] Test & checkpoint (Main agent)
+```
+
+## Sample Task Prompts
+
+### Generator Task Prompt Template
+
+```
+You are generating a TCS sample for domain: {domain}
+
+SPEC:
+---
+{spec_content}
+---
+
+USER SCENARIO: {random_prompt}
+
+Generate 3 semantically equivalent representations:
+
+1. VISUAL (description or JSX):
+   Describe what the user would see/interact with.
+
+2. STRUCTURED (JSON):
+   The underlying data model.
+
+3. DSL (compact notation):
+   Using the spec syntax above.
+
+Return as JSON:
+{
+  "prompt": "{random_prompt}",
+  "rep1_visual": "...",
+  "rep2_structured": {...},
+  "rep3_dsl": "..."
+}
+```
+
+### Judge Task Prompt Template
+
+```
+You are a TCS triangulation judge.
+
+Compare these 3 representations for SEMANTIC EQUIVALENCE:
+
+REP1 (Visual):
+{rep1}
+
+REP2 (Structured):
+{rep2}
+
+REP3 (DSL):
+{rep3}
+
+Check:
+1. Do all 3 represent the SAME thing?
+2. Are there missing fields in any representation?
+3. Are there contradictions?
+4. Does the DSL follow the spec correctly?
+
+SPEC for reference:
+{spec}
+
+Return JSON:
+{
+  "equivalent": true/false,
+  "disagreements": [
+    "Rep1 has X but Rep2 missing",
+    "Rep3 uses Y syntax not in spec"
+  ],
+  "spec_gaps": [
+    "Spec doesn't define how to handle X",
+    "Need syntax for Y"
+  ]
+}
+```
+
+### Evolver Task Prompt Template
+
+```
+You are evolving a DSL specification based on TCS findings.
+
+CURRENT SPEC:
+---
+{spec}
+---
+
+FINDINGS FROM TRIANGULATION:
+{findings_list}
+
+Update the spec to address these gaps. Rules:
+1. Add missing syntax/semantics
+2. Clarify ambiguities
+3. Add examples for new features
+4. Keep backwards compatible if possible
+5. Mark changes with <!-- NEW v{N} -->
+
+Return the COMPLETE updated spec.
+```
+
+### Builder Task Prompt Templates
+
+**Scanner:**
+```
+Write a TypeScript scanner/tokenizer for this DSL:
+
+SPEC:
+{spec}
+
+Requirements:
+- Export: scan(source: string): Token[]
+- Token type: { type: string, value: string, line: number, column: number }
+- Handle all syntax in spec
+- Good error messages with line/column
+
+Return complete scanner.ts file.
+```
+
+**Parser:**
+```
+Write a TypeScript parser for this DSL:
+
+SPEC:
+{spec}
+
+TOKEN TYPES (from scanner):
+{token_types}
+
+Requirements:
+- Export: parse(tokens: Token[]): AST
+- AST should mirror the structured representation
+- Handle all syntax in spec
+- Good error messages
+
+Return complete parser.ts file.
+```
+
+**Emitter:**
+```
+Write a TypeScript emitter for this DSL:
+
+SPEC:
+{spec}
+
+AST TYPE:
+{ast_type}
+
+TARGET:
+{target_format}
+
+Requirements:
+- Export: emit(ast: AST): TargetType
+- Handle all node types
+- Preserve all semantic information
+
+Return complete emitter.ts file.
+```
+
+## Error Handling
+
+If a Task fails:
+1. Log the error
+2. Retry once with clarified prompt
+3. If still fails, mark sample as invalid
+4. Continue with other samples
+
+If compiler build fails:
+1. Spawn a "Reflector" Task to diagnose
+2. Apply suggested fix
+3. Retry build
+
+## Usage Example
+
+User says:
+> "Run TCS to build a chart DSL compiler. 5 iterations, 8 samples each."
+
+Claude Code responds:
+> "Starting TCS for chart-dsl domain.
+>
+> Configuration:
+> - Iterations: 5
+> - Samples/iteration: 8
+> - Output: ./output/chart-dsl/
+>
+> Creating initial spec... Do you have an existing spec or should I create one?"
+
+User:
+> "Create one"
+
+Claude Code:
+> "Created initial spec at ./output/chart-dsl/spec/SPEC-v0.md
+>
+> Starting iteration 1...
+> [Spawns 8 parallel generator Tasks]
+> Generated 8 samples.
+>
+> [Spawns 8 parallel judge Tasks]
+> Triangulation complete: 6 valid, 2 with disagreements.
+>
+> Findings:
+> - Axis label syntax undefined
+> - Legend position options missing
+>
+> [Spawns evolver Task]
+> Spec updated to v1.
+>
+> [Spawns 3 parallel builder Tasks]
+> Compiler built: scanner.ts, parser.ts, emitter.ts
+>
+> Testing... 5/6 passing (83%)
+>
+> Iteration 1 complete. Continuing..."
+
+## Checkpoint & Resume
+
+State file enables resume:
+
+```json
+{
+  "domain": "chart-dsl",
+  "iteration": 3,
+  "max_iterations": 5,
+  "samples_per_iteration": 8,
+  "spec_version": 3,
+  "validated_samples": ["s001", "s002", ...],
+  "invalid_samples": ["s003"],
+  "compiler_version": 3,
+  "pass_rate": 0.88
+}
+```
+
+To resume:
+> "Resume TCS from ./output/chart-dsl/"
+
+Claude Code reads state.json and continues from last checkpoint.
+
+## Best Practices
+
+1. **Use `run_in_background: false`** for Tasks - need results immediately
+2. **Batch parallel Tasks** in single message for efficiency
+3. **Keep spec in memory** during iteration, write to disk at checkpoints
+4. **Accumulate samples** across iterations for comprehensive testing
+5. **Use Opus** for compiler/spec tasks, **Sonnet** for judge/generator tasks
+
+## Model Selection per Task
+
+Use the best models for critical tasks:
+
+```
+| Task Type  | Model    | Reason                              |
+|------------|----------|-------------------------------------|
+| Scanner    | opus     | Compiler code - highest quality     |
+| Parser     | opus     | Compiler code - complex logic       |
+| Emitter    | opus     | Compiler code - correctness critical|
+| Evolver    | opus     | Spec design - architecture decisions|
+| Architect  | opus     | Redesign - needs deep reasoning     |
+| Judge      | sonnet   | Comparison - thorough but fast      |
+| Reflector  | sonnet   | Diagnosis - good reasoning          |
+| Generator  | sonnet   | Sample generation - quality matters |
+```
+
+**Philosophy**: Use Opus 4.5 for anything that produces code or evolves the spec.
+Use Sonnet for validation, comparison, and generation tasks.
+
+Specify in Task call:
+```
+Task(subagent_type="general-purpose", model="opus", prompt="...")   // For compiler/spec
+Task(subagent_type="general-purpose", model="sonnet", prompt="...")  // For judge/generator
+```
+
+## Quality Principles (CRITICAL)
+
+These principles are **non-negotiable** for TCS execution:
+
+### 1. No Bypassing Issues
+
+```
+❌ WRONG: "Add a try-catch to suppress the error"
+❌ WRONG: "Skip this edge case for now"
+❌ WRONG: "Use 'any' type to fix compilation"
+
+✅ RIGHT: "The error reveals a spec gap - evolve the spec"
+✅ RIGHT: "This edge case needs proper handling in the parser"
+✅ RIGHT: "Define proper types that match the semantic model"
+```
+
+When a test fails or error occurs:
+1. **Understand the root cause** - Why did it fail?
+2. **Trace to spec or architecture** - Is the spec incomplete? Is the design wrong?
+3. **Fix at the source** - Update spec, redesign module, or add proper handling
+4. **Never suppress** - Errors are signals, not noise
+
+### 2. No Isolated Patches
+
+```
+❌ WRONG: "Add special case for this one sample"
+❌ WRONG: "Hardcode this value to make test pass"
+❌ WRONG: "Add if-statement to handle user X's edge case"
+
+✅ RIGHT: "This pattern appears in 3 samples - generalize the solution"
+✅ RIGHT: "Update the grammar to properly handle this construct"
+✅ RIGHT: "Refactor to use a consistent approach across all cases"
+```
+
+When fixing issues:
+1. **Look for patterns** - Is this a one-off or systemic?
+2. **Generalize the fix** - Solution should work for all similar cases
+3. **Update tests** - Add samples that would catch similar issues
+4. **Refactor if needed** - Don't bolt-on, integrate properly
+
+### 3. Production-Ready Code Only
+
+```
+❌ WRONG: "// TODO: handle this later"
+❌ WRONG: "console.log('debug:', value)"
+❌ WRONG: "function parse(x: any): any"
+
+✅ RIGHT: Complete implementation with all cases handled
+✅ RIGHT: Proper error types with descriptive messages
+✅ RIGHT: Full type safety with strict TypeScript
+```
+
+Every compiler module must have:
+- **Strict TypeScript** - `strict: true`, `noUncheckedIndexedAccess: true`
+- **Complete error handling** - All error paths with useful messages
+- **No TODOs** - Every feature fully implemented
+- **No debug code** - Clean, production-ready output
+- **Proper exports** - Well-defined public API
+
+### 4. Architecture Changes When Needed
+
+```
+❌ WRONG: "Add another parameter to this 10-parameter function"
+❌ WRONG: "Nest another if-statement in this 200-line function"
+❌ WRONG: "Keep the broken design, just work around it"
+
+✅ RIGHT: "The scanner needs a state machine - redesign it"
+✅ RIGHT: "Split this into separate passes for clarity"
+✅ RIGHT: "The AST structure doesn't match semantics - restructure"
+```
+
+Signs you need architecture change:
+- **Same bug keeps appearing** in different forms
+- **Adding features is painful** - too many special cases
+- **Code is hard to understand** - even for the builder agent
+- **Tests are brittle** - small changes break many tests
+
+When architecture needs changing:
+1. **Spawn Architect Task** to design new structure
+2. **Rebuild affected modules** from scratch (don't patch)
+3. **Migrate tests** to new structure
+4. **Verify all samples** still pass
+
+### Quality Gate Checklist
+
+Before marking an iteration complete:
+
+```
+[ ] All compiler code compiles with strict TypeScript
+[ ] No 'any' types (except unavoidable external interfaces)
+[ ] No suppressed errors or warnings
+[ ] No TODO comments
+[ ] No console.log/debug statements
+[ ] All error paths have descriptive messages
+[ ] Code handles all spec-defined constructs
+[ ] No isolated patches - all fixes are generalized
+[ ] Architecture supports future spec evolution
+```
+
+### Reflector Task for Quality Issues
+
+When quality problems are detected, spawn a Reflector:
+
+```
+Task: "Review this compiler module for quality issues:
+
+       CODE:
+       {module_code}
+
+       SPEC:
+       {spec}
+
+       Check for:
+       1. Bypassed errors (try-catch hiding issues)
+       2. Isolated patches (special cases that should be generalized)
+       3. Missing type safety
+       4. Incomplete error handling
+       5. Architecture problems
+
+       Return JSON:
+       {
+         'quality_score': 0-100,
+         'issues': [
+           {
+             'severity': 'critical|major|minor',
+             'type': 'bypass|patch|types|errors|architecture',
+             'location': 'line or function name',
+             'problem': 'description',
+             'solution': 'how to fix properly'
+           }
+         ],
+         'needs_redesign': true/false,
+         'redesign_reason': 'why architecture change needed'
+       }"
+```
+
+### Iteration Quality Standards
+
+| Metric | Minimum | Target |
+|--------|---------|--------|
+| TypeScript strict compliance | 100% | 100% |
+| Test pass rate | 90% | 98%+ |
+| Code quality score | 80 | 95+ |
+| Isolated patches | 0 | 0 |
+| Bypassed errors | 0 | 0 |
+| TODO comments | 0 | 0 |
+
+**Do not proceed to next iteration if minimums not met.**