# Phase 3: Extensibility + Guardrails - Autonomous TDD Workflow ## Objective Add multi-language/framework support, enhanced safety guardrails, TUI interface, and extensibility for IDE/editor integration. ## Scope - Multi-language test runner support (pytest, go test, etc.) - Enhanced safety: diff preview, confirmation gates, minimal-change prompts - Optional TUI panel with tmux integration - State-based extension API for IDE integration - Parallel subtask execution (experimental) ## Deliverables ### 1. Multi-Language Test Runner Support **Extend TestRunnerAdapter:** ```typescript class TestRunnerAdapter { // Existing methods... async detectLanguage(): Promise async detectFramework(language: Language): Promise async getFrameworkAdapter(framework: Framework): Promise } enum Language { JavaScript = 'javascript', TypeScript = 'typescript', Python = 'python', Go = 'go', Rust = 'rust' } enum Framework { Vitest = 'vitest', Jest = 'jest', Pytest = 'pytest', GoTest = 'gotest', CargoTest = 'cargotest' } interface FrameworkAdapter { runTargeted(pattern: string): Promise runAll(): Promise parseCoverage(output: string): Promise getTestFilePattern(): string getTestFileExtension(): string } ``` **Framework-specific adapters:** **PytestAdapter** (`packages/tm-core/src/services/test-adapters/pytest-adapter.ts`): ```typescript class PytestAdapter implements FrameworkAdapter { async runTargeted(pattern: string): Promise { const output = await exec(`pytest ${pattern} --json-report`) return this.parseResults(output) } async runAll(): Promise { const output = await exec('pytest --cov --json-report') return this.parseResults(output) } parseCoverage(output: string): Promise { // Parse pytest-cov XML output } getTestFilePattern(): string { return '**/test_*.py' } getTestFileExtension(): string { return '.py' } } ``` **GoTestAdapter** (`packages/tm-core/src/services/test-adapters/gotest-adapter.ts`): ```typescript class GoTestAdapter implements FrameworkAdapter { async runTargeted(pattern: string): Promise { const output = await exec(`go test ${pattern} -json`) return this.parseResults(output) } async runAll(): Promise { const output = await exec('go test ./... -coverprofile=coverage.out -json') return this.parseResults(output) } parseCoverage(output: string): Promise { // Parse go test coverage output } getTestFilePattern(): string { return '**/*_test.go' } getTestFileExtension(): string { return '_test.go' } } ``` **Detection Logic:** ```typescript async function detectFramework(): Promise { // Check for package.json if (await exists('package.json')) { const pkg = await readJSON('package.json') if (pkg.devDependencies?.vitest) return Framework.Vitest if (pkg.devDependencies?.jest) return Framework.Jest } // Check for Python files if (await exists('pytest.ini') || await exists('setup.py')) { return Framework.Pytest } // Check for Go files if (await exists('go.mod')) { return Framework.GoTest } // Check for Rust files if (await exists('Cargo.toml')) { return Framework.CargoTest } throw new Error('Could not detect test framework') } ``` ### 2. Enhanced Safety Guardrails **Diff Preview Mode:** ```bash $ tm autopilot 42 --preview-diffs [2/3] Subtask 42.2: Add collection endpoint RED ✓ Tests created: src/api/__tests__/metrics.test.js GREEN Implementing code... ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ Proposed changes (src/api/metrics.js): + import { MetricsSchema } from '../models/schema.js' + + export async function createMetric(data) { + const validated = MetricsSchema.parse(data) + const result = await db.metrics.create(validated) + return result + } ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ Apply these changes? [Y/n/e(dit)/s(kip)] Y - Apply and continue n - Reject and retry GREEN phase e - Open in editor for manual changes s - Skip this subtask ``` **Minimal Change Enforcement:** Add to system prompt: ```markdown CRITICAL: Make MINIMAL changes to pass the failing tests. - Only modify files directly related to the subtask - Do not refactor existing code unless absolutely necessary - Do not add features beyond the acceptance criteria - Keep changes under 50 lines per file when possible - Prefer composition over modification ``` **Change Size Warnings:** ```bash ⚠️ Large change detected: Files modified: 5 Lines changed: +234, -12 This subtask was expected to be small (~50 lines). Consider: - Breaking into smaller subtasks - Reviewing acceptance criteria - Checking for unintended changes Continue anyway? [y/N] ``` ### 3. TUI Interface with tmux **Layout:** ``` ┌──────────────────────────────────┬─────────────────────────────────┐ │ Task Navigator (left) │ Executor Terminal (right) │ │ │ │ │ Project: my-app │ $ tm autopilot --executor-mode │ │ Branch: analytics/task-42 │ > Running subtask 42.2 GREEN... │ │ Tag: analytics │ > Implementing endpoint... │ │ │ > Tests: 3 passed, 0 failed │ │ Tasks: │ > Ready to commit │ │ → 42 [in-progress] User metrics │ │ │ → 42.1 [done] Schema │ [Live output from executor] │ │ → 42.2 [active] Endpoint ◀ │ │ │ → 42.3 [pending] Dashboard │ │ │ │ │ │ [s] start [p] pause [q] quit │ │ └──────────────────────────────────┴─────────────────────────────────┘ ``` **Implementation:** **TUI Navigator** (`apps/cli/src/ui/tui/navigator.ts`): ```typescript import blessed from 'blessed' class AutopilotTUI { private screen: blessed.Widgets.Screen private taskList: blessed.Widgets.ListElement private statusBox: blessed.Widgets.BoxElement private executorPane: string // tmux pane ID async start(taskId?: string) { // Create blessed screen this.screen = blessed.screen() // Create task list widget this.taskList = blessed.list({ label: 'Tasks', keys: true, vi: true, style: { selected: { bg: 'blue' } } }) // Spawn tmux pane for executor this.executorPane = await this.spawnExecutorPane() // Watch state file for updates this.watchStateFile() // Handle keybindings this.setupKeybindings() } private async spawnExecutorPane(): Promise { const paneId = await exec('tmux split-window -h -P -F "#{pane_id}"') await exec(`tmux send-keys -t ${paneId} "tm autopilot --executor-mode" Enter`) return paneId.trim() } private watchStateFile() { watch('.taskmaster/state/current-run.json', (event, filename) => { this.updateDisplay() }) } private setupKeybindings() { this.screen.key(['s'], () => this.startTask()) this.screen.key(['p'], () => this.pauseTask()) this.screen.key(['q'], () => this.quit()) this.screen.key(['up', 'down'], () => this.navigateTasks()) } } ``` **Executor Mode:** ```bash $ tm autopilot 42 --executor-mode # Runs in executor pane, writes state to shared file # Left pane reads state file and updates display ``` **State File** (`.taskmaster/state/current-run.json`): ```json { "runId": "2025-01-15-142033", "taskId": "42", "status": "running", "currentPhase": "green", "currentSubtask": "42.2", "lastOutput": "Implementing endpoint...", "testsStatus": { "passed": 3, "failed": 0 } } ``` ### 4. Extension API for IDE Integration **State-based API:** Expose run state via JSON files that IDEs can read: - `.taskmaster/state/current-run.json` - live run state - `.taskmaster/reports/runs//manifest.json` - run metadata - `.taskmaster/reports/runs//log.jsonl` - event stream **WebSocket API (optional):** ```typescript // packages/tm-core/src/services/autopilot-server.ts class AutopilotServer { private wss: WebSocketServer start(port: number = 7890) { this.wss = new WebSocketServer({ port }) this.wss.on('connection', (ws) => { // Send current state ws.send(JSON.stringify(this.getCurrentState())) // Stream events this.orchestrator.on('*', (event) => { ws.send(JSON.stringify(event)) }) }) } } ``` **Usage from IDE extension:** ```typescript // VS Code extension example const ws = new WebSocket('ws://localhost:7890') ws.on('message', (data) => { const event = JSON.parse(data) if (event.type === 'subtask:complete') { vscode.window.showInformationMessage( `Subtask ${event.subtaskId} completed` ) } }) ``` ### 5. Parallel Subtask Execution (Experimental) **Dependency Analysis:** ```typescript class SubtaskScheduler { async buildDependencyGraph(subtasks: Subtask[]): Promise { const graph = new DAG() for (const subtask of subtasks) { graph.addNode(subtask.id) for (const depId of subtask.dependencies) { graph.addEdge(depId, subtask.id) } } return graph } async getParallelBatches(graph: DAG): Promise { const batches: Subtask[][] = [] const completed = new Set() while (completed.size < graph.size()) { const ready = graph.nodes.filter(node => !completed.has(node.id) && node.dependencies.every(dep => completed.has(dep)) ) batches.push(ready) ready.forEach(node => completed.add(node.id)) } return batches } } ``` **Parallel Execution:** ```bash $ tm autopilot 42 --parallel [Batch 1] Running 2 subtasks in parallel: → 42.1: Add metrics schema → 42.4: Add API documentation 42.1 RED ✓ Tests created 42.4 RED ✓ Tests created 42.1 GREEN ✓ Implementation complete 42.4 GREEN ✓ Implementation complete 42.1 COMMIT ✓ Committed: a1b2c3d 42.4 COMMIT ✓ Committed: e5f6g7h [Batch 2] Running 2 subtasks in parallel (depend on 42.1): → 42.2: Add collection endpoint → 42.3: Add dashboard widget ... ``` **Conflict Detection:** ```typescript async function detectConflicts(subtasks: Subtask[]): Promise { const conflicts: Conflict[] = [] for (let i = 0; i < subtasks.length; i++) { for (let j = i + 1; j < subtasks.length; j++) { const filesA = await predictAffectedFiles(subtasks[i]) const filesB = await predictAffectedFiles(subtasks[j]) const overlap = filesA.filter(f => filesB.includes(f)) if (overlap.length > 0) { conflicts.push({ subtasks: [subtasks[i].id, subtasks[j].id], files: overlap }) } } } return conflicts } ``` ### 6. Advanced Configuration **Add to `.taskmaster/config.json`:** ```json { "autopilot": { "safety": { "previewDiffs": false, "maxChangeLinesPerFile": 100, "warnOnLargeChanges": true, "requireConfirmOnLargeChanges": true }, "parallel": { "enabled": false, "maxConcurrent": 3, "detectConflicts": true }, "tui": { "enabled": false, "tmuxSession": "taskmaster-autopilot" }, "api": { "enabled": false, "port": 7890, "allowRemote": false } }, "test": { "frameworks": { "python": { "runner": "pytest", "coverageCommand": "pytest --cov", "testPattern": "**/test_*.py" }, "go": { "runner": "go test", "coverageCommand": "go test ./... -coverprofile=coverage.out", "testPattern": "**/*_test.go" } } } } ``` ## CLI Updates **New commands:** ```bash tm autopilot --tui # Launch TUI interface tm autopilot --parallel # Enable parallel execution tm autopilot --preview-diffs # Show diffs before applying tm autopilot --executor-mode # Run as executor pane tm autopilot-server start # Start WebSocket API ``` ## Success Criteria - Supports Python projects with pytest - Supports Go projects with go test - Diff preview prevents unwanted changes - TUI provides better visibility for long-running tasks - IDE extensions can integrate via state files or WebSocket - Parallel execution reduces total time for independent subtasks ## Out of Scope - Full Electron/web GUI - AI executor selection UI (defer to Phase 4) - Multi-repository support - Remote execution on cloud runners ## Testing Strategy - Test with Python project (pytest) - Test with Go project (go test) - Test diff preview UI with mock changes - Test parallel execution with independent subtasks - Test conflict detection with overlapping file changes - Test TUI with mock tmux environment ## Dependencies - Phase 2 completed (PR + resumability) - tmux installed (for TUI) - blessed or ink library (for TUI rendering) ## Estimated Effort 3-4 weeks ## Risks & Mitigations - **Risk:** Parallel execution causes git conflicts - **Mitigation:** Conservative conflict detection, sequential fallback - **Risk:** TUI adds complexity and maintenance burden - **Mitigation:** Keep TUI optional, state-based design allows alternatives - **Risk:** Framework adapters hard to maintain across versions - **Mitigation:** Abstract common parsing logic, document adapter interface - **Risk:** Diff preview slows down workflow - **Mitigation:** Make optional, use --preview-diffs flag only when needed ## Validation Test with: - Python project with pytest and pytest-cov - Go project with go test - Large changes requiring confirmation - Parallel execution with 3+ independent subtasks - TUI with task selection and live status updates - VS Code extension reading state files