chore: prepare branch
This commit is contained in:
125
.taskmaster/docs/tdd-workflow-phase-0-spike.md
Normal file
125
.taskmaster/docs/tdd-workflow-phase-0-spike.md
Normal file
@@ -0,0 +1,125 @@
|
||||
# Phase 0: Spike - Autonomous TDD Workflow
|
||||
|
||||
## Objective
|
||||
Validate feasibility and build foundational understanding before full implementation.
|
||||
|
||||
## Scope
|
||||
- Implement CLI skeleton `tm autopilot` with dry-run mode
|
||||
- Show planned steps from a real task with subtasks
|
||||
- Detect test runner from package.json
|
||||
- Detect git state and render preflight report
|
||||
|
||||
## Deliverables
|
||||
|
||||
### 1. CLI Command Skeleton
|
||||
- Create `apps/cli/src/commands/autopilot.command.ts`
|
||||
- Support `tm autopilot <taskId>` command
|
||||
- Implement `--dry-run` flag
|
||||
- Basic help text and usage information
|
||||
|
||||
### 2. Preflight Detection System
|
||||
- Detect test runner from package.json (npm test, pnpm test, etc.)
|
||||
- Check git working tree state (clean/dirty)
|
||||
- Validate required tools are available (git, gh, node/npm)
|
||||
- Detect default branch
|
||||
|
||||
### 3. Dry-Run Execution Plan Display
|
||||
Display planned execution for a task including:
|
||||
- Preflight checks status
|
||||
- Branch name that would be created
|
||||
- Tag that would be set
|
||||
- List of subtasks in execution order
|
||||
- For each subtask:
|
||||
- RED phase: test file that would be created
|
||||
- GREEN phase: implementation files that would be modified
|
||||
- COMMIT: commit message that would be used
|
||||
- Finalization steps: test suite run, coverage check, push, PR creation
|
||||
|
||||
### 4. Task Loading & Validation
|
||||
- Load task from TaskMaster state
|
||||
- Validate task exists and has subtasks
|
||||
- If no subtasks, show message about needing to expand first
|
||||
- Show dependency order for subtasks
|
||||
|
||||
## Example Output
|
||||
|
||||
```bash
|
||||
$ tm autopilot 42 --dry-run
|
||||
|
||||
Autopilot Plan for Task #42 [analytics]: User metrics tracking
|
||||
─────────────────────────────────────────────────────────────
|
||||
|
||||
Preflight Checks:
|
||||
✓ Working tree is clean
|
||||
✓ Test command detected: npm test
|
||||
✓ Tools available: git, gh, node, npm
|
||||
✓ Current branch: main (will create new branch)
|
||||
✓ Task has 3 subtasks ready to execute
|
||||
|
||||
Branch & Tag:
|
||||
→ Will create branch: analytics/task-42-user-metrics
|
||||
→ Will set active tag: analytics
|
||||
|
||||
Execution Plan (3 subtasks):
|
||||
|
||||
1. Subtask 42.1: Add metrics schema
|
||||
RED: Generate tests → src/__tests__/schema.test.js
|
||||
GREEN: Implement code → src/schema.js
|
||||
COMMIT: "feat(metrics): add metrics schema (task 42.1)"
|
||||
|
||||
2. Subtask 42.2: Add collection endpoint [depends on 42.1]
|
||||
RED: Generate tests → src/api/__tests__/metrics.test.js
|
||||
GREEN: Implement code → src/api/metrics.js
|
||||
COMMIT: "feat(metrics): add collection endpoint (task 42.2)"
|
||||
|
||||
3. Subtask 42.3: Add dashboard widget [depends on 42.2]
|
||||
RED: Generate tests → src/components/__tests__/MetricsWidget.test.jsx
|
||||
GREEN: Implement code → src/components/MetricsWidget.jsx
|
||||
COMMIT: "feat(metrics): add dashboard widget (task 42.3)"
|
||||
|
||||
Finalization:
|
||||
→ Run full test suite with coverage (threshold: 80%)
|
||||
→ Push branch to origin (will confirm)
|
||||
→ Create PR targeting main
|
||||
|
||||
Estimated commits: 3
|
||||
Estimated duration: ~20-30 minutes (depends on implementation complexity)
|
||||
|
||||
Run without --dry-run to execute.
|
||||
```
|
||||
|
||||
## Success Criteria
|
||||
- Dry-run output is clear and matches expected workflow
|
||||
- Preflight detection works correctly on the project repo
|
||||
- Task loading integrates with existing TaskMaster state
|
||||
- No actual git operations or file modifications occur in dry-run mode
|
||||
|
||||
## Out of Scope
|
||||
- Actual test generation
|
||||
- Actual code implementation
|
||||
- Git operations (branch creation, commits, push)
|
||||
- PR creation
|
||||
- Test execution
|
||||
|
||||
## Implementation Notes
|
||||
- Reuse existing `TaskService` from `packages/tm-core`
|
||||
- Use existing git utilities from `scripts/modules/utils/git-utils.js`
|
||||
- Load task/subtask data from `.taskmaster/tasks/tasks.json`
|
||||
- Detect test command via package.json → scripts.test field
|
||||
|
||||
## Dependencies
|
||||
- Existing TaskMaster CLI structure
|
||||
- Existing task storage format
|
||||
- Git utilities
|
||||
|
||||
## Estimated Effort
|
||||
2-3 days
|
||||
|
||||
## Validation
|
||||
Test dry-run mode with:
|
||||
- Task with 1 subtask
|
||||
- Task with multiple subtasks
|
||||
- Task with dependencies between subtasks
|
||||
- Task without subtasks (should show warning)
|
||||
- Dirty git working tree (should warn)
|
||||
- Missing tools (should error with helpful message)
|
||||
380
.taskmaster/docs/tdd-workflow-phase-1-core-rails.md
Normal file
380
.taskmaster/docs/tdd-workflow-phase-1-core-rails.md
Normal file
@@ -0,0 +1,380 @@
|
||||
# Phase 1: Core Rails - Autonomous TDD Workflow
|
||||
|
||||
## Objective
|
||||
Implement the core autonomous TDD workflow with safe git operations, test generation/execution, and commit gating.
|
||||
|
||||
## Scope
|
||||
- WorkflowOrchestrator with event stream
|
||||
- Git and Test adapters
|
||||
- Subtask loop (RED → GREEN → COMMIT)
|
||||
- Framework-agnostic test generation using Surgical Test Generator
|
||||
- Test execution with detected test command
|
||||
- Commit gating on passing tests and coverage
|
||||
- Branch/tag mapping
|
||||
- Run report persistence
|
||||
|
||||
## Deliverables
|
||||
|
||||
### 1. WorkflowOrchestrator (`packages/tm-core/src/services/workflow-orchestrator.ts`)
|
||||
|
||||
**Responsibilities:**
|
||||
- State machine driving phases: Preflight → Branch/Tag → SubtaskIter → Finalize
|
||||
- Event emission for progress tracking
|
||||
- Coordination of Git, Test, and Executor adapters
|
||||
- Run state persistence
|
||||
|
||||
**API:**
|
||||
```typescript
|
||||
class WorkflowOrchestrator {
|
||||
async executeTask(taskId: string, options: AutopilotOptions): Promise<RunResult>
|
||||
async resume(runId: string): Promise<RunResult>
|
||||
on(event: string, handler: (data: any) => void): void
|
||||
|
||||
// Events emitted:
|
||||
// - 'phase:start' { phase, timestamp }
|
||||
// - 'phase:complete' { phase, status, timestamp }
|
||||
// - 'subtask:start' { subtaskId, phase }
|
||||
// - 'subtask:complete' { subtaskId, phase, status }
|
||||
// - 'test:run' { subtaskId, phase, results }
|
||||
// - 'commit:created' { subtaskId, sha, message }
|
||||
// - 'error' { phase, error, recoverable }
|
||||
}
|
||||
```
|
||||
|
||||
**State Machine Phases:**
|
||||
1. Preflight - validate environment
|
||||
2. BranchSetup - create branch, set tag
|
||||
3. SubtaskLoop - for each subtask: RED → GREEN → COMMIT
|
||||
4. Finalize - full test suite, coverage check
|
||||
5. Complete - run report, cleanup
|
||||
|
||||
### 2. GitAdapter (`packages/tm-core/src/services/git-adapter.ts`)
|
||||
|
||||
**Responsibilities:**
|
||||
- All git operations with safety checks
|
||||
- Branch name generation from tag/task
|
||||
- Confirmation gates for destructive operations
|
||||
|
||||
**API:**
|
||||
```typescript
|
||||
class GitAdapter {
|
||||
async isWorkingTreeClean(): Promise<boolean>
|
||||
async getCurrentBranch(): Promise<string>
|
||||
async getDefaultBranch(): Promise<string>
|
||||
async createBranch(name: string): Promise<void>
|
||||
async checkoutBranch(name: string): Promise<void>
|
||||
async commit(message: string, files?: string[]): Promise<string>
|
||||
async push(branch: string, remote?: string): Promise<void>
|
||||
|
||||
// Safety checks
|
||||
async assertNotOnDefaultBranch(): Promise<void>
|
||||
async assertCleanOrConfirm(): Promise<void>
|
||||
|
||||
// Branch naming
|
||||
generateBranchName(tag: string, taskId: string, slug: string): string
|
||||
}
|
||||
```
|
||||
|
||||
**Guardrails:**
|
||||
- Never allow commits on default branch
|
||||
- Always check working tree before branch creation
|
||||
- Confirm destructive operations unless `--no-confirm` flag
|
||||
|
||||
### 3. TestRunnerAdapter (`packages/tm-core/src/services/test-runner-adapter.ts`)
|
||||
|
||||
**Responsibilities:**
|
||||
- Detect test command from package.json
|
||||
- Execute tests (targeted and full suite)
|
||||
- Parse test results and coverage
|
||||
- Enforce coverage thresholds
|
||||
|
||||
**API:**
|
||||
```typescript
|
||||
class TestRunnerAdapter {
|
||||
async detectTestCommand(): Promise<string>
|
||||
async runTargeted(pattern: string): Promise<TestResults>
|
||||
async runAll(): Promise<TestResults>
|
||||
async getCoverage(): Promise<CoverageReport>
|
||||
async meetsThresholds(coverage: CoverageReport): Promise<boolean>
|
||||
}
|
||||
|
||||
interface TestResults {
|
||||
exitCode: number
|
||||
duration: number
|
||||
summary: {
|
||||
total: number
|
||||
passed: number
|
||||
failed: number
|
||||
skipped: number
|
||||
}
|
||||
failures: Array<{
|
||||
test: string
|
||||
error: string
|
||||
stack?: string
|
||||
}>
|
||||
}
|
||||
|
||||
interface CoverageReport {
|
||||
lines: number
|
||||
branches: number
|
||||
functions: number
|
||||
statements: number
|
||||
}
|
||||
```
|
||||
|
||||
**Detection Logic:**
|
||||
- Check package.json → scripts.test
|
||||
- Support: npm test, pnpm test, yarn test, bun test
|
||||
- Fall back to explicit command from config
|
||||
|
||||
### 4. Test Generation Integration
|
||||
|
||||
**Use Surgical Test Generator:**
|
||||
- Load prompt from `.claude/agents/surgical-test-generator.md`
|
||||
- Compose with task/subtask context
|
||||
- Generate tests via executor (Claude)
|
||||
- Write test files to detected locations
|
||||
|
||||
**Prompt Composition:**
|
||||
```typescript
|
||||
async function composeRedPrompt(subtask: Subtask, context: ProjectContext): Promise<string> {
|
||||
const systemPrompts = [
|
||||
loadFile('.cursor/rules/git_workflow.mdc'),
|
||||
loadFile('.cursor/rules/test_workflow.mdc'),
|
||||
loadFile('.claude/agents/surgical-test-generator.md')
|
||||
]
|
||||
|
||||
const taskContext = formatTaskContext(subtask)
|
||||
const instruction = formatRedInstruction(subtask, context)
|
||||
|
||||
return [
|
||||
...systemPrompts,
|
||||
'<TASK CONTEXT>',
|
||||
taskContext,
|
||||
'<INSTRUCTION>',
|
||||
instruction
|
||||
].join('\n\n')
|
||||
}
|
||||
```
|
||||
|
||||
### 5. Subtask Loop Implementation
|
||||
|
||||
**RED Phase:**
|
||||
1. Compose test generation prompt with subtask context
|
||||
2. Execute via Claude executor
|
||||
3. Parse generated test file paths and code
|
||||
4. Write test files to filesystem
|
||||
5. Run tests to confirm they fail (red state)
|
||||
6. Store test results in run artifacts
|
||||
7. If tests pass unexpectedly, warn and skip to next subtask
|
||||
|
||||
**GREEN Phase:**
|
||||
1. Compose implementation prompt with test failures
|
||||
2. Execute via Claude executor with max attempts (default: 3)
|
||||
3. Parse implementation changes
|
||||
4. Apply changes to filesystem
|
||||
5. Run tests to verify passing (green state)
|
||||
6. If tests still fail after max attempts:
|
||||
- Save current state
|
||||
- Emit pause event
|
||||
- Return resumable checkpoint
|
||||
7. If tests pass, proceed to COMMIT
|
||||
|
||||
**COMMIT Phase:**
|
||||
1. Verify all tests pass
|
||||
2. Check coverage meets thresholds (if enabled)
|
||||
3. Generate conventional commit message
|
||||
4. Stage test files + implementation files
|
||||
5. Commit with message
|
||||
6. Update subtask status to 'done'
|
||||
7. Emit commit event with SHA
|
||||
8. Continue to next subtask
|
||||
|
||||
### 6. Branch & Tag Management
|
||||
|
||||
**Integration with existing tag system:**
|
||||
- Use `scripts/modules/task-manager/tag-management.js`
|
||||
- Explicit tag switching when branch created
|
||||
- Store branch ↔ tag mapping in run state
|
||||
|
||||
**Branch Naming:**
|
||||
- Pattern from config: `{tag}/task-{id}-{slug}`
|
||||
- Default: `analytics/task-42-user-metrics`
|
||||
- Sanitize: lowercase, replace spaces with hyphens
|
||||
|
||||
### 7. Run Artifacts & State Persistence
|
||||
|
||||
**Directory structure:**
|
||||
```
|
||||
.taskmaster/reports/runs/<run-id>/
|
||||
├── manifest.json # run metadata
|
||||
├── log.jsonl # event stream
|
||||
├── commits.txt # commit SHAs
|
||||
├── test-results/
|
||||
│ ├── subtask-42.1-red.json
|
||||
│ ├── subtask-42.1-green.json
|
||||
│ ├── subtask-42.2-red.json
|
||||
│ ├── subtask-42.2-green-attempt1.json
|
||||
│ ├── subtask-42.2-green-attempt2.json
|
||||
│ └── final-suite.json
|
||||
└── state.json # resumable checkpoint
|
||||
```
|
||||
|
||||
**manifest.json:**
|
||||
```json
|
||||
{
|
||||
"runId": "2025-01-15-142033",
|
||||
"taskId": "42",
|
||||
"tag": "analytics",
|
||||
"branch": "analytics/task-42-user-metrics",
|
||||
"startTime": "2025-01-15T14:20:33Z",
|
||||
"endTime": null,
|
||||
"status": "in-progress",
|
||||
"currentPhase": "subtask-loop",
|
||||
"currentSubtask": "42.2",
|
||||
"subtasksCompleted": ["42.1"],
|
||||
"subtasksFailed": [],
|
||||
"totalCommits": 1
|
||||
}
|
||||
```
|
||||
|
||||
**log.jsonl** (append-only event log):
|
||||
```jsonl
|
||||
{"ts":"2025-01-15T14:20:33Z","event":"phase:start","phase":"preflight","status":"ok"}
|
||||
{"ts":"2025-01-15T14:21:00Z","event":"subtask:start","subtask":"42.1","phase":"red"}
|
||||
{"ts":"2025-01-15T14:22:00Z","event":"test:run","subtask":"42.1","phase":"red","results":{"passed":0,"failed":3}}
|
||||
{"ts":"2025-01-15T14:23:00Z","event":"subtask:start","subtask":"42.1","phase":"green"}
|
||||
{"ts":"2025-01-15T14:24:30Z","event":"test:run","subtask":"42.1","phase":"green","attempt":1,"results":{"passed":3,"failed":0}}
|
||||
{"ts":"2025-01-15T14:24:35Z","event":"commit:created","subtask":"42.1","sha":"a1b2c3d","message":"feat(metrics): add metrics schema (task 42.1)"}
|
||||
```
|
||||
|
||||
### 8. CLI Command Implementation
|
||||
|
||||
**Update `tm autopilot` command:**
|
||||
- Remove `--dry-run` only behavior
|
||||
- Execute actual workflow when flag not present
|
||||
- Add progress reporting via orchestrator events
|
||||
- Support `--no-confirm` for CI/automation
|
||||
- Support `--max-attempts` to override default
|
||||
|
||||
**Real-time output:**
|
||||
```bash
|
||||
$ tm autopilot 42
|
||||
|
||||
🚀 Starting autopilot for Task #42 [analytics]: User metrics tracking
|
||||
|
||||
✓ Preflight checks passed
|
||||
✓ Created branch: analytics/task-42-user-metrics
|
||||
✓ Set active tag: analytics
|
||||
|
||||
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
|
||||
|
||||
[1/3] Subtask 42.1: Add metrics schema
|
||||
|
||||
RED Generating tests... ⏳
|
||||
RED ✓ Tests created: src/__tests__/schema.test.js
|
||||
RED ✓ Tests failing: 3 failed, 0 passed
|
||||
|
||||
GREEN Implementing code... ⏳
|
||||
GREEN ✓ Tests passing: 3 passed, 0 failed (attempt 1)
|
||||
|
||||
COMMIT ✓ Committed: a1b2c3d
|
||||
"feat(metrics): add metrics schema (task 42.1)"
|
||||
|
||||
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
|
||||
|
||||
[2/3] Subtask 42.2: Add collection endpoint
|
||||
...
|
||||
```
|
||||
|
||||
## Success Criteria
|
||||
- Can execute a simple task end-to-end without manual intervention
|
||||
- All commits made on feature branch, never on default branch
|
||||
- Tests are generated before implementation (RED → GREEN order enforced)
|
||||
- Only commits when tests pass and coverage meets threshold
|
||||
- Run state is persisted and can be inspected post-run
|
||||
- Clear error messages when things go wrong
|
||||
- Orchestrator events allow CLI to show live progress
|
||||
|
||||
## Configuration
|
||||
|
||||
**Add to `.taskmaster/config.json`:**
|
||||
```json
|
||||
{
|
||||
"autopilot": {
|
||||
"enabled": true,
|
||||
"requireCleanWorkingTree": true,
|
||||
"commitTemplate": "{type}({scope}): {msg}",
|
||||
"defaultCommitType": "feat",
|
||||
"maxGreenAttempts": 3,
|
||||
"testTimeout": 300000
|
||||
},
|
||||
"test": {
|
||||
"runner": "auto",
|
||||
"coverageThresholds": {
|
||||
"lines": 80,
|
||||
"branches": 80,
|
||||
"functions": 80,
|
||||
"statements": 80
|
||||
},
|
||||
"targetedRunPattern": "**/*.test.js"
|
||||
},
|
||||
"git": {
|
||||
"branchPattern": "{tag}/task-{id}-{slug}",
|
||||
"defaultRemote": "origin"
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
## Out of Scope (defer to Phase 2)
|
||||
- PR creation (gh integration)
|
||||
- Resume functionality (`--resume` flag)
|
||||
- Lint/format step
|
||||
- Multiple executor support (only Claude)
|
||||
|
||||
## Implementation Order
|
||||
1. GitAdapter with safety checks
|
||||
2. TestRunnerAdapter with detection logic
|
||||
3. WorkflowOrchestrator state machine skeleton
|
||||
4. RED phase: test generation integration
|
||||
5. GREEN phase: implementation with retry logic
|
||||
6. COMMIT phase: gating and persistence
|
||||
7. CLI command wiring with event handling
|
||||
8. Run artifacts and logging
|
||||
|
||||
## Testing Strategy
|
||||
- Unit tests for each adapter (mock git/test commands)
|
||||
- Integration tests with real git repo (temporary directory)
|
||||
- End-to-end test with sample task in test project
|
||||
- Verify no commits on default branch (security test)
|
||||
- Verify commit gating works (force test failure, ensure no commit)
|
||||
|
||||
## Dependencies
|
||||
- Phase 0 completed (CLI skeleton, preflight checks)
|
||||
- Existing TaskService and executor infrastructure
|
||||
- Surgical Test Generator prompt file exists
|
||||
|
||||
## Estimated Effort
|
||||
2-3 weeks
|
||||
|
||||
## Risks & Mitigations
|
||||
- **Risk:** Test generation produces invalid/wrong tests
|
||||
- **Mitigation:** Use Surgical Test Generator prompt, add manual review step in early iterations
|
||||
|
||||
- **Risk:** Implementation attempts timeout/fail repeatedly
|
||||
- **Mitigation:** Max attempts with pause/resume; store state for manual intervention
|
||||
|
||||
- **Risk:** Coverage parsing fails on different test frameworks
|
||||
- **Mitigation:** Start with one framework (vitest), add parsers incrementally
|
||||
|
||||
- **Risk:** Git operations fail (conflicts, permissions)
|
||||
- **Mitigation:** Detailed error messages, save state before destructive ops
|
||||
|
||||
## Validation
|
||||
Test with:
|
||||
- Simple task (1 subtask, clear requirements)
|
||||
- Medium task (3 subtasks with dependencies)
|
||||
- Task requiring multiple GREEN attempts
|
||||
- Task with dirty working tree (should error)
|
||||
- Task on default branch (should error)
|
||||
- Project without test command (should error with helpful message)
|
||||
433
.taskmaster/docs/tdd-workflow-phase-2-pr-resumability.md
Normal file
433
.taskmaster/docs/tdd-workflow-phase-2-pr-resumability.md
Normal file
@@ -0,0 +1,433 @@
|
||||
# Phase 2: PR + Resumability - Autonomous TDD Workflow
|
||||
|
||||
## Objective
|
||||
Add PR creation with GitHub CLI integration, resumable checkpoints for interrupted runs, and enhanced guardrails with coverage enforcement.
|
||||
|
||||
## Scope
|
||||
- GitHub PR creation via `gh` CLI
|
||||
- Well-formed PR body using run report
|
||||
- Resumable checkpoints and `--resume` flag
|
||||
- Coverage enforcement before finalization
|
||||
- Optional lint/format step
|
||||
- Enhanced error recovery
|
||||
|
||||
## Deliverables
|
||||
|
||||
### 1. PR Creation Integration
|
||||
|
||||
**PRAdapter** (`packages/tm-core/src/services/pr-adapter.ts`):
|
||||
```typescript
|
||||
class PRAdapter {
|
||||
async isGHAvailable(): Promise<boolean>
|
||||
async createPR(options: PROptions): Promise<PRResult>
|
||||
async getPRTemplate(runReport: RunReport): Promise<string>
|
||||
|
||||
// Fallback for missing gh CLI
|
||||
async getManualPRInstructions(options: PROptions): Promise<string>
|
||||
}
|
||||
|
||||
interface PROptions {
|
||||
branch: string
|
||||
base: string
|
||||
title: string
|
||||
body: string
|
||||
draft?: boolean
|
||||
}
|
||||
|
||||
interface PRResult {
|
||||
url: string
|
||||
number: number
|
||||
}
|
||||
```
|
||||
|
||||
**PR Title Format:**
|
||||
```
|
||||
Task #<id> [<tag>]: <title>
|
||||
```
|
||||
|
||||
Example: `Task #42 [analytics]: User metrics tracking`
|
||||
|
||||
**PR Body Template:**
|
||||
|
||||
Located at `.taskmaster/templates/pr-body.md`:
|
||||
|
||||
```markdown
|
||||
## Summary
|
||||
|
||||
Implements Task #42 from TaskMaster autonomous workflow.
|
||||
|
||||
**Branch:** {branch}
|
||||
**Tag:** {tag}
|
||||
**Subtasks completed:** {subtaskCount}
|
||||
|
||||
{taskDescription}
|
||||
|
||||
## Subtasks
|
||||
|
||||
{subtasksList}
|
||||
|
||||
## Test Coverage
|
||||
|
||||
| Metric | Coverage |
|
||||
|--------|----------|
|
||||
| Lines | {lines}% |
|
||||
| Branches | {branches}% |
|
||||
| Functions | {functions}% |
|
||||
| Statements | {statements}% |
|
||||
|
||||
**All subtasks passed with {totalTests} tests.**
|
||||
|
||||
## Commits
|
||||
|
||||
{commitsList}
|
||||
|
||||
## Run Report
|
||||
|
||||
Full execution report: `.taskmaster/reports/runs/{runId}/`
|
||||
|
||||
---
|
||||
|
||||
🤖 Generated with [Task Master](https://github.com/cline/task-master) autonomous TDD workflow
|
||||
```
|
||||
|
||||
**Token replacement:**
|
||||
- `{branch}` → branch name
|
||||
- `{tag}` → active tag
|
||||
- `{subtaskCount}` → number of completed subtasks
|
||||
- `{taskDescription}` → task description from TaskMaster
|
||||
- `{subtasksList}` → markdown list of subtask titles
|
||||
- `{lines}`, `{branches}`, `{functions}`, `{statements}` → coverage percentages
|
||||
- `{totalTests}` → total test count
|
||||
- `{commitsList}` → markdown list of commit SHAs and messages
|
||||
- `{runId}` → run ID timestamp
|
||||
|
||||
### 2. GitHub CLI Integration
|
||||
|
||||
**Detection:**
|
||||
```bash
|
||||
which gh
|
||||
```
|
||||
|
||||
If not found, show fallback instructions:
|
||||
```bash
|
||||
✓ Branch pushed: analytics/task-42-user-metrics
|
||||
✗ gh CLI not found - cannot create PR automatically
|
||||
|
||||
To create PR manually:
|
||||
gh pr create \
|
||||
--base main \
|
||||
--head analytics/task-42-user-metrics \
|
||||
--title "Task #42 [analytics]: User metrics tracking" \
|
||||
--body-file .taskmaster/reports/runs/2025-01-15-142033/pr.md
|
||||
|
||||
Or visit:
|
||||
https://github.com/org/repo/compare/main...analytics/task-42-user-metrics
|
||||
```
|
||||
|
||||
**Confirmation gate:**
|
||||
```bash
|
||||
Ready to create PR:
|
||||
Title: Task #42 [analytics]: User metrics tracking
|
||||
Base: main
|
||||
Head: analytics/task-42-user-metrics
|
||||
|
||||
Create PR? [Y/n]
|
||||
```
|
||||
|
||||
Unless `--no-confirm` flag is set.
|
||||
|
||||
### 3. Resumable Workflow
|
||||
|
||||
**State Checkpoint** (`state.json`):
|
||||
```json
|
||||
{
|
||||
"runId": "2025-01-15-142033",
|
||||
"taskId": "42",
|
||||
"phase": "subtask-loop",
|
||||
"currentSubtask": "42.2",
|
||||
"currentPhase": "green",
|
||||
"attempts": 2,
|
||||
"completedSubtasks": ["42.1"],
|
||||
"commits": ["a1b2c3d"],
|
||||
"branch": "analytics/task-42-user-metrics",
|
||||
"tag": "analytics",
|
||||
"canResume": true,
|
||||
"pausedAt": "2025-01-15T14:25:35Z",
|
||||
"pausedReason": "max_attempts_reached",
|
||||
"nextAction": "manual_review_required"
|
||||
}
|
||||
```
|
||||
|
||||
**Resume Command:**
|
||||
```bash
|
||||
$ tm autopilot --resume
|
||||
|
||||
Resuming run: 2025-01-15-142033
|
||||
Task: #42 [analytics] User metrics tracking
|
||||
Branch: analytics/task-42-user-metrics
|
||||
Last subtask: 42.2 (GREEN phase, attempt 2/3 failed)
|
||||
Paused: 5 minutes ago
|
||||
|
||||
Reason: Could not achieve green state after 3 attempts
|
||||
Last error: POST /api/metrics returns 500 instead of 201
|
||||
|
||||
Resume from subtask 42.2 GREEN phase? [Y/n]
|
||||
```
|
||||
|
||||
**Resume logic:**
|
||||
1. Load state from `.taskmaster/reports/runs/<runId>/state.json`
|
||||
2. Verify branch still exists and is checked out
|
||||
3. Verify no uncommitted changes (unless `--force`)
|
||||
4. Continue from last checkpoint phase
|
||||
5. Update state file as execution progresses
|
||||
|
||||
**Multiple interrupted runs:**
|
||||
```bash
|
||||
$ tm autopilot --resume
|
||||
|
||||
Found 2 resumable runs:
|
||||
1. 2025-01-15-142033 - Task #42 (paused 5 min ago at subtask 42.2 GREEN)
|
||||
2. 2025-01-14-103022 - Task #38 (paused 2 hours ago at subtask 38.3 RED)
|
||||
|
||||
Select run to resume [1-2]:
|
||||
```
|
||||
|
||||
### 4. Coverage Enforcement
|
||||
|
||||
**Coverage Check Phase** (before finalization):
|
||||
```typescript
|
||||
async function enforceCoverage(runId: string): Promise<void> {
|
||||
const testResults = await testRunner.runAll()
|
||||
const coverage = await testRunner.getCoverage()
|
||||
|
||||
const thresholds = config.test.coverageThresholds
|
||||
const failures = []
|
||||
|
||||
if (coverage.lines < thresholds.lines) {
|
||||
failures.push(`Lines: ${coverage.lines}% < ${thresholds.lines}%`)
|
||||
}
|
||||
// ... check branches, functions, statements
|
||||
|
||||
if (failures.length > 0) {
|
||||
throw new CoverageError(
|
||||
`Coverage thresholds not met:\n${failures.join('\n')}`
|
||||
)
|
||||
}
|
||||
|
||||
// Store coverage in run report
|
||||
await storeRunArtifact(runId, 'coverage.json', coverage)
|
||||
}
|
||||
```
|
||||
|
||||
**Handling coverage failures:**
|
||||
```bash
|
||||
⚠️ Coverage check failed:
|
||||
Lines: 78.5% < 80%
|
||||
Branches: 75.0% < 80%
|
||||
|
||||
Options:
|
||||
1. Add more tests and resume
|
||||
2. Lower thresholds in .taskmaster/config.json
|
||||
3. Skip coverage check: tm autopilot --resume --skip-coverage
|
||||
|
||||
Run paused. Fix coverage and resume with:
|
||||
tm autopilot --resume
|
||||
```
|
||||
|
||||
### 5. Optional Lint/Format Step
|
||||
|
||||
**Configuration:**
|
||||
```json
|
||||
{
|
||||
"autopilot": {
|
||||
"finalization": {
|
||||
"lint": {
|
||||
"enabled": true,
|
||||
"command": "npm run lint",
|
||||
"fix": true,
|
||||
"failOnError": false
|
||||
},
|
||||
"format": {
|
||||
"enabled": true,
|
||||
"command": "npm run format",
|
||||
"commitChanges": true
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
**Execution:**
|
||||
```bash
|
||||
Finalization Steps:
|
||||
|
||||
✓ All tests passing (12 tests, 0 failures)
|
||||
✓ Coverage thresholds met (85% lines, 82% branches)
|
||||
|
||||
LINT Running linter... ⏳
|
||||
LINT ✓ No lint errors
|
||||
|
||||
FORMAT Running formatter... ⏳
|
||||
FORMAT ✓ Formatted 3 files
|
||||
FORMAT ✓ Committed formatting changes: "chore: auto-format code"
|
||||
|
||||
PUSH Pushing to origin... ⏳
|
||||
PUSH ✓ Pushed analytics/task-42-user-metrics
|
||||
|
||||
PR Creating pull request... ⏳
|
||||
PR ✓ Created PR #123
|
||||
https://github.com/org/repo/pull/123
|
||||
```
|
||||
|
||||
### 6. Enhanced Error Recovery
|
||||
|
||||
**Pause Points:**
|
||||
- Max GREEN attempts reached (current)
|
||||
- Coverage check failed (new)
|
||||
- Lint errors (if `failOnError: true`)
|
||||
- Git push failed (new)
|
||||
- PR creation failed (new)
|
||||
|
||||
**Each pause saves:**
|
||||
- Full state checkpoint
|
||||
- Last command output
|
||||
- Suggested next actions
|
||||
- Resume instructions
|
||||
|
||||
**Automatic recovery attempts:**
|
||||
- Git push: retry up to 3 times with backoff
|
||||
- PR creation: fall back to manual instructions
|
||||
- Lint: auto-fix if enabled, otherwise pause
|
||||
|
||||
### 7. Finalization Phase Enhancement
|
||||
|
||||
**Updated workflow:**
|
||||
1. Run full test suite
|
||||
2. Check coverage thresholds → pause if failed
|
||||
3. Run lint (if enabled) → pause if failed and `failOnError: true`
|
||||
4. Run format (if enabled) → auto-commit changes
|
||||
5. Confirm push (unless `--no-confirm`)
|
||||
6. Push branch → retry on failure
|
||||
7. Generate PR body from template
|
||||
8. Create PR via gh → fall back to manual instructions
|
||||
9. Update task status to 'review' (configurable)
|
||||
10. Save final run report
|
||||
|
||||
**Final output:**
|
||||
```bash
|
||||
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
|
||||
|
||||
✅ Task #42 [analytics]: User metrics tracking - COMPLETE
|
||||
|
||||
Branch: analytics/task-42-user-metrics
|
||||
Subtasks completed: 3/3
|
||||
Commits: 3
|
||||
Total tests: 12 (12 passed, 0 failed)
|
||||
Coverage: 85% lines, 82% branches, 88% functions, 85% statements
|
||||
|
||||
PR #123: https://github.com/org/repo/pull/123
|
||||
|
||||
Run report: .taskmaster/reports/runs/2025-01-15-142033/
|
||||
|
||||
Next steps:
|
||||
- Review PR and request changes if needed
|
||||
- Merge when ready
|
||||
- Task status updated to 'review'
|
||||
|
||||
Completed in 24 minutes
|
||||
```
|
||||
|
||||
## CLI Updates
|
||||
|
||||
**New flags:**
|
||||
- `--resume` → Resume from last checkpoint
|
||||
- `--skip-coverage` → Skip coverage checks
|
||||
- `--skip-lint` → Skip lint step
|
||||
- `--skip-format` → Skip format step
|
||||
- `--skip-pr` → Push branch but don't create PR
|
||||
- `--draft-pr` → Create draft PR instead of ready-for-review
|
||||
|
||||
## Configuration Updates
|
||||
|
||||
**Add to `.taskmaster/config.json`:**
|
||||
```json
|
||||
{
|
||||
"autopilot": {
|
||||
"finalization": {
|
||||
"lint": {
|
||||
"enabled": false,
|
||||
"command": "npm run lint",
|
||||
"fix": true,
|
||||
"failOnError": false
|
||||
},
|
||||
"format": {
|
||||
"enabled": false,
|
||||
"command": "npm run format",
|
||||
"commitChanges": true
|
||||
},
|
||||
"updateTaskStatus": "review"
|
||||
}
|
||||
},
|
||||
"git": {
|
||||
"pr": {
|
||||
"enabled": true,
|
||||
"base": "default",
|
||||
"bodyTemplate": ".taskmaster/templates/pr-body.md",
|
||||
"draft": false
|
||||
},
|
||||
"pushRetries": 3,
|
||||
"pushRetryDelay": 5000
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
## Success Criteria
|
||||
- Can create PR automatically with well-formed body
|
||||
- Can resume interrupted runs from any checkpoint
|
||||
- Coverage checks prevent low-quality code from being merged
|
||||
- Clear error messages and recovery paths for all failure modes
|
||||
- Run reports include full PR context for review
|
||||
|
||||
## Out of Scope (defer to Phase 3)
|
||||
- Multiple test framework support (pytest, go test)
|
||||
- Diff preview before commits
|
||||
- TUI panel implementation
|
||||
- Extension/IDE integration
|
||||
|
||||
## Testing Strategy
|
||||
- Mock `gh` CLI for PR creation tests
|
||||
- Test resume from each possible pause point
|
||||
- Test coverage failure scenarios
|
||||
- Test lint/format integration with mock commands
|
||||
- End-to-end test with PR creation on test repo
|
||||
|
||||
## Dependencies
|
||||
- Phase 1 completed (core workflow)
|
||||
- GitHub CLI (`gh`) installed (optional, fallback provided)
|
||||
- Test framework supports coverage output
|
||||
|
||||
## Estimated Effort
|
||||
1-2 weeks
|
||||
|
||||
## Risks & Mitigations
|
||||
- **Risk:** GitHub CLI auth issues
|
||||
- **Mitigation:** Clear auth setup docs, fallback to manual instructions
|
||||
|
||||
- **Risk:** PR body template doesn't match all project needs
|
||||
- **Mitigation:** Make template customizable via config path
|
||||
|
||||
- **Risk:** Resume state gets corrupted
|
||||
- **Mitigation:** Validate state on load, provide --force-reset option
|
||||
|
||||
- **Risk:** Coverage calculation differs between runs
|
||||
- **Mitigation:** Store coverage with each test run for comparison
|
||||
|
||||
## Validation
|
||||
Test with:
|
||||
- Successful PR creation end-to-end
|
||||
- Resume from GREEN attempt failure
|
||||
- Resume from coverage failure
|
||||
- Resume from lint failure
|
||||
- Missing `gh` CLI (fallback to manual)
|
||||
- Lint/format integration enabled
|
||||
- Multiple interrupted runs (selection UI)
|
||||
@@ -0,0 +1,534 @@
|
||||
# Phase 3: Extensibility + Guardrails - Autonomous TDD Workflow
|
||||
|
||||
## Objective
|
||||
Add multi-language/framework support, enhanced safety guardrails, TUI interface, and extensibility for IDE/editor integration.
|
||||
|
||||
## Scope
|
||||
- Multi-language test runner support (pytest, go test, etc.)
|
||||
- Enhanced safety: diff preview, confirmation gates, minimal-change prompts
|
||||
- Optional TUI panel with tmux integration
|
||||
- State-based extension API for IDE integration
|
||||
- Parallel subtask execution (experimental)
|
||||
|
||||
## Deliverables
|
||||
|
||||
### 1. Multi-Language Test Runner Support
|
||||
|
||||
**Extend TestRunnerAdapter:**
|
||||
```typescript
|
||||
class TestRunnerAdapter {
|
||||
// Existing methods...
|
||||
|
||||
async detectLanguage(): Promise<Language>
|
||||
async detectFramework(language: Language): Promise<Framework>
|
||||
async getFrameworkAdapter(framework: Framework): Promise<FrameworkAdapter>
|
||||
}
|
||||
|
||||
enum Language {
|
||||
JavaScript = 'javascript',
|
||||
TypeScript = 'typescript',
|
||||
Python = 'python',
|
||||
Go = 'go',
|
||||
Rust = 'rust'
|
||||
}
|
||||
|
||||
enum Framework {
|
||||
Vitest = 'vitest',
|
||||
Jest = 'jest',
|
||||
Pytest = 'pytest',
|
||||
GoTest = 'gotest',
|
||||
CargoTest = 'cargotest'
|
||||
}
|
||||
|
||||
interface FrameworkAdapter {
|
||||
runTargeted(pattern: string): Promise<TestResults>
|
||||
runAll(): Promise<TestResults>
|
||||
parseCoverage(output: string): Promise<CoverageReport>
|
||||
getTestFilePattern(): string
|
||||
getTestFileExtension(): string
|
||||
}
|
||||
```
|
||||
|
||||
**Framework-specific adapters:**
|
||||
|
||||
**PytestAdapter** (`packages/tm-core/src/services/test-adapters/pytest-adapter.ts`):
|
||||
```typescript
|
||||
class PytestAdapter implements FrameworkAdapter {
|
||||
async runTargeted(pattern: string): Promise<TestResults> {
|
||||
const output = await exec(`pytest ${pattern} --json-report`)
|
||||
return this.parseResults(output)
|
||||
}
|
||||
|
||||
async runAll(): Promise<TestResults> {
|
||||
const output = await exec('pytest --cov --json-report')
|
||||
return this.parseResults(output)
|
||||
}
|
||||
|
||||
parseCoverage(output: string): Promise<CoverageReport> {
|
||||
// Parse pytest-cov XML output
|
||||
}
|
||||
|
||||
getTestFilePattern(): string {
|
||||
return '**/test_*.py'
|
||||
}
|
||||
|
||||
getTestFileExtension(): string {
|
||||
return '.py'
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
**GoTestAdapter** (`packages/tm-core/src/services/test-adapters/gotest-adapter.ts`):
|
||||
```typescript
|
||||
class GoTestAdapter implements FrameworkAdapter {
|
||||
async runTargeted(pattern: string): Promise<TestResults> {
|
||||
const output = await exec(`go test ${pattern} -json`)
|
||||
return this.parseResults(output)
|
||||
}
|
||||
|
||||
async runAll(): Promise<TestResults> {
|
||||
const output = await exec('go test ./... -coverprofile=coverage.out -json')
|
||||
return this.parseResults(output)
|
||||
}
|
||||
|
||||
parseCoverage(output: string): Promise<CoverageReport> {
|
||||
// Parse go test coverage output
|
||||
}
|
||||
|
||||
getTestFilePattern(): string {
|
||||
return '**/*_test.go'
|
||||
}
|
||||
|
||||
getTestFileExtension(): string {
|
||||
return '_test.go'
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
**Detection Logic:**
|
||||
```typescript
|
||||
async function detectFramework(): Promise<Framework> {
|
||||
// Check for package.json
|
||||
if (await exists('package.json')) {
|
||||
const pkg = await readJSON('package.json')
|
||||
if (pkg.devDependencies?.vitest) return Framework.Vitest
|
||||
if (pkg.devDependencies?.jest) return Framework.Jest
|
||||
}
|
||||
|
||||
// Check for Python files
|
||||
if (await exists('pytest.ini') || await exists('setup.py')) {
|
||||
return Framework.Pytest
|
||||
}
|
||||
|
||||
// Check for Go files
|
||||
if (await exists('go.mod')) {
|
||||
return Framework.GoTest
|
||||
}
|
||||
|
||||
// Check for Rust files
|
||||
if (await exists('Cargo.toml')) {
|
||||
return Framework.CargoTest
|
||||
}
|
||||
|
||||
throw new Error('Could not detect test framework')
|
||||
}
|
||||
```
|
||||
|
||||
### 2. Enhanced Safety Guardrails
|
||||
|
||||
**Diff Preview Mode:**
|
||||
```bash
|
||||
$ tm autopilot 42 --preview-diffs
|
||||
|
||||
[2/3] Subtask 42.2: Add collection endpoint
|
||||
|
||||
RED ✓ Tests created: src/api/__tests__/metrics.test.js
|
||||
|
||||
GREEN Implementing code...
|
||||
|
||||
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
|
||||
Proposed changes (src/api/metrics.js):
|
||||
|
||||
+ import { MetricsSchema } from '../models/schema.js'
|
||||
+
|
||||
+ export async function createMetric(data) {
|
||||
+ const validated = MetricsSchema.parse(data)
|
||||
+ const result = await db.metrics.create(validated)
|
||||
+ return result
|
||||
+ }
|
||||
|
||||
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
|
||||
|
||||
Apply these changes? [Y/n/e(dit)/s(kip)]
|
||||
Y - Apply and continue
|
||||
n - Reject and retry GREEN phase
|
||||
e - Open in editor for manual changes
|
||||
s - Skip this subtask
|
||||
```
|
||||
|
||||
**Minimal Change Enforcement:**
|
||||
|
||||
Add to system prompt:
|
||||
```markdown
|
||||
CRITICAL: Make MINIMAL changes to pass the failing tests.
|
||||
- Only modify files directly related to the subtask
|
||||
- Do not refactor existing code unless absolutely necessary
|
||||
- Do not add features beyond the acceptance criteria
|
||||
- Keep changes under 50 lines per file when possible
|
||||
- Prefer composition over modification
|
||||
```
|
||||
|
||||
**Change Size Warnings:**
|
||||
```bash
|
||||
⚠️ Large change detected:
|
||||
Files modified: 5
|
||||
Lines changed: +234, -12
|
||||
|
||||
This subtask was expected to be small (~50 lines).
|
||||
Consider:
|
||||
- Breaking into smaller subtasks
|
||||
- Reviewing acceptance criteria
|
||||
- Checking for unintended changes
|
||||
|
||||
Continue anyway? [y/N]
|
||||
```
|
||||
|
||||
### 3. TUI Interface with tmux
|
||||
|
||||
**Layout:**
|
||||
```
|
||||
┌──────────────────────────────────┬─────────────────────────────────┐
|
||||
│ Task Navigator (left) │ Executor Terminal (right) │
|
||||
│ │ │
|
||||
│ Project: my-app │ $ tm autopilot --executor-mode │
|
||||
│ Branch: analytics/task-42 │ > Running subtask 42.2 GREEN... │
|
||||
│ Tag: analytics │ > Implementing endpoint... │
|
||||
│ │ > Tests: 3 passed, 0 failed │
|
||||
│ Tasks: │ > Ready to commit │
|
||||
│ → 42 [in-progress] User metrics │ │
|
||||
│ → 42.1 [done] Schema │ [Live output from executor] │
|
||||
│ → 42.2 [active] Endpoint ◀ │ │
|
||||
│ → 42.3 [pending] Dashboard │ │
|
||||
│ │ │
|
||||
│ [s] start [p] pause [q] quit │ │
|
||||
└──────────────────────────────────┴─────────────────────────────────┘
|
||||
```
|
||||
|
||||
**Implementation:**
|
||||
|
||||
**TUI Navigator** (`apps/cli/src/ui/tui/navigator.ts`):
|
||||
```typescript
|
||||
import blessed from 'blessed'
|
||||
|
||||
class AutopilotTUI {
|
||||
private screen: blessed.Widgets.Screen
|
||||
private taskList: blessed.Widgets.ListElement
|
||||
private statusBox: blessed.Widgets.BoxElement
|
||||
private executorPane: string // tmux pane ID
|
||||
|
||||
async start(taskId?: string) {
|
||||
// Create blessed screen
|
||||
this.screen = blessed.screen()
|
||||
|
||||
// Create task list widget
|
||||
this.taskList = blessed.list({
|
||||
label: 'Tasks',
|
||||
keys: true,
|
||||
vi: true,
|
||||
style: { selected: { bg: 'blue' } }
|
||||
})
|
||||
|
||||
// Spawn tmux pane for executor
|
||||
this.executorPane = await this.spawnExecutorPane()
|
||||
|
||||
// Watch state file for updates
|
||||
this.watchStateFile()
|
||||
|
||||
// Handle keybindings
|
||||
this.setupKeybindings()
|
||||
}
|
||||
|
||||
private async spawnExecutorPane(): Promise<string> {
|
||||
const paneId = await exec('tmux split-window -h -P -F "#{pane_id}"')
|
||||
await exec(`tmux send-keys -t ${paneId} "tm autopilot --executor-mode" Enter`)
|
||||
return paneId.trim()
|
||||
}
|
||||
|
||||
private watchStateFile() {
|
||||
watch('.taskmaster/state/current-run.json', (event, filename) => {
|
||||
this.updateDisplay()
|
||||
})
|
||||
}
|
||||
|
||||
private setupKeybindings() {
|
||||
this.screen.key(['s'], () => this.startTask())
|
||||
this.screen.key(['p'], () => this.pauseTask())
|
||||
this.screen.key(['q'], () => this.quit())
|
||||
this.screen.key(['up', 'down'], () => this.navigateTasks())
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
**Executor Mode:**
|
||||
```bash
|
||||
$ tm autopilot 42 --executor-mode
|
||||
|
||||
# Runs in executor pane, writes state to shared file
|
||||
# Left pane reads state file and updates display
|
||||
```
|
||||
|
||||
**State File** (`.taskmaster/state/current-run.json`):
|
||||
```json
|
||||
{
|
||||
"runId": "2025-01-15-142033",
|
||||
"taskId": "42",
|
||||
"status": "running",
|
||||
"currentPhase": "green",
|
||||
"currentSubtask": "42.2",
|
||||
"lastOutput": "Implementing endpoint...",
|
||||
"testsStatus": {
|
||||
"passed": 3,
|
||||
"failed": 0
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### 4. Extension API for IDE Integration
|
||||
|
||||
**State-based API:**
|
||||
|
||||
Expose run state via JSON files that IDEs can read:
|
||||
- `.taskmaster/state/current-run.json` - live run state
|
||||
- `.taskmaster/reports/runs/<runId>/manifest.json` - run metadata
|
||||
- `.taskmaster/reports/runs/<runId>/log.jsonl` - event stream
|
||||
|
||||
**WebSocket API (optional):**
|
||||
```typescript
|
||||
// packages/tm-core/src/services/autopilot-server.ts
|
||||
class AutopilotServer {
|
||||
private wss: WebSocketServer
|
||||
|
||||
start(port: number = 7890) {
|
||||
this.wss = new WebSocketServer({ port })
|
||||
|
||||
this.wss.on('connection', (ws) => {
|
||||
// Send current state
|
||||
ws.send(JSON.stringify(this.getCurrentState()))
|
||||
|
||||
// Stream events
|
||||
this.orchestrator.on('*', (event) => {
|
||||
ws.send(JSON.stringify(event))
|
||||
})
|
||||
})
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
**Usage from IDE extension:**
|
||||
```typescript
|
||||
// VS Code extension example
|
||||
const ws = new WebSocket('ws://localhost:7890')
|
||||
|
||||
ws.on('message', (data) => {
|
||||
const event = JSON.parse(data)
|
||||
|
||||
if (event.type === 'subtask:complete') {
|
||||
vscode.window.showInformationMessage(
|
||||
`Subtask ${event.subtaskId} completed`
|
||||
)
|
||||
}
|
||||
})
|
||||
```
|
||||
|
||||
### 5. Parallel Subtask Execution (Experimental)
|
||||
|
||||
**Dependency Analysis:**
|
||||
```typescript
|
||||
class SubtaskScheduler {
|
||||
async buildDependencyGraph(subtasks: Subtask[]): Promise<DAG> {
|
||||
const graph = new DAG()
|
||||
|
||||
for (const subtask of subtasks) {
|
||||
graph.addNode(subtask.id)
|
||||
|
||||
for (const depId of subtask.dependencies) {
|
||||
graph.addEdge(depId, subtask.id)
|
||||
}
|
||||
}
|
||||
|
||||
return graph
|
||||
}
|
||||
|
||||
async getParallelBatches(graph: DAG): Promise<Subtask[][]> {
|
||||
const batches: Subtask[][] = []
|
||||
const completed = new Set<string>()
|
||||
|
||||
while (completed.size < graph.size()) {
|
||||
const ready = graph.nodes.filter(node =>
|
||||
!completed.has(node.id) &&
|
||||
node.dependencies.every(dep => completed.has(dep))
|
||||
)
|
||||
|
||||
batches.push(ready)
|
||||
ready.forEach(node => completed.add(node.id))
|
||||
}
|
||||
|
||||
return batches
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
**Parallel Execution:**
|
||||
```bash
|
||||
$ tm autopilot 42 --parallel
|
||||
|
||||
[Batch 1] Running 2 subtasks in parallel:
|
||||
→ 42.1: Add metrics schema
|
||||
→ 42.4: Add API documentation
|
||||
|
||||
42.1 RED ✓ Tests created
|
||||
42.4 RED ✓ Tests created
|
||||
|
||||
42.1 GREEN ✓ Implementation complete
|
||||
42.4 GREEN ✓ Implementation complete
|
||||
|
||||
42.1 COMMIT ✓ Committed: a1b2c3d
|
||||
42.4 COMMIT ✓ Committed: e5f6g7h
|
||||
|
||||
[Batch 2] Running 2 subtasks in parallel (depend on 42.1):
|
||||
→ 42.2: Add collection endpoint
|
||||
→ 42.3: Add dashboard widget
|
||||
...
|
||||
```
|
||||
|
||||
**Conflict Detection:**
|
||||
```typescript
|
||||
async function detectConflicts(subtasks: Subtask[]): Promise<Conflict[]> {
|
||||
const conflicts: Conflict[] = []
|
||||
|
||||
for (let i = 0; i < subtasks.length; i++) {
|
||||
for (let j = i + 1; j < subtasks.length; j++) {
|
||||
const filesA = await predictAffectedFiles(subtasks[i])
|
||||
const filesB = await predictAffectedFiles(subtasks[j])
|
||||
|
||||
const overlap = filesA.filter(f => filesB.includes(f))
|
||||
|
||||
if (overlap.length > 0) {
|
||||
conflicts.push({
|
||||
subtasks: [subtasks[i].id, subtasks[j].id],
|
||||
files: overlap
|
||||
})
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
return conflicts
|
||||
}
|
||||
```
|
||||
|
||||
### 6. Advanced Configuration
|
||||
|
||||
**Add to `.taskmaster/config.json`:**
|
||||
```json
|
||||
{
|
||||
"autopilot": {
|
||||
"safety": {
|
||||
"previewDiffs": false,
|
||||
"maxChangeLinesPerFile": 100,
|
||||
"warnOnLargeChanges": true,
|
||||
"requireConfirmOnLargeChanges": true
|
||||
},
|
||||
"parallel": {
|
||||
"enabled": false,
|
||||
"maxConcurrent": 3,
|
||||
"detectConflicts": true
|
||||
},
|
||||
"tui": {
|
||||
"enabled": false,
|
||||
"tmuxSession": "taskmaster-autopilot"
|
||||
},
|
||||
"api": {
|
||||
"enabled": false,
|
||||
"port": 7890,
|
||||
"allowRemote": false
|
||||
}
|
||||
},
|
||||
"test": {
|
||||
"frameworks": {
|
||||
"python": {
|
||||
"runner": "pytest",
|
||||
"coverageCommand": "pytest --cov",
|
||||
"testPattern": "**/test_*.py"
|
||||
},
|
||||
"go": {
|
||||
"runner": "go test",
|
||||
"coverageCommand": "go test ./... -coverprofile=coverage.out",
|
||||
"testPattern": "**/*_test.go"
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
## CLI Updates
|
||||
|
||||
**New commands:**
|
||||
```bash
|
||||
tm autopilot <taskId> --tui # Launch TUI interface
|
||||
tm autopilot <taskId> --parallel # Enable parallel execution
|
||||
tm autopilot <taskId> --preview-diffs # Show diffs before applying
|
||||
tm autopilot <taskId> --executor-mode # Run as executor pane
|
||||
tm autopilot-server start # Start WebSocket API
|
||||
```
|
||||
|
||||
## Success Criteria
|
||||
- Supports Python projects with pytest
|
||||
- Supports Go projects with go test
|
||||
- Diff preview prevents unwanted changes
|
||||
- TUI provides better visibility for long-running tasks
|
||||
- IDE extensions can integrate via state files or WebSocket
|
||||
- Parallel execution reduces total time for independent subtasks
|
||||
|
||||
## Out of Scope
|
||||
- Full Electron/web GUI
|
||||
- AI executor selection UI (defer to Phase 4)
|
||||
- Multi-repository support
|
||||
- Remote execution on cloud runners
|
||||
|
||||
## Testing Strategy
|
||||
- Test with Python project (pytest)
|
||||
- Test with Go project (go test)
|
||||
- Test diff preview UI with mock changes
|
||||
- Test parallel execution with independent subtasks
|
||||
- Test conflict detection with overlapping file changes
|
||||
- Test TUI with mock tmux environment
|
||||
|
||||
## Dependencies
|
||||
- Phase 2 completed (PR + resumability)
|
||||
- tmux installed (for TUI)
|
||||
- blessed or ink library (for TUI rendering)
|
||||
|
||||
## Estimated Effort
|
||||
3-4 weeks
|
||||
|
||||
## Risks & Mitigations
|
||||
- **Risk:** Parallel execution causes git conflicts
|
||||
- **Mitigation:** Conservative conflict detection, sequential fallback
|
||||
|
||||
- **Risk:** TUI adds complexity and maintenance burden
|
||||
- **Mitigation:** Keep TUI optional, state-based design allows alternatives
|
||||
|
||||
- **Risk:** Framework adapters hard to maintain across versions
|
||||
- **Mitigation:** Abstract common parsing logic, document adapter interface
|
||||
|
||||
- **Risk:** Diff preview slows down workflow
|
||||
- **Mitigation:** Make optional, use --preview-diffs flag only when needed
|
||||
|
||||
## Validation
|
||||
Test with:
|
||||
- Python project with pytest and pytest-cov
|
||||
- Go project with go test
|
||||
- Large changes requiring confirmation
|
||||
- Parallel execution with 3+ independent subtasks
|
||||
- TUI with task selection and live status updates
|
||||
- VS Code extension reading state files
|
||||
Reference in New Issue
Block a user