ros/claude-task-master

Fork 0

Files

Ralph Khreish ae65a7e6c2 chore: prepare branch

2025-10-16 22:32:20 +02:00

12 KiB

Raw Blame History

Phase 1: Core Rails - Autonomous TDD Workflow

Objective

Implement the core autonomous TDD workflow with safe git operations, test generation/execution, and commit gating.

Scope

WorkflowOrchestrator with event stream
Git and Test adapters
Subtask loop (RED → GREEN → COMMIT)
Framework-agnostic test generation using Surgical Test Generator
Test execution with detected test command
Commit gating on passing tests and coverage
Branch/tag mapping
Run report persistence

Deliverables

1. WorkflowOrchestrator (`packages/tm-core/src/services/workflow-orchestrator.ts`)

Responsibilities:

State machine driving phases: Preflight → Branch/Tag → SubtaskIter → Finalize
Event emission for progress tracking
Coordination of Git, Test, and Executor adapters
Run state persistence

API:

class WorkflowOrchestrator {
  async executeTask(taskId: string, options: AutopilotOptions): Promise<RunResult>
  async resume(runId: string): Promise<RunResult>
  on(event: string, handler: (data: any) => void): void

  // Events emitted:
  // - 'phase:start' { phase, timestamp }
  // - 'phase:complete' { phase, status, timestamp }
  // - 'subtask:start' { subtaskId, phase }
  // - 'subtask:complete' { subtaskId, phase, status }
  // - 'test:run' { subtaskId, phase, results }
  // - 'commit:created' { subtaskId, sha, message }
  // - 'error' { phase, error, recoverable }
}

State Machine Phases:

Preflight - validate environment
BranchSetup - create branch, set tag
SubtaskLoop - for each subtask: RED → GREEN → COMMIT
Finalize - full test suite, coverage check
Complete - run report, cleanup

2. GitAdapter (`packages/tm-core/src/services/git-adapter.ts`)

Responsibilities:

All git operations with safety checks
Branch name generation from tag/task
Confirmation gates for destructive operations

API:

class GitAdapter {
  async isWorkingTreeClean(): Promise<boolean>
  async getCurrentBranch(): Promise<string>
  async getDefaultBranch(): Promise<string>
  async createBranch(name: string): Promise<void>
  async checkoutBranch(name: string): Promise<void>
  async commit(message: string, files?: string[]): Promise<string>
  async push(branch: string, remote?: string): Promise<void>

  // Safety checks
  async assertNotOnDefaultBranch(): Promise<void>
  async assertCleanOrConfirm(): Promise<void>

  // Branch naming
  generateBranchName(tag: string, taskId: string, slug: string): string
}

Guardrails:

Never allow commits on default branch
Always check working tree before branch creation
Confirm destructive operations unless --no-confirm flag

3. TestRunnerAdapter (`packages/tm-core/src/services/test-runner-adapter.ts`)

Responsibilities:

Detect test command from package.json
Execute tests (targeted and full suite)
Parse test results and coverage
Enforce coverage thresholds

API:

class TestRunnerAdapter {
  async detectTestCommand(): Promise<string>
  async runTargeted(pattern: string): Promise<TestResults>
  async runAll(): Promise<TestResults>
  async getCoverage(): Promise<CoverageReport>
  async meetsThresholds(coverage: CoverageReport): Promise<boolean>
}

interface TestResults {
  exitCode: number
  duration: number
  summary: {
    total: number
    passed: number
    failed: number
    skipped: number
  }
  failures: Array<{
    test: string
    error: string
    stack?: string
  }>
}

interface CoverageReport {
  lines: number
  branches: number
  functions: number
  statements: number
}

Detection Logic:

Check package.json → scripts.test
Support: npm test, pnpm test, yarn test, bun test
Fall back to explicit command from config

4. Test Generation Integration

Use Surgical Test Generator:

Load prompt from .claude/agents/surgical-test-generator.md
Compose with task/subtask context
Generate tests via executor (Claude)
Write test files to detected locations

Prompt Composition:

async function composeRedPrompt(subtask: Subtask, context: ProjectContext): Promise<string> {
  const systemPrompts = [
    loadFile('.cursor/rules/git_workflow.mdc'),
    loadFile('.cursor/rules/test_workflow.mdc'),
    loadFile('.claude/agents/surgical-test-generator.md')
  ]

  const taskContext = formatTaskContext(subtask)
  const instruction = formatRedInstruction(subtask, context)

  return [
    ...systemPrompts,
    '<TASK CONTEXT>',
    taskContext,
    '<INSTRUCTION>',
    instruction
  ].join('\n\n')
}

5. Subtask Loop Implementation

RED Phase:

Compose test generation prompt with subtask context
Execute via Claude executor
Parse generated test file paths and code
Write test files to filesystem
Run tests to confirm they fail (red state)
Store test results in run artifacts
If tests pass unexpectedly, warn and skip to next subtask

GREEN Phase:

Compose implementation prompt with test failures
Execute via Claude executor with max attempts (default: 3)
Parse implementation changes
Apply changes to filesystem
Run tests to verify passing (green state)
If tests still fail after max attempts:
- Save current state
- Emit pause event
- Return resumable checkpoint
If tests pass, proceed to COMMIT

COMMIT Phase:

Verify all tests pass
Check coverage meets thresholds (if enabled)
Generate conventional commit message
Stage test files + implementation files
Commit with message
Update subtask status to 'done'
Emit commit event with SHA
Continue to next subtask

6. Branch & Tag Management

Integration with existing tag system:

Use scripts/modules/task-manager/tag-management.js
Explicit tag switching when branch created
Store branch ↔ tag mapping in run state

Branch Naming:

Pattern from config: {tag}/task-{id}-{slug}
Default: analytics/task-42-user-metrics
Sanitize: lowercase, replace spaces with hyphens

7. Run Artifacts & State Persistence

Directory structure:

.taskmaster/reports/runs/<run-id>/
├── manifest.json          # run metadata
├── log.jsonl              # event stream
├── commits.txt            # commit SHAs
├── test-results/
│   ├── subtask-42.1-red.json
│   ├── subtask-42.1-green.json
│   ├── subtask-42.2-red.json
│   ├── subtask-42.2-green-attempt1.json
│   ├── subtask-42.2-green-attempt2.json
│   └── final-suite.json
└── state.json             # resumable checkpoint

manifest.json:

{
  "runId": "2025-01-15-142033",
  "taskId": "42",
  "tag": "analytics",
  "branch": "analytics/task-42-user-metrics",
  "startTime": "2025-01-15T14:20:33Z",
  "endTime": null,
  "status": "in-progress",
  "currentPhase": "subtask-loop",
  "currentSubtask": "42.2",
  "subtasksCompleted": ["42.1"],
  "subtasksFailed": [],
  "totalCommits": 1
}

log.jsonl (append-only event log):

{"ts":"2025-01-15T14:20:33Z","event":"phase:start","phase":"preflight","status":"ok"}
{"ts":"2025-01-15T14:21:00Z","event":"subtask:start","subtask":"42.1","phase":"red"}
{"ts":"2025-01-15T14:22:00Z","event":"test:run","subtask":"42.1","phase":"red","results":{"passed":0,"failed":3}}
{"ts":"2025-01-15T14:23:00Z","event":"subtask:start","subtask":"42.1","phase":"green"}
{"ts":"2025-01-15T14:24:30Z","event":"test:run","subtask":"42.1","phase":"green","attempt":1,"results":{"passed":3,"failed":0}}
{"ts":"2025-01-15T14:24:35Z","event":"commit:created","subtask":"42.1","sha":"a1b2c3d","message":"feat(metrics): add metrics schema (task 42.1)"}

8. CLI Command Implementation

Update tm autopilot command:

Remove --dry-run only behavior
Execute actual workflow when flag not present
Add progress reporting via orchestrator events
Support --no-confirm for CI/automation
Support --max-attempts to override default

Real-time output:

$ tm autopilot 42

🚀 Starting autopilot for Task #42 [analytics]: User metrics tracking

✓ Preflight checks passed
✓ Created branch: analytics/task-42-user-metrics
✓ Set active tag: analytics

━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━

[1/3] Subtask 42.1: Add metrics schema

  RED   Generating tests... ⏳
  RED   ✓ Tests created: src/__tests__/schema.test.js
  RED   ✓ Tests failing: 3 failed, 0 passed

  GREEN Implementing code... ⏳
  GREEN ✓ Tests passing: 3 passed, 0 failed (attempt 1)

  COMMIT ✓ Committed: a1b2c3d
         "feat(metrics): add metrics schema (task 42.1)"

━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━

[2/3] Subtask 42.2: Add collection endpoint
  ...

Success Criteria

Can execute a simple task end-to-end without manual intervention
All commits made on feature branch, never on default branch
Tests are generated before implementation (RED → GREEN order enforced)
Only commits when tests pass and coverage meets threshold
Run state is persisted and can be inspected post-run
Clear error messages when things go wrong
Orchestrator events allow CLI to show live progress

Configuration

Add to .taskmaster/config.json:

{
  "autopilot": {
    "enabled": true,
    "requireCleanWorkingTree": true,
    "commitTemplate": "{type}({scope}): {msg}",
    "defaultCommitType": "feat",
    "maxGreenAttempts": 3,
    "testTimeout": 300000
  },
  "test": {
    "runner": "auto",
    "coverageThresholds": {
      "lines": 80,
      "branches": 80,
      "functions": 80,
      "statements": 80
    },
    "targetedRunPattern": "**/*.test.js"
  },
  "git": {
    "branchPattern": "{tag}/task-{id}-{slug}",
    "defaultRemote": "origin"
  }
}

Out of Scope (defer to Phase 2)

PR creation (gh integration)
Resume functionality (--resume flag)
Lint/format step
Multiple executor support (only Claude)

Implementation Order

GitAdapter with safety checks
TestRunnerAdapter with detection logic
WorkflowOrchestrator state machine skeleton
RED phase: test generation integration
GREEN phase: implementation with retry logic
COMMIT phase: gating and persistence
CLI command wiring with event handling
Run artifacts and logging

Testing Strategy

Unit tests for each adapter (mock git/test commands)
Integration tests with real git repo (temporary directory)
End-to-end test with sample task in test project
Verify no commits on default branch (security test)
Verify commit gating works (force test failure, ensure no commit)

Dependencies

Phase 0 completed (CLI skeleton, preflight checks)
Existing TaskService and executor infrastructure
Surgical Test Generator prompt file exists

Estimated Effort

2-3 weeks

Risks & Mitigations

Risk: Test generation produces invalid/wrong tests
- Mitigation: Use Surgical Test Generator prompt, add manual review step in early iterations
Risk: Implementation attempts timeout/fail repeatedly
- Mitigation: Max attempts with pause/resume; store state for manual intervention
Risk: Coverage parsing fails on different test frameworks
- Mitigation: Start with one framework (vitest), add parsers incrementally
Risk: Git operations fail (conflicts, permissions)
- Mitigation: Detailed error messages, save state before destructive ops

Validation

Test with:

Simple task (1 subtask, clear requirements)
Medium task (3 subtasks with dependencies)
Task requiring multiple GREEN attempts
Task with dirty working tree (should error)
Task on default branch (should error)
Project without test command (should error with helpful message)

12 KiB Raw Blame History

Phase 1: Core Rails - Autonomous TDD Workflow

Objective

Scope

Deliverables

1. WorkflowOrchestrator (packages/tm-core/src/services/workflow-orchestrator.ts)

2. GitAdapter (packages/tm-core/src/services/git-adapter.ts)

3. TestRunnerAdapter (packages/tm-core/src/services/test-runner-adapter.ts)

4. Test Generation Integration

5. Subtask Loop Implementation

6. Branch & Tag Management

7. Run Artifacts & State Persistence

8. CLI Command Implementation

Success Criteria

Configuration

Out of Scope (defer to Phase 2)

Implementation Order

Testing Strategy

Dependencies

Estimated Effort

Risks & Mitigations

Validation

12 KiB

Raw Blame History

1. WorkflowOrchestrator (`packages/tm-core/src/services/workflow-orchestrator.ts`)

2. GitAdapter (`packages/tm-core/src/services/git-adapter.ts`)

3. TestRunnerAdapter (`packages/tm-core/src/services/test-runner-adapter.ts`)