chore: exit rc mode

chore: rc version bump
fix: temporary fix, revert zod schema definitions for mcp tools to zod v3 (#1323 )
2025-10-18 18:39:54 +02:00 · 2025-10-18 16:38:31 +00:00 · 2025-10-18 18:34:40 +02:00 · 2025-10-18 16:34:11 +02:00 · 2025-10-18 16:29:03 +02:00 · 2025-10-18 09:13:05 +00:00
227 changed files with 29438 additions and 2230 deletions
--- a/.changeset/auto-update-changelog-highlights.md
+++ b/.changeset/auto-update-changelog-highlights.md
@@ -1,7 +0,0 @@
---
-"task-master-ai": minor
---
-
-Add changelog highlights to auto-update notifications
-
-When the CLI auto-updates to a new version, it now displays a "What's New" section.
--- a/.changeset/config.json
+++ b/.changeset/config.json
@@ -11,6 +11,7 @@
  "access": "public",
  "baseBranch": "main",
  "ignore": [
-    "docs"
+    "docs",
+    "@tm/claude-code-plugin"
  ]
 }
--- a/.changeset/dirty-hairs-know.md
+++ b/.changeset/dirty-hairs-know.md
@@ -0,0 +1,5 @@
+---
+"task-master-ai": patch
+---
+
+Improve auth token refresh flow
--- a/.changeset/fix-parent-directory-traversal.md
+++ b/.changeset/fix-parent-directory-traversal.md
@@ -0,0 +1,7 @@
+---
+"task-master-ai": patch
+---
+
+Enable Task Master commands to traverse parent directories to find project root from nested paths
+
+Fixes #1301
--- a/.changeset/fix-warning-box-alignment.md
+++ b/.changeset/fix-warning-box-alignment.md
@@ -0,0 +1,5 @@
+---
+"@tm/cli": patch
+---
+
+Fix warning message box width to match dashboard box width for consistent UI alignment
--- a/.changeset/kind-lines-melt.md
+++ b/.changeset/kind-lines-melt.md
@@ -0,0 +1,21 @@
+---
+"task-master-ai": patch
+---
+
+Fix MCP server compatibility with Draft-07 clients (Augment IDE, gemini-cli, gemini code assist)
+
+- Resolves #1284
+
+**Problem:**
+
+- MCP tools were using Zod v4, which outputs JSON Schema Draft 2020-12
+- MCP clients only support Draft-07
+- Tools were not discoverable in gemini-cli and other clients
+
+**Solution:**
+
+- Updated all MCP tools to import from `zod/v3` instead of `zod`
+- Zod v3 schemas convert to Draft-07 via FastMCP's zod-to-json-schema
+- Fixed logger to use stderr instead of stdout (MCP protocol requirement)
+
+This is a temporary workaround until FastMCP adds JSON Schema version configuration.
--- a/.changeset/light-owls-stay.md
+++ b/.changeset/light-owls-stay.md
@@ -0,0 +1,35 @@
+---
+"task-master-ai": minor
+---
+
+Add configurable MCP tool loading to optimize LLM context usage
+
+You can now control which Task Master MCP tools are loaded by setting the `TASK_MASTER_TOOLS` environment variable in your MCP configuration. This helps reduce context usage for LLMs by only loading the tools you need.
+
+**Configuration Options:**
+
+- `all` (default): Load all 36 tools
+- `core` or `lean`: Load only 7 essential tools for daily development
+  - Includes: `get_tasks`, `next_task`, `get_task`, `set_task_status`, `update_subtask`, `parse_prd`, `expand_task`
+- `standard`: Load 15 commonly used tools (all core tools plus 8 more)
+  - Additional tools: `initialize_project`, `analyze_project_complexity`, `expand_all`, `add_subtask`, `remove_task`, `generate`, `add_task`, `complexity_report`
+- Custom list: Comma-separated tool names (e.g., `get_tasks,next_task,set_task_status`)
+
+**Example .mcp.json configuration:**
+
+```json
+{
+  "mcpServers": {
+    "task-master-ai": {
+      "command": "npx",
+      "args": ["-y", "task-master-ai"],
+      "env": {
+        "TASK_MASTER_TOOLS": "standard",
+        "ANTHROPIC_API_KEY": "your_key_here"
+      }
+    }
+  }
+}
+```
+
+For complete details on all available tools, configuration examples, and usage guidelines, see the [MCP Tools documentation](https://docs.task-master.dev/capabilities/mcp#configurable-tool-loading).
--- a/.changeset/mean-planes-wave.md
+++ b/.changeset/mean-planes-wave.md
@@ -1,47 +0,0 @@
---
-"task-master-ai": minor
---
-
-Add Claude Code plugin with marketplace distribution
-
-This release introduces official Claude Code plugin support, marking the evolution from legacy `.claude` directory copying to a modern plugin-based architecture.
-
-## 🎉 New: Claude Code Plugin
-
-Task Master AI commands and agents are now distributed as a proper Claude Code plugin:
-
- **49 slash commands** with clean naming (`/taskmaster:command-name`)
- **3 specialized AI agents** (task-orchestrator, task-executor, task-checker)
- **MCP server integration** for deep Claude Code integration
-
-**Installation:**
-
-```bash
-/plugin marketplace add eyaltoledano/claude-task-master
-/plugin install taskmaster@taskmaster
-```
-
-### The `rules add claude` command no longer copies commands and agents to `.claude/commands/` and `.claude/agents/`. Instead, it now
-
- Shows plugin installation instructions
- Only manages CLAUDE.md imports for agent instructions
- Directs users to install the official plugin
-
-**Migration for Existing Users:**
-
-If you previously used `rules add claude`:
-
-1. The old commands in `.claude/commands/` will continue to work but won't receive updates
-2. Install the plugin for the latest features: `/plugin install taskmaster@taskmaster`
-3. remove old `.claude/commands/` and `.claude/agents/` directories
-
-**Why This Change?**
-
-Claude Code plugins provide:
-
- ✅ Automatic updates when we release new features
- ✅ Better command organization and naming
- ✅ Seamless integration with Claude Code
- ✅ No manual file copying or management
-
-The plugin system is the future of Task Master AI integration with Claude Code!
--- a/.changeset/metal-rocks-help.md
+++ b/.changeset/metal-rocks-help.md
@@ -0,0 +1,5 @@
+---
+"task-master-ai": minor
+---
+
+Improve next command to work with remote
--- a/.changeset/nice-ways-hope.md
+++ b/.changeset/nice-ways-hope.md
@@ -1,17 +0,0 @@
---
-"task-master-ai": minor
---
-
-Add RPG (Repository Planning Graph) method template for structured PRD creation. The new `example_prd_rpg.txt` template teaches AI agents and developers the RPG methodology through embedded instructions, inline good/bad examples, and XML-style tags for structure. This template enables creation of dependency-aware PRDs that automatically generate topologically-ordered task graphs when parsed with Task Master.
-
-Key features:
- Method-as-template: teaches RPG principles (dual-semantics, explicit dependencies, topological order) while being used
- Inline instructions at decision points guide AI through each section
- Good/bad examples for immediate pattern matching
- Flexible plain-text format with XML-style tags for parseability
- Critical dependency-graph section ensures correct task ordering
- Automatic inclusion during `task-master init`
- Comprehensive documentation at [docs.task-master.dev/capabilities/rpg-method](https://docs.task-master.dev/capabilities/rpg-method)
- Tool recommendations for code-context-aware PRD creation (Claude Code, Cursor, Gemini CLI, Codex/Grok)
-
-The RPG template complements the existing `example_prd.txt` and provides a more structured approach for complex projects requiring clear module boundaries and dependency chains.
--- a/.changeset/open-tips-notice.md
+++ b/.changeset/open-tips-notice.md
@@ -0,0 +1,5 @@
+---
+"task-master-ai": minor
+---
+
+Add 4.5 haiku and sonnet to supported models for claude-code and anthropic ai providers
--- a/.changeset/plain-falcons-serve.md
+++ b/.changeset/plain-falcons-serve.md
@@ -1,7 +0,0 @@
---
-"task-master-ai": patch
---
-
-Fix cross-level task dependencies not being saved
-
-Fixes an issue where adding dependencies between subtasks and top-level tasks (e.g., `task-master add-dependency --id=2.2 --depends-on=11`) would report success but fail to persist the changes. Dependencies can now be created in both directions between any task levels.
--- a/.changeset/pre.json
+++ b/.changeset/pre.json
@@ -2,20 +2,24 @@
  "mode": "exit",
  "tag": "rc",
  "initialVersions": {
-    "task-master-ai": "0.28.0",
+    "task-master-ai": "0.29.0",
    "@tm/cli": "",
-    "docs": "0.0.5",
-    "extension": "0.25.5",
+    "docs": "0.0.6",
+    "extension": "0.25.6",
+    "@tm/mcp": "0.28.0-rc.2",
    "@tm/ai-sdk-provider-grok-cli": "",
    "@tm/build-config": "",
-    "@tm/claude-code-plugin": "0.0.1",
+    "@tm/claude-code-plugin": "0.0.2",
    "@tm/core": ""
  },
  "changesets": [
-    "auto-update-changelog-highlights",
-    "mean-planes-wave",
-    "nice-ways-hope",
-    "plain-falcons-serve",
-    "smart-owls-relax"
+    "dirty-hairs-know",
+    "fix-parent-directory-traversal",
+    "fix-warning-box-alignment",
+    "kind-lines-melt",
+    "light-owls-stay",
+    "metal-rocks-help",
+    "open-tips-notice",
+    "some-dodos-wonder"
  ]
 }
--- a/.changeset/smart-owls-relax.md
+++ b/.changeset/smart-owls-relax.md
@@ -1,16 +0,0 @@
---
-"task-master-ai": minor
---
-
-Enhance `expand_all` to intelligently use complexity analysis recommendations when expanding tasks. 
-
-The expand-all operation now automatically leverages recommendations from `analyze-complexity` to determine optimal subtask counts for each task, resulting in more accurate and context-aware task breakdowns.
-
-Key improvements:
- Automatic integration with complexity analysis reports
- Tag-aware complexity report path resolution
- Intelligent subtask count determination based on task complexity
- Falls back to defaults when complexity analysis is unavailable
- Enhanced logging for better visibility into expansion decisions
-
-When you run `task-master expand --all` after `task-master analyze-complexity`, Task Master now uses the recommended subtask counts from the complexity analysis instead of applying uniform defaults, ensuring each task is broken down according to its actual complexity.
--- a/.changeset/some-dodos-wonder.md
+++ b/.changeset/some-dodos-wonder.md
@@ -0,0 +1,36 @@
+---
+"task-master-ai": minor
+---
+
+Add autonomous TDD workflow automation system with new `tm autopilot` commands and MCP tools for AI-driven test-driven development.
+
+**New CLI Commands:**
+
+- `tm autopilot start <taskId>` - Initialize TDD workflow
+- `tm autopilot next` - Get next action in workflow
+- `tm autopilot status` - Check workflow progress
+- `tm autopilot complete` - Advance phase with test results
+- `tm autopilot commit` - Save progress with metadata
+- `tm autopilot resume` - Continue from checkpoint
+- `tm autopilot abort` - Cancel workflow
+
+**New MCP Tools:**
+Seven new autopilot tools for programmatic control: `autopilot_start`, `autopilot_next`, `autopilot_status`, `autopilot_complete_phase`, `autopilot_commit`, `autopilot_resume`, `autopilot_abort`
+
+**Features:**
+
+- Complete RED → GREEN → COMMIT cycle enforcement
+- Intelligent commit message generation with metadata
+- Activity logging and state persistence
+- Configurable workflow settings via `.taskmaster/config.json`
+- Comprehensive AI agent integration documentation
+
+**Documentation:**
+
+- AI Agent Integration Guide (2,800+ lines)
+- TDD Quick Start Guide
+- Example prompts and integration patterns
+
+> **Learn more:** [TDD Workflow Quickstart Guide](https://dev.task-master.dev/tdd-workflow/quickstart)
+
+This release enables AI agents to autonomously execute test-driven development workflows with full state management and recovery capabilities.
--- a/.coderabbit.yaml
+++ b/.coderabbit.yaml
@@ -1,10 +1,7 @@
 reviews:
-  profile: assertive
+  profile: chill
  poem: false
  auto_review:
+    enabled: true
    base_branches:
-      - rc
-      - beta
-      - alpha
-      - production
-      - next
+      - ".*"
--- a/.env.example
+++ b/.env.example
@@ -14,4 +14,4 @@ OLLAMA_API_KEY=YOUR_OLLAMA_API_KEY_HERE
 VERTEX_PROJECT_ID=your-gcp-project-id
 VERTEX_LOCATION=us-central1
 # Optional: Path to service account credentials JSON file (alternative to API key)
-GOOGLE_APPLICATION_CREDENTIALS=/path/to/service-account-credentials.json
+GOOGLE_APPLICATION_CREDENTIALS=/path/to/service-account-credentials.json
--- a/.gitignore
+++ b/.gitignore
@@ -96,4 +96,7 @@ apps/extension/.vscode-test/
 apps/extension/vsix-build/

 # turbo
-.turbo
+.turbo
+
+# TaskMaster Workflow State (now stored in ~/.taskmaster/sessions/)
+# No longer needed in .gitignore as state is stored globally
--- a/.taskmaster/config.json
+++ b/.taskmaster/config.json
@@ -1,8 +1,8 @@
 {
 	"models": {
 		"main": {
-			"provider": "anthropic",
-			"modelId": "claude-sonnet-4-20250514",
+			"provider": "claude-code",
+			"modelId": "sonnet",
 			"maxTokens": 64000,
 			"temperature": 0.2
 		},
@@ -35,6 +35,7 @@
 		"defaultTag": "master"
 	},
 	"claudeCode": {},
+	"codexCli": {},
 	"grokCli": {
 		"timeout": 120000,
 		"workingDirectory": null,
--- a/.taskmaster/docs/autonomous-tdd-git-workflow.md
+++ b/.taskmaster/docs/autonomous-tdd-git-workflow.md
@@ -0,0 +1,912 @@
+## Summary
+
+- Put the existing git and test workflows on rails: a repeatable, automated process that can run autonomously, with guardrails and a compact TUI for visibility.
+
+- Flow: for a selected task, create a branch named with the tag + task id → generate tests for the first subtask (red) using the Surgical Test Generator → implement code (green) → verify tests → commit → repeat per subtask → final verify → push → open PR against the default branch.
+
+- Build on existing rules: .cursor/rules/git_workflow.mdc, .cursor/rules/test_workflow.mdc, .claude/agents/surgical-test-generator.md, and existing CLI/core services.
+
+## Goals
+
+- Deterministic, resumable automation to execute the TDD loop per subtask with minimal human intervention.
+
+- Strong guardrails: never commit to the default branch; only commit when tests pass; enforce status transitions; persist logs/state for debuggability.
+
+- Visibility: a compact terminal UI (like lazygit) to pick tag, view tasks, and start work; right-side pane opens an executor terminal (via tmux) for agent coding.
+
+- Extensible: framework-agnostic test generation via the Surgical Test Generator; detect and use the repo’s test command for execution with coverage thresholds.
+
+## Non‑Goals (initial)
+
+- Full multi-language runner parity beyond detection and executing the project’s test command.
+
+- Complex GUI; start with CLI/TUI + tmux pane. IDE/extension can hook into the same state later.
+
+- Rich executor selection UX (codex/gemini/claude) — we’ll prompt per run; defaults can come later.
+
+## Success Criteria
+
+- One command can autonomously complete a task's subtasks via TDD and open a PR when done.
+
+- All commits made on a branch that includes the tag and task id (see Branch Naming); no commits to the default branch directly.
+
+- Every subtask iteration: failing tests added first (red), then code added to pass them (green), commit only after green.
+
+- End-to-end logs + artifacts stored in .taskmaster/reports/runs/<timestamp-or-id>/.
+
+## Success Metrics (Phase 1)
+
+- **Adoption**: 80% of tasks in a pilot repo completed via `tm autopilot`
+- **Safety**: 0 commits to default branch; 100% of commits have green tests
+- **Efficiency**: Average time from task start to PR < 30min for simple subtasks
+- **Reliability**: < 5% of runs require manual intervention (timeout/conflicts)
+
+## User Stories
+
+- As a developer, I can run tm autopilot <taskId> and watch a structured, safe workflow execute.
+
+- As a reviewer, I can inspect commits per subtask, and a PR summarizing the work when the task completes.
+
+- As an operator, I can see current step, active subtask, tests status, and logs in a compact CLI view and read a final run report.
+
+## Example Workflow Traces
+
+### Happy Path: Complete a 3-subtask feature
+
+```bash
+# Developer starts
+$ tm autopilot 42
+→ Checks preflight: ✓ clean tree, ✓ npm test detected
+→ Creates branch: analytics/task-42-user-metrics
+→ Subtask 42.1: "Add metrics schema"
+  RED: generates test_metrics_schema.test.js → 3 failures
+  GREEN: implements schema.js → all pass
+  COMMIT: "feat(metrics): add metrics schema (task 42.1)"
+→ Subtask 42.2: "Add collection endpoint"
+  RED: generates test_metrics_endpoint.test.js → 5 failures
+  GREEN: implements api/metrics.js → all pass
+  COMMIT: "feat(metrics): add collection endpoint (task 42.2)"
+→ Subtask 42.3: "Add dashboard widget"
+  RED: generates test_metrics_widget.test.js → 4 failures
+  GREEN: implements components/MetricsWidget.jsx → all pass
+  COMMIT: "feat(metrics): add dashboard widget (task 42.3)"
+→ Final: all 3 subtasks complete
+  ✓ Run full test suite → all pass
+  ✓ Coverage check → 85% (meets 80% threshold)
+  PUSH: confirms with user → pushed to origin
+  PR: opens #123 "Task #42 [analytics]: User metrics tracking"
+
+✓ Task 42 complete. PR: https://github.com/org/repo/pull/123
+  Run report: .taskmaster/reports/runs/2025-01-15-142033/
+```
+
+### Error Recovery: Failing tests timeout
+
+```bash
+$ tm autopilot 42
+→ Subtask 42.2 GREEN phase: attempt 1 fails (2 tests still red)
+→ Subtask 42.2 GREEN phase: attempt 2 fails (1 test still red)
+→ Subtask 42.2 GREEN phase: attempt 3 fails (1 test still red)
+
+⚠️  Paused: Could not achieve green state after 3 attempts
+📋 State saved to: .taskmaster/reports/runs/2025-01-15-142033/
+    Last error: "POST /api/metrics returns 500 instead of 201"
+
+Next steps:
+  - Review diff: git diff HEAD
+  - Inspect logs: cat .taskmaster/reports/runs/2025-01-15-142033/log.jsonl
+  - Check test output: cat .taskmaster/reports/runs/2025-01-15-142033/test-results/subtask-42.2-green-attempt3.json
+  - Resume after manual fix: tm autopilot --resume
+
+# Developer manually fixes the issue, then:
+$ tm autopilot --resume
+→ Resuming subtask 42.2 GREEN phase
+  GREEN: all tests pass
+  COMMIT: "feat(metrics): add collection endpoint (task 42.2)"
+→ Continuing to subtask 42.3...
+```
+
+### Dry Run: Preview before execution
+
+```bash
+$ tm autopilot 42 --dry-run
+Autopilot Plan for Task #42 [analytics]: User metrics tracking
+─────────────────────────────────────────────────────────────
+Preflight:
+  ✓ Working tree is clean
+  ✓ Test command detected: npm test
+  ✓ Tools available: git, gh, node, npm
+  ✓ Current branch: main (will create new branch)
+
+Branch & Tag:
+  → Create branch: analytics/task-42-user-metrics
+  → Set active tag: analytics
+
+Subtasks (3 pending):
+  1. 42.1: Add metrics schema
+     - RED: generate tests in src/__tests__/schema.test.js
+     - GREEN: implement src/schema.js
+     - COMMIT: "feat(metrics): add metrics schema (task 42.1)"
+
+  2. 42.2: Add collection endpoint [depends on 42.1]
+     - RED: generate tests in src/api/__tests__/metrics.test.js
+     - GREEN: implement src/api/metrics.js
+     - COMMIT: "feat(metrics): add collection endpoint (task 42.2)"
+
+  3. 42.3: Add dashboard widget [depends on 42.2]
+     - RED: generate tests in src/components/__tests__/MetricsWidget.test.jsx
+     - GREEN: implement src/components/MetricsWidget.jsx
+     - COMMIT: "feat(metrics): add dashboard widget (task 42.3)"
+
+Finalization:
+  → Run full test suite with coverage
+  → Push branch to origin (will confirm)
+  → Create PR targeting main
+
+Run without --dry-run to execute.
+```
+
+## High‑Level Workflow
+
+1) Pre‑flight
+
+   - Verify clean working tree or confirm staging/commit policy (configurable).
+
+   - Detect repo type and the project’s test command (e.g., npm test, pnpm test, pytest, go test).
+
+   - Validate tools: git, gh (optional for PR), node/npm, and (if used) claude CLI.
+
+   - Load TaskMaster state and selected task; if no subtasks exist, automatically run “expand” before working.
+
+2) Branch & Tag Setup
+
+   - Checkout default branch and update (optional), then create a branch using Branch Naming (below).
+
+   - Map branch ↔ tag via existing tag management; explicitly set active tag to the branch’s tag.
+
+3) Subtask Loop (for each pending/in-progress subtask in dependency order)
+
+   - Select next eligible subtask using tm-core TaskService getNextTask() and subtask eligibility logic.
+
+   - Red: generate or update failing tests for the subtask
+
+     - Use the Surgical Test Generator system prompt .claude/agents/surgical-test-generator.md) to produce high-signal tests following project conventions.
+
+     - Run tests to confirm red; record results. If not red (already passing), skip to next subtask or escalate.
+
+   - Green: implement code to pass tests
+
+     - Use executor to implement changes (initial: claude CLI prompt with focused context).
+
+     - Re-run tests until green or timeout/backoff policy triggers.
+
+   - Commit: when green
+
+     - Commit tests + code with conventional commit message. Optionally update subtask status to done.
+
+     - Persist run step metadata/logs.
+
+4) Finalization
+
+   - Run full test suite and coverage (if configured); optionally lint/format.
+
+   - Commit any final adjustments.
+
+   - Push branch (ask user to confirm); create PR (via gh pr create) targeting the default branch. Title format: Task #<id> [<tag>]: <title>.
+
+5) Post‑Run
+
+   - Update task status if desired (e.g., review).
+
+   - Persist run report (JSON + markdown summary) to .taskmaster/reports/runs/<run-id>/.
+
+## Guardrails
+
+- Never commit to the default branch.
+
+- Commit only if all tests (targeted and suite) pass; allow override flags.
+
+- Enforce 80% coverage thresholds (lines/branches/functions/statements) by default; configurable.
+
+- Timebox/model ops and retries; if not green within N attempts, pause with actionable state for resume.
+
+- Always log actions, commands, and outcomes; include dry-run mode.
+
+- Ask before branch creation, pushing, and opening a PR unless --no-confirm is set.
+
+## Integration Points (Current Repo)
+
+- CLI: apps/cli provides command structure and UI components.
+
+  - New command: tm autopilot (alias: task-master autopilot).
+
+  - Reuse UI components under apps/cli/src/ui/components/ for headers/task details/next-task.
+
+- Core services: packages/tm-core
+
+  - TaskService for selection, status, tags.
+
+  - TaskExecutionService for prompt formatting and executor prep.
+
+  - Executors: claude executor and ExecutorFactory to run external tools.
+
+  - Proposed new: WorkflowOrchestrator to drive the autonomous loop and emit progress events.
+
+- Tag/Git utilities: scripts/modules/utils/git-utils.js and scripts/modules/task-manager/tag-management.js for branch→tag mapping and explicit tag switching.
+
+- Rules: .cursor/rules/git_workflow.mdc and .cursor/rules/test_workflow.mdc to steer behavior and ensure consistency.
+
+- Test generation prompt: .claude/agents/surgical-test-generator.md.
+
+## Proposed Components
+
+- Orchestrator (tm-core): WorkflowOrchestrator (new)
+
+  - State machine driving phases: Preflight → Branch/Tag → SubtaskIter (Red/Green/Commit) → Finalize → PR.
+
+  - Exposes an evented API (progress events) that the CLI can render.
+
+  - Stores run state artifacts.
+
+- Test Runner Adapter
+
+  - Detects and runs tests via the project’s test command (e.g., npm test), with targeted runs where feasible.
+
+  - API: runTargeted(files/pattern), runAll(), report summary (failures, duration, coverage), enforce 80% threshold by default.
+
+- Git/PR Adapter
+
+  - Encapsulates git ops: branch create/checkout, add/commit, push.
+
+  - Optional gh integration to open PR; fallback to instructions if gh unavailable.
+
+  - Confirmation gates for branch creation and pushes.
+
+- Prompt/Exec Adapter
+
+  - Uses existing executor service to call the selected coding assistant (initially claude) with tight prompts: task/subtask context, surgical tests first, then minimal code to green.
+
+- Run State + Reporting
+
+  - JSONL log of steps, timestamps, commands, test results.
+
+  - Markdown summary for PR description and post-run artifact.
+
+## CLI UX (MVP)
+
+- Command: tm autopilot [taskId]
+
+  - Flags: --dry-run, --no-push, --no-pr, --no-confirm, --force, --max-attempts <n>, --runner <auto|custom>, --commit-scope <scope>
+
+  - Output: compact header (project, tag, branch), current phase, subtask line, last test summary, next actions.
+
+- Resume: If interrupted, tm autopilot --resume picks up from last checkpoint in run state.
+
+### TUI with tmux (Linear Execution)
+
+- Left pane: Tag selector, task list (status/priority), start/expand shortcuts; "Start" triggers the next task or a selected task.
+
+- Right pane: Executor terminal (tmux split) that runs the coding agent (claude-code/codex). Autopilot can hand over to the right pane during green.
+
+- MCP integration: use MCP tools for task queries/updates and for shell/test invocations where available.
+
+## TUI Layout (tmux-based)
+
+### Pane Structure
+
+```
+┌─────────────────────────────────────┬──────────────────────────────────┐
+│ Task Navigator (left)               │ Executor Terminal (right)        │
+│                                     │                                  │
+│ Project: my-app                     │ $ tm autopilot --executor-mode   │
+│ Branch: analytics/task-42           │ > Running subtask 42.2 GREEN...  │
+│ Tag: analytics                      │ > Implementing endpoint...       │
+│                                     │ > Tests: 3 passed, 0 failed      │
+│ Tasks:                              │ > Ready to commit                │
+│ → 42 [in-progress] User metrics     │                                  │
+│   → 42.1 [done] Schema              │ [Live output from Claude Code]   │
+│   → 42.2 [active] Endpoint ◀        │                                  │
+│   → 42.3 [pending] Dashboard        │                                  │
+│                                     │                                  │
+│ [s] start  [p] pause  [q] quit      │                                  │
+└─────────────────────────────────────┴──────────────────────────────────┘
+```
+
+### Implementation Notes
+
+- **Left pane**: `apps/cli/src/ui/tui/navigator.ts` (new, uses `blessed` or `ink`)
+- **Right pane**: spawned via `tmux split-window -h` running `tm autopilot --executor-mode`
+- **Communication**: shared state file `.taskmaster/state/current-run.json` + file watching or event stream
+- **Keybindings**:
+  - `s` - Start selected task
+  - `p` - Pause/resume current run
+  - `q` - Quit (with confirmation if run active)
+  - `↑/↓` - Navigate task list
+  - `Enter` - Expand/collapse subtasks
+
+## Prompt Composition (Detailed)
+
+### System Prompt Assembly
+
+Prompts are composed in three layers:
+
+1. **Base rules** (loaded in order from `.cursor/rules/` and `.claude/agents/`):
+   - `git_workflow.mdc` → git commit conventions, branch policy, PR guidelines
+   - `test_workflow.mdc` → TDD loop requirements, coverage thresholds, test structure
+   - `surgical-test-generator.md` → test generation methodology, project-specific test patterns
+
+2. **Task context injection**:
+   ```
+   You are implementing:
+   Task #42 [analytics]: User metrics tracking
+   Subtask 42.2: Add collection endpoint
+
+   Description:
+   Implement POST /api/metrics endpoint to collect user metrics events
+
+   Acceptance criteria:
+   - POST /api/metrics accepts { userId, eventType, timestamp }
+   - Validates input schema (reject missing/invalid fields)
+   - Persists to database
+   - Returns 201 on success with created record
+   - Returns 400 on validation errors
+
+   Dependencies:
+   - Subtask 42.1 (metrics schema) is complete
+
+   Current phase: RED (generate failing tests)
+   Test command: npm test
+   Test file convention: src/**/*.test.js (vitest framework detected)
+   Branch: analytics/task-42-user-metrics
+   Project language: JavaScript (Node.js)
+   ```
+
+3. **Phase-specific instructions**:
+   - **RED phase**: "Generate minimal failing tests for this subtask. Do NOT implement any production code. Only create test files. Confirm tests fail with clear error messages indicating missing implementation."
+   - **GREEN phase**: "Implement minimal code to pass the failing tests. Follow existing project patterns in `src/`. Only modify files necessary for this subtask. Keep changes focused and reviewable."
+
+### Example Full Prompt (RED Phase)
+
+```markdown
+<SYSTEM PROMPT>
+[Contents of .cursor/rules/git_workflow.mdc]
+[Contents of .cursor/rules/test_workflow.mdc]
+[Contents of .claude/agents/surgical-test-generator.md]
+
+<TASK CONTEXT>
+You are implementing:
+Task #42.2: Add collection endpoint
+
+Description:
+Implement POST /api/metrics endpoint to collect user metrics events
+
+Acceptance criteria:
+- POST /api/metrics accepts { userId, eventType, timestamp }
+- Validates input schema (reject missing/invalid fields)
+- Persists to database using MetricsSchema from subtask 42.1
+- Returns 201 on success with created record
+- Returns 400 on validation errors with details
+
+Dependencies: Subtask 42.1 (metrics schema) is complete
+
+<INSTRUCTION>
+Generate failing tests for this subtask. Follow project conventions:
+- Test file: src/api/__tests__/metrics.test.js
+- Framework: vitest (detected from package.json)
+- Test cases to cover:
+  * POST /api/metrics with valid payload → should return 201 (will fail: endpoint not implemented)
+  * POST /api/metrics with missing userId → should return 400 (will fail: validation not implemented)
+  * POST /api/metrics with invalid timestamp → should return 400 (will fail: validation not implemented)
+  * POST /api/metrics should persist to database → should save record (will fail: persistence not implemented)
+
+Do NOT implement the endpoint code yet. Only create test file(s).
+Confirm tests fail with messages like "Cannot POST /api/metrics" or "endpoint not defined".
+
+Output format:
+1. File path to create: src/api/__tests__/metrics.test.js
+2. Complete test code
+3. Command to run: npm test src/api/__tests__/metrics.test.js
+```
+
+### Example Full Prompt (GREEN Phase)
+
+```markdown
+<SYSTEM PROMPT>
+[Contents of .cursor/rules/git_workflow.mdc]
+[Contents of .cursor/rules/test_workflow.mdc]
+
+<TASK CONTEXT>
+Task #42.2: Add collection endpoint
+[same context as RED phase]
+
+<CURRENT STATE>
+Tests created in RED phase:
+- src/api/__tests__/metrics.test.js
+- 5 tests written, all failing as expected
+
+Test output:
+```
+FAIL src/api/__tests__/metrics.test.js
+  POST /api/metrics
+    ✗ should return 201 with valid payload (endpoint not found)
+    ✗ should return 400 with missing userId (endpoint not found)
+    ✗ should return 400 with invalid timestamp (endpoint not found)
+    ✗ should persist to database (endpoint not found)
+```
+
+<INSTRUCTION>
+Implement minimal code to make all tests pass.
+
+Guidelines:
+- Create/modify file: src/api/metrics.js
+- Use existing patterns from src/api/ (e.g., src/api/users.js for reference)
+- Import MetricsSchema from subtask 42.1 (src/models/schema.js)
+- Implement validation, persistence, and response handling
+- Follow project error handling conventions
+- Keep implementation focused on this subtask only
+
+After implementation:
+1. Run tests: npm test src/api/__tests__/metrics.test.js
+2. Confirm all 5 tests pass
+3. Report results
+
+Output format:
+1. File(s) created/modified
+2. Implementation code
+3. Test command and results
+```
+
+### Prompt Loading Configuration
+
+See `.taskmaster/config.json` → `prompts` section for paths and load order.
+
+## Configuration Schema
+
+### .taskmaster/config.json
+
+```json
+{
+  "autopilot": {
+    "enabled": true,
+    "requireCleanWorkingTree": true,
+    "commitTemplate": "{type}({scope}): {msg}",
+    "defaultCommitType": "feat",
+    "maxGreenAttempts": 3,
+    "testTimeout": 300000
+  },
+  "test": {
+    "runner": "auto",
+    "coverageThresholds": {
+      "lines": 80,
+      "branches": 80,
+      "functions": 80,
+      "statements": 80
+    },
+    "targetedRunPattern": "**/*.test.js"
+  },
+  "git": {
+    "branchPattern": "{tag}/task-{id}-{slug}",
+    "pr": {
+      "enabled": true,
+      "base": "default",
+      "bodyTemplate": ".taskmaster/templates/pr-body.md"
+    }
+  },
+  "prompts": {
+    "rulesPath": ".cursor/rules",
+    "testGeneratorPath": ".claude/agents/surgical-test-generator.md",
+    "loadOrder": ["git_workflow.mdc", "test_workflow.mdc"]
+  }
+}
+```
+
+### Configuration Fields
+
+#### autopilot
+- `enabled` (boolean): Enable/disable autopilot functionality
+- `requireCleanWorkingTree` (boolean): Require clean git state before starting
+- `commitTemplate` (string): Template for commit messages (tokens: `{type}`, `{scope}`, `{msg}`)
+- `defaultCommitType` (string): Default commit type (feat, fix, chore, etc.)
+- `maxGreenAttempts` (number): Maximum retry attempts to achieve green tests (default: 3)
+- `testTimeout` (number): Timeout in milliseconds per test run (default: 300000 = 5min)
+
+#### test
+- `runner` (string): Test runner detection mode (`"auto"` or explicit command like `"npm test"`)
+- `coverageThresholds` (object): Minimum coverage percentages required
+  - `lines`, `branches`, `functions`, `statements` (number): Threshold percentages (0-100)
+- `targetedRunPattern` (string): Glob pattern for targeted subtask test runs
+
+#### git
+- `branchPattern` (string): Branch naming pattern (tokens: `{tag}`, `{id}`, `{slug}`)
+- `pr.enabled` (boolean): Enable automatic PR creation
+- `pr.base` (string): Target branch for PRs (`"default"` uses repo default, or specify like `"main"`)
+- `pr.bodyTemplate` (string): Path to PR body template file (optional)
+
+#### prompts
+- `rulesPath` (string): Directory containing rule files (e.g., `.cursor/rules`)
+- `testGeneratorPath` (string): Path to test generator prompt file
+- `loadOrder` (array): Order to load rule files from `rulesPath`
+
+### Environment Variables
+
+```bash
+# Required for executor
+ANTHROPIC_API_KEY=sk-ant-...          # Claude API key
+
+# Optional: for PR creation
+GITHUB_TOKEN=ghp_...                  # GitHub personal access token
+
+# Optional: for other executors (future)
+OPENAI_API_KEY=sk-...
+GOOGLE_API_KEY=...
+```
+
+## Run Artifacts & Observability
+
+### Per-Run Artifact Structure
+
+Each autopilot run creates a timestamped directory with complete traceability:
+
+```
+.taskmaster/reports/runs/2025-01-15-142033/
+├── manifest.json          # run metadata (task id, start/end time, status)
+├── log.jsonl              # timestamped event stream
+├── commits.txt            # list of commit SHAs made during run
+├── test-results/
+│   ├── subtask-42.1-red.json
+│   ├── subtask-42.1-green.json
+│   ├── subtask-42.2-red.json
+│   ├── subtask-42.2-green-attempt1.json
+│   ├── subtask-42.2-green-attempt2.json
+│   ├── subtask-42.2-green-attempt3.json
+│   └── final-suite.json
+└── pr.md                  # generated PR body
+```
+
+### manifest.json Format
+
+```json
+{
+  "runId": "2025-01-15-142033",
+  "taskId": "42",
+  "tag": "analytics",
+  "branch": "analytics/task-42-user-metrics",
+  "startTime": "2025-01-15T14:20:33Z",
+  "endTime": "2025-01-15T14:45:12Z",
+  "status": "completed",
+  "subtasksCompleted": ["42.1", "42.2", "42.3"],
+  "subtasksFailed": [],
+  "totalCommits": 3,
+  "prUrl": "https://github.com/org/repo/pull/123",
+  "finalCoverage": {
+    "lines": 85.3,
+    "branches": 82.1,
+    "functions": 88.9,
+    "statements": 85.0
+  }
+}
+```
+
+### log.jsonl Format
+
+Event stream in JSON Lines format for easy parsing and debugging:
+
+```jsonl
+{"ts":"2025-01-15T14:20:33Z","phase":"preflight","status":"ok","details":{"testCmd":"npm test","gitClean":true}}
+{"ts":"2025-01-15T14:20:45Z","phase":"branch","status":"ok","branch":"analytics/task-42-user-metrics"}
+{"ts":"2025-01-15T14:21:00Z","phase":"red","subtask":"42.1","status":"ok","tests":{"failed":3,"passed":0}}
+{"ts":"2025-01-15T14:22:15Z","phase":"green","subtask":"42.1","status":"ok","tests":{"passed":3,"failed":0},"attempts":2}
+{"ts":"2025-01-15T14:22:20Z","phase":"commit","subtask":"42.1","status":"ok","sha":"a1b2c3d","message":"feat(metrics): add metrics schema (task 42.1)"}
+{"ts":"2025-01-15T14:23:00Z","phase":"red","subtask":"42.2","status":"ok","tests":{"failed":5,"passed":0}}
+{"ts":"2025-01-15T14:25:30Z","phase":"green","subtask":"42.2","status":"error","tests":{"passed":3,"failed":2},"attempts":3,"error":"Max attempts reached"}
+{"ts":"2025-01-15T14:25:35Z","phase":"pause","reason":"max_attempts","nextAction":"manual_review"}
+```
+
+### Test Results Format
+
+Each test run stores detailed results:
+
+```json
+{
+  "subtask": "42.2",
+  "phase": "green",
+  "attempt": 3,
+  "timestamp": "2025-01-15T14:25:30Z",
+  "command": "npm test src/api/__tests__/metrics.test.js",
+  "exitCode": 1,
+  "duration": 2340,
+  "summary": {
+    "total": 5,
+    "passed": 3,
+    "failed": 2,
+    "skipped": 0
+  },
+  "failures": [
+    {
+      "test": "POST /api/metrics should return 201 with valid payload",
+      "error": "Expected status 201, got 500",
+      "stack": "..."
+    }
+  ],
+  "coverage": {
+    "lines": 78.5,
+    "branches": 75.0,
+    "functions": 80.0,
+    "statements": 78.5
+  }
+}
+```
+
+## Execution Model
+
+### Orchestration vs Direct Execution
+
+The autopilot system uses an **orchestration model** rather than direct code execution:
+
+**Orchestrator Role** (tm-core WorkflowOrchestrator):
+- Maintains state machine tracking current phase (RED/GREEN/COMMIT) per subtask
+- Validates preconditions (tests pass, git state clean, etc.)
+- Returns "work units" describing what needs to be done next
+- Records completion and advances to next phase
+- Persists state for resumability
+
+**Executor Role** (Claude Code/AI session via MCP):
+- Queries orchestrator for next work unit
+- Executes the work (generates tests, writes code, runs tests, makes commits)
+- Reports results back to orchestrator
+- Handles file operations and tool invocations
+
+**Why This Approach?**
+- Leverages existing AI capabilities (Claude Code) rather than duplicating them
+- MCP protocol provides clean separation between state management and execution
+- Allows human oversight and intervention at each phase
+- Simpler to implement: orchestrator is pure state logic, no code generation needed
+- Enables multiple executor types (Claude Code, other AI tools, human developers)
+
+**Example Flow**:
+```typescript
+// Claude Code (via MCP) queries orchestrator
+const workUnit = await orchestrator.getNextWorkUnit('42');
+// => {
+//      phase: 'RED',
+//      subtask: '42.1',
+//      action: 'Generate failing tests for metrics schema',
+//      context: { title, description, dependencies, testFile: 'src/__tests__/schema.test.js' }
+//    }
+
+// Claude Code executes the work (writes test file, runs tests)
+// Then reports back
+await orchestrator.completeWorkUnit('42', '42.1', 'RED', {
+  success: true,
+  testsCreated: ['src/__tests__/schema.test.js'],
+  testsFailed: 3
+});
+
+// Query again for next phase
+const nextWorkUnit = await orchestrator.getNextWorkUnit('42');
+// => { phase: 'GREEN', subtask: '42.1', action: 'Implement code to pass tests', ... }
+```
+
+## Design Decisions
+
+### Why commit per subtask instead of per task?
+
+**Decision**: Commit after each subtask's green state, not after the entire task.
+
+**Rationale**:
+- Atomic commits make code review easier (reviewers can see logical progression)
+- Easier to revert a single subtask if it causes issues downstream
+- Matches the TDD loop's natural checkpoint and cognitive boundary
+- Provides resumability points if the run is interrupted
+
+**Trade-off**: More commits per task (can use squash-merge in PRs if desired)
+
+### Why not support parallel subtask execution?
+
+**Decision**: Sequential subtask execution in Phase 1; parallel execution deferred to Phase 3.
+
+**Rationale**:
+- Subtasks often have implicit dependencies (e.g., schema before endpoint, endpoint before UI)
+- Simpler orchestrator state machine (less complexity = faster to ship)
+- Parallel execution requires explicit dependency DAG and conflict resolution
+- Can be added in Phase 3 once core workflow is proven stable
+
+**Trade-off**: Slower for truly independent subtasks (mitigated by keeping subtasks small and focused)
+
+### Why require 80% coverage by default?
+
+**Decision**: Enforce 80% coverage threshold (lines/branches/functions/statements) before allowing commits.
+
+**Rationale**:
+- Industry standard baseline for production code quality
+- Forces test generation to be comprehensive, not superficial
+- Configurable per project via `.taskmaster/config.json` if too strict
+- Prevents "green tests" that only test happy paths
+
+**Trade-off**: May require more test generation iterations; can be lowered per project
+
+### Why use tmux instead of a rich GUI?
+
+**Decision**: MVP uses tmux split panes for TUI, not Electron/web-based GUI.
+
+**Rationale**:
+- Tmux is universally available on dev machines; no installation burden
+- Terminal-first workflows match developer mental model (no context switching)
+- Simpler to implement and maintain; can add GUI later via extensions
+- State stored in files allows IDE/extension integration without coupling
+
+**Trade-off**: Less visual polish than GUI; requires tmux familiarity
+
+### Why not support multiple executors (codex/gemini/claude) in Phase 1?
+
+**Decision**: Start with Claude executor only; add others in Phase 2+.
+
+**Rationale**:
+- Reduces scope and complexity for initial delivery
+- Claude Code already integrated with existing executor service
+- Executor abstraction already exists; adding more is straightforward later
+- Different executors may need different prompt strategies (requires experimentation)
+
+**Trade-off**: Users locked to Claude initially; can work around with manual executor selection
+
+## Risks and Mitigations
+
+- Model hallucination/large diffs: restrict prompt scope; enforce minimal changes; show diff previews (optional) before commit.
+
+- Flaky tests: allow retries, isolate targeted runs for speed, then full suite before commit.
+
+- Environment variability: detect runners/tools; provide fallbacks and actionable errors.
+
+- PR creation fails: still push and print manual commands; persist PR body to reuse.
+
+## Open Questions
+
+1) Slugging rules for branch names; any length limits or normalization beyond {slug} token sanitize?
+
+2) PR body standard sections beyond run report (e.g., checklist, coverage table)?
+
+3) Default executor prompt fine-tuning once codex/gemini integration is available.
+
+4) Where to store persistent TUI state (pane layout, last selection) in .taskmaster/state.json?
+
+## Branch Naming
+
+- Include both the tag and the task id in the branch name to make lineage explicit.
+
+- Default pattern: <tag>/task-<id>[-slug] (e.g., master/task-12, tag-analytics/task-4-user-auth).
+
+- Configurable via .taskmaster/config.json: git.branchPattern supports tokens {tag}, {id}, {slug}.
+
+## PR Base Branch
+
+- Use the repository’s default branch (detected via git) unless overridden.
+
+- Title format: Task #<id> [<tag>]: <title>.
+
+## RPG Mapping (Repository Planning Graph)
+
+Functional nodes (capabilities):
+
+- Autopilot Orchestration → drives TDD loop and lifecycle
+
+- Test Generation (Surgical) → produces failing tests from subtask context
+
+- Test Execution + Coverage → runs suite, enforces thresholds
+
+- Git/Branch/PR Management → safe operations and PR creation
+
+- TUI/Terminal Integration → interactive control and visibility via tmux
+
+- MCP Integration → structured task/status/context operations
+
+Structural nodes (code organization):
+
+- packages/tm-core:
+
+  - services/workflow-orchestrator.ts (new)
+
+  - services/test-runner-adapter.ts (new)
+
+  - services/git-adapter.ts (new)
+
+  - existing: task-service.ts, task-execution-service.ts, executors/*
+
+- apps/cli:
+
+  - src/commands/autopilot.command.ts (new)
+
+  - src/ui/tui/ (new tmux/TUI helpers)
+
+- scripts/modules:
+
+  - reuse utils/git-utils.js, task-manager/tag-management.js
+
+- .claude/agents/:
+
+  - surgical-test-generator.md
+
+Edges (data/control flow):
+
+- Autopilot → Test Generation → Test Execution → Git Commit → loop
+
+- Autopilot → Git Adapter (branch, tag, PR)
+
+- Autopilot → TUI (event stream) → tmux pane control
+
+- Autopilot → MCP tools for task/status updates
+
+- Test Execution → Coverage gate → Autopilot decision
+
+Topological traversal (implementation order):
+
+1) Git/Test adapters (foundations)
+
+2) Orchestrator skeleton + events
+
+3) CLI autopilot command and dry-run
+
+4) Surgical test-gen integration and execution gate
+
+5) PR creation, run reports, resumability
+
+## Phased Roadmap
+
+- Phase 0: Spike
+
+  - Implement CLI skeleton tm autopilot with dry-run showing planned steps from a real task + subtasks.
+
+  - Detect test runner (package.json) and git state; render a preflight report.
+
+- Phase 1: Core Rails (State Machine & Orchestration)
+
+  - Implement WorkflowOrchestrator in tm-core as a **state machine** that tracks TDD phases per subtask.
+
+  - Orchestrator **guides** the current AI session (Claude Code/MCP client) rather than executing code itself.
+
+  - Add Git/Test adapters for status checks and validation (not direct execution).
+
+  - WorkflowOrchestrator API:
+    - `getNextWorkUnit(taskId)` → returns next phase to execute (RED/GREEN/COMMIT) with context
+    - `completeWorkUnit(taskId, subtaskId, phase, result)` → records completion and advances state
+    - `getRunState(taskId)` → returns current progress and resumability data
+
+  - MCP integration: expose work unit endpoints so Claude Code can query "what to do next" and report back.
+
+  - Branch/tag mapping via existing tag-management APIs.
+
+  - Run report persisted under .taskmaster/reports/runs/ with state checkpoints for resumability.
+
+- Phase 2: PR + Resumability
+
+  - Add gh PR creation with well-formed body using the run report.
+
+  - Introduce resumable checkpoints and --resume flag.
+
+  - Add coverage enforcement and optional lint/format step.
+
+- Phase 3: Extensibility + Guardrails
+
+  - Add support for basic pytest/go test adapters.
+
+  - Add safeguards: diff preview mode, manual confirm gates, aggressive minimal-change prompts.
+
+  - Optional: small TUI panel and extension panel leveraging the same run state file.
+
+## References (Repo)
+
+- Test Workflow: .cursor/rules/test_workflow.mdc
+
+- Git Workflow: .cursor/rules/git_workflow.mdc
+
+- CLI: apps/cli/src/commands/start.command.ts, apps/cli/src/ui/components/*.ts
+
+- Core Services: packages/tm-core/src/services/task-service.ts, task-execution-service.ts
+
+- Executors: packages/tm-core/src/executors/*
+
+- Git Utilities: scripts/modules/utils/git-utils.js
+
+- Tag Management: scripts/modules/task-manager/tag-management.js
+
+ - Surgical Test Generator: .claude/agents/surgical-test-generator.md
+
--- a/.taskmaster/docs/tdd-workflow-phase-0-spike.md
+++ b/.taskmaster/docs/tdd-workflow-phase-0-spike.md
@@ -0,0 +1,130 @@
+# Phase 0: Spike - Autonomous TDD Workflow ✅ COMPLETE
+
+## Objective
+Validate feasibility and build foundational understanding before full implementation.
+
+## Status
+**COMPLETED** - All deliverables implemented and validated.
+
+See `apps/cli/src/commands/autopilot.command.ts` for implementation.
+
+## Scope
+- Implement CLI skeleton `tm autopilot` with dry-run mode
+- Show planned steps from a real task with subtasks
+- Detect test runner from package.json
+- Detect git state and render preflight report
+
+## Deliverables
+
+### 1. CLI Command Skeleton
+- Create `apps/cli/src/commands/autopilot.command.ts`
+- Support `tm autopilot <taskId>` command
+- Implement `--dry-run` flag
+- Basic help text and usage information
+
+### 2. Preflight Detection System
+- Detect test runner from package.json (npm test, pnpm test, etc.)
+- Check git working tree state (clean/dirty)
+- Validate required tools are available (git, gh, node/npm)
+- Detect default branch
+
+### 3. Dry-Run Execution Plan Display
+Display planned execution for a task including:
+- Preflight checks status
+- Branch name that would be created
+- Tag that would be set
+- List of subtasks in execution order
+- For each subtask:
+  - RED phase: test file that would be created
+  - GREEN phase: implementation files that would be modified
+  - COMMIT: commit message that would be used
+- Finalization steps: test suite run, coverage check, push, PR creation
+
+### 4. Task Loading & Validation
+- Load task from TaskMaster state
+- Validate task exists and has subtasks
+- If no subtasks, show message about needing to expand first
+- Show dependency order for subtasks
+
+## Example Output
+
+```bash
+$ tm autopilot 42 --dry-run
+
+Autopilot Plan for Task #42 [analytics]: User metrics tracking
+─────────────────────────────────────────────────────────────
+
+Preflight Checks:
+  ✓ Working tree is clean
+  ✓ Test command detected: npm test
+  ✓ Tools available: git, gh, node, npm
+  ✓ Current branch: main (will create new branch)
+  ✓ Task has 3 subtasks ready to execute
+
+Branch & Tag:
+  → Will create branch: analytics/task-42-user-metrics
+  → Will set active tag: analytics
+
+Execution Plan (3 subtasks):
+
+  1. Subtask 42.1: Add metrics schema
+     RED:    Generate tests → src/__tests__/schema.test.js
+     GREEN:  Implement code → src/schema.js
+     COMMIT: "feat(metrics): add metrics schema (task 42.1)"
+
+  2. Subtask 42.2: Add collection endpoint [depends on 42.1]
+     RED:    Generate tests → src/api/__tests__/metrics.test.js
+     GREEN:  Implement code → src/api/metrics.js
+     COMMIT: "feat(metrics): add collection endpoint (task 42.2)"
+
+  3. Subtask 42.3: Add dashboard widget [depends on 42.2]
+     RED:    Generate tests → src/components/__tests__/MetricsWidget.test.jsx
+     GREEN:  Implement code → src/components/MetricsWidget.jsx
+     COMMIT: "feat(metrics): add dashboard widget (task 42.3)"
+
+Finalization:
+  → Run full test suite with coverage (threshold: 80%)
+  → Push branch to origin (will confirm)
+  → Create PR targeting main
+
+Estimated commits: 3
+Estimated duration: ~20-30 minutes (depends on implementation complexity)
+
+Run without --dry-run to execute.
+```
+
+## Success Criteria
+- Dry-run output is clear and matches expected workflow
+- Preflight detection works correctly on the project repo
+- Task loading integrates with existing TaskMaster state
+- No actual git operations or file modifications occur in dry-run mode
+
+## Out of Scope
+- Actual test generation
+- Actual code implementation
+- Git operations (branch creation, commits, push)
+- PR creation
+- Test execution
+
+## Implementation Notes
+- Reuse existing `TaskService` from `packages/tm-core`
+- Use existing git utilities from `scripts/modules/utils/git-utils.js`
+- Load task/subtask data from `.taskmaster/tasks/tasks.json`
+- Detect test command via package.json → scripts.test field
+
+## Dependencies
+- Existing TaskMaster CLI structure
+- Existing task storage format
+- Git utilities
+
+## Estimated Effort
+2-3 days
+
+## Validation
+Test dry-run mode with:
+- Task with 1 subtask
+- Task with multiple subtasks
+- Task with dependencies between subtasks
+- Task without subtasks (should show warning)
+- Dirty git working tree (should warn)
+- Missing tools (should error with helpful message)
--- a/.taskmaster/docs/tdd-workflow-phase-1-core-rails.md
+++ b/.taskmaster/docs/tdd-workflow-phase-1-core-rails.md
--- a/.taskmaster/docs/tdd-workflow-phase-1-orchestrator.md
+++ b/.taskmaster/docs/tdd-workflow-phase-1-orchestrator.md
@@ -0,0 +1,369 @@
+# Phase 1: Core Rails - State Machine & Orchestration
+
+## Objective
+Build the WorkflowOrchestrator as a state machine that guides AI sessions through TDD workflow, rather than directly executing code.
+
+## Architecture Overview
+
+### Execution Model
+The orchestrator acts as a **state manager and guide**, not a code executor:
+
+```
+┌─────────────────────────────────────────────────────────────┐
+│                    Claude Code (MCP Client)                  │
+│  - Queries "what to do next"                                │
+│  - Executes work (writes tests, code, runs commands)        │
+│  - Reports completion                                        │
+└────────────────┬────────────────────────────────────────────┘
+                 │ MCP Protocol
+                 ▼
+┌─────────────────────────────────────────────────────────────┐
+│              WorkflowOrchestrator (tm-core)                  │
+│  - Maintains state machine (RED → GREEN → COMMIT)           │
+│  - Returns work units with context                          │
+│  - Validates preconditions                                  │
+│  - Records progress                                         │
+│  - Persists state for resumability                          │
+└─────────────────────────────────────────────────────────────┘
+```
+
+### Why This Approach?
+1. **Separation of Concerns**: State management separate from code execution
+2. **Leverage Existing Tools**: Uses Claude Code's capabilities instead of reimplementing
+3. **Human-in-the-Loop**: Easy to inspect state and intervene at any phase
+4. **Simpler Implementation**: Orchestrator is pure logic, no AI model integration needed
+5. **Flexible Executors**: Any tool (Claude Code, human, other AI) can execute work units
+
+## Core Components
+
+### 1. WorkflowOrchestrator Service
+**Location**: `packages/tm-core/src/services/workflow-orchestrator.service.ts`
+
+**Responsibilities**:
+- Track current phase (RED/GREEN/COMMIT) per subtask
+- Generate work units with context for each phase
+- Validate phase completion criteria
+- Advance state machine on successful completion
+- Handle errors and retry logic
+- Persist run state for resumability
+
+**API**:
+```typescript
+interface WorkflowOrchestrator {
+  // Start a new autopilot run
+  startRun(taskId: string, options?: RunOptions): Promise<RunContext>;
+
+  // Get next work unit to execute
+  getNextWorkUnit(runId: string): Promise<WorkUnit | null>;
+
+  // Report work unit completion
+  completeWorkUnit(
+    runId: string,
+    workUnitId: string,
+    result: WorkUnitResult
+  ): Promise<void>;
+
+  // Get current run state
+  getRunState(runId: string): Promise<RunState>;
+
+  // Pause/resume
+  pauseRun(runId: string): Promise<void>;
+  resumeRun(runId: string): Promise<void>;
+}
+
+interface WorkUnit {
+  id: string;                    // Unique work unit ID
+  phase: 'RED' | 'GREEN' | 'COMMIT';
+  subtaskId: string;             // e.g., "42.1"
+  action: string;                // Human-readable description
+  context: WorkUnitContext;      // All info needed to execute
+  preconditions: Precondition[]; // Checks before execution
+}
+
+interface WorkUnitContext {
+  taskId: string;
+  taskTitle: string;
+  subtaskTitle: string;
+  subtaskDescription: string;
+  dependencies: string[];        // Completed subtask IDs
+  testCommand: string;           // e.g., "npm test"
+
+  // Phase-specific context
+  redPhase?: {
+    testFile: string;            // Where to create test
+    testFramework: string;       // e.g., "vitest"
+    acceptanceCriteria: string[];
+  };
+
+  greenPhase?: {
+    testFile: string;            // Test to make pass
+    implementationHints: string[];
+    expectedFiles: string[];     // Files likely to modify
+  };
+
+  commitPhase?: {
+    commitMessage: string;       // Pre-generated message
+    filesToCommit: string[];     // Files modified in RED+GREEN
+  };
+}
+
+interface WorkUnitResult {
+  success: boolean;
+  phase: 'RED' | 'GREEN' | 'COMMIT';
+
+  // RED phase results
+  testsCreated?: string[];
+  testsFailed?: number;
+
+  // GREEN phase results
+  testsPassed?: number;
+  filesModified?: string[];
+  attempts?: number;
+
+  // COMMIT phase results
+  commitSha?: string;
+
+  // Common
+  error?: string;
+  logs?: string;
+}
+
+interface RunState {
+  runId: string;
+  taskId: string;
+  status: 'running' | 'paused' | 'completed' | 'failed';
+  currentPhase: 'RED' | 'GREEN' | 'COMMIT';
+  currentSubtask: string;
+  completedSubtasks: string[];
+  failedSubtasks: string[];
+  startTime: Date;
+  lastUpdateTime: Date;
+
+  // Resumability
+  checkpoint: {
+    subtaskId: string;
+    phase: 'RED' | 'GREEN' | 'COMMIT';
+    attemptNumber: number;
+  };
+}
+```
+
+### 2. State Machine Logic
+
+**Phase Transitions**:
+```
+START → RED(subtask 1) → GREEN(subtask 1) → COMMIT(subtask 1)
+                                               ↓
+                        RED(subtask 2) ← ─ ─ ─ ┘
+                             ↓
+                        GREEN(subtask 2)
+                             ↓
+                        COMMIT(subtask 2)
+                             ↓
+                           (repeat for remaining subtasks)
+                             ↓
+                        FINALIZE → END
+```
+
+**Phase Rules**:
+- **RED**: Can only transition to GREEN if tests created and failing
+- **GREEN**: Can only transition to COMMIT if tests passing (attempt < maxAttempts)
+- **COMMIT**: Can only transition to next RED if commit successful
+- **FINALIZE**: Can only start if all subtasks completed
+
+**Preconditions**:
+- RED: No uncommitted changes (or staged from previous GREEN that failed)
+- GREEN: RED phase complete, tests exist and are failing
+- COMMIT: GREEN phase complete, all tests passing, coverage meets threshold
+
+### 3. MCP Integration
+
+**New MCP Tools** (expose WorkflowOrchestrator via MCP):
+```typescript
+// Start an autopilot run
+mcp__task_master_ai__autopilot_start(taskId: string, dryRun?: boolean)
+
+// Get next work unit
+mcp__task_master_ai__autopilot_next_work_unit(runId: string)
+
+// Complete current work unit
+mcp__task_master_ai__autopilot_complete_work_unit(
+  runId: string,
+  workUnitId: string,
+  result: WorkUnitResult
+)
+
+// Get run state
+mcp__task_master_ai__autopilot_get_state(runId: string)
+
+// Pause/resume
+mcp__task_master_ai__autopilot_pause(runId: string)
+mcp__task_master_ai__autopilot_resume(runId: string)
+```
+
+### 4. Git/Test Adapters
+
+**GitAdapter** (`packages/tm-core/src/services/git-adapter.service.ts`):
+- Check working tree status
+- Validate branch state
+- Read git config (user, remote, default branch)
+- **Does NOT execute** git commands (that's executor's job)
+
+**TestAdapter** (`packages/tm-core/src/services/test-adapter.service.ts`):
+- Detect test framework from package.json
+- Parse test output (failures, passes, coverage)
+- Validate coverage thresholds
+- **Does NOT run** tests (that's executor's job)
+
+### 5. Run State Persistence
+
+**Storage Location**: `.taskmaster/reports/runs/<runId>/`
+
+**Files**:
+- `state.json` - Current run state (for resumability)
+- `log.jsonl` - Event stream (timestamped work unit completions)
+- `manifest.json` - Run metadata
+- `work-units.json` - All work units generated for this run
+
+**Example `state.json`**:
+```json
+{
+  "runId": "2025-01-15-142033",
+  "taskId": "42",
+  "status": "paused",
+  "currentPhase": "GREEN",
+  "currentSubtask": "42.2",
+  "completedSubtasks": ["42.1"],
+  "failedSubtasks": [],
+  "checkpoint": {
+    "subtaskId": "42.2",
+    "phase": "GREEN",
+    "attemptNumber": 2
+  },
+  "startTime": "2025-01-15T14:20:33Z",
+  "lastUpdateTime": "2025-01-15T14:35:12Z"
+}
+```
+
+## Implementation Plan
+
+### Step 1: WorkflowOrchestrator Skeleton
+- [ ] Create `workflow-orchestrator.service.ts` with interfaces
+- [ ] Implement state machine logic (phase transitions)
+- [ ] Add run state persistence (state.json, log.jsonl)
+- [ ] Write unit tests for state machine
+
+### Step 2: Work Unit Generation
+- [ ] Implement `getNextWorkUnit()` with context assembly
+- [ ] Generate RED phase work units (test file paths, criteria)
+- [ ] Generate GREEN phase work units (implementation hints)
+- [ ] Generate COMMIT phase work units (commit messages)
+
+### Step 3: Git/Test Adapters
+- [ ] Create GitAdapter for status checks only
+- [ ] Create TestAdapter for output parsing only
+- [ ] Add precondition validation using adapters
+- [ ] Write adapter unit tests
+
+### Step 4: MCP Integration
+- [ ] Add MCP tool definitions in `packages/mcp-server/src/tools/`
+- [ ] Wire up WorkflowOrchestrator to MCP tools
+- [ ] Test MCP tools via Claude Code
+- [ ] Document MCP workflow in CLAUDE.md
+
+### Step 5: CLI Integration
+- [ ] Update `autopilot.command.ts` to call WorkflowOrchestrator
+- [ ] Add `--interactive` mode that shows work units and waits for completion
+- [ ] Add `--resume` flag to continue paused runs
+- [ ] Test end-to-end flow
+
+### Step 6: Integration Testing
+- [ ] Create test task with 2-3 subtasks
+- [ ] Run autopilot start → get work unit → complete → repeat
+- [ ] Verify state persistence and resumability
+- [ ] Test failure scenarios (test failures, git issues)
+
+## Success Criteria
+- [ ] WorkflowOrchestrator can generate work units for all phases
+- [ ] MCP tools allow Claude Code to query and complete work units
+- [ ] State persists correctly between work unit completions
+- [ ] Run can be paused and resumed from checkpoint
+- [ ] Adapters validate preconditions without executing commands
+- [ ] End-to-end: Claude Code can complete a simple task via work units
+
+## Out of Scope (Phase 1)
+- Actual git operations (branch creation, commits) - executor handles this
+- Actual test execution - executor handles this
+- PR creation - deferred to Phase 2
+- TUI interface - deferred to Phase 3
+- Coverage enforcement - deferred to Phase 2
+
+## Example Usage Flow
+
+```bash
+# Terminal 1: Claude Code session
+$ claude
+
+# In Claude Code (via MCP):
+> Start autopilot for task 42
+[Calls mcp__task_master_ai__autopilot_start(42)]
+→ Run started: run-2025-01-15-142033
+
+> Get next work unit
+[Calls mcp__task_master_ai__autopilot_next_work_unit(run-2025-01-15-142033)]
+→ Work unit: RED phase for subtask 42.1
+→ Action: Generate failing tests for metrics schema
+→ Test file: src/__tests__/schema.test.js
+→ Framework: vitest
+
+> [Claude Code creates test file, runs tests]
+
+> Complete work unit
+[Calls mcp__task_master_ai__autopilot_complete_work_unit(
+  run-2025-01-15-142033,
+  workUnit-42.1-RED,
+  { success: true, testsCreated: ['src/__tests__/schema.test.js'], testsFailed: 3 }
+)]
+→ Work unit completed. State saved.
+
+> Get next work unit
+[Calls mcp__task_master_ai__autopilot_next_work_unit(run-2025-01-15-142033)]
+→ Work unit: GREEN phase for subtask 42.1
+→ Action: Implement code to pass failing tests
+→ Test file: src/__tests__/schema.test.js
+→ Expected implementation: src/schema.js
+
+> [Claude Code implements schema.js, runs tests, confirms all pass]
+
+> Complete work unit
+[...]
+→ Work unit completed. Ready for COMMIT.
+
+> Get next work unit
+[...]
+→ Work unit: COMMIT phase for subtask 42.1
+→ Commit message: "feat(metrics): add metrics schema (task 42.1)"
+→ Files to commit: src/__tests__/schema.test.js, src/schema.js
+
+> [Claude Code stages files and commits]
+
+> Complete work unit
+[...]
+→ Subtask 42.1 complete! Moving to 42.2...
+```
+
+## Dependencies
+- Existing TaskService (task loading, status updates)
+- Existing PreflightChecker (environment validation)
+- Existing TaskLoaderService (dependency ordering)
+- MCP server infrastructure
+
+## Estimated Effort
+7-10 days
+
+## Next Phase
+Phase 2 will add:
+- PR creation via gh CLI
+- Coverage enforcement
+- Enhanced error recovery
+- Full resumability testing
--- a/.taskmaster/docs/tdd-workflow-phase-2-pr-resumability.md
+++ b/.taskmaster/docs/tdd-workflow-phase-2-pr-resumability.md
@@ -0,0 +1,433 @@
+# Phase 2: PR + Resumability - Autonomous TDD Workflow
+
+## Objective
+Add PR creation with GitHub CLI integration, resumable checkpoints for interrupted runs, and enhanced guardrails with coverage enforcement.
+
+## Scope
+- GitHub PR creation via `gh` CLI
+- Well-formed PR body using run report
+- Resumable checkpoints and `--resume` flag
+- Coverage enforcement before finalization
+- Optional lint/format step
+- Enhanced error recovery
+
+## Deliverables
+
+### 1. PR Creation Integration
+
+**PRAdapter** (`packages/tm-core/src/services/pr-adapter.ts`):
+```typescript
+class PRAdapter {
+  async isGHAvailable(): Promise<boolean>
+  async createPR(options: PROptions): Promise<PRResult>
+  async getPRTemplate(runReport: RunReport): Promise<string>
+
+  // Fallback for missing gh CLI
+  async getManualPRInstructions(options: PROptions): Promise<string>
+}
+
+interface PROptions {
+  branch: string
+  base: string
+  title: string
+  body: string
+  draft?: boolean
+}
+
+interface PRResult {
+  url: string
+  number: number
+}
+```
+
+**PR Title Format:**
+```
+Task #<id> [<tag>]: <title>
+```
+
+Example: `Task #42 [analytics]: User metrics tracking`
+
+**PR Body Template:**
+
+Located at `.taskmaster/templates/pr-body.md`:
+
+```markdown
+## Summary
+
+Implements Task #42 from TaskMaster autonomous workflow.
+
+**Branch:** {branch}
+**Tag:** {tag}
+**Subtasks completed:** {subtaskCount}
+
+{taskDescription}
+
+## Subtasks
+
+{subtasksList}
+
+## Test Coverage
+
+| Metric | Coverage |
+|--------|----------|
+| Lines | {lines}% |
+| Branches | {branches}% |
+| Functions | {functions}% |
+| Statements | {statements}% |
+
+**All subtasks passed with {totalTests} tests.**
+
+## Commits
+
+{commitsList}
+
+## Run Report
+
+Full execution report: `.taskmaster/reports/runs/{runId}/`
+
+---
+
+🤖 Generated with [Task Master](https://github.com/cline/task-master) autonomous TDD workflow
+```
+
+**Token replacement:**
+- `{branch}` → branch name
+- `{tag}` → active tag
+- `{subtaskCount}` → number of completed subtasks
+- `{taskDescription}` → task description from TaskMaster
+- `{subtasksList}` → markdown list of subtask titles
+- `{lines}`, `{branches}`, `{functions}`, `{statements}` → coverage percentages
+- `{totalTests}` → total test count
+- `{commitsList}` → markdown list of commit SHAs and messages
+- `{runId}` → run ID timestamp
+
+### 2. GitHub CLI Integration
+
+**Detection:**
+```bash
+which gh
+```
+
+If not found, show fallback instructions:
+```bash
+✓ Branch pushed: analytics/task-42-user-metrics
+✗ gh CLI not found - cannot create PR automatically
+
+To create PR manually:
+  gh pr create \
+    --base main \
+    --head analytics/task-42-user-metrics \
+    --title "Task #42 [analytics]: User metrics tracking" \
+    --body-file .taskmaster/reports/runs/2025-01-15-142033/pr.md
+
+Or visit:
+  https://github.com/org/repo/compare/main...analytics/task-42-user-metrics
+```
+
+**Confirmation gate:**
+```bash
+Ready to create PR:
+  Title: Task #42 [analytics]: User metrics tracking
+  Base: main
+  Head: analytics/task-42-user-metrics
+
+Create PR? [Y/n]
+```
+
+Unless `--no-confirm` flag is set.
+
+### 3. Resumable Workflow
+
+**State Checkpoint** (`state.json`):
+```json
+{
+  "runId": "2025-01-15-142033",
+  "taskId": "42",
+  "phase": "subtask-loop",
+  "currentSubtask": "42.2",
+  "currentPhase": "green",
+  "attempts": 2,
+  "completedSubtasks": ["42.1"],
+  "commits": ["a1b2c3d"],
+  "branch": "analytics/task-42-user-metrics",
+  "tag": "analytics",
+  "canResume": true,
+  "pausedAt": "2025-01-15T14:25:35Z",
+  "pausedReason": "max_attempts_reached",
+  "nextAction": "manual_review_required"
+}
+```
+
+**Resume Command:**
+```bash
+$ tm autopilot --resume
+
+Resuming run: 2025-01-15-142033
+  Task: #42 [analytics] User metrics tracking
+  Branch: analytics/task-42-user-metrics
+  Last subtask: 42.2 (GREEN phase, attempt 2/3 failed)
+  Paused: 5 minutes ago
+
+Reason: Could not achieve green state after 3 attempts
+Last error: POST /api/metrics returns 500 instead of 201
+
+Resume from subtask 42.2 GREEN phase? [Y/n]
+```
+
+**Resume logic:**
+1. Load state from `.taskmaster/reports/runs/<runId>/state.json`
+2. Verify branch still exists and is checked out
+3. Verify no uncommitted changes (unless `--force`)
+4. Continue from last checkpoint phase
+5. Update state file as execution progresses
+
+**Multiple interrupted runs:**
+```bash
+$ tm autopilot --resume
+
+Found 2 resumable runs:
+  1. 2025-01-15-142033 - Task #42 (paused 5 min ago at subtask 42.2 GREEN)
+  2. 2025-01-14-103022 - Task #38 (paused 2 hours ago at subtask 38.3 RED)
+
+Select run to resume [1-2]:
+```
+
+### 4. Coverage Enforcement
+
+**Coverage Check Phase** (before finalization):
+```typescript
+async function enforceCoverage(runId: string): Promise<void> {
+  const testResults = await testRunner.runAll()
+  const coverage = await testRunner.getCoverage()
+
+  const thresholds = config.test.coverageThresholds
+  const failures = []
+
+  if (coverage.lines < thresholds.lines) {
+    failures.push(`Lines: ${coverage.lines}% < ${thresholds.lines}%`)
+  }
+  // ... check branches, functions, statements
+
+  if (failures.length > 0) {
+    throw new CoverageError(
+      `Coverage thresholds not met:\n${failures.join('\n')}`
+    )
+  }
+
+  // Store coverage in run report
+  await storeRunArtifact(runId, 'coverage.json', coverage)
+}
+```
+
+**Handling coverage failures:**
+```bash
+⚠️  Coverage check failed:
+  Lines: 78.5% < 80%
+  Branches: 75.0% < 80%
+
+Options:
+  1. Add more tests and resume
+  2. Lower thresholds in .taskmaster/config.json
+  3. Skip coverage check: tm autopilot --resume --skip-coverage
+
+Run paused. Fix coverage and resume with:
+  tm autopilot --resume
+```
+
+### 5. Optional Lint/Format Step
+
+**Configuration:**
+```json
+{
+  "autopilot": {
+    "finalization": {
+      "lint": {
+        "enabled": true,
+        "command": "npm run lint",
+        "fix": true,
+        "failOnError": false
+      },
+      "format": {
+        "enabled": true,
+        "command": "npm run format",
+        "commitChanges": true
+      }
+    }
+  }
+}
+```
+
+**Execution:**
+```bash
+Finalization Steps:
+
+  ✓ All tests passing (12 tests, 0 failures)
+  ✓ Coverage thresholds met (85% lines, 82% branches)
+
+  LINT Running linter... ⏳
+  LINT ✓ No lint errors
+
+  FORMAT Running formatter... ⏳
+  FORMAT ✓ Formatted 3 files
+  FORMAT ✓ Committed formatting changes: "chore: auto-format code"
+
+  PUSH Pushing to origin... ⏳
+  PUSH ✓ Pushed analytics/task-42-user-metrics
+
+  PR Creating pull request... ⏳
+  PR ✓ Created PR #123
+      https://github.com/org/repo/pull/123
+```
+
+### 6. Enhanced Error Recovery
+
+**Pause Points:**
+- Max GREEN attempts reached (current)
+- Coverage check failed (new)
+- Lint errors (if `failOnError: true`)
+- Git push failed (new)
+- PR creation failed (new)
+
+**Each pause saves:**
+- Full state checkpoint
+- Last command output
+- Suggested next actions
+- Resume instructions
+
+**Automatic recovery attempts:**
+- Git push: retry up to 3 times with backoff
+- PR creation: fall back to manual instructions
+- Lint: auto-fix if enabled, otherwise pause
+
+### 7. Finalization Phase Enhancement
+
+**Updated workflow:**
+1. Run full test suite
+2. Check coverage thresholds → pause if failed
+3. Run lint (if enabled) → pause if failed and `failOnError: true`
+4. Run format (if enabled) → auto-commit changes
+5. Confirm push (unless `--no-confirm`)
+6. Push branch → retry on failure
+7. Generate PR body from template
+8. Create PR via gh → fall back to manual instructions
+9. Update task status to 'review' (configurable)
+10. Save final run report
+
+**Final output:**
+```bash
+━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
+
+✅ Task #42 [analytics]: User metrics tracking - COMPLETE
+
+  Branch: analytics/task-42-user-metrics
+  Subtasks completed: 3/3
+  Commits: 3
+  Total tests: 12 (12 passed, 0 failed)
+  Coverage: 85% lines, 82% branches, 88% functions, 85% statements
+
+  PR #123: https://github.com/org/repo/pull/123
+
+  Run report: .taskmaster/reports/runs/2025-01-15-142033/
+
+Next steps:
+  - Review PR and request changes if needed
+  - Merge when ready
+  - Task status updated to 'review'
+
+Completed in 24 minutes
+```
+
+## CLI Updates
+
+**New flags:**
+- `--resume` → Resume from last checkpoint
+- `--skip-coverage` → Skip coverage checks
+- `--skip-lint` → Skip lint step
+- `--skip-format` → Skip format step
+- `--skip-pr` → Push branch but don't create PR
+- `--draft-pr` → Create draft PR instead of ready-for-review
+
+## Configuration Updates
+
+**Add to `.taskmaster/config.json`:**
+```json
+{
+  "autopilot": {
+    "finalization": {
+      "lint": {
+        "enabled": false,
+        "command": "npm run lint",
+        "fix": true,
+        "failOnError": false
+      },
+      "format": {
+        "enabled": false,
+        "command": "npm run format",
+        "commitChanges": true
+      },
+      "updateTaskStatus": "review"
+    }
+  },
+  "git": {
+    "pr": {
+      "enabled": true,
+      "base": "default",
+      "bodyTemplate": ".taskmaster/templates/pr-body.md",
+      "draft": false
+    },
+    "pushRetries": 3,
+    "pushRetryDelay": 5000
+  }
+}
+```
+
+## Success Criteria
+- Can create PR automatically with well-formed body
+- Can resume interrupted runs from any checkpoint
+- Coverage checks prevent low-quality code from being merged
+- Clear error messages and recovery paths for all failure modes
+- Run reports include full PR context for review
+
+## Out of Scope (defer to Phase 3)
+- Multiple test framework support (pytest, go test)
+- Diff preview before commits
+- TUI panel implementation
+- Extension/IDE integration
+
+## Testing Strategy
+- Mock `gh` CLI for PR creation tests
+- Test resume from each possible pause point
+- Test coverage failure scenarios
+- Test lint/format integration with mock commands
+- End-to-end test with PR creation on test repo
+
+## Dependencies
+- Phase 1 completed (core workflow)
+- GitHub CLI (`gh`) installed (optional, fallback provided)
+- Test framework supports coverage output
+
+## Estimated Effort
+1-2 weeks
+
+## Risks & Mitigations
+- **Risk:** GitHub CLI auth issues
+  - **Mitigation:** Clear auth setup docs, fallback to manual instructions
+
+- **Risk:** PR body template doesn't match all project needs
+  - **Mitigation:** Make template customizable via config path
+
+- **Risk:** Resume state gets corrupted
+  - **Mitigation:** Validate state on load, provide --force-reset option
+
+- **Risk:** Coverage calculation differs between runs
+  - **Mitigation:** Store coverage with each test run for comparison
+
+## Validation
+Test with:
+- Successful PR creation end-to-end
+- Resume from GREEN attempt failure
+- Resume from coverage failure
+- Resume from lint failure
+- Missing `gh` CLI (fallback to manual)
+- Lint/format integration enabled
+- Multiple interrupted runs (selection UI)
--- a/.taskmaster/docs/tdd-workflow-phase-3-extensibility-guardrails.md
+++ b/.taskmaster/docs/tdd-workflow-phase-3-extensibility-guardrails.md
@@ -0,0 +1,534 @@
+# Phase 3: Extensibility + Guardrails - Autonomous TDD Workflow
+
+## Objective
+Add multi-language/framework support, enhanced safety guardrails, TUI interface, and extensibility for IDE/editor integration.
+
+## Scope
+- Multi-language test runner support (pytest, go test, etc.)
+- Enhanced safety: diff preview, confirmation gates, minimal-change prompts
+- Optional TUI panel with tmux integration
+- State-based extension API for IDE integration
+- Parallel subtask execution (experimental)
+
+## Deliverables
+
+### 1. Multi-Language Test Runner Support
+
+**Extend TestRunnerAdapter:**
+```typescript
+class TestRunnerAdapter {
+  // Existing methods...
+
+  async detectLanguage(): Promise<Language>
+  async detectFramework(language: Language): Promise<Framework>
+  async getFrameworkAdapter(framework: Framework): Promise<FrameworkAdapter>
+}
+
+enum Language {
+  JavaScript = 'javascript',
+  TypeScript = 'typescript',
+  Python = 'python',
+  Go = 'go',
+  Rust = 'rust'
+}
+
+enum Framework {
+  Vitest = 'vitest',
+  Jest = 'jest',
+  Pytest = 'pytest',
+  GoTest = 'gotest',
+  CargoTest = 'cargotest'
+}
+
+interface FrameworkAdapter {
+  runTargeted(pattern: string): Promise<TestResults>
+  runAll(): Promise<TestResults>
+  parseCoverage(output: string): Promise<CoverageReport>
+  getTestFilePattern(): string
+  getTestFileExtension(): string
+}
+```
+
+**Framework-specific adapters:**
+
+**PytestAdapter** (`packages/tm-core/src/services/test-adapters/pytest-adapter.ts`):
+```typescript
+class PytestAdapter implements FrameworkAdapter {
+  async runTargeted(pattern: string): Promise<TestResults> {
+    const output = await exec(`pytest ${pattern} --json-report`)
+    return this.parseResults(output)
+  }
+
+  async runAll(): Promise<TestResults> {
+    const output = await exec('pytest --cov --json-report')
+    return this.parseResults(output)
+  }
+
+  parseCoverage(output: string): Promise<CoverageReport> {
+    // Parse pytest-cov XML output
+  }
+
+  getTestFilePattern(): string {
+    return '**/test_*.py'
+  }
+
+  getTestFileExtension(): string {
+    return '.py'
+  }
+}
+```
+
+**GoTestAdapter** (`packages/tm-core/src/services/test-adapters/gotest-adapter.ts`):
+```typescript
+class GoTestAdapter implements FrameworkAdapter {
+  async runTargeted(pattern: string): Promise<TestResults> {
+    const output = await exec(`go test ${pattern} -json`)
+    return this.parseResults(output)
+  }
+
+  async runAll(): Promise<TestResults> {
+    const output = await exec('go test ./... -coverprofile=coverage.out -json')
+    return this.parseResults(output)
+  }
+
+  parseCoverage(output: string): Promise<CoverageReport> {
+    // Parse go test coverage output
+  }
+
+  getTestFilePattern(): string {
+    return '**/*_test.go'
+  }
+
+  getTestFileExtension(): string {
+    return '_test.go'
+  }
+}
+```
+
+**Detection Logic:**
+```typescript
+async function detectFramework(): Promise<Framework> {
+  // Check for package.json
+  if (await exists('package.json')) {
+    const pkg = await readJSON('package.json')
+    if (pkg.devDependencies?.vitest) return Framework.Vitest
+    if (pkg.devDependencies?.jest) return Framework.Jest
+  }
+
+  // Check for Python files
+  if (await exists('pytest.ini') || await exists('setup.py')) {
+    return Framework.Pytest
+  }
+
+  // Check for Go files
+  if (await exists('go.mod')) {
+    return Framework.GoTest
+  }
+
+  // Check for Rust files
+  if (await exists('Cargo.toml')) {
+    return Framework.CargoTest
+  }
+
+  throw new Error('Could not detect test framework')
+}
+```
+
+### 2. Enhanced Safety Guardrails
+
+**Diff Preview Mode:**
+```bash
+$ tm autopilot 42 --preview-diffs
+
+[2/3] Subtask 42.2: Add collection endpoint
+
+  RED   ✓ Tests created: src/api/__tests__/metrics.test.js
+
+  GREEN Implementing code...
+
+━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
+Proposed changes (src/api/metrics.js):
+
+  + import { MetricsSchema } from '../models/schema.js'
+  +
+  + export async function createMetric(data) {
+  +   const validated = MetricsSchema.parse(data)
+  +   const result = await db.metrics.create(validated)
+  +   return result
+  + }
+
+━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
+
+Apply these changes? [Y/n/e(dit)/s(kip)]
+  Y - Apply and continue
+  n - Reject and retry GREEN phase
+  e - Open in editor for manual changes
+  s - Skip this subtask
+```
+
+**Minimal Change Enforcement:**
+
+Add to system prompt:
+```markdown
+CRITICAL: Make MINIMAL changes to pass the failing tests.
+- Only modify files directly related to the subtask
+- Do not refactor existing code unless absolutely necessary
+- Do not add features beyond the acceptance criteria
+- Keep changes under 50 lines per file when possible
+- Prefer composition over modification
+```
+
+**Change Size Warnings:**
+```bash
+⚠️  Large change detected:
+  Files modified: 5
+  Lines changed: +234, -12
+
+This subtask was expected to be small (~50 lines).
+Consider:
+  - Breaking into smaller subtasks
+  - Reviewing acceptance criteria
+  - Checking for unintended changes
+
+Continue anyway? [y/N]
+```
+
+### 3. TUI Interface with tmux
+
+**Layout:**
+```
+┌──────────────────────────────────┬─────────────────────────────────┐
+│ Task Navigator (left)            │ Executor Terminal (right)       │
+│                                  │                                 │
+│ Project: my-app                  │ $ tm autopilot --executor-mode  │
+│ Branch: analytics/task-42        │ > Running subtask 42.2 GREEN... │
+│ Tag: analytics                   │ > Implementing endpoint...      │
+│                                  │ > Tests: 3 passed, 0 failed     │
+│ Tasks:                           │ > Ready to commit               │
+│ → 42 [in-progress] User metrics  │                                 │
+│   → 42.1 [done] Schema           │ [Live output from executor]     │
+│   → 42.2 [active] Endpoint ◀     │                                 │
+│   → 42.3 [pending] Dashboard     │                                 │
+│                                  │                                 │
+│ [s] start  [p] pause  [q] quit   │                                 │
+└──────────────────────────────────┴─────────────────────────────────┘
+```
+
+**Implementation:**
+
+**TUI Navigator** (`apps/cli/src/ui/tui/navigator.ts`):
+```typescript
+import blessed from 'blessed'
+
+class AutopilotTUI {
+  private screen: blessed.Widgets.Screen
+  private taskList: blessed.Widgets.ListElement
+  private statusBox: blessed.Widgets.BoxElement
+  private executorPane: string  // tmux pane ID
+
+  async start(taskId?: string) {
+    // Create blessed screen
+    this.screen = blessed.screen()
+
+    // Create task list widget
+    this.taskList = blessed.list({
+      label: 'Tasks',
+      keys: true,
+      vi: true,
+      style: { selected: { bg: 'blue' } }
+    })
+
+    // Spawn tmux pane for executor
+    this.executorPane = await this.spawnExecutorPane()
+
+    // Watch state file for updates
+    this.watchStateFile()
+
+    // Handle keybindings
+    this.setupKeybindings()
+  }
+
+  private async spawnExecutorPane(): Promise<string> {
+    const paneId = await exec('tmux split-window -h -P -F "#{pane_id}"')
+    await exec(`tmux send-keys -t ${paneId} "tm autopilot --executor-mode" Enter`)
+    return paneId.trim()
+  }
+
+  private watchStateFile() {
+    watch('.taskmaster/state/current-run.json', (event, filename) => {
+      this.updateDisplay()
+    })
+  }
+
+  private setupKeybindings() {
+    this.screen.key(['s'], () => this.startTask())
+    this.screen.key(['p'], () => this.pauseTask())
+    this.screen.key(['q'], () => this.quit())
+    this.screen.key(['up', 'down'], () => this.navigateTasks())
+  }
+}
+```
+
+**Executor Mode:**
+```bash
+$ tm autopilot 42 --executor-mode
+
+# Runs in executor pane, writes state to shared file
+# Left pane reads state file and updates display
+```
+
+**State File** (`.taskmaster/state/current-run.json`):
+```json
+{
+  "runId": "2025-01-15-142033",
+  "taskId": "42",
+  "status": "running",
+  "currentPhase": "green",
+  "currentSubtask": "42.2",
+  "lastOutput": "Implementing endpoint...",
+  "testsStatus": {
+    "passed": 3,
+    "failed": 0
+  }
+}
+```
+
+### 4. Extension API for IDE Integration
+
+**State-based API:**
+
+Expose run state via JSON files that IDEs can read:
+- `.taskmaster/state/current-run.json` - live run state
+- `.taskmaster/reports/runs/<runId>/manifest.json` - run metadata
+- `.taskmaster/reports/runs/<runId>/log.jsonl` - event stream
+
+**WebSocket API (optional):**
+```typescript
+// packages/tm-core/src/services/autopilot-server.ts
+class AutopilotServer {
+  private wss: WebSocketServer
+
+  start(port: number = 7890) {
+    this.wss = new WebSocketServer({ port })
+
+    this.wss.on('connection', (ws) => {
+      // Send current state
+      ws.send(JSON.stringify(this.getCurrentState()))
+
+      // Stream events
+      this.orchestrator.on('*', (event) => {
+        ws.send(JSON.stringify(event))
+      })
+    })
+  }
+}
+```
+
+**Usage from IDE extension:**
+```typescript
+// VS Code extension example
+const ws = new WebSocket('ws://localhost:7890')
+
+ws.on('message', (data) => {
+  const event = JSON.parse(data)
+
+  if (event.type === 'subtask:complete') {
+    vscode.window.showInformationMessage(
+      `Subtask ${event.subtaskId} completed`
+    )
+  }
+})
+```
+
+### 5. Parallel Subtask Execution (Experimental)
+
+**Dependency Analysis:**
+```typescript
+class SubtaskScheduler {
+  async buildDependencyGraph(subtasks: Subtask[]): Promise<DAG> {
+    const graph = new DAG()
+
+    for (const subtask of subtasks) {
+      graph.addNode(subtask.id)
+
+      for (const depId of subtask.dependencies) {
+        graph.addEdge(depId, subtask.id)
+      }
+    }
+
+    return graph
+  }
+
+  async getParallelBatches(graph: DAG): Promise<Subtask[][]> {
+    const batches: Subtask[][] = []
+    const completed = new Set<string>()
+
+    while (completed.size < graph.size()) {
+      const ready = graph.nodes.filter(node =>
+        !completed.has(node.id) &&
+        node.dependencies.every(dep => completed.has(dep))
+      )
+
+      batches.push(ready)
+      ready.forEach(node => completed.add(node.id))
+    }
+
+    return batches
+  }
+}
+```
+
+**Parallel Execution:**
+```bash
+$ tm autopilot 42 --parallel
+
+[Batch 1] Running 2 subtasks in parallel:
+  → 42.1: Add metrics schema
+  → 42.4: Add API documentation
+
+  42.1 RED   ✓ Tests created
+  42.4 RED   ✓ Tests created
+
+  42.1 GREEN ✓ Implementation complete
+  42.4 GREEN ✓ Implementation complete
+
+  42.1 COMMIT ✓ Committed: a1b2c3d
+  42.4 COMMIT ✓ Committed: e5f6g7h
+
+[Batch 2] Running 2 subtasks in parallel (depend on 42.1):
+  → 42.2: Add collection endpoint
+  → 42.3: Add dashboard widget
+  ...
+```
+
+**Conflict Detection:**
+```typescript
+async function detectConflicts(subtasks: Subtask[]): Promise<Conflict[]> {
+  const conflicts: Conflict[] = []
+
+  for (let i = 0; i < subtasks.length; i++) {
+    for (let j = i + 1; j < subtasks.length; j++) {
+      const filesA = await predictAffectedFiles(subtasks[i])
+      const filesB = await predictAffectedFiles(subtasks[j])
+
+      const overlap = filesA.filter(f => filesB.includes(f))
+
+      if (overlap.length > 0) {
+        conflicts.push({
+          subtasks: [subtasks[i].id, subtasks[j].id],
+          files: overlap
+        })
+      }
+    }
+  }
+
+  return conflicts
+}
+```
+
+### 6. Advanced Configuration
+
+**Add to `.taskmaster/config.json`:**
+```json
+{
+  "autopilot": {
+    "safety": {
+      "previewDiffs": false,
+      "maxChangeLinesPerFile": 100,
+      "warnOnLargeChanges": true,
+      "requireConfirmOnLargeChanges": true
+    },
+    "parallel": {
+      "enabled": false,
+      "maxConcurrent": 3,
+      "detectConflicts": true
+    },
+    "tui": {
+      "enabled": false,
+      "tmuxSession": "taskmaster-autopilot"
+    },
+    "api": {
+      "enabled": false,
+      "port": 7890,
+      "allowRemote": false
+    }
+  },
+  "test": {
+    "frameworks": {
+      "python": {
+        "runner": "pytest",
+        "coverageCommand": "pytest --cov",
+        "testPattern": "**/test_*.py"
+      },
+      "go": {
+        "runner": "go test",
+        "coverageCommand": "go test ./... -coverprofile=coverage.out",
+        "testPattern": "**/*_test.go"
+      }
+    }
+  }
+}
+```
+
+## CLI Updates
+
+**New commands:**
+```bash
+tm autopilot <taskId> --tui              # Launch TUI interface
+tm autopilot <taskId> --parallel         # Enable parallel execution
+tm autopilot <taskId> --preview-diffs    # Show diffs before applying
+tm autopilot <taskId> --executor-mode    # Run as executor pane
+tm autopilot-server start                # Start WebSocket API
+```
+
+## Success Criteria
+- Supports Python projects with pytest
+- Supports Go projects with go test
+- Diff preview prevents unwanted changes
+- TUI provides better visibility for long-running tasks
+- IDE extensions can integrate via state files or WebSocket
+- Parallel execution reduces total time for independent subtasks
+
+## Out of Scope
+- Full Electron/web GUI
+- AI executor selection UI (defer to Phase 4)
+- Multi-repository support
+- Remote execution on cloud runners
+
+## Testing Strategy
+- Test with Python project (pytest)
+- Test with Go project (go test)
+- Test diff preview UI with mock changes
+- Test parallel execution with independent subtasks
+- Test conflict detection with overlapping file changes
+- Test TUI with mock tmux environment
+
+## Dependencies
+- Phase 2 completed (PR + resumability)
+- tmux installed (for TUI)
+- blessed or ink library (for TUI rendering)
+
+## Estimated Effort
+3-4 weeks
+
+## Risks & Mitigations
+- **Risk:** Parallel execution causes git conflicts
+  - **Mitigation:** Conservative conflict detection, sequential fallback
+
+- **Risk:** TUI adds complexity and maintenance burden
+  - **Mitigation:** Keep TUI optional, state-based design allows alternatives
+
+- **Risk:** Framework adapters hard to maintain across versions
+  - **Mitigation:** Abstract common parsing logic, document adapter interface
+
+- **Risk:** Diff preview slows down workflow
+  - **Mitigation:** Make optional, use --preview-diffs flag only when needed
+
+## Validation
+Test with:
+- Python project with pytest and pytest-cov
+- Go project with go test
+- Large changes requiring confirmation
+- Parallel execution with 3+ independent subtasks
+- TUI with task selection and live status updates
+- VS Code extension reading state files
--- a/.taskmaster/reports/task-complexity-report_autonomous-tdd-git-workflow.json
+++ b/.taskmaster/reports/task-complexity-report_autonomous-tdd-git-workflow.json
@@ -0,0 +1,197 @@
+{
+	"meta": {
+		"generatedAt": "2025-10-07T09:46:06.248Z",
+		"tasksAnalyzed": 23,
+		"totalTasks": 23,
+		"analysisCount": 23,
+		"thresholdScore": 5,
+		"projectName": "Taskmaster",
+		"usedResearch": false
+	},
+	"complexityAnalysis": [
+		{
+			"taskId": 31,
+			"taskTitle": "Create WorkflowOrchestrator service foundation",
+			"complexityScore": 7,
+			"recommendedSubtasks": 5,
+			"expansionPrompt": "Break down the WorkflowOrchestrator foundation into its core architectural components: phase management system, event emitter infrastructure, state management interfaces, service integration, and lifecycle control methods. Each subtask should focus on a specific architectural concern with clear interfaces and testable units.",
+			"reasoning": "This is a foundational service requiring state machine implementation, event-driven architecture, and integration with existing services. The complexity is high due to the need for robust phase management, error handling, and service orchestration patterns."
+		},
+		{
+			"taskId": 32,
+			"taskTitle": "Implement GitAdapter for repository operations",
+			"complexityScore": 6,
+			"recommendedSubtasks": 4,
+			"expansionPrompt": "Decompose the GitAdapter implementation into: TypeScript wrapper creation around existing git-utils.js, core git operation methods with comprehensive error handling, branch naming pattern system with token replacement, and confirmation gates for destructive operations. Focus on type safety and existing code integration.",
+			"reasoning": "Moderate-high complexity due to TypeScript integration over existing JavaScript utilities, branch pattern implementation, and safety mechanisms. The existing git-utils.js provides a solid foundation, reducing complexity."
+		},
+		{
+			"taskId": 33,
+			"taskTitle": "Create TestRunnerAdapter for framework detection and execution",
+			"complexityScore": 8,
+			"recommendedSubtasks": 6,
+			"expansionPrompt": "Break down TestRunnerAdapter into framework detection logic, test execution engine with process management, Jest-specific result parsing, Vitest-specific result parsing, unified result interfaces, and final integration. Each framework parser should be separate to handle their unique output formats.",
+			"reasoning": "High complexity due to multiple framework support (Jest, Vitest), child process management, result parsing from different formats, coverage reporting, and timeout handling. Each framework has unique output formats requiring specialized parsers."
+		},
+		{
+			"taskId": 34,
+			"taskTitle": "Implement autopilot CLI command structure",
+			"complexityScore": 5,
+			"recommendedSubtasks": 4,
+			"expansionPrompt": "Structure the autopilot command into: basic command setup with Commander.js integration, comprehensive flag handling and validation system, preflight check validation with environment validation, and WorkflowOrchestrator integration with dry-run execution planning. Follow existing CLI patterns from the codebase.",
+			"reasoning": "Moderate complexity involving CLI structure, flag handling, and integration with WorkflowOrchestrator. The existing CLI patterns and Commander.js usage in the codebase provide good guidance, reducing implementation complexity."
+		},
+		{
+			"taskId": 35,
+			"taskTitle": "Integrate surgical test generator with WorkflowOrchestrator",
+			"complexityScore": 6,
+			"recommendedSubtasks": 4,
+			"expansionPrompt": "Decompose the test generation integration into: TaskExecutionService enhancement for test generation mode, TestGenerationService creation using executor framework, prompt composition system for rule integration, and framework-specific test pattern support. Leverage existing executor patterns from the codebase.",
+			"reasoning": "Moderate-high complexity due to integration with existing services, prompt composition system, and framework-specific test generation. The existing executor framework and TaskExecutionService provide good integration points."
+		},
+		{
+			"taskId": 36,
+			"taskTitle": "Implement subtask TDD loop execution",
+			"complexityScore": 9,
+			"recommendedSubtasks": 7,
+			"expansionPrompt": "Break down the TDD loop into: SubtaskExecutor class architecture, RED phase test generation, GREEN phase code generation, COMMIT phase with conventional commits, retry mechanism for GREEN phase, timeout and backoff policies, and TaskService integration. Each phase should be independently testable.",
+			"reasoning": "Very high complexity due to implementing the complete TDD red-green-commit cycle with AI integration, retry logic, timeout handling, and git operations. This is the core autonomous workflow requiring robust error handling and state management."
+		},
+		{
+			"taskId": 37,
+			"taskTitle": "Add configuration schema for autopilot settings",
+			"complexityScore": 4,
+			"recommendedSubtasks": 3,
+			"expansionPrompt": "Expand configuration support into: extending configuration interfaces with autopilot settings, updating ConfigManager validation logic, and implementing default configuration values. Build on existing configuration patterns and maintain backward compatibility.",
+			"reasoning": "Low-moderate complexity involving schema extension and validation logic. The existing configuration system provides clear patterns to follow, making this primarily an extension task rather than new architecture."
+		},
+		{
+			"taskId": 38,
+			"taskTitle": "Implement run state persistence and logging",
+			"complexityScore": 6,
+			"recommendedSubtasks": 5,
+			"expansionPrompt": "Structure run state management into: RunStateManager service class creation, run directory structure and manifest creation, JSONL event logging system, test result and commit tracking storage, and state checkpointing with resume functionality. Focus on data integrity and structured logging.",
+			"reasoning": "Moderate-high complexity due to file system operations, structured logging, state serialization, and resume functionality. Requires careful design of data formats and error handling for persistence operations."
+		},
+		{
+			"taskId": 39,
+			"taskTitle": "Add GitHub PR creation with run reports",
+			"complexityScore": 5,
+			"recommendedSubtasks": 4,
+			"expansionPrompt": "Decompose PR creation into: PRAdapter service foundation with interfaces, GitHub CLI integration and command execution, PR body generation from run data and test results, and custom PR template system with configuration support. Leverage existing git-utils.js patterns for CLI integration.",
+			"reasoning": "Moderate complexity involving GitHub CLI integration, report generation, and template systems. The existing git-utils.js provides patterns for CLI tool integration, reducing implementation complexity."
+		},
+		{
+			"taskId": 40,
+			"taskTitle": "Implement task dependency resolution for subtask ordering",
+			"complexityScore": 6,
+			"recommendedSubtasks": 4,
+			"expansionPrompt": "Break down dependency resolution into: dependency resolution algorithm with cycle detection, topological sorting for subtask ordering, task eligibility checking system, and TaskService integration. Implement graph algorithms for dependency management with proper error handling.",
+			"reasoning": "Moderate-high complexity due to graph algorithm implementation, cycle detection, and integration with existing task management. Requires careful design of dependency resolution logic and edge case handling."
+		},
+		{
+			"taskId": 41,
+			"taskTitle": "Create resume functionality for interrupted runs",
+			"complexityScore": 7,
+			"recommendedSubtasks": 5,
+			"expansionPrompt": "Structure resume functionality into: checkpoint creation in RunStateManager, state restoration logic with validation, state validation for safe resume operations, CLI flag implementation for resume command, and partial phase resume functionality. Focus on data integrity and workflow consistency.",
+			"reasoning": "High complexity due to state serialization/deserialization, workflow restoration, validation logic, and CLI integration. Requires robust error handling and state consistency checks for reliable resume operations."
+		},
+		{
+			"taskId": 42,
+			"taskTitle": "Add coverage threshold enforcement",
+			"complexityScore": 5,
+			"recommendedSubtasks": 4,
+			"expansionPrompt": "Decompose coverage enforcement into: coverage report parsing from Jest/Vitest, configurable threshold validation logic, coverage gates integration in workflow phases, and detailed coverage failure reporting system. Build on existing TestRunnerAdapter patterns.",
+			"reasoning": "Moderate complexity involving coverage report parsing, validation logic, and workflow integration. The existing TestRunnerAdapter provides good foundation for extending coverage capabilities."
+		},
+		{
+			"taskId": 43,
+			"taskTitle": "Implement tmux-based TUI navigator",
+			"complexityScore": 8,
+			"recommendedSubtasks": 6,
+			"expansionPrompt": "Break down TUI implementation into: framework selection and basic structure setup, left pane interface layout with status indicators, tmux integration and terminal coordination, navigation system with keybindings, real-time status updates system, and comprehensive event handling with UX polish. Each component should be independently testable.",
+			"reasoning": "High complexity due to terminal UI framework integration, tmux session management, real-time updates, keyboard event handling, and terminal interface design. Requires expertise in terminal UI libraries and tmux integration."
+		},
+		{
+			"taskId": 44,
+			"taskTitle": "Add prompt composition system for context-aware test generation",
+			"complexityScore": 6,
+			"recommendedSubtasks": 4,
+			"expansionPrompt": "Structure prompt composition into: PromptComposer service foundation, template processing engine with token replacement, rule loading system with precedence handling, and context injection with phase-specific prompt generation. Focus on flexible template system and rule management.",
+			"reasoning": "Moderate-high complexity due to template processing, rule precedence systems, and context injection logic. Requires careful design of template syntax and rule loading mechanisms."
+		},
+		{
+			"taskId": 45,
+			"taskTitle": "Implement tag-branch mapping and automatic tag switching",
+			"complexityScore": 5,
+			"recommendedSubtasks": 3,
+			"expansionPrompt": "Decompose tag-branch mapping into: GitAdapter enhancement with branch-to-tag extraction logic, automatic tag switching workflow integration, and branch-to-tag mapping persistence with validation. Build on existing git-utils.js and tag management functionality.",
+			"reasoning": "Moderate complexity involving pattern matching, tag management integration, and workflow automation. The existing git-utils.js and tag management systems provide good foundation for implementation."
+		},
+		{
+			"taskId": 46,
+			"taskTitle": "Add comprehensive error handling and recovery",
+			"complexityScore": 7,
+			"recommendedSubtasks": 5,
+			"expansionPrompt": "Structure error handling into: error classification system with specific error types, recovery suggestion engine with actionable recommendations, error context management and preservation, force flag implementation with selective bypass, and logging/reporting system integration. Focus on actionable error messages and automated recovery where possible.",
+			"reasoning": "High complexity due to comprehensive error taxonomy, recovery automation, context preservation, and integration across all workflow components. Requires deep understanding of failure modes and recovery strategies."
+		},
+		{
+			"taskId": 47,
+			"taskTitle": "Implement conventional commit message generation",
+			"complexityScore": 4,
+			"recommendedSubtasks": 3,
+			"expansionPrompt": "Break down commit message generation into: template system creation with variable substitution, commit type auto-detection based on task content and file changes, and validation with GitAdapter integration. Follow conventional commit standards and integrate with existing git operations.",
+			"reasoning": "Low-moderate complexity involving template processing, pattern matching for commit type detection, and validation logic. Well-defined conventional commit standards provide clear implementation guidance."
+		},
+		{
+			"taskId": 48,
+			"taskTitle": "Add multi-framework test execution support",
+			"complexityScore": 7,
+			"recommendedSubtasks": 5,
+			"expansionPrompt": "Expand test framework support into: framework detection system for multiple languages, common adapter interface design, Python pytest adapter implementation, Go and Rust adapter implementations, and integration with existing TestRunnerAdapter. Each language adapter should follow the unified interface pattern.",
+			"reasoning": "High complexity due to multi-language support, framework detection across different ecosystems, and adapter pattern implementation. Each language has unique testing conventions and output formats."
+		},
+		{
+			"taskId": 49,
+			"taskTitle": "Implement workflow event streaming for real-time monitoring",
+			"complexityScore": 6,
+			"recommendedSubtasks": 4,
+			"expansionPrompt": "Structure event streaming into: WorkflowOrchestrator EventEmitter enhancement, structured event format with metadata, event persistence to run logs, and optional WebSocket streaming for external monitoring. Focus on event consistency and real-time delivery.",
+			"reasoning": "Moderate-high complexity due to event-driven architecture, structured event formats, persistence integration, and WebSocket implementation. Requires careful design of event schemas and delivery mechanisms."
+		},
+		{
+			"taskId": 50,
+			"taskTitle": "Add intelligent test targeting for faster feedback",
+			"complexityScore": 7,
+			"recommendedSubtasks": 5,
+			"expansionPrompt": "Decompose test targeting into: file change detection system, test dependency analysis engine, framework-specific targeting adapters, test impact calculation algorithm, and fallback integration with TestRunnerAdapter. Focus on accuracy and performance optimization.",
+			"reasoning": "High complexity due to dependency analysis, impact calculation algorithms, framework-specific targeting, and integration with existing test execution. Requires sophisticated analysis of code relationships and test dependencies."
+		},
+		{
+			"taskId": 51,
+			"taskTitle": "Implement dry-run visualization with execution timeline",
+			"complexityScore": 6,
+			"recommendedSubtasks": 4,
+			"expansionPrompt": "Structure dry-run visualization into: timeline calculation engine with duration estimates, estimation algorithms based on task complexity, ASCII art progress visualization with formatting, and resource validation with preflight checks. Focus on accurate planning and clear visual presentation.",
+			"reasoning": "Moderate-high complexity due to timeline calculation, estimation algorithms, ASCII visualization, and resource validation. Requires understanding of workflow timing and visual formatting for terminal output."
+		},
+		{
+			"taskId": 52,
+			"taskTitle": "Add autopilot workflow integration tests",
+			"complexityScore": 8,
+			"recommendedSubtasks": 6,
+			"expansionPrompt": "Structure integration testing into: isolated test environment infrastructure, mock integrations and service stubs, end-to-end workflow test scenarios, performance benchmarking and resource monitoring, test isolation and parallelization strategies, and comprehensive result validation and reporting. Focus on realistic test scenarios and reliable automation.",
+			"reasoning": "High complexity due to end-to-end testing requirements, mock service integration, performance testing, isolation mechanisms, and comprehensive validation. Requires sophisticated test infrastructure and scenario design."
+		},
+		{
+			"taskId": 53,
+			"taskTitle": "Finalize autopilot documentation and examples",
+			"complexityScore": 3,
+			"recommendedSubtasks": 4,
+			"expansionPrompt": "Structure documentation into: comprehensive autopilot documentation covering setup and usage, example PRD files and templates for different project types, troubleshooting guide for common issues and solutions, and demo materials with workflow visualization. Focus on clarity and practical examples.",
+			"reasoning": "Low complexity involving documentation writing, example creation, and demo material production. The main challenge is ensuring accuracy and completeness rather than technical implementation."
+		}
+	]
+}
--- a/.taskmaster/reports/task-complexity-report_tdd-phase-1-core-rails.json
+++ b/.taskmaster/reports/task-complexity-report_tdd-phase-1-core-rails.json
@@ -0,0 +1,93 @@
+{
+	"meta": {
+		"generatedAt": "2025-10-09T12:47:27.960Z",
+		"tasksAnalyzed": 10,
+		"totalTasks": 10,
+		"analysisCount": 10,
+		"thresholdScore": 5,
+		"projectName": "Taskmaster",
+		"usedResearch": false
+	},
+	"complexityAnalysis": [
+		{
+			"taskId": 1,
+			"taskTitle": "Design and Implement Global Storage System",
+			"complexityScore": 7,
+			"recommendedSubtasks": 6,
+			"expansionPrompt": "Break down the global storage system implementation into: 1) Path normalization utilities with cross-platform support, 2) Run ID generation and validation, 3) Manifest.json structure and management, 4) Activity.jsonl append-only logging, 5) State.json mutable checkpoint handling, and 6) Directory structure creation and cleanup. Focus on robust error handling, atomic operations, and isolation between different runs.",
+			"reasoning": "Complex system requiring cross-platform path handling, multiple file formats (JSON/JSONL), atomic operations, and state management. The existing codebase shows sophisticated file operations infrastructure but this extends beyond current patterns. Implementation involves filesystem operations, concurrency concerns, and data integrity."
+		},
+		{
+			"taskId": 2,
+			"taskTitle": "Build GitAdapter with Safety Checks",
+			"complexityScore": 8,
+			"recommendedSubtasks": 7,
+			"expansionPrompt": "Decompose GitAdapter into: 1) Git repository detection and validation, 2) Working tree status checking with detailed reporting, 3) Branch operations (create, checkout, list) with safety guards, 4) Commit operations with metadata embedding, 5) Default branch detection and protection logic, 6) Push operations with conflict handling, and 7) Branch name generation from patterns. Emphasize safety checks, confirmation gates, and comprehensive error messages.",
+			"reasoning": "High complexity due to git operations safety requirements, multiple git commands integration, error handling for various git states, and safety mechanisms. The PRD emphasizes never allowing commits on default branch and requiring clean working tree - critical safety features that need robust implementation."
+		},
+		{
+			"taskId": 3,
+			"taskTitle": "Implement Test Result Validator",
+			"complexityScore": 5,
+			"recommendedSubtasks": 4,
+			"expansionPrompt": "Split test validation into: 1) Input validation and schema definition for test results, 2) RED phase validation logic (ensuring failures exist), 3) GREEN phase validation logic (ensuring all tests pass), and 4) Coverage threshold validation with configurable limits. Include comprehensive validation messages and suggestions for common failure scenarios.",
+			"reasoning": "Moderate complexity focused on business logic validation. The validator is framework-agnostic (only validates reported numbers), has clear validation rules, and well-defined input/output. The existing codebase shows validation patterns that can be leveraged."
+		},
+		{
+			"taskId": 4,
+			"taskTitle": "Develop WorkflowOrchestrator State Machine",
+			"complexityScore": 9,
+			"recommendedSubtasks": 8,
+			"expansionPrompt": "Structure the orchestrator into: 1) State machine definition and transitions (Preflight → BranchSetup → SubtaskLoop → Finalize), 2) Event emission system with comprehensive event types, 3) State persistence and recovery mechanisms, 4) Phase coordination and validation, 5) Subtask iteration and progress tracking, 6) Error handling and recovery strategies, 7) Resume functionality from checkpoints, and 8) Integration points for Git, Test, and other adapters.",
+			"reasoning": "Very high complexity as the central coordination component. Must orchestrate multiple adapters, handle state transitions, event emission, persistence, and recovery. The state machine needs to be robust, resumable, and coordinate all other components. Critical for the entire workflow's reliability."
+		},
+		{
+			"taskId": 5,
+			"taskTitle": "Create Enhanced Commit Message Generator",
+			"complexityScore": 4,
+			"recommendedSubtasks": 3,
+			"expansionPrompt": "Organize commit message generation into: 1) Template parsing and variable substitution with configurable templates, 2) Scope detection from changed files with intelligent categorization, and 3) Metadata embedding (task context, test results, coverage) with conventional commits compliance. Ensure messages are parseable and contain all required task metadata.",
+			"reasoning": "Relatively straightforward text processing and template system. The conventional commits format is well-defined, and the metadata requirements are clear. The existing package.json shows commander dependency for CLI patterns that can be leveraged."
+		},
+		{
+			"taskId": 6,
+			"taskTitle": "Implement Subtask TDD Loop",
+			"complexityScore": 8,
+			"recommendedSubtasks": 6,
+			"expansionPrompt": "Break down the TDD loop into: 1) RED phase orchestration with test generation coordination, 2) GREEN phase orchestration with implementation guidance, 3) COMMIT phase with file staging and commit creation, 4) Attempt tracking and maximum retry logic, 5) Phase transition validation and state updates, and 6) Activity logging for all phase transitions. Focus on robust state management and clear error recovery paths.",
+			"reasoning": "High complexity due to coordinating multiple phases, state transitions, retry logic, and integration with multiple adapters (Git, Test, State). This is the core workflow execution engine requiring careful orchestration and error handling."
+		},
+		{
+			"taskId": 7,
+			"taskTitle": "Build CLI Commands for AI Agent Orchestration",
+			"complexityScore": 6,
+			"recommendedSubtasks": 5,
+			"expansionPrompt": "Structure CLI commands into: 1) Command registration and argument parsing setup, 2) `start` and `resume` commands with initialization logic, 3) `next` and `status` commands with JSON output formatting, 4) `complete` command with result validation integration, and 5) `commit` and `abort` commands with git operation coordination. Ensure consistent JSON output for machine parsing and comprehensive error handling.",
+			"reasoning": "Moderate complexity leveraging existing CLI infrastructure. The codebase shows commander usage patterns and CLI structure. Main complexity is in JSON output formatting, argument validation, and integration with the orchestrator component."
+		},
+		{
+			"taskId": 8,
+			"taskTitle": "Develop MCP Tools for AI Agent Integration",
+			"complexityScore": 6,
+			"recommendedSubtasks": 5,
+			"expansionPrompt": "Organize MCP tools into: 1) Tool schema definition and parameter validation, 2) `autopilot_start` and `autopilot_resume` tool implementation, 3) `autopilot_next` and `autopilot_status` tools with context provision, 4) `autopilot_complete_phase` tool with validation integration, and 5) `autopilot_commit` tool with git operations. Ensure parity with CLI functionality and proper error handling.",
+			"reasoning": "Moderate complexity building on existing MCP infrastructure. The codebase shows extensive MCP tooling patterns. Main work is adapting CLI functionality to MCP interface patterns and ensuring consistent behavior between CLI and MCP interfaces."
+		},
+		{
+			"taskId": 9,
+			"taskTitle": "Write AI Agent Integration Documentation and Templates",
+			"complexityScore": 2,
+			"recommendedSubtasks": 2,
+			"expansionPrompt": "Structure documentation into: 1) Comprehensive workflow documentation with step-by-step examples, command usage, and integration patterns, and 2) Template creation for CLAUDE.md integration, example prompts, and troubleshooting guides. Focus on clear examples and practical integration guidance.",
+			"reasoning": "Low complexity documentation task. Requires understanding of the implemented system but primarily involves writing clear instructions and examples. The existing codebase shows good documentation patterns that can be followed."
+		},
+		{
+			"taskId": 10,
+			"taskTitle": "Implement Configuration System and Project Hygiene",
+			"complexityScore": 5,
+			"recommendedSubtasks": 4,
+			"expansionPrompt": "Structure configuration into: 1) Configuration schema definition with comprehensive validation using ajv, 2) Default configuration setup and loading mechanisms, 3) Gitignore management and project directory hygiene rules, and 4) Configuration validation and error reporting. Ensure configurations are validated on startup and provide clear error messages for invalid settings.",
+			"reasoning": "Moderate complexity involving schema validation, file operations, and configuration management. The package.json shows ajv dependency is available. Configuration systems require careful validation and user-friendly error reporting, but follow established patterns."
+		}
+	]
+}
--- a/.taskmaster/reports/task-complexity-report_tdd-workflow-phase-0.json
+++ b/.taskmaster/reports/task-complexity-report_tdd-workflow-phase-0.json
@@ -0,0 +1,93 @@
+{
+	"meta": {
+		"generatedAt": "2025-10-07T14:16:40.283Z",
+		"tasksAnalyzed": 10,
+		"totalTasks": 10,
+		"analysisCount": 10,
+		"thresholdScore": 5,
+		"projectName": "Taskmaster",
+		"usedResearch": false
+	},
+	"complexityAnalysis": [
+		{
+			"taskId": 1,
+			"taskTitle": "Create autopilot command CLI skeleton",
+			"complexityScore": 4,
+			"recommendedSubtasks": 3,
+			"expansionPrompt": "Break down the autopilot command creation into: 1) Create AutopilotCommand class extending Commander.Command with proper argument parsing and options, 2) Implement command structure with help text and validation following existing patterns, 3) Add basic registration method and placeholder action handler",
+			"reasoning": "Medium complexity due to following established patterns in the codebase. The command-registry.ts and start.command.ts provide clear templates for implementation. Main complexity is argument parsing and option validation."
+		},
+		{
+			"taskId": 2,
+			"taskTitle": "Implement preflight detection system",
+			"complexityScore": 7,
+			"recommendedSubtasks": 5,
+			"expansionPrompt": "Create PreflightChecker with these subtasks: 1) Package.json test script detection and validation, 2) Git working tree status checking using system commands, 3) Tool availability validation (git, gh, node/npm), 4) Default branch detection via git commands, 5) Structured result reporting with success/failure indicators and error messages",
+			"reasoning": "High complexity due to system integration requirements. Needs to interact with multiple external tools (git, npm, gh), parse various file formats, and handle different system configurations. Error handling for missing tools adds complexity."
+		},
+		{
+			"taskId": 3,
+			"taskTitle": "Implement task loading and validation",
+			"complexityScore": 5,
+			"recommendedSubtasks": 3,
+			"expansionPrompt": "Implement task loading: 1) Use existing TaskService from @tm/core to load tasks by ID with proper error handling, 2) Validate task structure including subtask existence and dependency validation, 3) Provide user-friendly error messages for missing tasks or need to expand subtasks first",
+			"reasoning": "Medium-high complexity. While leveraging existing TaskService reduces implementation effort, the validation logic for subtasks and dependencies requires careful handling of edge cases. Task structure validation adds complexity."
+		},
+		{
+			"taskId": 4,
+			"taskTitle": "Create execution plan display logic",
+			"complexityScore": 6,
+			"recommendedSubtasks": 4,
+			"expansionPrompt": "Build ExecutionPlanDisplay: 1) Create display formatter using boxen and chalk for consistent CLI styling, 2) Format preflight check results with color-coded status indicators, 3) Display subtask execution order with RED/GREEN/COMMIT phase visualization, 4) Show branch/tag info and finalization steps with duration estimates",
+			"reasoning": "Moderate-high complexity due to complex formatting requirements and dependency on multiple other components. The display needs to coordinate information from preflight, task validation, and execution planning. CLI styling consistency adds complexity."
+		},
+		{
+			"taskId": 5,
+			"taskTitle": "Implement branch and tag planning",
+			"complexityScore": 3,
+			"recommendedSubtasks": 2,
+			"expansionPrompt": "Create BranchPlanner: 1) Implement branch name generation using pattern <tag>/task-<id>-<slug> with kebab-case conversion and special character handling, 2) Add TaskMaster config integration to determine active tag and handle existing branch conflicts",
+			"reasoning": "Low-medium complexity. String manipulation and naming convention implementation is straightforward. The main complexity is handling edge cases with special characters and existing branch conflicts."
+		},
+		{
+			"taskId": 6,
+			"taskTitle": "Create subtask execution order calculation",
+			"complexityScore": 8,
+			"recommendedSubtasks": 4,
+			"expansionPrompt": "Implement dependency resolution: 1) Build dependency graph from subtask data with proper parsing, 2) Implement topological sort algorithm for execution order, 3) Add circular dependency detection with clear error reporting, 4) Create parallel execution grouping for independent subtasks",
+			"reasoning": "High complexity due to graph algorithms and dependency resolution. Topological sorting, circular dependency detection, and parallel grouping require algorithmic sophistication. Edge cases in dependency chains add significant complexity."
+		},
+		{
+			"taskId": 7,
+			"taskTitle": "Implement TDD phase planning for subtasks",
+			"complexityScore": 6,
+			"recommendedSubtasks": 4,
+			"expansionPrompt": "Create TDDPhasePlanner: 1) Implement test file path detection for common project structures (src/, tests/, __tests__), 2) Parse implementation files from subtask details and descriptions, 3) Generate conventional commit messages for RED/GREEN/COMMIT phases, 4) Add implementation complexity estimation based on subtask content",
+			"reasoning": "Moderate-high complexity due to project structure detection and file path inference. Conventional commit message generation and complexity estimation require understanding of different project layouts and parsing subtask content effectively."
+		},
+		{
+			"taskId": 8,
+			"taskTitle": "Add finalization steps planning",
+			"complexityScore": 4,
+			"recommendedSubtasks": 3,
+			"expansionPrompt": "Create FinalizationPlanner: 1) Implement test suite execution planning with coverage threshold detection from package.json, 2) Add git operations planning (branch push, PR creation) using existing git patterns, 3) Create duration estimation algorithm based on subtask count and complexity metrics",
+			"reasoning": "Medium complexity. Building on existing git utilities and test command detection reduces complexity. Main challenges are coverage threshold parsing and duration estimation algorithms."
+		},
+		{
+			"taskId": 9,
+			"taskTitle": "Integrate command with existing CLI infrastructure",
+			"complexityScore": 3,
+			"recommendedSubtasks": 2,
+			"expansionPrompt": "Complete CLI integration: 1) Add AutopilotCommand to command-registry.ts following existing patterns and update command metadata, 2) Test command registration and help system integration with proper cleanup and error handling",
+			"reasoning": "Low-medium complexity. The command-registry.ts provides a clear pattern to follow. Main work is registration and ensuring proper integration with existing CLI infrastructure. Well-established patterns reduce complexity."
+		},
+		{
+			"taskId": 10,
+			"taskTitle": "Add comprehensive error handling and edge cases",
+			"complexityScore": 7,
+			"recommendedSubtasks": 5,
+			"expansionPrompt": "Implement error handling: 1) Add missing task and invalid task structure error handling with helpful messages, 2) Handle git state errors (dirty working tree, missing tools), 3) Add dependency validation errors (circular, invalid references), 4) Implement missing tool detection with installation guidance, 5) Create user-friendly error messages following existing CLI patterns",
+			"reasoning": "High complexity due to comprehensive error scenarios. Each component (preflight, task loading, dependency resolution) has multiple failure modes that need proper handling. Providing helpful error messages and recovery suggestions adds complexity."
+		}
+	]
+}
--- a/.taskmaster/state.json
+++ b/.taskmaster/state.json
@@ -1,6 +1,6 @@
 {
-	"currentTag": "master",
-	"lastSwitched": "2025-09-12T22:25:27.535Z",
+	"currentTag": "tdd-phase-1-core-rails",
+	"lastSwitched": "2025-10-09T12:41:40.367Z",
 	"branchTagMapping": {
 		"v017-adds": "v017-adds",
 		"next": "next"
--- a/.taskmaster/tasks/tasks.json
+++ b/.taskmaster/tasks/tasks.json
--- a/CHANGELOG.md
+++ b/CHANGELOG.md
@@ -1,5 +1,192 @@
 # task-master-ai

+## 0.30.0-rc.1
+
+### Minor Changes
+
+- [#1317](https://github.com/eyaltoledano/claude-task-master/pull/1317) [`548beb4`](https://github.com/eyaltoledano/claude-task-master/commit/548beb434453c69041572eb5927ee7da7075c213) Thanks [@Crunchyman-ralph](https://github.com/Crunchyman-ralph)! - Add 4.5 haiku and sonnet to supported models for claude-code and anthropic ai providers
+
+- [#1309](https://github.com/eyaltoledano/claude-task-master/pull/1309) [`ccb87a5`](https://github.com/eyaltoledano/claude-task-master/commit/ccb87a516a11f7ec4b03133c8f24f4fd8c3a45fc) Thanks [@Crunchyman-ralph](https://github.com/Crunchyman-ralph)! - Add autonomous TDD workflow automation system with new `tm autopilot` commands and MCP tools for AI-driven test-driven development.
+
+  **New CLI Commands:**
+  - `tm autopilot start <taskId>` - Initialize TDD workflow
+  - `tm autopilot next` - Get next action in workflow
+  - `tm autopilot status` - Check workflow progress
+  - `tm autopilot complete` - Advance phase with test results
+  - `tm autopilot commit` - Save progress with metadata
+  - `tm autopilot resume` - Continue from checkpoint
+  - `tm autopilot abort` - Cancel workflow
+
+  **New MCP Tools:**
+  Seven new autopilot tools for programmatic control: `autopilot_start`, `autopilot_next`, `autopilot_status`, `autopilot_complete_phase`, `autopilot_commit`, `autopilot_resume`, `autopilot_abort`
+
+  **Features:**
+  - Complete RED → GREEN → COMMIT cycle enforcement
+  - Intelligent commit message generation with metadata
+  - Activity logging and state persistence
+  - Configurable workflow settings via `.taskmaster/config.json`
+  - Comprehensive AI agent integration documentation
+
+  **Documentation:**
+  - AI Agent Integration Guide (2,800+ lines)
+  - TDD Quick Start Guide
+  - Example prompts and integration patterns
+
+  > **Learn more:** [TDD Workflow Quickstart Guide](https://dev.task-master.dev/tdd-workflow/quickstart)
+
+  This release enables AI agents to autonomously execute test-driven development workflows with full state management and recovery capabilities.
+
+### Patch Changes
+
+- [#1323](https://github.com/eyaltoledano/claude-task-master/pull/1323) [`dc6652c`](https://github.com/eyaltoledano/claude-task-master/commit/dc6652ccd2b50b91eb55d92899ebf70c7b4d6601) Thanks [@Crunchyman-ralph](https://github.com/Crunchyman-ralph)! - Fix MCP server compatibility with Draft-07 clients (Augment IDE, gemini-cli, gemini code assist)
+  - Resolves #1284
+
+  **Problem:**
+  - MCP tools were using Zod v4, which outputs JSON Schema Draft 2020-12
+  - MCP clients only support Draft-07
+  - Tools were not discoverable in gemini-cli and other clients
+
+  **Solution:**
+  - Updated all MCP tools to import from `zod/v3` instead of `zod`
+  - Zod v3 schemas convert to Draft-07 via FastMCP's zod-to-json-schema
+  - Fixed logger to use stderr instead of stdout (MCP protocol requirement)
+
+  This is a temporary workaround until FastMCP adds JSON Schema version configuration.
+
+## 0.30.0-rc.0
+
+### Minor Changes
+
+- [#1181](https://github.com/eyaltoledano/claude-task-master/pull/1181) [`a69d8c9`](https://github.com/eyaltoledano/claude-task-master/commit/a69d8c91dc9205a3fdaf9d32276144fa3bcad55d) Thanks [@karol-f](https://github.com/karol-f)! - Add configurable MCP tool loading to optimize LLM context usage
+
+  You can now control which Task Master MCP tools are loaded by setting the `TASK_MASTER_TOOLS` environment variable in your MCP configuration. This helps reduce context usage for LLMs by only loading the tools you need.
+
+  **Configuration Options:**
+  - `all` (default): Load all 36 tools
+  - `core` or `lean`: Load only 7 essential tools for daily development
+    - Includes: `get_tasks`, `next_task`, `get_task`, `set_task_status`, `update_subtask`, `parse_prd`, `expand_task`
+  - `standard`: Load 15 commonly used tools (all core tools plus 8 more)
+    - Additional tools: `initialize_project`, `analyze_project_complexity`, `expand_all`, `add_subtask`, `remove_task`, `generate`, `add_task`, `complexity_report`
+  - Custom list: Comma-separated tool names (e.g., `get_tasks,next_task,set_task_status`)
+
+  **Example .mcp.json configuration:**
+
+  ```json
+  {
+    "mcpServers": {
+      "task-master-ai": {
+        "command": "npx",
+        "args": ["-y", "task-master-ai"],
+        "env": {
+          "TASK_MASTER_TOOLS": "standard",
+          "ANTHROPIC_API_KEY": "your_key_here"
+        }
+      }
+    }
+  }
+  ```
+
+  For complete details on all available tools, configuration examples, and usage guidelines, see the [MCP Tools documentation](https://docs.task-master.dev/capabilities/mcp#configurable-tool-loading).
+
+- [#1312](https://github.com/eyaltoledano/claude-task-master/pull/1312) [`d7fca18`](https://github.com/eyaltoledano/claude-task-master/commit/d7fca1844f24ad8ce079c21d9799a3c4b4413381) Thanks [@Crunchyman-ralph](https://github.com/Crunchyman-ralph)! - Improve next command to work with remote
+
+### Patch Changes
+
+- [#1314](https://github.com/eyaltoledano/claude-task-master/pull/1314) [`6bc75c0`](https://github.com/eyaltoledano/claude-task-master/commit/6bc75c0ac68b59cb10cee70574a689f83e4de768) Thanks [@Crunchyman-ralph](https://github.com/Crunchyman-ralph)! - Improve auth token refresh flow
+
+- [#1302](https://github.com/eyaltoledano/claude-task-master/pull/1302) [`3283506`](https://github.com/eyaltoledano/claude-task-master/commit/3283506444d59896ecb97721ef2e96e290eb84d3) Thanks [@bjcoombs](https://github.com/bjcoombs)! - Enable Task Master commands to traverse parent directories to find project root from nested paths
+
+  Fixes #1301
+
+## 0.29.0
+
+### Minor Changes
+
+- [#1286](https://github.com/eyaltoledano/claude-task-master/pull/1286) [`f12a16d`](https://github.com/eyaltoledano/claude-task-master/commit/f12a16d09649f62148515f11f616157c7d0bd2d5) Thanks [@Crunchyman-ralph](https://github.com/Crunchyman-ralph)! - Add changelog highlights to auto-update notifications
+
+  When the CLI auto-updates to a new version, it now displays a "What's New" section.
+
+- [#1293](https://github.com/eyaltoledano/claude-task-master/pull/1293) [`3010b90`](https://github.com/eyaltoledano/claude-task-master/commit/3010b90d98f3a7d8636caa92fc33d6ee69d4bed0) Thanks [@Crunchyman-ralph](https://github.com/Crunchyman-ralph)! - Add Claude Code plugin with marketplace distribution
+
+  This release introduces official Claude Code plugin support, marking the evolution from legacy `.claude` directory copying to a modern plugin-based architecture.
+
+  ## 🎉 New: Claude Code Plugin
+
+  Task Master AI commands and agents are now distributed as a proper Claude Code plugin:
+  - **49 slash commands** with clean naming (`/taskmaster:command-name`)
+  - **3 specialized AI agents** (task-orchestrator, task-executor, task-checker)
+  - **MCP server integration** for deep Claude Code integration
+
+  **Installation:**
+
+  ```bash
+  /plugin marketplace add eyaltoledano/claude-task-master
+  /plugin install taskmaster@taskmaster
+  ```
+
+  ### The `rules add claude` command no longer copies commands and agents to `.claude/commands/` and `.claude/agents/`. Instead, it now
+  - Shows plugin installation instructions
+  - Only manages CLAUDE.md imports for agent instructions
+  - Directs users to install the official plugin
+
+  **Migration for Existing Users:**
+
+  If you previously used `rules add claude`:
+  1. The old commands in `.claude/commands/` will continue to work but won't receive updates
+  2. Install the plugin for the latest features: `/plugin install taskmaster@taskmaster`
+  3. remove old `.claude/commands/` and `.claude/agents/` directories
+
+  **Why This Change?**
+
+  Claude Code plugins provide:
+  - ✅ Automatic updates when we release new features
+  - ✅ Better command organization and naming
+  - ✅ Seamless integration with Claude Code
+  - ✅ No manual file copying or management
+
+  The plugin system is the future of Task Master AI integration with Claude Code!
+
+- [#1285](https://github.com/eyaltoledano/claude-task-master/pull/1285) [`2a910a4`](https://github.com/eyaltoledano/claude-task-master/commit/2a910a40bac375f9f61d797bf55597303d556b48) Thanks [@Crunchyman-ralph](https://github.com/Crunchyman-ralph)! - Add RPG (Repository Planning Graph) method template for structured PRD creation. The new `example_prd_rpg.txt` template teaches AI agents and developers the RPG methodology through embedded instructions, inline good/bad examples, and XML-style tags for structure. This template enables creation of dependency-aware PRDs that automatically generate topologically-ordered task graphs when parsed with Task Master.
+
+  Key features:
+  - Method-as-template: teaches RPG principles (dual-semantics, explicit dependencies, topological order) while being used
+  - Inline instructions at decision points guide AI through each section
+  - Good/bad examples for immediate pattern matching
+  - Flexible plain-text format with XML-style tags for parseability
+  - Critical dependency-graph section ensures correct task ordering
+  - Automatic inclusion during `task-master init`
+  - Comprehensive documentation at [docs.task-master.dev/capabilities/rpg-method](https://docs.task-master.dev/capabilities/rpg-method)
+  - Tool recommendations for code-context-aware PRD creation (Claude Code, Cursor, Gemini CLI, Codex/Grok)
+
+  The RPG template complements the existing `example_prd.txt` and provides a more structured approach for complex projects requiring clear module boundaries and dependency chains.
+
+- [#1287](https://github.com/eyaltoledano/claude-task-master/pull/1287) [`90e6bdc`](https://github.com/eyaltoledano/claude-task-master/commit/90e6bdcf1c59f65ad27fcdfe3b13b9dca7e77654) Thanks [@Crunchyman-ralph](https://github.com/Crunchyman-ralph)! - Enhance `expand_all` to intelligently use complexity analysis recommendations when expanding tasks.
+
+  The expand-all operation now automatically leverages recommendations from `analyze-complexity` to determine optimal subtask counts for each task, resulting in more accurate and context-aware task breakdowns.
+
+  Key improvements:
+  - Automatic integration with complexity analysis reports
+  - Tag-aware complexity report path resolution
+  - Intelligent subtask count determination based on task complexity
+  - Falls back to defaults when complexity analysis is unavailable
+  - Enhanced logging for better visibility into expansion decisions
+
+  When you run `task-master expand --all` after `task-master analyze-complexity`, Task Master now uses the recommended subtask counts from the complexity analysis instead of applying uniform defaults, ensuring each task is broken down according to its actual complexity.
+
+### Patch Changes
+
+- [#1191](https://github.com/eyaltoledano/claude-task-master/pull/1191) [`aaf903f`](https://github.com/eyaltoledano/claude-task-master/commit/aaf903ff2f606c779a22e9a4b240ab57b3683815) Thanks [@Crunchyman-ralph](https://github.com/Crunchyman-ralph)! - Fix cross-level task dependencies not being saved
+
+  Fixes an issue where adding dependencies between subtasks and top-level tasks (e.g., `task-master add-dependency --id=2.2 --depends-on=11`) would report success but fail to persist the changes. Dependencies can now be created in both directions between any task levels.
+
+- [#1299](https://github.com/eyaltoledano/claude-task-master/pull/1299) [`4c1ef2c`](https://github.com/eyaltoledano/claude-task-master/commit/4c1ef2ca94411c53bcd2a78ec710b06c500236dd) Thanks [@Crunchyman-ralph](https://github.com/Crunchyman-ralph)! - Improve refresh token when authenticating
+
+## 0.29.0-rc.1
+
+### Patch Changes
+
+- [#1299](https://github.com/eyaltoledano/claude-task-master/pull/1299) [`a6c5152`](https://github.com/eyaltoledano/claude-task-master/commit/a6c5152f20edd8717cf1aea34e7c178b1261aa99) Thanks [@Crunchyman-ralph](https://github.com/Crunchyman-ralph)! - Improve refresh token when authenticating
+
 ## 0.29.0-rc.0

 ### Minor Changes
--- a/CLAUDE.md
+++ b/CLAUDE.md
@@ -6,13 +6,20 @@

 ## Test Guidelines

+### Test File Placement
+
+- **Package & tests**: Place in `packages/<package-name>/src/<module>/<file>.spec.ts` or `apps/<app-name>/src/<module>/<file.spec.ts>` alongside source
+- **Package integration tests**: Place in `packages/<package-name>/tests/integration/<module>/<file>.test.ts` or `apps/<app-name>/tests/integration/<module>/<file>.test.ts` alongside source
+- **Isolated unit tests**: Use `tests/unit/packages/<package-name>/` only when parallel placement isn't possible
+- **Test extension**: Always use `.ts` for TypeScript tests, never `.js`
+
 ### Synchronous Tests
 - **NEVER use async/await in test functions** unless testing actual asynchronous operations
 - Use synchronous top-level imports instead of dynamic `await import()`
 - Test bodies should be synchronous whenever possible
 - Example:
-  ```javascript
-  // ✅ CORRECT - Synchronous imports
+  ```typescript
+  // ✅ CORRECT - Synchronous imports with .ts extension
  import { MyClass } from '../src/my-class.js';

  it('should verify behavior', () => {
@@ -26,6 +33,11 @@
  });
  ```

+## Documentation Guidelines
+
+- **Documentation location**: Write docs in `apps/docs/` (Mintlify site source), not `docs/`
+- **Documentation URL**: Reference docs at https://docs.task-master.dev, not local file paths
+
 ## Changeset Guidelines

 - When creating changesets, remember that it's user-facing, meaning we don't have to get into the specifics of the code, but rather mention what the end-user is getting or fixing from this changeset.
--- a/README.md
+++ b/README.md
@@ -119,6 +119,7 @@ MCP (Model Control Protocol) lets you run Task Master directly from your editor.
      "command": "npx",
      "args": ["-y", "task-master-ai"],
      "env": {
+        // "TASK_MASTER_TOOLS": "all", // Options: "all", "standard", "core", or comma-separated list of tools
        "ANTHROPIC_API_KEY": "YOUR_ANTHROPIC_API_KEY_HERE",
        "PERPLEXITY_API_KEY": "YOUR_PERPLEXITY_API_KEY_HERE",
        "OPENAI_API_KEY": "YOUR_OPENAI_KEY_HERE",
@@ -148,6 +149,7 @@ MCP (Model Control Protocol) lets you run Task Master directly from your editor.
      "command": "npx",
      "args": ["-y", "task-master-ai"],
      "env": {
+        // "TASK_MASTER_TOOLS": "all", // Options: "all", "standard", "core", or comma-separated list of tools
        "ANTHROPIC_API_KEY": "YOUR_ANTHROPIC_API_KEY_HERE",
        "PERPLEXITY_API_KEY": "YOUR_PERPLEXITY_API_KEY_HERE",
        "OPENAI_API_KEY": "YOUR_OPENAI_KEY_HERE",
@@ -196,7 +198,7 @@ Initialize taskmaster-ai in my project

 #### 5. Make sure you have a PRD (Recommended)

-For **new projects**: Create your PRD at `.taskmaster/docs/prd.txt`  
+For **new projects**: Create your PRD at `.taskmaster/docs/prd.txt`.
 For **existing projects**: You can use `scripts/prd.txt` or migrate with `task-master migrate`

 An example PRD template is available after initialization in `.taskmaster/templates/example_prd.txt`.
@@ -282,6 +284,76 @@ task-master generate
 task-master rules add windsurf,roo,vscode
 ```

+## Tool Loading Configuration
+
+### Optimizing MCP Tool Loading
+
+Task Master's MCP server supports selective tool loading to reduce context window usage. By default, all 36 tools are loaded (~21,000 tokens) to maintain backward compatibility with existing installations.
+
+You can optimize performance by configuring the `TASK_MASTER_TOOLS` environment variable:
+
+### Available Modes
+
+| Mode | Tools | Context Usage | Use Case |
+|------|-------|--------------|----------|
+| `all` (default) | 36 | ~21,000 tokens | Complete feature set - all tools available |
+| `standard` | 15 | ~10,000 tokens | Common task management operations |
+| `core` (or `lean`) | 7 | ~5,000 tokens | Essential daily development workflow |
+| `custom` | Variable | Variable | Comma-separated list of specific tools |
+
+### Configuration Methods
+
+#### Method 1: Environment Variable in MCP Configuration
+
+Add `TASK_MASTER_TOOLS` to your MCP configuration file's `env` section:
+
+```jsonc
+{
+  "mcpServers": {  // or "servers" for VS Code
+    "task-master-ai": {
+      "command": "npx",
+      "args": ["-y", "--package=task-master-ai", "task-master-ai"],
+      "env": {
+        "TASK_MASTER_TOOLS": "standard",  // Options: "all", "standard", "core", "lean", or comma-separated list
+        "ANTHROPIC_API_KEY": "your-key-here",
+        // ... other API keys
+      }
+    }
+  }
+}
+```
+
+#### Method 2: Claude Code CLI (One-Time Setup)
+
+For Claude Code users, you can set the mode during installation:
+
+```bash
+# Core mode example (~70% token reduction)
+claude mcp add task-master-ai --scope user \
+  --env TASK_MASTER_TOOLS="core" \
+  -- npx -y task-master-ai@latest
+
+# Custom tools example
+claude mcp add task-master-ai --scope user \
+  --env TASK_MASTER_TOOLS="get_tasks,next_task,set_task_status" \
+  -- npx -y task-master-ai@latest
+```
+
+### Tool Sets Details
+
+**Core Tools (7):** `get_tasks`, `next_task`, `get_task`, `set_task_status`, `update_subtask`, `parse_prd`, `expand_task`
+
+**Standard Tools (15):** All core tools plus `initialize_project`, `analyze_project_complexity`, `expand_all`, `add_subtask`, `remove_task`, `generate`, `add_task`, `complexity_report`
+
+**All Tools (36):** Complete set including project setup, task management, analysis, dependencies, tags, research, and more
+
+### Recommendations
+
+- **New users**: Start with `"standard"` mode for a good balance
+- **Large projects**: Use `"core"` mode to minimize token usage
+- **Complex workflows**: Use `"all"` mode or custom selection
+- **Backward compatibility**: If not specified, defaults to `"all"` mode
+
 ## Claude Code Support

 Task Master now supports Claude models through the Claude Code CLI, which requires no API key:
@@ -310,6 +382,12 @@ cd claude-task-master
 node scripts/init.js
 ```

+## Join Our Team
+
+<a href="https://tryhamster.com" target="_blank">
+  <img src="./images/hamster-hiring.png" alt="Join Hamster's founding team" />
+</a>
+
 ## Contributors

 <a href="https://github.com/eyaltoledano/claude-task-master/graphs/contributors">
--- a/apps/cli/CHANGELOG.md
+++ b/apps/cli/CHANGELOG.md
@@ -11,6 +11,13 @@

 ### Patch Changes

+- Updated dependencies []:
+  - @tm/core@null
+
+## null
+
+### Patch Changes
+
 - Updated dependencies []:
  - @tm/core@null

--- a/apps/cli/package.json
+++ b/apps/cli/package.json
@@ -22,6 +22,7 @@
 		"test:ci": "vitest run --coverage --reporter=dot"
 	},
 	"dependencies": {
+		"@inquirer/search": "^3.2.0",
 		"@tm/core": "*",
 		"boxen": "^8.0.1",
 		"chalk": "5.6.2",
--- a/apps/cli/src/command-registry.ts
+++ b/apps/cli/src/command-registry.ts
@@ -8,11 +8,13 @@ import { Command } from 'commander';
 // Import all commands
 import { ListTasksCommand } from './commands/list.command.js';
 import { ShowCommand } from './commands/show.command.js';
+import { NextCommand } from './commands/next.command.js';
 import { AuthCommand } from './commands/auth.command.js';
 import { ContextCommand } from './commands/context.command.js';
 import { StartCommand } from './commands/start.command.js';
 import { SetStatusCommand } from './commands/set-status.command.js';
 import { ExportCommand } from './commands/export.command.js';
+import { AutopilotCommand } from './commands/autopilot/index.js';

 /**
 * Command metadata for registration
@@ -45,6 +47,12 @@ export class CommandRegistry {
 			commandClass: ShowCommand as any,
 			category: 'task'
 		},
+		{
+			name: 'next',
+			description: 'Find the next available task to work on',
+			commandClass: NextCommand as any,
+			category: 'task'
+		},
 		{
 			name: 'start',
 			description: 'Start working on a task with claude-code',
@@ -63,6 +71,13 @@ export class CommandRegistry {
 			commandClass: ExportCommand as any,
 			category: 'task'
 		},
+		{
+			name: 'autopilot',
+			description:
+				'AI agent orchestration for TDD workflow (start, resume, next, complete, commit, status, abort)',
+			commandClass: AutopilotCommand as any,
+			category: 'development'
+		},

 		// Authentication & Context Commands
 		{
--- a/apps/cli/src/commands/auth.command.ts
+++ b/apps/cli/src/commands/auth.command.ts
@@ -14,6 +14,8 @@ import {
 	type AuthCredentials
 } from '@tm/core/auth';
 import * as ui from '../utils/ui.js';
+import { ContextCommand } from './context.command.js';
+import { displayError } from '../utils/error-handler.js';

 /**
 * Result type from auth command
@@ -116,8 +118,7 @@ export class AuthCommand extends Command {
 				process.exit(0);
 			}, 100);
 		} catch (error: any) {
-			this.handleError(error);
-			process.exit(1);
+			displayError(error);
 		}
 	}

@@ -133,8 +134,7 @@ export class AuthCommand extends Command {
 				process.exit(1);
 			}
 		} catch (error: any) {
-			this.handleError(error);
-			process.exit(1);
+			displayError(error);
 		}
 	}

@@ -146,8 +146,7 @@ export class AuthCommand extends Command {
 			const result = this.displayStatus();
 			this.setLastResult(result);
 		} catch (error: any) {
-			this.handleError(error);
-			process.exit(1);
+			displayError(error);
 		}
 	}

@@ -163,8 +162,7 @@ export class AuthCommand extends Command {
 				process.exit(1);
 			}
 		} catch (error: any) {
-			this.handleError(error);
-			process.exit(1);
+			displayError(error);
 		}
 	}

@@ -187,19 +185,29 @@ export class AuthCommand extends Command {
 			if (credentials.expiresAt) {
 				const expiresAt = new Date(credentials.expiresAt);
 				const now = new Date();
-				const hoursRemaining = Math.floor(
-					(expiresAt.getTime() - now.getTime()) / (1000 * 60 * 60)
-				);
+				const timeRemaining = expiresAt.getTime() - now.getTime();
+				const hoursRemaining = Math.floor(timeRemaining / (1000 * 60 * 60));
+				const minutesRemaining = Math.floor(timeRemaining / (1000 * 60));

-				if (hoursRemaining > 0) {
-					console.log(
-						chalk.gray(
-							`  Expires: ${expiresAt.toLocaleString()} (${hoursRemaining} hours remaining)`
-						)
-					);
+				if (timeRemaining > 0) {
+					// Token is still valid
+					if (hoursRemaining > 0) {
+						console.log(
+							chalk.gray(
+								`  Expires at: ${expiresAt.toLocaleString()} (${hoursRemaining} hours remaining)`
+							)
+						);
+					} else {
+						console.log(
+							chalk.gray(
+								`  Expires at: ${expiresAt.toLocaleString()} (${minutesRemaining} minutes remaining)`
+							)
+						);
+					}
 				} else {
+					// Token has expired
 					console.log(
-						chalk.yellow(`  Token expired at: ${expiresAt.toLocaleString()}`)
+						chalk.yellow(`  Expired at: ${expiresAt.toLocaleString()}`)
 					);
 				}
 			} else {
@@ -341,6 +349,37 @@ export class AuthCommand extends Command {
 				chalk.gray(`  Logged in as: ${credentials.email || credentials.userId}`)
 			);

+			// Post-auth: Set up workspace context
+			console.log(); // Add spacing
+			try {
+				const contextCommand = new ContextCommand();
+				const contextResult = await contextCommand.setupContextInteractive();
+				if (contextResult.success) {
+					if (contextResult.orgSelected && contextResult.briefSelected) {
+						console.log(
+							chalk.green('✓ Workspace context configured successfully')
+						);
+					} else if (contextResult.orgSelected) {
+						console.log(chalk.green('✓ Organization selected'));
+					}
+				} else {
+					console.log(
+						chalk.yellow('⚠ Context setup was skipped or encountered issues')
+					);
+					console.log(
+						chalk.gray('  You can set up context later with "tm context"')
+					);
+				}
+			} catch (contextError) {
+				console.log(chalk.yellow('⚠ Context setup encountered an error'));
+				console.log(
+					chalk.gray('  You can set up context later with "tm context"')
+				);
+				if (process.env.DEBUG) {
+					console.error(chalk.gray((contextError as Error).message));
+				}
+			}
+
 			return {
 				success: true,
 				action: 'login',
@@ -348,7 +387,7 @@ export class AuthCommand extends Command {
 				message: 'Authentication successful'
 			};
 		} catch (error) {
-			this.handleAuthError(error as AuthenticationError);
+			displayError(error, { skipExit: true });

 			return {
 				success: false,
@@ -411,51 +450,6 @@ export class AuthCommand extends Command {
 		}
 	}

-	/**
-	 * Handle authentication errors
-	 */
-	private handleAuthError(error: AuthenticationError): void {
-		console.error(chalk.red(`\n✗ ${error.message}`));
-
-		switch (error.code) {
-			case 'NETWORK_ERROR':
-				ui.displayWarning(
-					'Please check your internet connection and try again.'
-				);
-				break;
-			case 'INVALID_CREDENTIALS':
-				ui.displayWarning('Please check your credentials and try again.');
-				break;
-			case 'AUTH_EXPIRED':
-				ui.displayWarning(
-					'Your session has expired. Please authenticate again.'
-				);
-				break;
-			default:
-				if (process.env.DEBUG) {
-					console.error(chalk.gray(error.stack || ''));
-				}
-		}
-	}
-
-	/**
-	 * Handle general errors
-	 */
-	private handleError(error: any): void {
-		if (error instanceof AuthenticationError) {
-			this.handleAuthError(error);
-		} else {
-			const msg = error?.getSanitizedDetails?.() ?? {
-				message: error?.message ?? String(error)
-			};
-			console.error(chalk.red(`Error: ${msg.message || 'Unexpected error'}`));
-
-			if (error.stack && process.env.DEBUG) {
-				console.error(chalk.gray(error.stack));
-			}
-		}
-	}
-
 	/**
 	 * Set the last result for programmatic access
 	 */
--- a/apps/cli/src/commands/autopilot.command.ts
+++ b/apps/cli/src/commands/autopilot.command.ts
@@ -0,0 +1,515 @@
+/**
+ * @fileoverview AutopilotCommand using Commander's native class pattern
+ * Extends Commander.Command for better integration with the framework
+ * This is a thin presentation layer over @tm/core's autopilot functionality
+ */
+
+import { Command } from 'commander';
+import chalk from 'chalk';
+import boxen from 'boxen';
+import ora, { type Ora } from 'ora';
+import {
+	createTaskMasterCore,
+	type TaskMasterCore,
+	type Task,
+	type Subtask
+} from '@tm/core';
+import * as ui from '../utils/ui.js';
+
+/**
+ * CLI-specific options interface for the autopilot command
+ */
+export interface AutopilotCommandOptions {
+	format?: 'text' | 'json';
+	project?: string;
+	dryRun?: boolean;
+}
+
+/**
+ * Preflight check result for a single check
+ */
+export interface PreflightCheckResult {
+	success: boolean;
+	message?: string;
+}
+
+/**
+ * Overall preflight check results
+ */
+export interface PreflightResult {
+	success: boolean;
+	testCommand: PreflightCheckResult;
+	gitWorkingTree: PreflightCheckResult;
+	requiredTools: PreflightCheckResult;
+	defaultBranch: PreflightCheckResult;
+}
+
+/**
+ * CLI-specific result type from autopilot command
+ */
+export interface AutopilotCommandResult {
+	success: boolean;
+	taskId: string;
+	task?: Task;
+	error?: string;
+	message?: string;
+}
+
+/**
+ * AutopilotCommand extending Commander's Command class
+ * This is a thin presentation layer over @tm/core's autopilot functionality
+ */
+export class AutopilotCommand extends Command {
+	private tmCore?: TaskMasterCore;
+	private lastResult?: AutopilotCommandResult;
+
+	constructor(name?: string) {
+		super(name || 'autopilot');
+
+		// Configure the command
+		this.description(
+			'Execute a task autonomously using TDD workflow with git integration'
+		)
+			.argument('<taskId>', 'Task ID to execute autonomously')
+			.option('-f, --format <format>', 'Output format (text, json)', 'text')
+			.option('-p, --project <path>', 'Project root directory', process.cwd())
+			.option(
+				'--dry-run',
+				'Show what would be executed without performing actions'
+			)
+			.action(async (taskId: string, options: AutopilotCommandOptions) => {
+				await this.executeCommand(taskId, options);
+			});
+	}
+
+	/**
+	 * Execute the autopilot command
+	 */
+	private async executeCommand(
+		taskId: string,
+		options: AutopilotCommandOptions
+	): Promise<void> {
+		let spinner: Ora | null = null;
+
+		try {
+			// Validate options
+			if (!this.validateOptions(options)) {
+				process.exit(1);
+			}
+
+			// Validate task ID format
+			if (!this.validateTaskId(taskId)) {
+				ui.displayError(`Invalid task ID format: ${taskId}`);
+				process.exit(1);
+			}
+
+			// Initialize tm-core with spinner
+			spinner = ora('Initializing Task Master...').start();
+			await this.initializeCore(options.project || process.cwd());
+			spinner.succeed('Task Master initialized');
+
+			// Load and validate task existence
+			spinner = ora(`Loading task ${taskId}...`).start();
+			const task = await this.loadTask(taskId);
+
+			if (!task) {
+				spinner.fail(`Task ${taskId} not found`);
+				ui.displayError(`Task with ID ${taskId} does not exist`);
+				process.exit(1);
+			}
+
+			spinner.succeed(`Task ${taskId} loaded`);
+
+			// Display task information
+			this.displayTaskInfo(task, options.dryRun || false);
+
+			// Execute autopilot logic (placeholder for now)
+			const result = await this.performAutopilot(taskId, task, options);
+
+			// Store result for programmatic access
+			this.setLastResult(result);
+
+			// Display results
+			this.displayResults(result, options);
+		} catch (error: unknown) {
+			if (spinner) {
+				spinner.fail('Operation failed');
+			}
+			this.handleError(error);
+			process.exit(1);
+		}
+	}
+
+	/**
+	 * Validate command options
+	 */
+	private validateOptions(options: AutopilotCommandOptions): boolean {
+		// Validate format
+		if (options.format && !['text', 'json'].includes(options.format)) {
+			console.error(chalk.red(`Invalid format: ${options.format}`));
+			console.error(chalk.gray(`Valid formats: text, json`));
+			return false;
+		}
+
+		return true;
+	}
+
+	/**
+	 * Validate task ID format
+	 */
+	private validateTaskId(taskId: string): boolean {
+		// Task ID should be a number or number.number format (e.g., "1" or "1.2")
+		const taskIdPattern = /^\d+(\.\d+)*$/;
+		return taskIdPattern.test(taskId);
+	}
+
+	/**
+	 * Initialize TaskMasterCore
+	 */
+	private async initializeCore(projectRoot: string): Promise<void> {
+		if (!this.tmCore) {
+			this.tmCore = await createTaskMasterCore({ projectPath: projectRoot });
+		}
+	}
+
+	/**
+	 * Load task from tm-core
+	 */
+	private async loadTask(taskId: string): Promise<Task | null> {
+		if (!this.tmCore) {
+			throw new Error('TaskMasterCore not initialized');
+		}
+
+		try {
+			const { task } = await this.tmCore.getTaskWithSubtask(taskId);
+			return task;
+		} catch (error) {
+			return null;
+		}
+	}
+
+	/**
+	 * Display task information before execution
+	 */
+	private displayTaskInfo(task: Task, isDryRun: boolean): void {
+		const prefix = isDryRun ? '[DRY RUN] ' : '';
+		console.log();
+		console.log(
+			boxen(
+				chalk.cyan.bold(`${prefix}Autopilot Task Execution`) +
+					'\n\n' +
+					chalk.white(`Task ID: ${task.id}`) +
+					'\n' +
+					chalk.white(`Title: ${task.title}`) +
+					'\n' +
+					chalk.white(`Status: ${task.status}`) +
+					(task.description ? '\n\n' + chalk.gray(task.description) : ''),
+				{
+					padding: 1,
+					borderStyle: 'round',
+					borderColor: 'cyan',
+					width: process.stdout.columns ? process.stdout.columns * 0.95 : 100
+				}
+			)
+		);
+		console.log();
+	}
+
+	/**
+	 * Perform autopilot execution using PreflightChecker and TaskLoader
+	 */
+	private async performAutopilot(
+		taskId: string,
+		task: Task,
+		options: AutopilotCommandOptions
+	): Promise<AutopilotCommandResult> {
+		// Run preflight checks
+		const preflightResult = await this.runPreflightChecks(options);
+		if (!preflightResult.success) {
+			return {
+				success: false,
+				taskId,
+				task,
+				error: 'Preflight checks failed',
+				message: 'Please resolve the issues above before running autopilot'
+			};
+		}
+
+		// Validate task structure and get execution order
+		const validationResult = await this.validateTaskStructure(
+			taskId,
+			task,
+			options
+		);
+		if (!validationResult.success) {
+			return validationResult;
+		}
+
+		// Display execution plan
+		this.displayExecutionPlan(
+			validationResult.task!,
+			validationResult.orderedSubtasks!,
+			options
+		);
+
+		return {
+			success: true,
+			taskId,
+			task: validationResult.task,
+			message: options.dryRun
+				? 'Dry run completed successfully'
+				: 'Autopilot execution ready (actual execution not yet implemented)'
+		};
+	}
+
+	/**
+	 * Run preflight checks and display results
+	 */
+	private async runPreflightChecks(
+		options: AutopilotCommandOptions
+	): Promise<PreflightResult> {
+		const { PreflightChecker } = await import('@tm/core');
+
+		console.log();
+		console.log(chalk.cyan.bold('Running preflight checks...'));
+
+		const preflightChecker = new PreflightChecker(
+			options.project || process.cwd()
+		);
+		const result = await preflightChecker.runAllChecks();
+
+		this.displayPreflightResults(result);
+
+		return result;
+	}
+
+	/**
+	 * Validate task structure and get execution order
+	 */
+	private async validateTaskStructure(
+		taskId: string,
+		task: Task,
+		options: AutopilotCommandOptions
+	): Promise<AutopilotCommandResult & { orderedSubtasks?: Subtask[] }> {
+		const { TaskLoaderService } = await import('@tm/core');
+
+		console.log();
+		console.log(chalk.cyan.bold('Validating task structure...'));
+
+		const taskLoader = new TaskLoaderService(options.project || process.cwd());
+		const validationResult = await taskLoader.loadAndValidateTask(taskId);
+
+		if (!validationResult.success) {
+			await taskLoader.cleanup();
+			return {
+				success: false,
+				taskId,
+				task,
+				error: validationResult.errorMessage,
+				message: validationResult.suggestion
+			};
+		}
+
+		const orderedSubtasks = taskLoader.getExecutionOrder(
+			validationResult.task!
+		);
+
+		await taskLoader.cleanup();
+
+		return {
+			success: true,
+			taskId,
+			task: validationResult.task,
+			orderedSubtasks
+		};
+	}
+
+	/**
+	 * Display execution plan with subtasks and TDD workflow
+	 */
+	private displayExecutionPlan(
+		task: Task,
+		orderedSubtasks: Subtask[],
+		options: AutopilotCommandOptions
+	): void {
+		console.log();
+		console.log(chalk.green.bold('✓ All checks passed!'));
+		console.log();
+		console.log(chalk.cyan.bold('Execution Plan:'));
+		console.log(chalk.white(`Task: ${task.title}`));
+		console.log(
+			chalk.gray(
+				`${orderedSubtasks.length} subtasks will be executed in dependency order`
+			)
+		);
+		console.log();
+
+		// Display subtasks
+		orderedSubtasks.forEach((subtask: Subtask, index: number) => {
+			console.log(
+				chalk.yellow(`${index + 1}. ${task.id}.${subtask.id}: ${subtask.title}`)
+			);
+			if (subtask.dependencies && subtask.dependencies.length > 0) {
+				console.log(
+					chalk.gray(`   Dependencies: ${subtask.dependencies.join(', ')}`)
+				);
+			}
+		});
+
+		console.log();
+		console.log(
+			chalk.cyan('Autopilot would execute each subtask using TDD workflow:')
+		);
+		console.log(chalk.gray('  1. RED phase: Write failing test'));
+		console.log(chalk.gray('  2. GREEN phase: Implement code to pass test'));
+		console.log(chalk.gray('  3. COMMIT phase: Commit changes'));
+		console.log();
+
+		if (options.dryRun) {
+			console.log(
+				chalk.yellow('This was a dry run. Use without --dry-run to execute.')
+			);
+		}
+	}
+
+	/**
+	 * Display preflight check results
+	 */
+	private displayPreflightResults(result: PreflightResult): void {
+		const checks = [
+			{ name: 'Test command', result: result.testCommand },
+			{ name: 'Git working tree', result: result.gitWorkingTree },
+			{ name: 'Required tools', result: result.requiredTools },
+			{ name: 'Default branch', result: result.defaultBranch }
+		];
+
+		checks.forEach((check) => {
+			const icon = check.result.success ? chalk.green('✓') : chalk.red('✗');
+			const status = check.result.success
+				? chalk.green('PASS')
+				: chalk.red('FAIL');
+			console.log(`${icon} ${chalk.white(check.name)}: ${status}`);
+			if (check.result.message) {
+				console.log(chalk.gray(`  ${check.result.message}`));
+			}
+		});
+	}
+
+	/**
+	 * Display results based on format
+	 */
+	private displayResults(
+		result: AutopilotCommandResult,
+		options: AutopilotCommandOptions
+	): void {
+		const format = options.format || 'text';
+
+		switch (format) {
+			case 'json':
+				this.displayJson(result);
+				break;
+
+			case 'text':
+			default:
+				this.displayTextResult(result);
+				break;
+		}
+	}
+
+	/**
+	 * Display in JSON format
+	 */
+	private displayJson(result: AutopilotCommandResult): void {
+		console.log(JSON.stringify(result, null, 2));
+	}
+
+	/**
+	 * Display result in text format
+	 */
+	private displayTextResult(result: AutopilotCommandResult): void {
+		if (result.success) {
+			console.log(
+				boxen(
+					chalk.green.bold('✓ Autopilot Command Completed') +
+						'\n\n' +
+						chalk.white(result.message || 'Execution complete'),
+					{
+						padding: 1,
+						borderStyle: 'round',
+						borderColor: 'green',
+						margin: { top: 1 }
+					}
+				)
+			);
+		} else {
+			console.log(
+				boxen(
+					chalk.red.bold('✗ Autopilot Command Failed') +
+						'\n\n' +
+						chalk.white(result.error || 'Unknown error'),
+					{
+						padding: 1,
+						borderStyle: 'round',
+						borderColor: 'red',
+						margin: { top: 1 }
+					}
+				)
+			);
+		}
+	}
+
+	/**
+	 * Handle general errors
+	 */
+	private handleError(error: unknown): void {
+		const errorObj = error as {
+			getSanitizedDetails?: () => { message: string };
+			message?: string;
+			stack?: string;
+		};
+
+		const msg = errorObj?.getSanitizedDetails?.() ?? {
+			message: errorObj?.message ?? String(error)
+		};
+		console.error(chalk.red(`Error: ${msg.message || 'Unexpected error'}`));
+
+		// Show stack trace in development mode or when DEBUG is set
+		const isDevelopment = process.env.NODE_ENV !== 'production';
+		if ((isDevelopment || process.env.DEBUG) && errorObj.stack) {
+			console.error(chalk.gray(errorObj.stack));
+		}
+	}
+
+	/**
+	 * Set the last result for programmatic access
+	 */
+	private setLastResult(result: AutopilotCommandResult): void {
+		this.lastResult = result;
+	}
+
+	/**
+	 * Get the last result (for programmatic usage)
+	 */
+	getLastResult(): AutopilotCommandResult | undefined {
+		return this.lastResult;
+	}
+
+	/**
+	 * Clean up resources
+	 */
+	async cleanup(): Promise<void> {
+		if (this.tmCore) {
+			await this.tmCore.close();
+			this.tmCore = undefined;
+		}
+	}
+
+	/**
+	 * Register this command on an existing program
+	 */
+	static register(program: Command, name?: string): AutopilotCommand {
+		const autopilotCommand = new AutopilotCommand(name);
+		program.addCommand(autopilotCommand);
+		return autopilotCommand;
+	}
+}
--- a/apps/cli/src/commands/autopilot/abort.command.ts
+++ b/apps/cli/src/commands/autopilot/abort.command.ts
@@ -0,0 +1,119 @@
+/**
+ * @fileoverview Abort Command - Safely terminate workflow
+ */
+
+import { Command } from 'commander';
+import { WorkflowOrchestrator } from '@tm/core';
+import {
+	AutopilotBaseOptions,
+	hasWorkflowState,
+	loadWorkflowState,
+	deleteWorkflowState,
+	OutputFormatter
+} from './shared.js';
+import inquirer from 'inquirer';
+
+interface AbortOptions extends AutopilotBaseOptions {
+	force?: boolean;
+}
+
+/**
+ * Abort Command - Safely terminate workflow and clean up state
+ */
+export class AbortCommand extends Command {
+	constructor() {
+		super('abort');
+
+		this.description('Abort the current TDD workflow and clean up state')
+			.option('-f, --force', 'Force abort without confirmation')
+			.action(async (options: AbortOptions) => {
+				await this.execute(options);
+			});
+	}
+
+	private async execute(options: AbortOptions): Promise<void> {
+		// Inherit parent options
+		const parentOpts = this.parent?.opts() as AutopilotBaseOptions;
+		const mergedOptions: AbortOptions = {
+			...parentOpts,
+			...options,
+			projectRoot:
+				options.projectRoot || parentOpts?.projectRoot || process.cwd()
+		};
+
+		const formatter = new OutputFormatter(mergedOptions.json || false);
+
+		try {
+			// Check for workflow state
+			const hasState = await hasWorkflowState(mergedOptions.projectRoot!);
+			if (!hasState) {
+				formatter.warning('No active workflow to abort');
+				return;
+			}
+
+			// Load state
+			const state = await loadWorkflowState(mergedOptions.projectRoot!);
+			if (!state) {
+				formatter.error('Failed to load workflow state');
+				process.exit(1);
+			}
+
+			// Restore orchestrator
+			const orchestrator = new WorkflowOrchestrator(state.context);
+			orchestrator.restoreState(state);
+
+			// Get progress before abort
+			const progress = orchestrator.getProgress();
+			const currentSubtask = orchestrator.getCurrentSubtask();
+
+			// Confirm abort if not forced or in JSON mode
+			if (!mergedOptions.force && !mergedOptions.json) {
+				const { confirmed } = await inquirer.prompt([
+					{
+						type: 'confirm',
+						name: 'confirmed',
+						message:
+							`This will abort the workflow for task ${state.context.taskId}. ` +
+							`Progress: ${progress.completed}/${progress.total} subtasks completed. ` +
+							`Continue?`,
+						default: false
+					}
+				]);
+
+				if (!confirmed) {
+					formatter.info('Abort cancelled');
+					return;
+				}
+			}
+
+			// Trigger abort in orchestrator
+			orchestrator.transition({ type: 'ABORT' });
+
+			// Delete workflow state
+			await deleteWorkflowState(mergedOptions.projectRoot!);
+
+			// Output result
+			formatter.success('Workflow aborted', {
+				taskId: state.context.taskId,
+				branchName: state.context.branchName,
+				progress: {
+					completed: progress.completed,
+					total: progress.total
+				},
+				lastSubtask: currentSubtask
+					? {
+							id: currentSubtask.id,
+							title: currentSubtask.title
+						}
+					: null,
+				note: 'Branch and commits remain. Clean up manually if needed.'
+			});
+		} catch (error) {
+			formatter.error((error as Error).message);
+			if (mergedOptions.verbose) {
+				console.error((error as Error).stack);
+			}
+			process.exit(1);
+		}
+	}
+}
--- a/apps/cli/src/commands/autopilot/commit.command.ts
+++ b/apps/cli/src/commands/autopilot/commit.command.ts
@@ -0,0 +1,169 @@
+/**
+ * @fileoverview Commit Command - Create commit with enhanced message generation
+ */
+
+import { Command } from 'commander';
+import { WorkflowOrchestrator } from '@tm/core';
+import {
+	AutopilotBaseOptions,
+	hasWorkflowState,
+	loadWorkflowState,
+	createGitAdapter,
+	createCommitMessageGenerator,
+	OutputFormatter,
+	saveWorkflowState
+} from './shared.js';
+
+type CommitOptions = AutopilotBaseOptions;
+
+/**
+ * Commit Command - Create commit using enhanced message generator
+ */
+export class CommitCommand extends Command {
+	constructor() {
+		super('commit');
+
+		this.description('Create a commit for the completed GREEN phase').action(
+			async (options: CommitOptions) => {
+				await this.execute(options);
+			}
+		);
+	}
+
+	private async execute(options: CommitOptions): Promise<void> {
+		// Inherit parent options
+		const parentOpts = this.parent?.opts() as AutopilotBaseOptions;
+		const mergedOptions: CommitOptions = {
+			...parentOpts,
+			...options,
+			projectRoot:
+				options.projectRoot || parentOpts?.projectRoot || process.cwd()
+		};
+
+		const formatter = new OutputFormatter(mergedOptions.json || false);
+
+		try {
+			// Check for workflow state
+			const hasState = await hasWorkflowState(mergedOptions.projectRoot!);
+			if (!hasState) {
+				formatter.error('No active workflow', {
+					suggestion: 'Start a workflow with: autopilot start <taskId>'
+				});
+				process.exit(1);
+			}
+
+			// Load state
+			const state = await loadWorkflowState(mergedOptions.projectRoot!);
+			if (!state) {
+				formatter.error('Failed to load workflow state');
+				process.exit(1);
+			}
+
+			const orchestrator = new WorkflowOrchestrator(state.context);
+			orchestrator.restoreState(state);
+			orchestrator.enableAutoPersist(async (newState) => {
+				await saveWorkflowState(mergedOptions.projectRoot!, newState);
+			});
+
+			// Verify in COMMIT phase
+			const tddPhase = orchestrator.getCurrentTDDPhase();
+			if (tddPhase !== 'COMMIT') {
+				formatter.error('Not in COMMIT phase', {
+					currentPhase: tddPhase || orchestrator.getCurrentPhase(),
+					suggestion: 'Complete RED and GREEN phases first'
+				});
+				process.exit(1);
+			}
+
+			// Get current subtask
+			const currentSubtask = orchestrator.getCurrentSubtask();
+			if (!currentSubtask) {
+				formatter.error('No current subtask');
+				process.exit(1);
+			}
+
+			// Initialize git adapter
+			const gitAdapter = createGitAdapter(mergedOptions.projectRoot!);
+			await gitAdapter.ensureGitRepository();
+
+			// Check for staged changes
+			const hasStagedChanges = await gitAdapter.hasStagedChanges();
+			if (!hasStagedChanges) {
+				// Stage all changes
+				formatter.info('No staged changes, staging all changes...');
+				await gitAdapter.stageFiles(['.']);
+			}
+
+			// Get changed files for scope detection
+			const status = await gitAdapter.getStatus();
+			const changedFiles = [...status.staged, ...status.modified];
+
+			// Generate commit message
+			const messageGenerator = createCommitMessageGenerator();
+			const testResults = state.context.lastTestResults;
+
+			const commitMessage = messageGenerator.generateMessage({
+				type: 'feat',
+				description: currentSubtask.title,
+				changedFiles,
+				taskId: state.context.taskId,
+				phase: 'TDD',
+				tag: (state.context.metadata.tag as string) || undefined,
+				testsPassing: testResults?.passed,
+				testsFailing: testResults?.failed,
+				coveragePercent: undefined // Could be added if available
+			});
+
+			// Create commit with metadata
+			await gitAdapter.createCommit(commitMessage, {
+				metadata: {
+					taskId: state.context.taskId,
+					subtaskId: currentSubtask.id,
+					phase: 'COMMIT',
+					tddCycle: 'complete'
+				}
+			});
+
+			// Get commit info
+			const lastCommit = await gitAdapter.getLastCommit();
+
+			// Complete COMMIT phase (this marks subtask as completed)
+			orchestrator.transition({ type: 'COMMIT_COMPLETE' });
+
+			// Check if should advance to next subtask
+			const progress = orchestrator.getProgress();
+			if (progress.current < progress.total) {
+				orchestrator.transition({ type: 'SUBTASK_COMPLETE' });
+			} else {
+				// All subtasks complete
+				orchestrator.transition({ type: 'ALL_SUBTASKS_COMPLETE' });
+			}
+
+			// Output success
+			formatter.success('Commit created', {
+				commitHash: lastCommit.hash.substring(0, 7),
+				message: commitMessage.split('\n')[0], // First line only
+				subtask: {
+					id: currentSubtask.id,
+					title: currentSubtask.title,
+					status: currentSubtask.status
+				},
+				progress: {
+					completed: progress.completed,
+					total: progress.total,
+					percentage: progress.percentage
+				},
+				nextAction:
+					progress.completed < progress.total
+						? 'Start next subtask with RED phase'
+						: 'All subtasks complete. Run: autopilot status'
+			});
+		} catch (error) {
+			formatter.error((error as Error).message);
+			if (mergedOptions.verbose) {
+				console.error((error as Error).stack);
+			}
+			process.exit(1);
+		}
+	}
+}
--- a/apps/cli/src/commands/autopilot/complete.command.ts
+++ b/apps/cli/src/commands/autopilot/complete.command.ts
@@ -0,0 +1,172 @@
+/**
+ * @fileoverview Complete Command - Complete current TDD phase with validation
+ */
+
+import { Command } from 'commander';
+import { WorkflowOrchestrator, TestResult } from '@tm/core';
+import {
+	AutopilotBaseOptions,
+	hasWorkflowState,
+	loadWorkflowState,
+	OutputFormatter
+} from './shared.js';
+
+interface CompleteOptions extends AutopilotBaseOptions {
+	results?: string;
+	coverage?: string;
+}
+
+/**
+ * Complete Command - Mark current phase as complete with validation
+ */
+export class CompleteCommand extends Command {
+	constructor() {
+		super('complete');
+
+		this.description('Complete the current TDD phase with result validation')
+			.option(
+				'-r, --results <json>',
+				'Test results JSON (with total, passed, failed, skipped)'
+			)
+			.option('-c, --coverage <percent>', 'Coverage percentage')
+			.action(async (options: CompleteOptions) => {
+				await this.execute(options);
+			});
+	}
+
+	private async execute(options: CompleteOptions): Promise<void> {
+		// Inherit parent options
+		const parentOpts = this.parent?.opts() as AutopilotBaseOptions;
+		const mergedOptions: CompleteOptions = {
+			...parentOpts,
+			...options,
+			projectRoot:
+				options.projectRoot || parentOpts?.projectRoot || process.cwd()
+		};
+
+		const formatter = new OutputFormatter(mergedOptions.json || false);
+
+		try {
+			// Check for workflow state
+			const hasState = await hasWorkflowState(mergedOptions.projectRoot!);
+			if (!hasState) {
+				formatter.error('No active workflow', {
+					suggestion: 'Start a workflow with: autopilot start <taskId>'
+				});
+				process.exit(1);
+			}
+
+			// Load state
+			const state = await loadWorkflowState(mergedOptions.projectRoot!);
+			if (!state) {
+				formatter.error('Failed to load workflow state');
+				process.exit(1);
+			}
+
+			// Restore orchestrator with persistence
+			const { saveWorkflowState } = await import('./shared.js');
+			const orchestrator = new WorkflowOrchestrator(state.context);
+			orchestrator.restoreState(state);
+			orchestrator.enableAutoPersist(async (newState) => {
+				await saveWorkflowState(mergedOptions.projectRoot!, newState);
+			});
+
+			// Get current phase
+			const tddPhase = orchestrator.getCurrentTDDPhase();
+			const currentSubtask = orchestrator.getCurrentSubtask();
+
+			if (!tddPhase) {
+				formatter.error('Not in a TDD phase', {
+					phase: orchestrator.getCurrentPhase()
+				});
+				process.exit(1);
+			}
+
+			// Validate based on phase
+			if (tddPhase === 'RED' || tddPhase === 'GREEN') {
+				if (!mergedOptions.results) {
+					formatter.error('Test results required for RED/GREEN phase', {
+						usage:
+							'--results \'{"total":10,"passed":9,"failed":1,"skipped":0}\''
+					});
+					process.exit(1);
+				}
+
+				// Parse test results
+				let testResults: TestResult;
+				try {
+					const parsed = JSON.parse(mergedOptions.results);
+					testResults = {
+						total: parsed.total || 0,
+						passed: parsed.passed || 0,
+						failed: parsed.failed || 0,
+						skipped: parsed.skipped || 0,
+						phase: tddPhase
+					};
+				} catch (error) {
+					formatter.error('Invalid test results JSON', {
+						error: (error as Error).message
+					});
+					process.exit(1);
+				}
+
+				// Validate RED phase requirements
+				if (tddPhase === 'RED' && testResults.failed === 0) {
+					formatter.error('RED phase validation failed', {
+						reason: 'At least one test must be failing',
+						actual: {
+							passed: testResults.passed,
+							failed: testResults.failed
+						}
+					});
+					process.exit(1);
+				}
+
+				// Validate GREEN phase requirements
+				if (tddPhase === 'GREEN' && testResults.failed !== 0) {
+					formatter.error('GREEN phase validation failed', {
+						reason: 'All tests must pass',
+						actual: {
+							passed: testResults.passed,
+							failed: testResults.failed
+						}
+					});
+					process.exit(1);
+				}
+
+				// Complete phase with test results
+				if (tddPhase === 'RED') {
+					orchestrator.transition({
+						type: 'RED_PHASE_COMPLETE',
+						testResults
+					});
+					formatter.success('RED phase completed', {
+						nextPhase: 'GREEN',
+						testResults,
+						subtask: currentSubtask?.title
+					});
+				} else {
+					orchestrator.transition({
+						type: 'GREEN_PHASE_COMPLETE',
+						testResults
+					});
+					formatter.success('GREEN phase completed', {
+						nextPhase: 'COMMIT',
+						testResults,
+						subtask: currentSubtask?.title,
+						suggestion: 'Run: autopilot commit'
+					});
+				}
+			} else if (tddPhase === 'COMMIT') {
+				formatter.error('Use "autopilot commit" to complete COMMIT phase');
+				process.exit(1);
+			}
+		} catch (error) {
+			formatter.error((error as Error).message);
+			if (mergedOptions.verbose) {
+				console.error((error as Error).stack);
+			}
+			process.exit(1);
+		}
+	}
+}
--- a/apps/cli/src/commands/autopilot/index.ts
+++ b/apps/cli/src/commands/autopilot/index.ts
@@ -0,0 +1,82 @@
+/**
+ * @fileoverview Autopilot CLI Commands for AI Agent Orchestration
+ * Provides subcommands for starting, resuming, and advancing the TDD workflow
+ * with JSON output for machine parsing.
+ */
+
+import { Command } from 'commander';
+import { StartCommand } from './start.command.js';
+import { ResumeCommand } from './resume.command.js';
+import { NextCommand } from './next.command.js';
+import { CompleteCommand } from './complete.command.js';
+import { CommitCommand } from './commit.command.js';
+import { StatusCommand } from './status.command.js';
+import { AbortCommand } from './abort.command.js';
+
+/**
+ * Shared command options for all autopilot commands
+ */
+export interface AutopilotBaseOptions {
+	json?: boolean;
+	verbose?: boolean;
+	projectRoot?: string;
+}
+
+/**
+ * AutopilotCommand with subcommands for TDD workflow orchestration
+ */
+export class AutopilotCommand extends Command {
+	constructor() {
+		super('autopilot');
+
+		// Configure main command
+		this.description('AI agent orchestration for TDD workflow execution')
+			.alias('ap')
+			// Global options for all subcommands
+			.option('--json', 'Output in JSON format for machine parsing')
+			.option('-v, --verbose', 'Enable verbose output')
+			.option(
+				'-p, --project-root <path>',
+				'Project root directory',
+				process.cwd()
+			);
+
+		// Register subcommands
+		this.registerSubcommands();
+	}
+
+	/**
+	 * Register all autopilot subcommands
+	 */
+	private registerSubcommands(): void {
+		// Start new TDD workflow
+		this.addCommand(new StartCommand());
+
+		// Resume existing workflow
+		this.addCommand(new ResumeCommand());
+
+		// Get next action
+		this.addCommand(new NextCommand());
+
+		// Complete current phase
+		this.addCommand(new CompleteCommand());
+
+		// Create commit
+		this.addCommand(new CommitCommand());
+
+		// Show status
+		this.addCommand(new StatusCommand());
+
+		// Abort workflow
+		this.addCommand(new AbortCommand());
+	}
+
+	/**
+	 * Register this command on an existing program
+	 */
+	static register(program: Command): AutopilotCommand {
+		const autopilotCommand = new AutopilotCommand();
+		program.addCommand(autopilotCommand);
+		return autopilotCommand;
+	}
+}
--- a/apps/cli/src/commands/autopilot/next.command.ts
+++ b/apps/cli/src/commands/autopilot/next.command.ts
@@ -0,0 +1,164 @@
+/**
+ * @fileoverview Next Command - Get next action in TDD workflow
+ */
+
+import { Command } from 'commander';
+import { WorkflowOrchestrator } from '@tm/core';
+import {
+	AutopilotBaseOptions,
+	hasWorkflowState,
+	loadWorkflowState,
+	OutputFormatter
+} from './shared.js';
+
+type NextOptions = AutopilotBaseOptions;
+
+/**
+ * Next Command - Get next action details
+ */
+export class NextCommand extends Command {
+	constructor() {
+		super('next');
+
+		this.description(
+			'Get the next action to perform in the TDD workflow'
+		).action(async (options: NextOptions) => {
+			await this.execute(options);
+		});
+	}
+
+	private async execute(options: NextOptions): Promise<void> {
+		// Inherit parent options
+		const parentOpts = this.parent?.opts() as AutopilotBaseOptions;
+		const mergedOptions: NextOptions = {
+			...parentOpts,
+			...options,
+			projectRoot:
+				options.projectRoot || parentOpts?.projectRoot || process.cwd()
+		};
+
+		const formatter = new OutputFormatter(mergedOptions.json || false);
+
+		try {
+			// Check for workflow state
+			const hasState = await hasWorkflowState(mergedOptions.projectRoot!);
+			if (!hasState) {
+				formatter.error('No active workflow', {
+					suggestion: 'Start a workflow with: autopilot start <taskId>'
+				});
+				process.exit(1);
+			}
+
+			// Load state
+			const state = await loadWorkflowState(mergedOptions.projectRoot!);
+			if (!state) {
+				formatter.error('Failed to load workflow state');
+				process.exit(1);
+			}
+
+			// Restore orchestrator
+			const orchestrator = new WorkflowOrchestrator(state.context);
+			orchestrator.restoreState(state);
+
+			// Get current phase and subtask
+			const phase = orchestrator.getCurrentPhase();
+			const tddPhase = orchestrator.getCurrentTDDPhase();
+			const currentSubtask = orchestrator.getCurrentSubtask();
+
+			// Determine next action based on phase
+			let actionType: string;
+			let actionDescription: string;
+			let actionDetails: Record<string, unknown> = {};
+
+			if (phase === 'COMPLETE') {
+				formatter.success('Workflow complete', {
+					message: 'All subtasks have been completed',
+					taskId: state.context.taskId
+				});
+				return;
+			}
+
+			if (phase === 'SUBTASK_LOOP' && tddPhase) {
+				switch (tddPhase) {
+					case 'RED':
+						actionType = 'generate_test';
+						actionDescription = 'Write failing test for current subtask';
+						actionDetails = {
+							subtask: currentSubtask
+								? {
+										id: currentSubtask.id,
+										title: currentSubtask.title,
+										attempts: currentSubtask.attempts
+									}
+								: null,
+							testCommand: 'npm test', // Could be customized based on config
+							expectedOutcome: 'Test should fail'
+						};
+						break;
+
+					case 'GREEN':
+						actionType = 'implement_code';
+						actionDescription = 'Implement code to pass the failing test';
+						actionDetails = {
+							subtask: currentSubtask
+								? {
+										id: currentSubtask.id,
+										title: currentSubtask.title,
+										attempts: currentSubtask.attempts
+									}
+								: null,
+							testCommand: 'npm test',
+							expectedOutcome: 'All tests should pass',
+							lastTestResults: state.context.lastTestResults
+						};
+						break;
+
+					case 'COMMIT':
+						actionType = 'commit_changes';
+						actionDescription = 'Commit the changes';
+						actionDetails = {
+							subtask: currentSubtask
+								? {
+										id: currentSubtask.id,
+										title: currentSubtask.title,
+										attempts: currentSubtask.attempts
+									}
+								: null,
+							suggestion: 'Use: autopilot commit'
+						};
+						break;
+
+					default:
+						actionType = 'unknown';
+						actionDescription = 'Unknown TDD phase';
+				}
+			} else {
+				actionType = 'workflow_phase';
+				actionDescription = `Currently in ${phase} phase`;
+			}
+
+			// Output next action
+			const output = {
+				action: actionType,
+				description: actionDescription,
+				phase,
+				tddPhase,
+				taskId: state.context.taskId,
+				branchName: state.context.branchName,
+				...actionDetails
+			};
+
+			if (mergedOptions.json) {
+				formatter.output(output);
+			} else {
+				formatter.success('Next action', output);
+			}
+		} catch (error) {
+			formatter.error((error as Error).message);
+			if (mergedOptions.verbose) {
+				console.error((error as Error).stack);
+			}
+			process.exit(1);
+		}
+	}
+}
--- a/apps/cli/src/commands/autopilot/resume.command.ts
+++ b/apps/cli/src/commands/autopilot/resume.command.ts
@@ -0,0 +1,111 @@
+/**
+ * @fileoverview Resume Command - Restore and resume TDD workflow
+ */
+
+import { Command } from 'commander';
+import { WorkflowOrchestrator } from '@tm/core';
+import {
+	AutopilotBaseOptions,
+	hasWorkflowState,
+	loadWorkflowState,
+	OutputFormatter
+} from './shared.js';
+
+type ResumeOptions = AutopilotBaseOptions;
+
+/**
+ * Resume Command - Restore workflow from saved state
+ */
+export class ResumeCommand extends Command {
+	constructor() {
+		super('resume');
+
+		this.description('Resume a previously started TDD workflow').action(
+			async (options: ResumeOptions) => {
+				await this.execute(options);
+			}
+		);
+	}
+
+	private async execute(options: ResumeOptions): Promise<void> {
+		// Inherit parent options (autopilot command)
+		const parentOpts = this.parent?.opts() as AutopilotBaseOptions;
+		const mergedOptions: ResumeOptions = {
+			...parentOpts,
+			...options,
+			projectRoot:
+				options.projectRoot || parentOpts?.projectRoot || process.cwd()
+		};
+
+		const formatter = new OutputFormatter(mergedOptions.json || false);
+
+		try {
+			// Check for workflow state
+			const hasState = await hasWorkflowState(mergedOptions.projectRoot!);
+			if (!hasState) {
+				formatter.error('No workflow state found', {
+					suggestion: 'Start a new workflow with: autopilot start <taskId>'
+				});
+				process.exit(1);
+			}
+
+			// Load state
+			formatter.info('Loading workflow state...');
+			const state = await loadWorkflowState(mergedOptions.projectRoot!);
+
+			if (!state) {
+				formatter.error('Failed to load workflow state');
+				process.exit(1);
+			}
+
+			// Validate state can be resumed
+			const orchestrator = new WorkflowOrchestrator(state.context);
+			if (!orchestrator.canResumeFromState(state)) {
+				formatter.error('Invalid workflow state', {
+					suggestion:
+						'State file may be corrupted. Consider starting a new workflow.'
+				});
+				process.exit(1);
+			}
+
+			// Restore state
+			orchestrator.restoreState(state);
+
+			// Re-enable auto-persistence
+			const { saveWorkflowState } = await import('./shared.js');
+			orchestrator.enableAutoPersist(async (newState) => {
+				await saveWorkflowState(mergedOptions.projectRoot!, newState);
+			});
+
+			// Get progress
+			const progress = orchestrator.getProgress();
+			const currentSubtask = orchestrator.getCurrentSubtask();
+
+			// Output success
+			formatter.success('Workflow resumed', {
+				taskId: state.context.taskId,
+				phase: orchestrator.getCurrentPhase(),
+				tddPhase: orchestrator.getCurrentTDDPhase(),
+				branchName: state.context.branchName,
+				progress: {
+					completed: progress.completed,
+					total: progress.total,
+					percentage: progress.percentage
+				},
+				currentSubtask: currentSubtask
+					? {
+							id: currentSubtask.id,
+							title: currentSubtask.title,
+							attempts: currentSubtask.attempts
+						}
+					: null
+			});
+		} catch (error) {
+			formatter.error((error as Error).message);
+			if (mergedOptions.verbose) {
+				console.error((error as Error).stack);
+			}
+			process.exit(1);
+		}
+	}
+}
--- a/apps/cli/src/commands/autopilot/shared.ts
+++ b/apps/cli/src/commands/autopilot/shared.ts
@@ -0,0 +1,262 @@
+/**
+ * @fileoverview Shared utilities for autopilot commands
+ */
+
+import {
+	WorkflowOrchestrator,
+	WorkflowStateManager,
+	GitAdapter,
+	CommitMessageGenerator
+} from '@tm/core';
+import type { WorkflowState, WorkflowContext, SubtaskInfo } from '@tm/core';
+import chalk from 'chalk';
+
+/**
+ * Base options interface for all autopilot commands
+ */
+export interface AutopilotBaseOptions {
+	projectRoot?: string;
+	json?: boolean;
+	verbose?: boolean;
+}
+
+/**
+ * Load workflow state from disk using WorkflowStateManager
+ */
+export async function loadWorkflowState(
+	projectRoot: string
+): Promise<WorkflowState | null> {
+	const stateManager = new WorkflowStateManager(projectRoot);
+
+	if (!(await stateManager.exists())) {
+		return null;
+	}
+
+	try {
+		return await stateManager.load();
+	} catch (error) {
+		throw new Error(
+			`Failed to load workflow state: ${(error as Error).message}`
+		);
+	}
+}
+
+/**
+ * Save workflow state to disk using WorkflowStateManager
+ */
+export async function saveWorkflowState(
+	projectRoot: string,
+	state: WorkflowState
+): Promise<void> {
+	const stateManager = new WorkflowStateManager(projectRoot);
+
+	try {
+		await stateManager.save(state);
+	} catch (error) {
+		throw new Error(
+			`Failed to save workflow state: ${(error as Error).message}`
+		);
+	}
+}
+
+/**
+ * Delete workflow state from disk using WorkflowStateManager
+ */
+export async function deleteWorkflowState(projectRoot: string): Promise<void> {
+	const stateManager = new WorkflowStateManager(projectRoot);
+	await stateManager.delete();
+}
+
+/**
+ * Check if workflow state exists using WorkflowStateManager
+ */
+export async function hasWorkflowState(projectRoot: string): Promise<boolean> {
+	const stateManager = new WorkflowStateManager(projectRoot);
+	return await stateManager.exists();
+}
+
+/**
+ * Initialize WorkflowOrchestrator with persistence
+ */
+export function createOrchestrator(
+	context: WorkflowContext,
+	projectRoot: string
+): WorkflowOrchestrator {
+	const orchestrator = new WorkflowOrchestrator(context);
+	const stateManager = new WorkflowStateManager(projectRoot);
+
+	// Enable auto-persistence
+	orchestrator.enableAutoPersist(async (state: WorkflowState) => {
+		await stateManager.save(state);
+	});
+
+	return orchestrator;
+}
+
+/**
+ * Initialize GitAdapter for project
+ */
+export function createGitAdapter(projectRoot: string): GitAdapter {
+	return new GitAdapter(projectRoot);
+}
+
+/**
+ * Initialize CommitMessageGenerator
+ */
+export function createCommitMessageGenerator(): CommitMessageGenerator {
+	return new CommitMessageGenerator();
+}
+
+/**
+ * Output formatter for JSON and text modes
+ */
+export class OutputFormatter {
+	constructor(private useJson: boolean) {}
+
+	/**
+	 * Output data in appropriate format
+	 */
+	output(data: Record<string, unknown>): void {
+		if (this.useJson) {
+			console.log(JSON.stringify(data, null, 2));
+		} else {
+			this.outputText(data);
+		}
+	}
+
+	/**
+	 * Output data in human-readable text format
+	 */
+	private outputText(data: Record<string, unknown>): void {
+		for (const [key, value] of Object.entries(data)) {
+			if (typeof value === 'object' && value !== null) {
+				console.log(chalk.cyan(`${key}:`));
+				this.outputObject(value as Record<string, unknown>, '  ');
+			} else {
+				console.log(chalk.white(`${key}: ${value}`));
+			}
+		}
+	}
+
+	/**
+	 * Output nested object with indentation
+	 */
+	private outputObject(obj: Record<string, unknown>, indent: string): void {
+		for (const [key, value] of Object.entries(obj)) {
+			if (typeof value === 'object' && value !== null) {
+				console.log(chalk.cyan(`${indent}${key}:`));
+				this.outputObject(value as Record<string, unknown>, indent + '  ');
+			} else {
+				console.log(chalk.gray(`${indent}${key}: ${value}`));
+			}
+		}
+	}
+
+	/**
+	 * Output error message
+	 */
+	error(message: string, details?: Record<string, unknown>): void {
+		if (this.useJson) {
+			console.error(
+				JSON.stringify(
+					{
+						error: message,
+						...details
+					},
+					null,
+					2
+				)
+			);
+		} else {
+			console.error(chalk.red(`Error: ${message}`));
+			if (details) {
+				for (const [key, value] of Object.entries(details)) {
+					console.error(chalk.gray(`  ${key}: ${value}`));
+				}
+			}
+		}
+	}
+
+	/**
+	 * Output success message
+	 */
+	success(message: string, data?: Record<string, unknown>): void {
+		if (this.useJson) {
+			console.log(
+				JSON.stringify(
+					{
+						success: true,
+						message,
+						...data
+					},
+					null,
+					2
+				)
+			);
+		} else {
+			console.log(chalk.green(`✓ ${message}`));
+			if (data) {
+				this.output(data);
+			}
+		}
+	}
+
+	/**
+	 * Output warning message
+	 */
+	warning(message: string): void {
+		if (this.useJson) {
+			console.warn(
+				JSON.stringify(
+					{
+						warning: message
+					},
+					null,
+					2
+				)
+			);
+		} else {
+			console.warn(chalk.yellow(`⚠ ${message}`));
+		}
+	}
+
+	/**
+	 * Output info message
+	 */
+	info(message: string): void {
+		if (this.useJson) {
+			// Don't output info messages in JSON mode
+			return;
+		}
+		console.log(chalk.blue(`ℹ ${message}`));
+	}
+}
+
+/**
+ * Validate task ID format
+ */
+export function validateTaskId(taskId: string): boolean {
+	// Task ID should be in format: number or number.number (e.g., "1" or "1.2")
+	const pattern = /^\d+(\.\d+)*$/;
+	return pattern.test(taskId);
+}
+
+/**
+ * Parse subtasks from task data
+ */
+export function parseSubtasks(
+	task: any,
+	maxAttempts: number = 3
+): SubtaskInfo[] {
+	if (!task.subtasks || !Array.isArray(task.subtasks)) {
+		return [];
+	}
+
+	return task.subtasks.map((subtask: any) => ({
+		id: subtask.id,
+		title: subtask.title,
+		status: subtask.status === 'done' ? 'completed' : 'pending',
+		attempts: 0,
+		maxAttempts
+	}));
+}
--- a/apps/cli/src/commands/autopilot/start.command.ts
+++ b/apps/cli/src/commands/autopilot/start.command.ts
@@ -0,0 +1,168 @@
+/**
+ * @fileoverview Start Command - Initialize and start TDD workflow
+ */
+
+import { Command } from 'commander';
+import { createTaskMasterCore, type WorkflowContext } from '@tm/core';
+import {
+	AutopilotBaseOptions,
+	hasWorkflowState,
+	createOrchestrator,
+	createGitAdapter,
+	OutputFormatter,
+	validateTaskId,
+	parseSubtasks
+} from './shared.js';
+
+interface StartOptions extends AutopilotBaseOptions {
+	force?: boolean;
+	maxAttempts?: string;
+}
+
+/**
+ * Start Command - Initialize new TDD workflow
+ */
+export class StartCommand extends Command {
+	constructor() {
+		super('start');
+
+		this.description('Initialize and start a new TDD workflow for a task')
+			.argument('<taskId>', 'Task ID to start workflow for')
+			.option('-f, --force', 'Force start even if workflow state exists')
+			.option('--max-attempts <number>', 'Maximum attempts per subtask', '3')
+			.action(async (taskId: string, options: StartOptions) => {
+				await this.execute(taskId, options);
+			});
+	}
+
+	private async execute(taskId: string, options: StartOptions): Promise<void> {
+		// Inherit parent options
+		const parentOpts = this.parent?.opts() as AutopilotBaseOptions;
+		const mergedOptions: StartOptions = {
+			...parentOpts,
+			...options,
+			projectRoot:
+				options.projectRoot || parentOpts?.projectRoot || process.cwd()
+		};
+
+		const formatter = new OutputFormatter(mergedOptions.json || false);
+
+		try {
+			// Validate task ID
+			if (!validateTaskId(taskId)) {
+				formatter.error('Invalid task ID format', {
+					taskId,
+					expected: 'Format: number or number.number (e.g., "1" or "1.2")'
+				});
+				process.exit(1);
+			}
+
+			// Check for existing workflow state
+			const hasState = await hasWorkflowState(mergedOptions.projectRoot!);
+			if (hasState && !mergedOptions.force) {
+				formatter.error(
+					'Workflow state already exists. Use --force to overwrite or resume with "autopilot resume"'
+				);
+				process.exit(1);
+			}
+
+			// Initialize Task Master Core
+			const tmCore = await createTaskMasterCore({
+				projectPath: mergedOptions.projectRoot!
+			});
+
+			// Get current tag from ConfigManager
+			const currentTag = tmCore.getActiveTag();
+
+			// Load task
+			formatter.info(`Loading task ${taskId}...`);
+			const { task } = await tmCore.getTaskWithSubtask(taskId);
+
+			if (!task) {
+				formatter.error('Task not found', { taskId });
+				await tmCore.close();
+				process.exit(1);
+			}
+
+			// Validate task has subtasks
+			if (!task.subtasks || task.subtasks.length === 0) {
+				formatter.error('Task has no subtasks. Expand task first.', {
+					taskId,
+					suggestion: `Run: task-master expand --id=${taskId}`
+				});
+				await tmCore.close();
+				process.exit(1);
+			}
+
+			// Initialize Git adapter
+			const gitAdapter = createGitAdapter(mergedOptions.projectRoot!);
+			await gitAdapter.ensureGitRepository();
+			await gitAdapter.ensureCleanWorkingTree();
+
+			// Parse subtasks
+			const maxAttempts = parseInt(mergedOptions.maxAttempts || '3', 10);
+			const subtasks = parseSubtasks(task, maxAttempts);
+
+			// Create workflow context
+			const context: WorkflowContext = {
+				taskId: task.id,
+				subtasks,
+				currentSubtaskIndex: 0,
+				errors: [],
+				metadata: {
+					startedAt: new Date().toISOString(),
+					tags: task.tags || []
+				}
+			};
+
+			// Create orchestrator with persistence
+			const orchestrator = createOrchestrator(
+				context,
+				mergedOptions.projectRoot!
+			);
+
+			// Complete PREFLIGHT phase
+			orchestrator.transition({ type: 'PREFLIGHT_COMPLETE' });
+
+			// Generate descriptive branch name
+			const sanitizedTitle = task.title
+				.toLowerCase()
+				.replace(/[^a-z0-9]+/g, '-')
+				.replace(/^-+|-+$/g, '')
+				.substring(0, 50);
+			const formattedTaskId = taskId.replace(/\./g, '-');
+			const tagPrefix = currentTag ? `${currentTag}/` : '';
+			const branchName = `${tagPrefix}task-${formattedTaskId}-${sanitizedTitle}`;
+
+			// Create and checkout branch
+			formatter.info(`Creating branch: ${branchName}`);
+			await gitAdapter.createAndCheckoutBranch(branchName);
+
+			// Transition to SUBTASK_LOOP
+			orchestrator.transition({
+				type: 'BRANCH_CREATED',
+				branchName
+			});
+
+			// Output success
+			formatter.success('TDD workflow started', {
+				taskId: task.id,
+				title: task.title,
+				phase: orchestrator.getCurrentPhase(),
+				tddPhase: orchestrator.getCurrentTDDPhase(),
+				branchName,
+				subtasks: subtasks.length,
+				currentSubtask: subtasks[0]?.title
+			});
+
+			// Clean up
+			await tmCore.close();
+		} catch (error) {
+			formatter.error((error as Error).message);
+			if (mergedOptions.verbose) {
+				console.error((error as Error).stack);
+			}
+			process.exit(1);
+		}
+	}
+}
--- a/apps/cli/src/commands/autopilot/status.command.ts
+++ b/apps/cli/src/commands/autopilot/status.command.ts
@@ -0,0 +1,114 @@
+/**
+ * @fileoverview Status Command - Show workflow progress
+ */
+
+import { Command } from 'commander';
+import { WorkflowOrchestrator } from '@tm/core';
+import {
+	AutopilotBaseOptions,
+	hasWorkflowState,
+	loadWorkflowState,
+	OutputFormatter
+} from './shared.js';
+
+type StatusOptions = AutopilotBaseOptions;
+
+/**
+ * Status Command - Show current workflow status
+ */
+export class StatusCommand extends Command {
+	constructor() {
+		super('status');
+
+		this.description('Show current TDD workflow status and progress').action(
+			async (options: StatusOptions) => {
+				await this.execute(options);
+			}
+		);
+	}
+
+	private async execute(options: StatusOptions): Promise<void> {
+		// Inherit parent options
+		const parentOpts = this.parent?.opts() as AutopilotBaseOptions;
+		const mergedOptions: StatusOptions = {
+			...parentOpts,
+			...options,
+			projectRoot:
+				options.projectRoot || parentOpts?.projectRoot || process.cwd()
+		};
+
+		const formatter = new OutputFormatter(mergedOptions.json || false);
+
+		try {
+			// Check for workflow state
+			const hasState = await hasWorkflowState(mergedOptions.projectRoot!);
+			if (!hasState) {
+				formatter.error('No active workflow', {
+					suggestion: 'Start a workflow with: autopilot start <taskId>'
+				});
+				process.exit(1);
+			}
+
+			// Load state
+			const state = await loadWorkflowState(mergedOptions.projectRoot!);
+			if (!state) {
+				formatter.error('Failed to load workflow state');
+				process.exit(1);
+			}
+
+			// Restore orchestrator
+			const orchestrator = new WorkflowOrchestrator(state.context);
+			orchestrator.restoreState(state);
+
+			// Get status information
+			const phase = orchestrator.getCurrentPhase();
+			const tddPhase = orchestrator.getCurrentTDDPhase();
+			const progress = orchestrator.getProgress();
+			const currentSubtask = orchestrator.getCurrentSubtask();
+			const errors = state.context.errors ?? [];
+
+			// Build status output
+			const status = {
+				taskId: state.context.taskId,
+				phase,
+				tddPhase,
+				branchName: state.context.branchName,
+				progress: {
+					completed: progress.completed,
+					total: progress.total,
+					current: progress.current,
+					percentage: progress.percentage
+				},
+				currentSubtask: currentSubtask
+					? {
+							id: currentSubtask.id,
+							title: currentSubtask.title,
+							status: currentSubtask.status,
+							attempts: currentSubtask.attempts,
+							maxAttempts: currentSubtask.maxAttempts
+						}
+					: null,
+				subtasks: state.context.subtasks.map((st) => ({
+					id: st.id,
+					title: st.title,
+					status: st.status,
+					attempts: st.attempts
+				})),
+				errors: errors.length > 0 ? errors : undefined,
+				metadata: state.context.metadata
+			};
+
+			if (mergedOptions.json) {
+				formatter.output(status);
+			} else {
+				formatter.success('Workflow status', status);
+			}
+		} catch (error) {
+			formatter.error((error as Error).message);
+			if (mergedOptions.verbose) {
+				console.error((error as Error).stack);
+			}
+			process.exit(1);
+		}
+	}
+}
--- a/apps/cli/src/commands/context.command.ts
+++ b/apps/cli/src/commands/context.command.ts
@@ -6,13 +6,11 @@
 import { Command } from 'commander';
 import chalk from 'chalk';
 import inquirer from 'inquirer';
+import search from '@inquirer/search';
 import ora, { Ora } from 'ora';
-import {
-	AuthManager,
-	AuthenticationError,
-	type UserContext
-} from '@tm/core/auth';
+import { AuthManager, type UserContext } from '@tm/core/auth';
 import * as ui from '../utils/ui.js';
+import { displayError } from '../utils/error-handler.js';

 /**
 * Result type from context command
@@ -118,8 +116,7 @@ export class ContextCommand extends Command {
 			const result = this.displayContext();
 			this.setLastResult(result);
 		} catch (error: any) {
-			this.handleError(error);
-			process.exit(1);
+			displayError(error);
 		}
 	}

@@ -156,10 +153,14 @@ export class ContextCommand extends Command {

 			if (context.briefName || context.briefId) {
 				console.log(chalk.green('\n✓ Brief'));
-				if (context.briefName) {
+				if (context.briefName && context.briefId) {
+					const shortId = context.briefId.slice(0, 8);
+					console.log(
+						chalk.white(`  ${context.briefName} `) + chalk.gray(`(${shortId})`)
+					);
+				} else if (context.briefName) {
 					console.log(chalk.white(`  ${context.briefName}`));
-				}
-				if (context.briefId) {
+				} else if (context.briefId) {
 					console.log(chalk.gray(`  ID: ${context.briefId}`));
 				}
 			}
@@ -211,8 +212,7 @@ export class ContextCommand extends Command {
 				process.exit(1);
 			}
 		} catch (error: any) {
-			this.handleError(error);
-			process.exit(1);
+			displayError(error);
 		}
 	}

@@ -250,9 +250,10 @@ export class ContextCommand extends Command {
 			]);

 			// Update context
-			await this.authManager.updateContext({
+			this.authManager.updateContext({
 				orgId: selectedOrg.id,
 				orgName: selectedOrg.name,
+				orgSlug: selectedOrg.slug,
 				// Clear brief when changing org
 				briefId: undefined,
 				briefName: undefined
@@ -299,8 +300,7 @@ export class ContextCommand extends Command {
 				process.exit(1);
 			}
 		} catch (error: any) {
-			this.handleError(error);
-			process.exit(1);
+			displayError(error);
 		}
 	}

@@ -324,26 +324,54 @@ export class ContextCommand extends Command {
 				};
 			}

-			// Prompt for selection
-			const { selectedBrief } = await inquirer.prompt([
-				{
-					type: 'list',
-					name: 'selectedBrief',
-					message: 'Select a brief:',
-					choices: [
-						{ name: '(No brief - organization level)', value: null },
-						...briefs.map((brief) => ({
-							name: `Brief ${brief.id} (${new Date(brief.createdAt).toLocaleDateString()})`,
-							value: brief
-						}))
-					]
+			// Prompt for selection with search
+			const selectedBrief = await search<(typeof briefs)[0] | null>({
+				message: 'Search for a brief:',
+				source: async (input) => {
+					const searchTerm = input?.toLowerCase() || '';
+
+					// Static option for no brief
+					const noBriefOption = {
+						name: '(No brief - organization level)',
+						value: null as any,
+						description: 'Clear brief selection'
+					};
+
+					// Filter and map brief options
+					const briefOptions = briefs
+						.filter((brief) => {
+							if (!searchTerm) return true;
+
+							const title = brief.document?.title || '';
+							const shortId = brief.id.slice(0, 8);
+
+							// Search by title first, then by UUID
+							return (
+								title.toLowerCase().includes(searchTerm) ||
+								brief.id.toLowerCase().includes(searchTerm) ||
+								shortId.toLowerCase().includes(searchTerm)
+							);
+						})
+						.map((brief) => {
+							const title =
+								brief.document?.title || `Brief ${brief.id.slice(0, 8)}`;
+							const shortId = brief.id.slice(0, 8);
+							return {
+								name: `${title} ${chalk.gray(`(${shortId})`)}`,
+								value: brief
+							};
+						});
+
+					return [noBriefOption, ...briefOptions];
 				}
-			]);
+			});

 			if (selectedBrief) {
 				// Update context with brief
-				const briefName = `Brief ${selectedBrief.id.slice(0, 8)}`;
-				await this.authManager.updateContext({
+				const briefName =
+					selectedBrief.document?.title ||
+					`Brief ${selectedBrief.id.slice(0, 8)}`;
+				this.authManager.updateContext({
 					briefId: selectedBrief.id,
 					briefName: briefName
 				});
@@ -354,11 +382,11 @@ export class ContextCommand extends Command {
 					success: true,
 					action: 'select-brief',
 					context: this.authManager.getContext() || undefined,
-					message: `Selected brief: ${selectedBrief.name}`
+					message: `Selected brief: ${selectedBrief.document?.title}`
 				};
 			} else {
 				// Clear brief selection
-				await this.authManager.updateContext({
+				this.authManager.updateContext({
 					briefId: undefined,
 					briefName: undefined
 				});
@@ -396,8 +424,7 @@ export class ContextCommand extends Command {
 				process.exit(1);
 			}
 		} catch (error: any) {
-			this.handleError(error);
-			process.exit(1);
+			displayError(error);
 		}
 	}

@@ -443,8 +470,7 @@ export class ContextCommand extends Command {
 				process.exit(1);
 			}
 		} catch (error: any) {
-			this.handleError(error);
-			process.exit(1);
+			displayError(error);
 		}
 	}

@@ -468,7 +494,7 @@ export class ContextCommand extends Command {
 			if (!briefId) {
 				spinner.fail('Could not extract a brief ID from the provided input');
 				ui.displayError(
-					`Provide a valid brief ID or a Hamster brief URL, e.g. https://${process.env.TM_PUBLIC_BASE_DOMAIN}/home/hamster/briefs/<id>`
+					`Provide a valid brief ID or a Hamster brief URL, e.g. https://${process.env.TM_BASE_DOMAIN || process.env.TM_PUBLIC_BASE_DOMAIN}/home/hamster/briefs/<id>`
 				);
 				process.exit(1);
 			}
@@ -480,20 +506,24 @@ export class ContextCommand extends Command {
 				process.exit(1);
 			}

-			// Fetch org to get a friendly name (optional)
+			// Fetch org to get a friendly name and slug (optional)
 			let orgName: string | undefined;
+			let orgSlug: string | undefined;
 			try {
 				const org = await this.authManager.getOrganization(brief.accountId);
 				orgName = org?.name;
+				orgSlug = org?.slug;
 			} catch {
 				// Non-fatal if org lookup fails
 			}

 			// Update context: set org and brief
-			const briefName = `Brief ${brief.id.slice(0, 8)}`;
-			await this.authManager.updateContext({
+			const briefName =
+				brief.document?.title || `Brief ${brief.id.slice(0, 8)}`;
+			this.authManager.updateContext({
 				orgId: brief.accountId,
 				orgName,
+				orgSlug,
 				briefId: brief.id,
 				briefName
 			});
@@ -515,8 +545,7 @@ export class ContextCommand extends Command {
 			try {
 				if (spinner?.isSpinning) spinner.stop();
 			} catch {}
-			this.handleError(error);
-			process.exit(1);
+			displayError(error);
 		}
 	}

@@ -613,7 +642,7 @@ export class ContextCommand extends Command {
 				};
 			}

-			await this.authManager.updateContext(context);
+			this.authManager.updateContext(context);
 			ui.displaySuccess('Context updated');

 			// Display what was set
@@ -645,26 +674,6 @@ export class ContextCommand extends Command {
 		}
 	}

-	/**
-	 * Handle errors
-	 */
-	private handleError(error: any): void {
-		if (error instanceof AuthenticationError) {
-			console.error(chalk.red(`\n✗ ${error.message}`));
-
-			if (error.code === 'NOT_AUTHENTICATED') {
-				ui.displayWarning('Please authenticate first: tm auth login');
-			}
-		} else {
-			const msg = error?.message ?? String(error);
-			console.error(chalk.red(`Error: ${msg}`));
-
-			if (error.stack && process.env.DEBUG) {
-				console.error(chalk.gray(error.stack));
-			}
-		}
-	}
-
 	/**
 	 * Set the last result for programmatic access
 	 */
@@ -686,6 +695,53 @@ export class ContextCommand extends Command {
 		return this.authManager.getContext();
 	}

+	/**
+	 * Interactive context setup (for post-auth flow)
+	 * Prompts user to select org and brief
+	 */
+	async setupContextInteractive(): Promise<{
+		success: boolean;
+		orgSelected: boolean;
+		briefSelected: boolean;
+	}> {
+		try {
+			// Ask if user wants to set up workspace context
+			const { setupContext } = await inquirer.prompt([
+				{
+					type: 'confirm',
+					name: 'setupContext',
+					message: 'Would you like to set up your workspace context now?',
+					default: true
+				}
+			]);
+
+			if (!setupContext) {
+				return { success: true, orgSelected: false, briefSelected: false };
+			}
+
+			// Select organization
+			const orgResult = await this.selectOrganization();
+			if (!orgResult.success || !orgResult.context?.orgId) {
+				return { success: false, orgSelected: false, briefSelected: false };
+			}
+
+			// Select brief
+			const briefResult = await this.selectBrief(orgResult.context.orgId);
+			return {
+				success: true,
+				orgSelected: true,
+				briefSelected: briefResult.success
+			};
+		} catch (error) {
+			console.error(
+				chalk.yellow(
+					'\nContext setup skipped due to error. You can set it up later with "tm context"'
+				)
+			);
+			return { success: false, orgSelected: false, briefSelected: false };
+		}
+	}
+
 	/**
 	 * Clean up resources
 	 */
--- a/apps/cli/src/commands/export.command.ts
+++ b/apps/cli/src/commands/export.command.ts
@@ -7,13 +7,10 @@ import { Command } from 'commander';
 import chalk from 'chalk';
 import inquirer from 'inquirer';
 import ora, { Ora } from 'ora';
-import {
-	AuthManager,
-	AuthenticationError,
-	type UserContext
-} from '@tm/core/auth';
+import { AuthManager, type UserContext } from '@tm/core/auth';
 import { TaskMasterCore, type ExportResult } from '@tm/core';
 import * as ui from '../utils/ui.js';
+import { displayError } from '../utils/error-handler.js';

 /**
 * Result type from export command
@@ -103,7 +100,7 @@ export class ExportCommand extends Command {
 			await this.initializeServices();

 			// Get current context
-			const context = this.authManager.getContext();
+			const context = await this.authManager.getContext();

 			// Determine org and brief IDs
 			let orgId = options?.org || context?.orgId;
@@ -197,8 +194,7 @@ export class ExportCommand extends Command {
 			};
 		} catch (error: any) {
 			if (spinner?.isSpinning) spinner.fail('Export failed');
-			this.handleError(error);
-			process.exit(1);
+			displayError(error);
 		}
 	}

@@ -334,26 +330,6 @@ export class ExportCommand extends Command {
 		return confirmed;
 	}

-	/**
-	 * Handle errors
-	 */
-	private handleError(error: any): void {
-		if (error instanceof AuthenticationError) {
-			console.error(chalk.red(`\n✗ ${error.message}`));
-
-			if (error.code === 'NOT_AUTHENTICATED') {
-				ui.displayWarning('Please authenticate first: tm auth login');
-			}
-		} else {
-			const msg = error?.message ?? String(error);
-			console.error(chalk.red(`Error: ${msg}`));
-
-			if (error.stack && process.env.DEBUG) {
-				console.error(chalk.gray(error.stack));
-			}
-		}
-	}
-
 	/**
 	 * Get the last export result (useful for testing)
 	 */
--- a/apps/cli/src/commands/list.command.ts
+++ b/apps/cli/src/commands/list.command.ts
@@ -17,8 +17,9 @@ import {
 } from '@tm/core';
 import type { StorageType } from '@tm/core/types';
 import * as ui from '../utils/ui.js';
+import { displayError } from '../utils/error-handler.js';
+import { displayCommandHeader } from '../utils/display-helpers.js';
 import {
-	displayHeader,
 	displayDashboards,
 	calculateTaskStatistics,
 	calculateSubtaskStatistics,
@@ -106,14 +107,7 @@ export class ListTasksCommand extends Command {
 				this.displayResults(result, options);
 			}
 		} catch (error: any) {
-			const msg = error?.getSanitizedDetails?.() ?? {
-				message: error?.message ?? String(error)
-			};
-			console.error(chalk.red(`Error: ${msg.message || 'Unexpected error'}`));
-			if (error.stack && process.env.DEBUG) {
-				console.error(chalk.gray(error.stack));
-			}
-			process.exit(1);
+			displayError(error);
 		}
 	}

@@ -257,15 +251,12 @@ export class ListTasksCommand extends Command {
 	 * Display in text format with tables
 	 */
 	private displayText(data: ListTasksResult, withSubtasks?: boolean): void {
-		const { tasks, tag } = data;
+		const { tasks, tag, storageType } = data;

-		// Get file path for display
-		const filePath = this.tmCore ? `.taskmaster/tasks/tasks.json` : undefined;
-
-		// Display header without banner (banner already shown by main CLI)
-		displayHeader({
+		// Display header using utility function
+		displayCommandHeader(this.tmCore, {
 			tag: tag || 'master',
-			filePath: filePath
+			storageType
 		});

 		// No tasks message
--- a/apps/cli/src/commands/next.command.ts
+++ b/apps/cli/src/commands/next.command.ts
@@ -0,0 +1,248 @@
+/**
+ * @fileoverview NextCommand using Commander's native class pattern
+ * Extends Commander.Command for better integration with the framework
+ */
+
+import path from 'node:path';
+import { Command } from 'commander';
+import chalk from 'chalk';
+import boxen from 'boxen';
+import { createTaskMasterCore, type Task, type TaskMasterCore } from '@tm/core';
+import type { StorageType } from '@tm/core/types';
+import { displayError } from '../utils/error-handler.js';
+import { displayTaskDetails } from '../ui/components/task-detail.component.js';
+import { displayCommandHeader } from '../utils/display-helpers.js';
+
+/**
+ * Options interface for the next command
+ */
+export interface NextCommandOptions {
+	tag?: string;
+	format?: 'text' | 'json';
+	silent?: boolean;
+	project?: string;
+}
+
+/**
+ * Result type from next command
+ */
+export interface NextTaskResult {
+	task: Task | null;
+	found: boolean;
+	tag: string;
+	storageType: Exclude<StorageType, 'auto'>;
+}
+
+/**
+ * NextCommand extending Commander's Command class
+ * This is a thin presentation layer over @tm/core
+ */
+export class NextCommand extends Command {
+	private tmCore?: TaskMasterCore;
+	private lastResult?: NextTaskResult;
+
+	constructor(name?: string) {
+		super(name || 'next');
+
+		// Configure the command
+		this.description('Find the next available task to work on')
+			.option('-t, --tag <tag>', 'Filter by tag')
+			.option('-f, --format <format>', 'Output format (text, json)', 'text')
+			.option('--silent', 'Suppress output (useful for programmatic usage)')
+			.option('-p, --project <path>', 'Project root directory', process.cwd())
+			.action(async (options: NextCommandOptions) => {
+				await this.executeCommand(options);
+			});
+	}
+
+	/**
+	 * Execute the next command
+	 */
+	private async executeCommand(options: NextCommandOptions): Promise<void> {
+		let hasError = false;
+		try {
+			// Validate options (throws on invalid options)
+			this.validateOptions(options);
+
+			// Initialize tm-core
+			await this.initializeCore(options.project || process.cwd());
+
+			// Get next task from core
+			const result = await this.getNextTask(options);
+
+			// Store result for programmatic access
+			this.setLastResult(result);
+
+			// Display results
+			if (!options.silent) {
+				this.displayResults(result, options);
+			}
+		} catch (error: any) {
+			hasError = true;
+			displayError(error, { skipExit: true });
+		} finally {
+			// Always clean up resources, even on error
+			await this.cleanup();
+		}
+
+		// Exit after cleanup completes
+		if (hasError) {
+			process.exit(1);
+		}
+	}
+
+	/**
+	 * Validate command options
+	 */
+	private validateOptions(options: NextCommandOptions): void {
+		// Validate format
+		if (options.format && !['text', 'json'].includes(options.format)) {
+			throw new Error(
+				`Invalid format: ${options.format}. Valid formats are: text, json`
+			);
+		}
+	}
+
+	/**
+	 * Initialize TaskMasterCore
+	 */
+	private async initializeCore(projectRoot: string): Promise<void> {
+		if (!this.tmCore) {
+			const resolved = path.resolve(projectRoot);
+			this.tmCore = await createTaskMasterCore({ projectPath: resolved });
+		}
+	}
+
+	/**
+	 * Get next task from tm-core
+	 */
+	private async getNextTask(
+		options: NextCommandOptions
+	): Promise<NextTaskResult> {
+		if (!this.tmCore) {
+			throw new Error('TaskMasterCore not initialized');
+		}
+
+		// Call tm-core to get next task
+		const task = await this.tmCore.getNextTask(options.tag);
+
+		// Get storage type and active tag
+		const storageType = this.tmCore.getStorageType();
+		if (storageType === 'auto') {
+			throw new Error('Storage type must be resolved before use');
+		}
+		const activeTag = options.tag || this.tmCore.getActiveTag();
+
+		return {
+			task,
+			found: task !== null,
+			tag: activeTag,
+			storageType
+		};
+	}
+
+	/**
+	 * Display results based on format
+	 */
+	private displayResults(
+		result: NextTaskResult,
+		options: NextCommandOptions
+	): void {
+		const format = options.format || 'text';
+
+		switch (format) {
+			case 'json':
+				this.displayJson(result);
+				break;
+
+			case 'text':
+			default:
+				this.displayText(result);
+				break;
+		}
+	}
+
+	/**
+	 * Display in JSON format
+	 */
+	private displayJson(result: NextTaskResult): void {
+		console.log(JSON.stringify(result, null, 2));
+	}
+
+	/**
+	 * Display in text format
+	 */
+	private displayText(result: NextTaskResult): void {
+		// Display header with storage info
+		displayCommandHeader(this.tmCore, {
+			tag: result.tag || 'master',
+			storageType: result.storageType
+		});
+
+		if (!result.found || !result.task) {
+			// No next task available
+			console.log(
+				boxen(
+					chalk.yellow(
+						'No tasks available to work on. All tasks are either completed, blocked by dependencies, or in progress.'
+					),
+					{
+						padding: 1,
+						borderStyle: 'round',
+						borderColor: 'yellow',
+						title: '⚠ NO TASKS AVAILABLE ⚠',
+						titleAlignment: 'center'
+					}
+				)
+			);
+			console.log(
+				`\n${chalk.dim('Tip: Try')} ${chalk.cyan('task-master list --status pending')} ${chalk.dim('to see all pending tasks')}`
+			);
+			return;
+		}
+
+		const task = result.task;
+
+		// Display the task details using the same component as 'show' command
+		// with a custom header indicating this is the next task
+		const customHeader = `Next Task: #${task.id} - ${task.title}`;
+		displayTaskDetails(task, {
+			customHeader,
+			headerColor: 'green',
+			showSuggestedActions: true
+		});
+	}
+
+	/**
+	 * Set the last result for programmatic access
+	 */
+	private setLastResult(result: NextTaskResult): void {
+		this.lastResult = result;
+	}
+
+	/**
+	 * Get the last result (for programmatic usage)
+	 */
+	getLastResult(): NextTaskResult | undefined {
+		return this.lastResult;
+	}
+
+	/**
+	 * Clean up resources
+	 */
+	async cleanup(): Promise<void> {
+		if (this.tmCore) {
+			await this.tmCore.close();
+			this.tmCore = undefined;
+		}
+	}
+
+	/**
+	 * Register this command on an existing program
+	 */
+	static register(program: Command, name?: string): NextCommand {
+		const nextCommand = new NextCommand(name);
+		program.addCommand(nextCommand);
+		return nextCommand;
+	}
+}
--- a/apps/cli/src/commands/set-status.command.ts
+++ b/apps/cli/src/commands/set-status.command.ts
@@ -12,6 +12,7 @@ import {
 	type TaskStatus
 } from '@tm/core';
 import type { StorageType } from '@tm/core/types';
+import { displayError } from '../utils/error-handler.js';

 /**
 * Valid task status values for validation
@@ -85,6 +86,7 @@ export class SetStatusCommand extends Command {
 	private async executeCommand(
 		options: SetStatusCommandOptions
 	): Promise<void> {
+		let hasError = false;
 		try {
 			// Validate required options
 			if (!options.id) {
@@ -135,16 +137,15 @@ export class SetStatusCommand extends Command {
 						oldStatus: result.oldStatus,
 						newStatus: result.newStatus
 					});
-				} catch (error) {
-					const errorMessage =
-						error instanceof Error ? error.message : String(error);
-
-					if (!options.silent) {
-						console.error(
-							chalk.red(`Failed to update task ${taskId}: ${errorMessage}`)
-						);
-					}
+				} catch (error: any) {
+					hasError = true;
 					if (options.format === 'json') {
+						const errorMessage = error?.getSanitizedDetails
+							? error.getSanitizedDetails().message
+							: error instanceof Error
+								? error.message
+								: String(error);
+
 						console.log(
 							JSON.stringify({
 								success: false,
@@ -153,8 +154,13 @@ export class SetStatusCommand extends Command {
 								timestamp: new Date().toISOString()
 							})
 						);
+					} else if (!options.silent) {
+						// Show which task failed with context
+						console.error(chalk.red(`\nFailed to update task ${taskId}:`));
+						displayError(error, { skipExit: true });
 					}
-					process.exit(1);
+					// Don't exit here - let finally block clean up first
+					break;
 				}
 			}

@@ -170,25 +176,26 @@ export class SetStatusCommand extends Command {

 			// Display results
 			this.displayResults(this.lastResult, options);
-		} catch (error) {
-			const errorMessage =
-				error instanceof Error ? error.message : 'Unknown error occurred';
-
-			if (!options.silent) {
-				console.error(chalk.red(`Error: ${errorMessage}`));
-			}
-
+		} catch (error: any) {
+			hasError = true;
 			if (options.format === 'json') {
+				const errorMessage =
+					error instanceof Error ? error.message : 'Unknown error occurred';
 				console.log(JSON.stringify({ success: false, error: errorMessage }));
+			} else if (!options.silent) {
+				displayError(error, { skipExit: true });
 			}
-
-			process.exit(1);
 		} finally {
 			// Clean up resources
 			if (this.tmCore) {
 				await this.tmCore.close();
 			}
 		}
+
+		// Exit after cleanup completes
+		if (hasError) {
+			process.exit(1);
+		}
 	}

 	/**
--- a/apps/cli/src/commands/show.command.ts
+++ b/apps/cli/src/commands/show.command.ts
@@ -9,7 +9,9 @@ import boxen from 'boxen';
 import { createTaskMasterCore, type Task, type TaskMasterCore } from '@tm/core';
 import type { StorageType } from '@tm/core/types';
 import * as ui from '../utils/ui.js';
+import { displayError } from '../utils/error-handler.js';
 import { displayTaskDetails } from '../ui/components/task-detail.component.js';
+import { displayCommandHeader } from '../utils/display-helpers.js';

 /**
 * Options interface for the show command
@@ -112,14 +114,7 @@ export class ShowCommand extends Command {
 				this.displayResults(result, options);
 			}
 		} catch (error: any) {
-			const msg = error?.getSanitizedDetails?.() ?? {
-				message: error?.message ?? String(error)
-			};
-			console.error(chalk.red(`Error: ${msg.message || 'Unexpected error'}`));
-			if (error.stack && process.env.DEBUG) {
-				console.error(chalk.gray(error.stack));
-			}
-			process.exit(1);
+			displayError(error);
 		}
 	}

@@ -257,6 +252,15 @@ export class ShowCommand extends Command {
 			return;
 		}

+		// Display header with storage info
+		const activeTag = this.tmCore?.getActiveTag() || 'master';
+		displayCommandHeader(this.tmCore, {
+			tag: activeTag,
+			storageType: result.storageType
+		});
+
+		console.log(); // Add spacing
+
 		// Use the global task details display function
 		displayTaskDetails(result.task, {
 			statusFilter: options.status,
@@ -271,8 +275,12 @@ export class ShowCommand extends Command {
 		result: ShowMultipleTasksResult,
 		_options: ShowCommandOptions
 	): void {
-		// Header
-		ui.displayBanner(`Tasks (${result.tasks.length} found)`);
+		// Display header with storage info
+		const activeTag = this.tmCore?.getActiveTag() || 'master';
+		displayCommandHeader(this.tmCore, {
+			tag: activeTag,
+			storageType: result.storageType
+		});

 		if (result.notFound.length > 0) {
 			console.log(chalk.yellow(`\n⚠ Not found: ${result.notFound.join(', ')}`));
@@ -291,8 +299,6 @@ export class ShowCommand extends Command {
 				showDependencies: true
 			})
 		);
-
-		console.log(`\n${chalk.gray('Storage: ' + result.storageType)}`);
 	}

 	/**
--- a/apps/cli/src/commands/start.command.ts
+++ b/apps/cli/src/commands/start.command.ts
@@ -16,6 +16,7 @@ import {
 } from '@tm/core';
 import { displayTaskDetails } from '../ui/components/task-detail.component.js';
 import * as ui from '../utils/ui.js';
+import { displayError } from '../utils/error-handler.js';

 /**
 * CLI-specific options interface for the start command
@@ -160,8 +161,7 @@ export class StartCommand extends Command {
 			if (spinner) {
 				spinner.fail('Operation failed');
 			}
-			this.handleError(error);
-			process.exit(1);
+			displayError(error);
 		}
 	}

@@ -452,22 +452,6 @@ export class StartCommand extends Command {
 		console.log(`\n${chalk.gray('Storage: ' + result.storageType)}`);
 	}

-	/**
-	 * Handle general errors
-	 */
-	private handleError(error: any): void {
-		const msg = error?.getSanitizedDetails?.() ?? {
-			message: error?.message ?? String(error)
-		};
-		console.error(chalk.red(`Error: ${msg.message || 'Unexpected error'}`));
-
-		// Show stack trace in development mode or when DEBUG is set
-		const isDevelopment = process.env.NODE_ENV !== 'production';
-		if ((isDevelopment || process.env.DEBUG) && error.stack) {
-			console.error(chalk.gray(error.stack));
-		}
-	}
-
 	/**
 	 * Set the last result for programmatic access
 	 */
--- a/apps/cli/src/index.ts
+++ b/apps/cli/src/index.ts
@@ -6,11 +6,13 @@
 // Commands
 export { ListTasksCommand } from './commands/list.command.js';
 export { ShowCommand } from './commands/show.command.js';
+export { NextCommand } from './commands/next.command.js';
 export { AuthCommand } from './commands/auth.command.js';
 export { ContextCommand } from './commands/context.command.js';
 export { StartCommand } from './commands/start.command.js';
 export { SetStatusCommand } from './commands/set-status.command.js';
 export { ExportCommand } from './commands/export.command.js';
+export { AutopilotCommand } from './commands/autopilot.command.js';

 // Command Registry
 export {
@@ -23,6 +25,9 @@ export {
 // UI utilities (for other commands to use)
 export * as ui from './utils/ui.js';

+// Error handling utilities
+export { displayError, isDebugMode } from './utils/error-handler.js';
+
 // Auto-update utilities
 export {
 	checkForUpdate,
--- a/apps/cli/src/ui/components/header.component.ts
+++ b/apps/cli/src/ui/components/header.component.ts
@@ -5,6 +5,16 @@

 import chalk from 'chalk';

+/**
+ * Brief information for API storage
+ */
+export interface BriefInfo {
+	briefId: string;
+	briefName: string;
+	orgSlug?: string;
+	webAppUrl?: string;
+}
+
 /**
 * Header configuration options
 */
@@ -12,22 +22,50 @@ export interface HeaderOptions {
 	title?: string;
 	tag?: string;
 	filePath?: string;
+	storageType?: 'api' | 'file';
+	briefInfo?: BriefInfo;
 }

 /**
 * Display the Task Master header with project info
 */
 export function displayHeader(options: HeaderOptions = {}): void {
-	const { filePath, tag } = options;
+	const { filePath, tag, storageType, briefInfo } = options;

-	// Display tag and file path info
-	if (tag) {
+	// Display different header based on storage type
+	if (storageType === 'api' && briefInfo) {
+		// API storage: Show brief information
+		const briefDisplay = `🏷  Brief: ${chalk.cyan(briefInfo.briefName)} ${chalk.gray(`(${briefInfo.briefId})`)}`;
+		console.log(briefDisplay);
+
+		// Construct and display the brief URL or ID
+		if (briefInfo.webAppUrl && briefInfo.orgSlug) {
+			const briefUrl = `${briefInfo.webAppUrl}/home/${briefInfo.orgSlug}/briefs/${briefInfo.briefId}/plan`;
+			console.log(`Listing tasks from: ${chalk.dim(briefUrl)}`);
+		} else if (briefInfo.webAppUrl) {
+			// Show web app URL and brief ID if org slug is missing
+			console.log(
+				`Listing tasks from: ${chalk.dim(`${briefInfo.webAppUrl} (Brief: ${briefInfo.briefId})`)}`
+			);
+			console.log(
+				chalk.yellow(
+					`💡 Tip: Run ${chalk.cyan('tm context select')} to set your organization and see the full URL`
+				)
+			);
+		} else {
+			// Fallback: just show the brief ID if we can't get web app URL
+			console.log(
+				`Listing tasks from: ${chalk.dim(`API (Brief ID: ${briefInfo.briefId})`)}`
+			);
+		}
+	} else if (tag) {
+		// File storage: Show tag information
 		let tagInfo = '';

 		if (tag && tag !== 'master') {
-			tagInfo = `🏷 tag: ${chalk.cyan(tag)}`;
+			tagInfo = `🏷  tag: ${chalk.cyan(tag)}`;
 		} else {
-			tagInfo = `🏷 tag: ${chalk.cyan('master')}`;
+			tagInfo = `🏷  tag: ${chalk.cyan('master')}`;
 		}

 		console.log(tagInfo);
@@ -39,7 +77,5 @@ export function displayHeader(options: HeaderOptions = {}): void {
 				: `${process.cwd()}/${filePath}`;
 			console.log(`Listing tasks from: ${chalk.dim(absolutePath)}`);
 		}
-
-		console.log(); // Empty line for spacing
 	}
 }
--- a/apps/cli/src/ui/components/next-task.component.ts
+++ b/apps/cli/src/ui/components/next-task.component.ts
@@ -6,7 +6,7 @@
 import chalk from 'chalk';
 import boxen from 'boxen';
 import type { Task } from '@tm/core/types';
-import { getComplexityWithColor } from '../../utils/ui.js';
+import { getComplexityWithColor, getBoxWidth } from '../../utils/ui.js';

 /**
 * Next task display options
@@ -113,7 +113,7 @@ export function displayRecommendedNextTask(
 			borderColor: '#FFA500', // Orange color
 			title: chalk.hex('#FFA500')('⚡ RECOMMENDED NEXT TASK ⚡'),
 			titleAlignment: 'center',
-			width: process.stdout.columns * 0.97,
+			width: getBoxWidth(0.97),
 			fullscreen: false
 		})
 	);
--- a/apps/cli/src/ui/components/suggested-steps.component.ts
+++ b/apps/cli/src/ui/components/suggested-steps.component.ts
@@ -5,6 +5,7 @@

 import chalk from 'chalk';
 import boxen from 'boxen';
+import { getBoxWidth } from '../../utils/ui.js';

 /**
 * Display suggested next steps section
@@ -24,7 +25,7 @@ export function displaySuggestedNextSteps(): void {
 				margin: { top: 0, bottom: 1 },
 				borderStyle: 'round',
 				borderColor: 'gray',
-				width: process.stdout.columns * 0.97
+				width: getBoxWidth(0.97)
 			}
 		)
 	);
--- a/apps/cli/src/utils/display-helpers.ts
+++ b/apps/cli/src/utils/display-helpers.ts
@@ -0,0 +1,75 @@
+/**
+ * @fileoverview Display helper utilities for commands
+ * Provides DRY utilities for displaying headers and other command output
+ */
+
+import type { TaskMasterCore } from '@tm/core';
+import type { StorageType } from '@tm/core/types';
+import { displayHeader, type BriefInfo } from '../ui/index.js';
+
+/**
+ * Get web app base URL from environment
+ */
+function getWebAppUrl(): string | undefined {
+	const baseDomain =
+		process.env.TM_BASE_DOMAIN || process.env.TM_PUBLIC_BASE_DOMAIN;
+
+	if (!baseDomain) {
+		return undefined;
+	}
+
+	// If it already includes protocol, use as-is
+	if (baseDomain.startsWith('http://') || baseDomain.startsWith('https://')) {
+		return baseDomain;
+	}
+
+	// Otherwise, add protocol based on domain
+	if (baseDomain.includes('localhost') || baseDomain.includes('127.0.0.1')) {
+		return `http://${baseDomain}`;
+	}
+
+	return `https://${baseDomain}`;
+}
+
+/**
+ * Display the command header with appropriate storage information
+ * Handles both API and file storage displays
+ */
+export function displayCommandHeader(
+	tmCore: TaskMasterCore | undefined,
+	options: {
+		tag?: string;
+		storageType: Exclude<StorageType, 'auto'>;
+	}
+): void {
+	const { tag, storageType } = options;
+
+	// Get brief info if using API storage
+	let briefInfo: BriefInfo | undefined;
+	if (storageType === 'api' && tmCore) {
+		const storageInfo = tmCore.getStorageDisplayInfo();
+		if (storageInfo) {
+			// Construct full brief info with web app URL
+			briefInfo = {
+				...storageInfo,
+				webAppUrl: getWebAppUrl()
+			};
+		}
+	}
+
+	// Get file path for display (only for file storage)
+	// Note: The file structure is fixed for file storage and won't change.
+	// This is a display-only relative path, not used for actual file operations.
+	const filePath =
+		storageType === 'file' && tmCore
+			? `.taskmaster/tasks/tasks.json`
+			: undefined;
+
+	// Display header
+	displayHeader({
+		tag: tag || 'master',
+		filePath: filePath,
+		storageType: storageType === 'api' ? 'api' : 'file',
+		briefInfo: briefInfo
+	});
+}
--- a/apps/cli/src/utils/error-handler.ts
+++ b/apps/cli/src/utils/error-handler.ts
@@ -0,0 +1,60 @@
+/**
+ * @fileoverview Centralized error handling utilities for CLI
+ * Provides consistent error formatting and debug mode detection
+ */
+
+import chalk from 'chalk';
+
+/**
+ * Check if debug mode is enabled via environment variable
+ * Only returns true when DEBUG is explicitly set to 'true' or '1'
+ *
+ * @returns True if debug mode is enabled
+ */
+export function isDebugMode(): boolean {
+	return process.env.DEBUG === 'true' || process.env.DEBUG === '1';
+}
+
+/**
+ * Display an error to the user with optional stack trace in debug mode
+ * Handles both TaskMasterError instances and regular errors
+ *
+ * @param error - The error to display
+ * @param options - Display options
+ */
+export function displayError(
+	error: any,
+	options: {
+		/** Skip exit, useful when caller wants to handle exit */
+		skipExit?: boolean;
+		/** Force show stack trace regardless of debug mode */
+		forceStack?: boolean;
+	} = {}
+): void {
+	// Check if it's a TaskMasterError with sanitized details
+	if (error?.getSanitizedDetails) {
+		const sanitized = error.getSanitizedDetails();
+		console.error(chalk.red(`\n${sanitized.message}`));
+
+		// Show stack trace in debug mode or if forced
+		if ((isDebugMode() || options.forceStack) && error.stack) {
+			console.error(chalk.gray('\nStack trace:'));
+			console.error(chalk.gray(error.stack));
+		}
+	} else {
+		// For other errors, show the message
+		const message = error?.message ?? String(error);
+		console.error(chalk.red(`\nError: ${message}`));
+
+		// Show stack trace in debug mode or if forced
+		if ((isDebugMode() || options.forceStack) && error?.stack) {
+			console.error(chalk.gray('\nStack trace:'));
+			console.error(chalk.gray(error.stack));
+		}
+	}
+
+	// Exit if not skipped
+	if (!options.skipExit) {
+		process.exit(1);
+	}
+}
--- a/apps/cli/src/utils/ui.spec.ts
+++ b/apps/cli/src/utils/ui.spec.ts
@@ -0,0 +1,158 @@
+/**
+ * CLI UI utilities tests
+ * Tests for apps/cli/src/utils/ui.ts
+ */
+
+import { describe, it, expect, beforeEach, afterEach, vi } from 'vitest';
+import type { MockInstance } from 'vitest';
+import { getBoxWidth } from './ui.js';
+
+describe('CLI UI Utilities', () => {
+	describe('getBoxWidth', () => {
+		let columnsSpy: MockInstance;
+		let originalDescriptor: PropertyDescriptor | undefined;
+
+		beforeEach(() => {
+			// Store original descriptor if it exists
+			originalDescriptor = Object.getOwnPropertyDescriptor(
+				process.stdout,
+				'columns'
+			);
+
+			// If columns doesn't exist or isn't a getter, define it as one
+			if (!originalDescriptor || !originalDescriptor.get) {
+				const currentValue = process.stdout.columns || 80;
+				Object.defineProperty(process.stdout, 'columns', {
+					get() {
+						return currentValue;
+					},
+					configurable: true
+				});
+			}
+
+			// Now spy on the getter
+			columnsSpy = vi.spyOn(process.stdout, 'columns', 'get');
+		});
+
+		afterEach(() => {
+			// Restore the spy
+			columnsSpy.mockRestore();
+
+			// Restore original descriptor or delete the property
+			if (originalDescriptor) {
+				Object.defineProperty(process.stdout, 'columns', originalDescriptor);
+			} else {
+				delete (process.stdout as any).columns;
+			}
+		});
+
+		it('should calculate width as percentage of terminal width', () => {
+			columnsSpy.mockReturnValue(100);
+			const width = getBoxWidth(0.9, 40);
+			expect(width).toBe(90);
+		});
+
+		it('should use default percentage of 0.9 when not specified', () => {
+			columnsSpy.mockReturnValue(100);
+			const width = getBoxWidth();
+			expect(width).toBe(90);
+		});
+
+		it('should use default minimum width of 40 when not specified', () => {
+			columnsSpy.mockReturnValue(30);
+			const width = getBoxWidth();
+			expect(width).toBe(40); // Should enforce minimum
+		});
+
+		it('should enforce minimum width when terminal is too narrow', () => {
+			columnsSpy.mockReturnValue(50);
+			const width = getBoxWidth(0.9, 60);
+			expect(width).toBe(60); // Should use minWidth instead of 45
+		});
+
+		it('should handle undefined process.stdout.columns', () => {
+			columnsSpy.mockReturnValue(undefined);
+			const width = getBoxWidth(0.9, 40);
+			// Should fall back to 80 columns: Math.floor(80 * 0.9) = 72
+			expect(width).toBe(72);
+		});
+
+		it('should handle custom percentage values', () => {
+			columnsSpy.mockReturnValue(100);
+			expect(getBoxWidth(0.95, 40)).toBe(95);
+			expect(getBoxWidth(0.8, 40)).toBe(80);
+			expect(getBoxWidth(0.5, 40)).toBe(50);
+		});
+
+		it('should handle custom minimum width values', () => {
+			columnsSpy.mockReturnValue(60);
+			expect(getBoxWidth(0.9, 70)).toBe(70); // 60 * 0.9 = 54, but min is 70
+			expect(getBoxWidth(0.9, 50)).toBe(54); // 60 * 0.9 = 54, min is 50
+		});
+
+		it('should floor the calculated width', () => {
+			columnsSpy.mockReturnValue(99);
+			const width = getBoxWidth(0.9, 40);
+			// 99 * 0.9 = 89.1, should floor to 89
+			expect(width).toBe(89);
+		});
+
+		it('should match warning box width calculation', () => {
+			// Test the specific case from displayWarning()
+			columnsSpy.mockReturnValue(80);
+			const width = getBoxWidth(0.9, 40);
+			expect(width).toBe(72);
+		});
+
+		it('should match table width calculation', () => {
+			// Test the specific case from createTaskTable()
+			columnsSpy.mockReturnValue(111);
+			const width = getBoxWidth(0.9, 100);
+			// 111 * 0.9 = 99.9, floor to 99, but max(99, 100) = 100
+			expect(width).toBe(100);
+		});
+
+		it('should match recommended task box width calculation', () => {
+			// Test the specific case from displayRecommendedNextTask()
+			columnsSpy.mockReturnValue(120);
+			const width = getBoxWidth(0.97, 40);
+			// 120 * 0.97 = 116.4, floor to 116
+			expect(width).toBe(116);
+		});
+
+		it('should handle edge case of zero terminal width', () => {
+			columnsSpy.mockReturnValue(0);
+			const width = getBoxWidth(0.9, 40);
+			// When columns is 0, it uses fallback of 80: Math.floor(80 * 0.9) = 72
+			expect(width).toBe(72);
+		});
+
+		it('should handle very large terminal widths', () => {
+			columnsSpy.mockReturnValue(1000);
+			const width = getBoxWidth(0.9, 40);
+			expect(width).toBe(900);
+		});
+
+		it('should handle very small percentages', () => {
+			columnsSpy.mockReturnValue(100);
+			const width = getBoxWidth(0.1, 5);
+			// 100 * 0.1 = 10, which is greater than min 5
+			expect(width).toBe(10);
+		});
+
+		it('should handle percentage of 1.0 (100%)', () => {
+			columnsSpy.mockReturnValue(80);
+			const width = getBoxWidth(1.0, 40);
+			expect(width).toBe(80);
+		});
+
+		it('should consistently return same value for same inputs', () => {
+			columnsSpy.mockReturnValue(100);
+			const width1 = getBoxWidth(0.9, 40);
+			const width2 = getBoxWidth(0.9, 40);
+			const width3 = getBoxWidth(0.9, 40);
+			expect(width1).toBe(width2);
+			expect(width2).toBe(width3);
+		});
+	});
+});
--- a/apps/cli/src/utils/ui.ts
+++ b/apps/cli/src/utils/ui.ts
@@ -126,6 +126,20 @@ export function getComplexityWithScore(complexity: number | undefined): string {
 	return color(`${complexity}/10 (${label})`);
 }

+/**
+ * Calculate box width as percentage of terminal width
+ * @param percentage - Percentage of terminal width to use (default: 0.9)
+ * @param minWidth - Minimum width to enforce (default: 40)
+ * @returns Calculated box width
+ */
+export function getBoxWidth(
+	percentage: number = 0.9,
+	minWidth: number = 40
+): number {
+	const terminalWidth = process.stdout.columns || 80;
+	return Math.max(Math.floor(terminalWidth * percentage), minWidth);
+}
+
 /**
 * Truncate text to specified length
 */
@@ -176,6 +190,8 @@ export function displayBanner(title: string = 'Task Master'): void {
 * Display an error message (matches scripts/modules/ui.js style)
 */
 export function displayError(message: string, details?: string): void {
+	const boxWidth = getBoxWidth();
+
 	console.error(
 		boxen(
 			chalk.red.bold('X Error: ') +
@@ -184,7 +200,8 @@ export function displayError(message: string, details?: string): void {
 			{
 				padding: 1,
 				borderStyle: 'round',
-				borderColor: 'red'
+				borderColor: 'red',
+				width: boxWidth
 			}
 		)
 	);
@@ -194,13 +211,16 @@ export function displayError(message: string, details?: string): void {
 * Display a success message
 */
 export function displaySuccess(message: string): void {
+	const boxWidth = getBoxWidth();
+
 	console.log(
 		boxen(
 			chalk.green.bold(String.fromCharCode(8730) + ' ') + chalk.white(message),
 			{
 				padding: 1,
 				borderStyle: 'round',
-				borderColor: 'green'
+				borderColor: 'green',
+				width: boxWidth
 			}
 		)
 	);
@@ -210,11 +230,14 @@ export function displaySuccess(message: string): void {
 * Display a warning message
 */
 export function displayWarning(message: string): void {
+	const boxWidth = getBoxWidth();
+
 	console.log(
 		boxen(chalk.yellow.bold('⚠ ') + chalk.white(message), {
 			padding: 1,
 			borderStyle: 'round',
-			borderColor: 'yellow'
+			borderColor: 'yellow',
+			width: boxWidth
 		})
 	);
 }
@@ -223,11 +246,14 @@ export function displayWarning(message: string): void {
 * Display info message
 */
 export function displayInfo(message: string): void {
+	const boxWidth = getBoxWidth();
+
 	console.log(
 		boxen(chalk.blue.bold('i ') + chalk.white(message), {
 			padding: 1,
 			borderStyle: 'round',
-			borderColor: 'blue'
+			borderColor: 'blue',
+			width: boxWidth
 		})
 	);
 }
@@ -282,23 +308,23 @@ export function createTaskTable(
 	} = options || {};

 	// Calculate dynamic column widths based on terminal width
-	const terminalWidth = process.stdout.columns * 0.9 || 100;
+	const tableWidth = getBoxWidth(0.9, 100);
 	// Adjust column widths to better match the original layout
 	const baseColWidths = showComplexity
 		? [
-				Math.floor(terminalWidth * 0.1),
-				Math.floor(terminalWidth * 0.4),
-				Math.floor(terminalWidth * 0.15),
-				Math.floor(terminalWidth * 0.1),
-				Math.floor(terminalWidth * 0.2),
-				Math.floor(terminalWidth * 0.1)
+				Math.floor(tableWidth * 0.1),
+				Math.floor(tableWidth * 0.4),
+				Math.floor(tableWidth * 0.15),
+				Math.floor(tableWidth * 0.1),
+				Math.floor(tableWidth * 0.2),
+				Math.floor(tableWidth * 0.1)
 			] // ID, Title, Status, Priority, Dependencies, Complexity
 		: [
-				Math.floor(terminalWidth * 0.08),
-				Math.floor(terminalWidth * 0.4),
-				Math.floor(terminalWidth * 0.18),
-				Math.floor(terminalWidth * 0.12),
-				Math.floor(terminalWidth * 0.2)
+				Math.floor(tableWidth * 0.08),
+				Math.floor(tableWidth * 0.4),
+				Math.floor(tableWidth * 0.18),
+				Math.floor(tableWidth * 0.12),
+				Math.floor(tableWidth * 0.2)
 			]; // ID, Title, Status, Priority, Dependencies

 	const headers = [
--- a/apps/cli/tests/integration/commands/autopilot/workflow.test.ts
+++ b/apps/cli/tests/integration/commands/autopilot/workflow.test.ts
@@ -0,0 +1,540 @@
+/**
+ * @fileoverview Integration tests for autopilot workflow commands
+ */
+
+import { describe, it, expect, beforeEach, afterEach, vi } from 'vitest';
+import type { WorkflowState } from '@tm/core';
+
+// Track file system state in memory - must be in vi.hoisted() for mock access
+const {
+	mockFileSystem,
+	pathExistsFn,
+	readJSONFn,
+	writeJSONFn,
+	ensureDirFn,
+	removeFn
+} = vi.hoisted(() => {
+	const mockFileSystem = new Map<string, string>();
+
+	return {
+		mockFileSystem,
+		pathExistsFn: vi.fn((path: string) =>
+			Promise.resolve(mockFileSystem.has(path))
+		),
+		readJSONFn: vi.fn((path: string) => {
+			const data = mockFileSystem.get(path);
+			return data
+				? Promise.resolve(JSON.parse(data))
+				: Promise.reject(new Error('File not found'));
+		}),
+		writeJSONFn: vi.fn((path: string, data: any) => {
+			mockFileSystem.set(path, JSON.stringify(data));
+			return Promise.resolve();
+		}),
+		ensureDirFn: vi.fn(() => Promise.resolve()),
+		removeFn: vi.fn((path: string) => {
+			mockFileSystem.delete(path);
+			return Promise.resolve();
+		})
+	};
+});
+
+// Mock fs-extra before any imports
+vi.mock('fs-extra', () => ({
+	default: {
+		pathExists: pathExistsFn,
+		readJSON: readJSONFn,
+		writeJSON: writeJSONFn,
+		ensureDir: ensureDirFn,
+		remove: removeFn
+	}
+}));
+
+vi.mock('@tm/core', () => ({
+	WorkflowOrchestrator: vi.fn().mockImplementation((context) => ({
+		getCurrentPhase: vi.fn().mockReturnValue('SUBTASK_LOOP'),
+		getCurrentTDDPhase: vi.fn().mockReturnValue('RED'),
+		getContext: vi.fn().mockReturnValue(context),
+		transition: vi.fn(),
+		restoreState: vi.fn(),
+		getState: vi.fn().mockReturnValue({ phase: 'SUBTASK_LOOP', context }),
+		enableAutoPersist: vi.fn(),
+		canResumeFromState: vi.fn().mockReturnValue(true),
+		getCurrentSubtask: vi.fn().mockReturnValue({
+			id: '1',
+			title: 'Test Subtask',
+			status: 'pending',
+			attempts: 0
+		}),
+		getProgress: vi.fn().mockReturnValue({
+			completed: 0,
+			total: 3,
+			current: 1,
+			percentage: 0
+		}),
+		canProceed: vi.fn().mockReturnValue(false)
+	})),
+	GitAdapter: vi.fn().mockImplementation(() => ({
+		ensureGitRepository: vi.fn().mockResolvedValue(undefined),
+		ensureCleanWorkingTree: vi.fn().mockResolvedValue(undefined),
+		createAndCheckoutBranch: vi.fn().mockResolvedValue(undefined),
+		hasStagedChanges: vi.fn().mockResolvedValue(true),
+		getStatus: vi.fn().mockResolvedValue({
+			staged: ['file1.ts'],
+			modified: ['file2.ts']
+		}),
+		createCommit: vi.fn().mockResolvedValue(undefined),
+		getLastCommit: vi.fn().mockResolvedValue({
+			hash: 'abc123def456',
+			message: 'test commit'
+		}),
+		stageFiles: vi.fn().mockResolvedValue(undefined)
+	})),
+	CommitMessageGenerator: vi.fn().mockImplementation(() => ({
+		generateMessage: vi.fn().mockReturnValue('feat: test commit message')
+	})),
+	createTaskMasterCore: vi.fn().mockResolvedValue({
+		getTaskWithSubtask: vi.fn().mockResolvedValue({
+			task: {
+				id: '1',
+				title: 'Test Task',
+				subtasks: [
+					{ id: '1', title: 'Subtask 1', status: 'pending' },
+					{ id: '2', title: 'Subtask 2', status: 'pending' },
+					{ id: '3', title: 'Subtask 3', status: 'pending' }
+				],
+				tag: 'test'
+			}
+		}),
+		close: vi.fn().mockResolvedValue(undefined)
+	})
+}));
+
+// Import after mocks are set up
+import { Command } from 'commander';
+import { AutopilotCommand } from '../../../../src/commands/autopilot/index.js';
+
+describe('Autopilot Workflow Integration Tests', () => {
+	const projectRoot = '/test/project';
+	let program: Command;
+
+	beforeEach(() => {
+		mockFileSystem.clear();
+
+		// Clear mock call history
+		pathExistsFn.mockClear();
+		readJSONFn.mockClear();
+		writeJSONFn.mockClear();
+		ensureDirFn.mockClear();
+		removeFn.mockClear();
+
+		program = new Command();
+		AutopilotCommand.register(program);
+
+		// Use exitOverride to handle Commander exits in tests
+		program.exitOverride();
+	});
+
+	afterEach(() => {
+		mockFileSystem.clear();
+		vi.restoreAllMocks();
+	});
+
+	describe('start command', () => {
+		it('should initialize workflow and create branch', async () => {
+			const consoleLogSpy = vi
+				.spyOn(console, 'log')
+				.mockImplementation(() => {});
+
+			await program.parseAsync([
+				'node',
+				'test',
+				'autopilot',
+				'start',
+				'1',
+				'--project-root',
+				projectRoot,
+				'--json'
+			]);
+
+			// Verify writeJSON was called with state
+			expect(writeJSONFn).toHaveBeenCalledWith(
+				expect.stringContaining('workflow-state.json'),
+				expect.objectContaining({
+					phase: expect.any(String),
+					context: expect.any(Object)
+				}),
+				expect.any(Object)
+			);
+
+			consoleLogSpy.mockRestore();
+		});
+
+		it('should reject invalid task ID', async () => {
+			const consoleErrorSpy = vi
+				.spyOn(console, 'error')
+				.mockImplementation(() => {});
+
+			await expect(
+				program.parseAsync([
+					'node',
+					'test',
+					'autopilot',
+					'start',
+					'invalid',
+					'--project-root',
+					projectRoot,
+					'--json'
+				])
+			).rejects.toMatchObject({ exitCode: 1 });
+
+			consoleErrorSpy.mockRestore();
+		});
+
+		it('should reject starting when workflow exists without force', async () => {
+			// Create existing state
+			const mockState: WorkflowState = {
+				phase: 'SUBTASK_LOOP',
+				context: {
+					taskId: '1',
+					subtasks: [],
+					currentSubtaskIndex: 0,
+					errors: [],
+					metadata: {}
+				}
+			};
+
+			mockFileSystem.set(
+				`${projectRoot}/.taskmaster/workflow-state.json`,
+				JSON.stringify(mockState)
+			);
+
+			const consoleErrorSpy = vi
+				.spyOn(console, 'error')
+				.mockImplementation(() => {});
+
+			await expect(
+				program.parseAsync([
+					'node',
+					'test',
+					'autopilot',
+					'start',
+					'1',
+					'--project-root',
+					projectRoot,
+					'--json'
+				])
+			).rejects.toMatchObject({ exitCode: 1 });
+
+			consoleErrorSpy.mockRestore();
+		});
+	});
+
+	describe('resume command', () => {
+		beforeEach(() => {
+			// Create saved state
+			const mockState: WorkflowState = {
+				phase: 'SUBTASK_LOOP',
+				context: {
+					taskId: '1',
+					subtasks: [
+						{
+							id: '1',
+							title: 'Test Subtask',
+							status: 'pending',
+							attempts: 0
+						}
+					],
+					currentSubtaskIndex: 0,
+					currentTDDPhase: 'RED',
+					branchName: 'task-1',
+					errors: [],
+					metadata: {}
+				}
+			};
+
+			mockFileSystem.set(
+				`${projectRoot}/.taskmaster/workflow-state.json`,
+				JSON.stringify(mockState)
+			);
+		});
+
+		it('should restore workflow from saved state', async () => {
+			const consoleLogSpy = vi
+				.spyOn(console, 'log')
+				.mockImplementation(() => {});
+
+			await program.parseAsync([
+				'node',
+				'test',
+				'autopilot',
+				'resume',
+				'--project-root',
+				projectRoot,
+				'--json'
+			]);
+
+			expect(consoleLogSpy).toHaveBeenCalled();
+			const output = JSON.parse(consoleLogSpy.mock.calls[0][0]);
+			expect(output.success).toBe(true);
+			expect(output.taskId).toBe('1');
+
+			consoleLogSpy.mockRestore();
+		});
+
+		it('should error when no state exists', async () => {
+			mockFileSystem.clear();
+
+			const consoleErrorSpy = vi
+				.spyOn(console, 'error')
+				.mockImplementation(() => {});
+
+			await expect(
+				program.parseAsync([
+					'node',
+					'test',
+					'autopilot',
+					'resume',
+					'--project-root',
+					projectRoot,
+					'--json'
+				])
+			).rejects.toMatchObject({ exitCode: 1 });
+
+			consoleErrorSpy.mockRestore();
+		});
+	});
+
+	describe('next command', () => {
+		beforeEach(() => {
+			const mockState: WorkflowState = {
+				phase: 'SUBTASK_LOOP',
+				context: {
+					taskId: '1',
+					subtasks: [
+						{
+							id: '1',
+							title: 'Test Subtask',
+							status: 'pending',
+							attempts: 0
+						}
+					],
+					currentSubtaskIndex: 0,
+					currentTDDPhase: 'RED',
+					branchName: 'task-1',
+					errors: [],
+					metadata: {}
+				}
+			};
+
+			mockFileSystem.set(
+				`${projectRoot}/.taskmaster/workflow-state.json`,
+				JSON.stringify(mockState)
+			);
+		});
+
+		it('should return next action in JSON format', async () => {
+			const consoleLogSpy = vi
+				.spyOn(console, 'log')
+				.mockImplementation(() => {});
+
+			await program.parseAsync([
+				'node',
+				'test',
+				'autopilot',
+				'next',
+				'--project-root',
+				projectRoot,
+				'--json'
+			]);
+
+			expect(consoleLogSpy).toHaveBeenCalled();
+			const output = JSON.parse(consoleLogSpy.mock.calls[0][0]);
+			expect(output.action).toBe('generate_test');
+			expect(output.phase).toBe('SUBTASK_LOOP');
+			expect(output.tddPhase).toBe('RED');
+
+			consoleLogSpy.mockRestore();
+		});
+	});
+
+	describe('status command', () => {
+		beforeEach(() => {
+			const mockState: WorkflowState = {
+				phase: 'SUBTASK_LOOP',
+				context: {
+					taskId: '1',
+					subtasks: [
+						{ id: '1', title: 'Subtask 1', status: 'completed', attempts: 1 },
+						{ id: '2', title: 'Subtask 2', status: 'pending', attempts: 0 },
+						{ id: '3', title: 'Subtask 3', status: 'pending', attempts: 0 }
+					],
+					currentSubtaskIndex: 1,
+					currentTDDPhase: 'RED',
+					branchName: 'task-1',
+					errors: [],
+					metadata: {}
+				}
+			};
+
+			mockFileSystem.set(
+				`${projectRoot}/.taskmaster/workflow-state.json`,
+				JSON.stringify(mockState)
+			);
+		});
+
+		it('should display workflow progress', async () => {
+			const consoleLogSpy = vi
+				.spyOn(console, 'log')
+				.mockImplementation(() => {});
+
+			await program.parseAsync([
+				'node',
+				'test',
+				'autopilot',
+				'status',
+				'--project-root',
+				projectRoot,
+				'--json'
+			]);
+
+			expect(consoleLogSpy).toHaveBeenCalled();
+			const output = JSON.parse(consoleLogSpy.mock.calls[0][0]);
+			expect(output.taskId).toBe('1');
+			expect(output.phase).toBe('SUBTASK_LOOP');
+			expect(output.progress).toBeDefined();
+			expect(output.subtasks).toHaveLength(3);
+
+			consoleLogSpy.mockRestore();
+		});
+	});
+
+	describe('complete command', () => {
+		beforeEach(() => {
+			const mockState: WorkflowState = {
+				phase: 'SUBTASK_LOOP',
+				context: {
+					taskId: '1',
+					subtasks: [
+						{
+							id: '1',
+							title: 'Test Subtask',
+							status: 'in-progress',
+							attempts: 0
+						}
+					],
+					currentSubtaskIndex: 0,
+					currentTDDPhase: 'RED',
+					branchName: 'task-1',
+					errors: [],
+					metadata: {}
+				}
+			};
+
+			mockFileSystem.set(
+				`${projectRoot}/.taskmaster/workflow-state.json`,
+				JSON.stringify(mockState)
+			);
+		});
+
+		it('should validate RED phase has failures', async () => {
+			const consoleErrorSpy = vi
+				.spyOn(console, 'error')
+				.mockImplementation(() => {});
+
+			await expect(
+				program.parseAsync([
+					'node',
+					'test',
+					'autopilot',
+					'complete',
+					'--project-root',
+					projectRoot,
+					'--results',
+					'{"total":10,"passed":10,"failed":0,"skipped":0}',
+					'--json'
+				])
+			).rejects.toMatchObject({ exitCode: 1 });
+
+			consoleErrorSpy.mockRestore();
+		});
+
+		it('should complete RED phase with failures', async () => {
+			const consoleLogSpy = vi
+				.spyOn(console, 'log')
+				.mockImplementation(() => {});
+
+			await program.parseAsync([
+				'node',
+				'test',
+				'autopilot',
+				'complete',
+				'--project-root',
+				projectRoot,
+				'--results',
+				'{"total":10,"passed":9,"failed":1,"skipped":0}',
+				'--json'
+			]);
+
+			expect(consoleLogSpy).toHaveBeenCalled();
+			const output = JSON.parse(consoleLogSpy.mock.calls[0][0]);
+			expect(output.success).toBe(true);
+			expect(output.nextPhase).toBe('GREEN');
+
+			consoleLogSpy.mockRestore();
+		});
+	});
+
+	describe('abort command', () => {
+		beforeEach(() => {
+			const mockState: WorkflowState = {
+				phase: 'SUBTASK_LOOP',
+				context: {
+					taskId: '1',
+					subtasks: [
+						{
+							id: '1',
+							title: 'Test Subtask',
+							status: 'pending',
+							attempts: 0
+						}
+					],
+					currentSubtaskIndex: 0,
+					currentTDDPhase: 'RED',
+					branchName: 'task-1',
+					errors: [],
+					metadata: {}
+				}
+			};
+
+			mockFileSystem.set(
+				`${projectRoot}/.taskmaster/workflow-state.json`,
+				JSON.stringify(mockState)
+			);
+		});
+
+		it('should abort workflow and delete state', async () => {
+			const consoleLogSpy = vi
+				.spyOn(console, 'log')
+				.mockImplementation(() => {});
+
+			await program.parseAsync([
+				'node',
+				'test',
+				'autopilot',
+				'abort',
+				'--project-root',
+				projectRoot,
+				'--force',
+				'--json'
+			]);
+
+			// Verify remove was called
+			expect(removeFn).toHaveBeenCalledWith(
+				expect.stringContaining('workflow-state.json')
+			);
+
+			consoleLogSpy.mockRestore();
+		});
+	});
+});
--- a/apps/cli/tests/unit/commands/autopilot/shared.test.ts
+++ b/apps/cli/tests/unit/commands/autopilot/shared.test.ts
@@ -0,0 +1,202 @@
+/**
+ * @fileoverview Unit tests for autopilot shared utilities
+ */
+
+import { describe, it, expect, beforeEach, afterEach, vi } from 'vitest';
+import {
+	validateTaskId,
+	parseSubtasks,
+	OutputFormatter
+} from '../../../../src/commands/autopilot/shared.js';
+
+// Mock fs-extra
+vi.mock('fs-extra', () => ({
+	default: {
+		pathExists: vi.fn(),
+		readJSON: vi.fn(),
+		writeJSON: vi.fn(),
+		ensureDir: vi.fn(),
+		remove: vi.fn()
+	},
+	pathExists: vi.fn(),
+	readJSON: vi.fn(),
+	writeJSON: vi.fn(),
+	ensureDir: vi.fn(),
+	remove: vi.fn()
+}));
+
+describe('Autopilot Shared Utilities', () => {
+	const projectRoot = '/test/project';
+	const statePath = `${projectRoot}/.taskmaster/workflow-state.json`;
+
+	beforeEach(() => {
+		vi.clearAllMocks();
+	});
+
+	afterEach(() => {
+		vi.restoreAllMocks();
+	});
+
+	describe('validateTaskId', () => {
+		it('should validate simple task IDs', () => {
+			expect(validateTaskId('1')).toBe(true);
+			expect(validateTaskId('10')).toBe(true);
+			expect(validateTaskId('999')).toBe(true);
+		});
+
+		it('should validate subtask IDs', () => {
+			expect(validateTaskId('1.1')).toBe(true);
+			expect(validateTaskId('1.2')).toBe(true);
+			expect(validateTaskId('10.5')).toBe(true);
+		});
+
+		it('should validate nested subtask IDs', () => {
+			expect(validateTaskId('1.1.1')).toBe(true);
+			expect(validateTaskId('1.2.3')).toBe(true);
+		});
+
+		it('should reject invalid formats', () => {
+			expect(validateTaskId('')).toBe(false);
+			expect(validateTaskId('abc')).toBe(false);
+			expect(validateTaskId('1.')).toBe(false);
+			expect(validateTaskId('.1')).toBe(false);
+			expect(validateTaskId('1..2')).toBe(false);
+			expect(validateTaskId('1.2.3.')).toBe(false);
+		});
+	});
+
+	describe('parseSubtasks', () => {
+		it('should parse subtasks from task data', () => {
+			const task = {
+				id: '1',
+				title: 'Test Task',
+				subtasks: [
+					{ id: '1', title: 'Subtask 1', status: 'pending' },
+					{ id: '2', title: 'Subtask 2', status: 'done' },
+					{ id: '3', title: 'Subtask 3', status: 'in-progress' }
+				]
+			};
+
+			const result = parseSubtasks(task, 5);
+
+			expect(result).toHaveLength(3);
+			expect(result[0]).toEqual({
+				id: '1',
+				title: 'Subtask 1',
+				status: 'pending',
+				attempts: 0,
+				maxAttempts: 5
+			});
+			expect(result[1]).toEqual({
+				id: '2',
+				title: 'Subtask 2',
+				status: 'completed',
+				attempts: 0,
+				maxAttempts: 5
+			});
+		});
+
+		it('should return empty array for missing subtasks', () => {
+			const task = { id: '1', title: 'Test Task' };
+			expect(parseSubtasks(task)).toEqual([]);
+		});
+
+		it('should use default maxAttempts', () => {
+			const task = {
+				subtasks: [{ id: '1', title: 'Subtask 1', status: 'pending' }]
+			};
+
+			const result = parseSubtasks(task);
+			expect(result[0].maxAttempts).toBe(3);
+		});
+	});
+
+	// State persistence tests omitted - covered in integration tests
+
+	describe('OutputFormatter', () => {
+		let consoleLogSpy: any;
+		let consoleErrorSpy: any;
+
+		beforeEach(() => {
+			consoleLogSpy = vi.spyOn(console, 'log').mockImplementation(() => {});
+			consoleErrorSpy = vi.spyOn(console, 'error').mockImplementation(() => {});
+		});
+
+		afterEach(() => {
+			consoleLogSpy.mockRestore();
+			consoleErrorSpy.mockRestore();
+		});
+
+		describe('JSON mode', () => {
+			it('should output JSON for success', () => {
+				const formatter = new OutputFormatter(true);
+				formatter.success('Test message', { key: 'value' });
+
+				expect(consoleLogSpy).toHaveBeenCalled();
+				const output = JSON.parse(consoleLogSpy.mock.calls[0][0]);
+				expect(output.success).toBe(true);
+				expect(output.message).toBe('Test message');
+				expect(output.key).toBe('value');
+			});
+
+			it('should output JSON for error', () => {
+				const formatter = new OutputFormatter(true);
+				formatter.error('Error message', { code: 'ERR001' });
+
+				expect(consoleErrorSpy).toHaveBeenCalled();
+				const output = JSON.parse(consoleErrorSpy.mock.calls[0][0]);
+				expect(output.error).toBe('Error message');
+				expect(output.code).toBe('ERR001');
+			});
+
+			it('should output JSON for data', () => {
+				const formatter = new OutputFormatter(true);
+				formatter.output({ test: 'data' });
+
+				expect(consoleLogSpy).toHaveBeenCalled();
+				const output = JSON.parse(consoleLogSpy.mock.calls[0][0]);
+				expect(output.test).toBe('data');
+			});
+		});
+
+		describe('Text mode', () => {
+			it('should output formatted text for success', () => {
+				const formatter = new OutputFormatter(false);
+				formatter.success('Test message');
+
+				expect(consoleLogSpy).toHaveBeenCalledWith(
+					expect.stringContaining('✓ Test message')
+				);
+			});
+
+			it('should output formatted text for error', () => {
+				const formatter = new OutputFormatter(false);
+				formatter.error('Error message');
+
+				expect(consoleErrorSpy).toHaveBeenCalledWith(
+					expect.stringContaining('Error: Error message')
+				);
+			});
+
+			it('should output formatted text for warning', () => {
+				const consoleWarnSpy = vi
+					.spyOn(console, 'warn')
+					.mockImplementation(() => {});
+				const formatter = new OutputFormatter(false);
+				formatter.warning('Warning message');
+
+				expect(consoleWarnSpy).toHaveBeenCalledWith(
+					expect.stringContaining('⚠ Warning message')
+				);
+				consoleWarnSpy.mockRestore();
+			});
+
+			it('should not output info in JSON mode', () => {
+				const formatter = new OutputFormatter(true);
+				formatter.info('Info message');
+
+				expect(consoleLogSpy).not.toHaveBeenCalled();
+			});
+		});
+	});
+});
--- a/apps/cli/vitest.config.ts
+++ b/apps/cli/vitest.config.ts
@@ -0,0 +1,25 @@
+import { defineConfig } from 'vitest/config';
+
+export default defineConfig({
+	test: {
+		globals: true,
+		environment: 'node',
+		include: ['tests/**/*.test.ts', 'tests/**/*.spec.ts'],
+		coverage: {
+			provider: 'v8',
+			reporter: ['text', 'json', 'html'],
+			include: ['src/**/*.ts'],
+			exclude: [
+				'node_modules/',
+				'dist/',
+				'tests/',
+				'**/*.test.ts',
+				'**/*.spec.ts',
+				'**/*.d.ts',
+				'**/mocks/**',
+				'**/fixtures/**',
+				'vitest.config.ts'
+			]
+		}
+	}
+});
--- a/apps/docs/CHANGELOG.md
+++ b/apps/docs/CHANGELOG.md
@@ -1,5 +1,7 @@
 # docs

+## 0.0.6
+
 ## 0.0.5

 ## 0.0.4
--- a/apps/docs/capabilities/mcp.mdx
+++ b/apps/docs/capabilities/mcp.mdx
@@ -13,6 +13,126 @@ The MCP interface is built on top of the `fastmcp` library and registers a set o

 Each tool is defined with a name, a description, and a set of parameters that are validated using the `zod` library. The `execute` function of each tool calls the corresponding core logic function from `scripts/modules/task-manager.js`.

+## Configurable Tool Loading
+
+To optimize LLM context usage, you can control which Task Master MCP tools are loaded using the `TASK_MASTER_TOOLS` environment variable. This is particularly useful when working with LLMs that have context limits or when you only need a subset of tools.
+
+### Configuration Modes
+
+#### All Tools (Default)
+Loads all 36 available tools. Use when you need full Task Master functionality.
+
+```json
+{
+  "mcpServers": {
+    "task-master-ai": {
+      "command": "npx",
+      "args": ["-y", "task-master-ai"],
+      "env": {
+        "TASK_MASTER_TOOLS": "all",
+        "ANTHROPIC_API_KEY": "your_key_here"
+      }
+    }
+  }
+}
+```
+
+If `TASK_MASTER_TOOLS` is not set, all tools are loaded by default.
+
+#### Core Tools (Lean Mode)
+Loads only 7 essential tools for daily development. Ideal for minimal context usage.
+
+**Core tools included:**
+- `get_tasks` - List all tasks
+- `next_task` - Find the next task to work on
+- `get_task` - Get detailed task information
+- `set_task_status` - Update task status
+- `update_subtask` - Add implementation notes
+- `parse_prd` - Generate tasks from PRD
+- `expand_task` - Break down tasks into subtasks
+
+```json
+{
+  "mcpServers": {
+    "task-master-ai": {
+      "command": "npx",
+      "args": ["-y", "task-master-ai"],
+      "env": {
+        "TASK_MASTER_TOOLS": "core",
+        "ANTHROPIC_API_KEY": "your_key_here"
+      }
+    }
+  }
+}
+```
+
+You can also use `"lean"` as an alias for `"core"`.
+
+#### Standard Tools
+Loads 15 commonly used tools. Balances functionality with context efficiency.
+
+**Standard tools include all core tools plus:**
+- `initialize_project` - Set up new projects
+- `analyze_project_complexity` - Analyze task complexity
+- `expand_all` - Expand all eligible tasks
+- `add_subtask` - Add subtasks manually
+- `remove_task` - Remove tasks
+- `generate` - Generate task markdown files
+- `add_task` - Create new tasks
+- `complexity_report` - View complexity analysis
+
+```json
+{
+  "mcpServers": {
+    "task-master-ai": {
+      "command": "npx",
+      "args": ["-y", "task-master-ai"],
+      "env": {
+        "TASK_MASTER_TOOLS": "standard",
+        "ANTHROPIC_API_KEY": "your_key_here"
+      }
+    }
+  }
+}
+```
+
+#### Custom Tool Selection
+Specify exactly which tools to load using a comma-separated list. Tool names are case-insensitive and support both underscores and hyphens.
+
+```json
+{
+  "mcpServers": {
+    "task-master-ai": {
+      "command": "npx",
+      "args": ["-y", "task-master-ai"],
+      "env": {
+        "TASK_MASTER_TOOLS": "get_tasks,next_task,set_task_status,update_subtask",
+        "ANTHROPIC_API_KEY": "your_key_here"
+      }
+    }
+  }
+}
+```
+
+### Choosing the Right Configuration
+
+- **Use `core`/`lean`**: When working with basic task management workflows or when context limits are strict
+- **Use `standard`**: For most development workflows that include task creation and analysis
+- **Use `all`**: When you need full functionality including tag management, dependencies, and advanced features
+- **Use custom list**: When you have specific tool requirements or want to experiment with minimal sets
+
+### Verification
+
+When the MCP server starts, it logs which tools were loaded:
+
+```
+Task Master MCP Server starting...
+Tool mode configuration: standard
+Loading standard tools
+Registering 15 MCP tools (mode: standard)
+Successfully registered 15/15 tools
+```
+
 ## Tool Categories

 The MCP tools can be categorized in the same way as the core functionalities:
--- a/apps/docs/command-reference.mdx
+++ b/apps/docs/command-reference.mdx
@@ -0,0 +1,235 @@
+---
+title: "Task Master Commands"
+description: "A comprehensive reference of all available Task Master commands"
+---
+
+<AccordionGroup>
+  <Accordion title="Parse PRD">
+    ```bash
+    # Parse a PRD file and generate tasks
+    task-master parse-prd <prd-file.txt>
+
+    # Limit the number of tasks generated
+    task-master parse-prd <prd-file.txt> --num-tasks=10
+    ```
+  </Accordion>
+
+  <Accordion title="List Tasks">
+    ```bash
+    # List all tasks
+    task-master list
+
+    # List tasks with a specific status
+    task-master list --status=<status>
+
+    # List tasks with subtasks
+    task-master list --with-subtasks
+
+    # List tasks with a specific status and include subtasks
+    task-master list --status=<status> --with-subtasks
+    ```
+  </Accordion>
+
+  <Accordion title="Show Next Task">
+    ```bash
+    # Show the next task to work on based on dependencies and status
+    task-master next
+    ```
+  </Accordion>
+
+  <Accordion title="Show Specific Task">
+    ```bash
+    # Show details of a specific task
+    task-master show <id>
+    # or
+    task-master show --id=<id>
+
+    # View a specific subtask (e.g., subtask 2 of task 1)
+    task-master show 1.2
+    ```
+  </Accordion>
+
+  <Accordion title="Update Tasks">
+    ```bash
+    # Update tasks from a specific ID and provide context
+    task-master update --from=<id> --prompt="<prompt>"
+    ```
+  </Accordion>
+
+  <Accordion title="Update a Specific Task">
+    ```bash
+    # Update a single task by ID with new information
+    task-master update-task --id=<id> --prompt="<prompt>"
+
+    # Use research-backed updates with Perplexity AI
+    task-master update-task --id=<id> --prompt="<prompt>" --research
+    ```
+  </Accordion>
+
+  <Accordion title="Update a Subtask">
+    ```bash
+    # Append additional information to a specific subtask
+    task-master update-subtask --id=<parentId.subtaskId> --prompt="<prompt>"
+
+    # Example: Add details about API rate limiting to subtask 2 of task 5
+    task-master update-subtask --id=5.2 --prompt="Add rate limiting of 100 requests per minute"
+
+    # Use research-backed updates with Perplexity AI
+    task-master update-subtask --id=<parentId.subtaskId> --prompt="<prompt>" --research
+    ```
+
+    Unlike the `update-task` command which replaces task information, the `update-subtask` command _appends_ new information to the existing subtask details, marking it with a timestamp. This is useful for iteratively enhancing subtasks while preserving the original content.
+  </Accordion>
+
+  <Accordion title="Generate Task Files">
+    ```bash
+    # Generate individual task files from tasks.json
+    task-master generate
+    ```
+  </Accordion>
+
+  <Accordion title="Set Task Status">
+    ```bash
+    # Set status of a single task
+    task-master set-status --id=<id> --status=<status>
+
+    # Set status for multiple tasks
+    task-master set-status --id=1,2,3 --status=<status>
+
+    # Set status for subtasks
+    task-master set-status --id=1.1,1.2 --status=<status>
+    ```
+
+    When marking a task as "done", all of its subtasks will automatically be marked as "done" as well.
+  </Accordion>
+
+  <Accordion title="Expand Tasks">
+    ```bash
+    # Expand a specific task with subtasks
+    task-master expand --id=<id> --num=<number>
+
+    # Expand with additional context
+    task-master expand --id=<id> --prompt="<context>"
+
+    # Expand all pending tasks
+    task-master expand --all
+
+    # Force regeneration of subtasks for tasks that already have them
+    task-master expand --all --force
+
+    # Research-backed subtask generation for a specific task
+    task-master expand --id=<id> --research
+
+    # Research-backed generation for all tasks
+    task-master expand --all --research
+    ```
+  </Accordion>
+
+  <Accordion title="Clear Subtasks">
+    ```bash
+    # Clear subtasks from a specific task
+    task-master clear-subtasks --id=<id>
+
+    # Clear subtasks from multiple tasks
+    task-master clear-subtasks --id=1,2,3
+
+    # Clear subtasks from all tasks
+    task-master clear-subtasks --all
+    ```
+  </Accordion>
+
+  <Accordion title="Analyze Task Complexity">
+    ```bash
+    # Analyze complexity of all tasks
+    task-master analyze-complexity
+
+    # Save report to a custom location
+    task-master analyze-complexity --output=my-report.json
+
+    # Use a specific LLM model
+    task-master analyze-complexity --model=claude-3-opus-20240229
+
+    # Set a custom complexity threshold (1-10)
+    task-master analyze-complexity --threshold=6
+
+    # Use an alternative tasks file
+    task-master analyze-complexity --file=custom-tasks.json
+
+    # Use Perplexity AI for research-backed complexity analysis
+    task-master analyze-complexity --research
+    ```
+  </Accordion>
+
+  <Accordion title="View Complexity Report">
+    ```bash
+    # Display the task complexity analysis report
+    task-master complexity-report
+
+    # View a report at a custom location
+    task-master complexity-report --file=my-report.json
+    ```
+  </Accordion>
+
+  <Accordion title="Managing Task Dependencies">
+    ```bash
+    # Add a dependency to a task
+    task-master add-dependency --id=<id> --depends-on=<id>
+
+    # Remove a dependency from a task
+    task-master remove-dependency --id=<id> --depends-on=<id>
+
+    # Validate dependencies without fixing them
+    task-master validate-dependencies
+
+    # Find and fix invalid dependencies automatically
+    task-master fix-dependencies
+    ```
+  </Accordion>
+
+  <Accordion title="Add a New Task">
+    ```bash
+    # Add a new task using AI
+    task-master add-task --prompt="Description of the new task"
+
+    # Add a task with dependencies
+    task-master add-task --prompt="Description" --dependencies=1,2,3
+
+    # Add a task with priority
+    task-master add-task --prompt="Description" --priority=high
+    ```
+  </Accordion>
+
+  <Accordion title="Initialize a Project">
+    ```bash
+    # Initialize a new project with Task Master structure
+    task-master init
+    ```
+  </Accordion>
+
+  <Accordion title="TDD Workflow (Autopilot)">
+    ```bash
+    # Start autonomous TDD workflow for a task
+    task-master autopilot start <taskId>
+
+    # Get next action with context
+    task-master autopilot next
+
+    # Complete phase with test results
+    task-master autopilot complete --results '{"total":N,"passed":N,"failed":N}'
+
+    # Commit changes
+    task-master autopilot commit
+
+    # Check workflow status
+    task-master autopilot status
+
+    # Resume interrupted workflow
+    task-master autopilot resume
+
+    # Abort workflow
+    task-master autopilot abort
+    ```
+
+    The TDD workflow enforces RED → GREEN → COMMIT cycles for each subtask. See [AI Agent Integration](/tdd-workflow/ai-agent-integration) for details.
+  </Accordion>
+</AccordionGroup>
--- a/apps/docs/configuration.mdx
+++ b/apps/docs/configuration.mdx
@@ -0,0 +1,94 @@
+---
+title: "Configuration"
+description: "Configure Task Master through environment variables in a .env file"
+---
+
+## Required Configuration
+
+<Note>
+  Task Master requires an Anthropic API key to function. Add this to your `.env` file:
+
+  ```bash
+  ANTHROPIC_API_KEY=sk-ant-api03-your-api-key
+  ```
+
+  You can obtain an API key from the [Anthropic Console](https://console.anthropic.com/).
+</Note>
+
+## Optional Configuration
+
+| Variable | Default Value | Description | Example |
+| --- | --- | --- | --- |
+| `MODEL` | `"claude-3-7-sonnet-20250219"` | Claude model to use | `MODEL=claude-3-opus-20240229` |
+| `MAX_TOKENS` | `"4000"` | Maximum tokens for responses | `MAX_TOKENS=8000` |
+| `TEMPERATURE` | `"0.7"` | Temperature for model responses | `TEMPERATURE=0.5` |
+| `DEBUG` | `"false"` | Enable debug logging | `DEBUG=true` |
+| `LOG_LEVEL` | `"info"` | Console output level | `LOG_LEVEL=debug` |
+| `DEFAULT_SUBTASKS` | `"3"` | Default subtask count | `DEFAULT_SUBTASKS=5` |
+| `DEFAULT_PRIORITY` | `"medium"` | Default priority | `DEFAULT_PRIORITY=high` |
+| `PROJECT_NAME` | `"MCP SaaS MVP"` | Project name in metadata | `PROJECT_NAME=My Awesome Project` |
+| `PROJECT_VERSION` | `"1.0.0"` | Version in metadata | `PROJECT_VERSION=2.1.0` |
+| `PERPLEXITY_API_KEY` | - | For research-backed features | `PERPLEXITY_API_KEY=pplx-...` |
+| `PERPLEXITY_MODEL` | `"sonar-medium-online"` | Perplexity model | `PERPLEXITY_MODEL=sonar-large-online` |
+
+## TDD Workflow Configuration
+
+Additional options for autonomous TDD workflow:
+
+| Variable | Default | Description |
+| --- | --- | --- |
+| `TM_MAX_ATTEMPTS` | `3` | Max attempts per subtask before marking blocked |
+| `TM_AUTO_COMMIT` | `true` | Auto-commit after GREEN phase |
+| `TM_PROJECT_ROOT` | Current dir | Default project root |
+
+## Example .env File
+
+```
+# Required
+ANTHROPIC_API_KEY=sk-ant-api03-your-api-key
+
+# Optional - Claude Configuration
+MODEL=claude-3-7-sonnet-20250219
+MAX_TOKENS=4000
+TEMPERATURE=0.7
+
+# Optional - Perplexity API for Research
+PERPLEXITY_API_KEY=pplx-your-api-key
+PERPLEXITY_MODEL=sonar-medium-online
+
+# Optional - Project Info
+PROJECT_NAME=My Project
+PROJECT_VERSION=1.0.0
+
+# Optional - Application Configuration
+DEFAULT_SUBTASKS=3
+DEFAULT_PRIORITY=medium
+DEBUG=false
+LOG_LEVEL=info
+
+# TDD Workflow
+TM_MAX_ATTEMPTS=3
+TM_AUTO_COMMIT=true
+```
+
+## Troubleshooting
+
+### If `task-master init` doesn't respond:
+
+Try running it with Node directly:
+
+```bash
+node node_modules/claude-task-master/scripts/init.js
+```
+
+Or clone the repository and run:
+
+```bash
+git clone https://github.com/eyaltoledano/claude-task-master.git
+cd claude-task-master
+node scripts/init.js
+```
+
+<Note>
+For advanced configuration options and detailed customization, see our [Advanced Configuration Guide] page.
+</Note>
--- a/apps/docs/docs.json
+++ b/apps/docs/docs.json
@@ -52,6 +52,13 @@
 							"capabilities/cli-root-commands",
 							"capabilities/task-structure"
 						]
+					},
+					{
+						"group": "TDD Workflow (Autopilot)",
+						"pages": [
+							"tdd-workflow/quickstart",
+							"tdd-workflow/ai-agent-integration"
+						]
 					}
 				]
 			}
--- a/apps/docs/getting-started/quick-start/configuration-quick.mdx
+++ b/apps/docs/getting-started/quick-start/configuration-quick.mdx
@@ -37,6 +37,25 @@ For MCP/Cursor usage: Configure keys in the env section of your .cursor/mcp.json
 }
 ```

+<Tip>
+**Optimize Context Usage**: You can control which Task Master MCP tools are loaded using the `TASK_MASTER_TOOLS` environment variable. This helps reduce LLM context usage by only loading the tools you need.
+
+Options:
+- `all` (default) - All 36 tools
+- `standard` - 15 commonly used tools
+- `core` or `lean` - 7 essential tools
+
+Example:
+```json
+"env": {
+  "TASK_MASTER_TOOLS": "standard",
+  "ANTHROPIC_API_KEY": "your_key_here"
+}
+```
+
+See the [MCP Tools documentation](/capabilities/mcp#configurable-tool-loading) for details.
+</Tip>
+
 ### CLI Usage: `.env` File

 Create a `.env` file in your project root and include the keys for the providers you plan to use:
--- a/apps/docs/package.json
+++ b/apps/docs/package.json
@@ -1,6 +1,6 @@
 {
 	"name": "docs",
-	"version": "0.0.5",
+	"version": "0.0.6",
 	"private": true,
 	"description": "Task Master documentation powered by Mintlify",
 	"scripts": {
--- a/apps/docs/tdd-workflow/ai-agent-integration.mdx
+++ b/apps/docs/tdd-workflow/ai-agent-integration.mdx
--- a/apps/docs/tdd-workflow/quickstart.mdx
+++ b/apps/docs/tdd-workflow/quickstart.mdx
@@ -0,0 +1,313 @@
+---
+title: "TDD Workflow Quick Start"
+description: "Get started with TaskMaster's autonomous TDD workflow in 5 minutes"
+---
+
+Get started with TaskMaster's autonomous TDD workflow in 5 minutes.
+
+## Prerequisites
+
+- TaskMaster initialized project (`tm init`)
+- Tasks with subtasks created (`tm parse-prd` or `tm expand`)
+- Git repository with clean working tree
+- Test framework installed (vitest, jest, mocha, etc.)
+
+## 1. Start a Workflow
+
+```bash
+tm autopilot start <taskId>
+```
+
+Example:
+```bash
+$ tm autopilot start 7
+
+✓ Workflow started for task 7
+✓ Created branch: task-7
+✓ Current phase: RED
+✓ Subtask 1/5: Implement start command
+→ Next action: Write a failing test
+```
+
+## 2. The TDD Cycle
+
+### RED Phase: Write Failing Test
+
+```bash
+# Check what to do next
+$ tm autopilot next --json
+{
+  "action": "generate_test",
+  "currentSubtask": {
+    "id": "1",
+    "title": "Implement start command"
+  }
+}
+```
+
+Write a test that fails:
+```typescript
+// tests/start.test.ts
+import { describe, it, expect } from 'vitest';
+import { StartCommand } from '../src/commands/start';
+
+describe('StartCommand', () => {
+  it('should initialize workflow', async () => {
+    const command = new StartCommand();
+    const result = await command.execute({ taskId: '7' });
+    expect(result.success).toBe(true);
+  });
+});
+```
+
+Run tests:
+```bash
+$ npm test
+# ✗ 1 test failed
+```
+
+Complete RED phase:
+```bash
+$ tm autopilot complete --results '{"total":1,"passed":0,"failed":1,"skipped":0}'
+
+✓ RED phase complete
+✓ Current phase: GREEN
+→ Next action: Implement code to pass tests
+```
+
+### GREEN Phase: Implement Feature
+
+Write minimal code to pass:
+```typescript
+// src/commands/start.ts
+export class StartCommand {
+  async execute(options: { taskId: string }) {
+    return { success: true };
+  }
+}
+```
+
+Run tests:
+```bash
+$ npm test
+# ✓ 1 test passed
+```
+
+Complete GREEN phase:
+```bash
+$ tm autopilot complete --results '{"total":1,"passed":1,"failed":0,"skipped":0}'
+
+✓ GREEN phase complete
+✓ Current phase: COMMIT
+→ Next action: Commit changes
+```
+
+### COMMIT Phase: Save Progress
+
+```bash
+$ tm autopilot commit
+
+✓ Created commit: abc123
+✓ Message: feat(autopilot): implement start command (Task 7.1)
+✓ Advanced to subtask 2/5
+✓ Current phase: RED
+→ Next action: Write a failing test
+```
+
+## 3. Continue for All Subtasks
+
+Repeat the RED-GREEN-COMMIT cycle for each subtask until complete.
+
+```bash
+# Check progress anytime
+$ tm autopilot status --json
+{
+  "taskId": "7",
+  "progress": {
+    "completed": 1,
+    "total": 5,
+    "percentage": 20
+  },
+  "currentSubtask": {
+    "id": "2",
+    "title": "Implement resume command"
+  }
+}
+```
+
+## 4. Complete the Workflow
+
+When all subtasks are done:
+
+```bash
+$ tm autopilot status --json
+{
+  "phase": "COMPLETE",
+  "progress": {
+    "completed": 5,
+    "total": 5,
+    "percentage": 100
+  }
+}
+```
+
+Your branch `task-7` is ready for review/merge!
+
+## Common Patterns
+
+### Parse Test Output
+
+Your test runner outputs human-readable format - convert to JSON:
+
+**Vitest:**
+```
+Tests  2 failed | 8 passed | 10 total
+```
+→ `{"total":10,"passed":8,"failed":2,"skipped":0}`
+
+**Jest:**
+```
+Tests:  2 failed, 8 passed, 10 total
+```
+→ `{"total":10,"passed":8,"failed":2,"skipped":0}`
+
+### Handle Errors
+
+**Problem:** RED phase won't complete - "no test failures"
+
+**Solution:** Your test isn't testing new behavior. Make sure it fails:
+```typescript
+// Bad - test passes immediately
+it('should exist', () => {
+  expect(StartCommand).toBeDefined(); // Always passes
+});
+
+// Good - test fails until feature exists
+it('should initialize workflow', async () => {
+  const result = await new StartCommand().execute({ taskId: '7' });
+  expect(result.success).toBe(true); // Fails until execute() is implemented
+});
+```
+
+**Problem:** GREEN phase won't complete - "tests still failing"
+
+**Solution:** Fix your implementation until all tests pass:
+```bash
+# Run tests to see what's failing
+$ npm test
+
+# Fix the issue
+$ vim src/commands/start.ts
+
+# Verify tests pass
+$ npm test
+
+# Try again
+$ tm autopilot complete --results '{"total":1,"passed":1,"failed":0,"skipped":0}'
+```
+
+### Resume Interrupted Work
+
+```bash
+# If you interrupted the workflow
+$ tm autopilot resume
+
+✓ Workflow resumed
+✓ Task 7 - subtask 3/5
+✓ Current phase: GREEN
+→ Continue from where you left off
+```
+
+## JSON Output Mode
+
+All commands support `--json` for programmatic use:
+
+```bash
+$ tm autopilot start 7 --json | jq .
+{
+  "success": true,
+  "taskId": "7",
+  "branchName": "task-7",
+  "phase": "SUBTASK_LOOP",
+  "tddPhase": "RED",
+  "progress": { ... },
+  "currentSubtask": { ... },
+  "nextAction": "generate_test"
+}
+```
+
+Perfect for:
+- CI/CD integration
+- Custom tooling
+- Automated workflows
+- Progress monitoring
+
+## MCP Integration
+
+For AI agents (Claude Code, etc.), use MCP tools:
+
+```typescript
+// Start workflow
+await mcp.call('autopilot_start', {
+  taskId: '7',
+  projectRoot: '/path/to/project'
+});
+
+// Get next action
+const next = await mcp.call('autopilot_next', {
+  projectRoot: '/path/to/project'
+});
+
+// Complete phase
+await mcp.call('autopilot_complete_phase', {
+  projectRoot: '/path/to/project',
+  testResults: { total: 1, passed: 0, failed: 1, skipped: 0 }
+});
+
+// Commit
+await mcp.call('autopilot_commit', {
+  projectRoot: '/path/to/project'
+});
+```
+
+See [AI Agent Integration Guide](./ai-agent-integration.mdx) for details.
+
+## Cheat Sheet
+
+```bash
+# Start
+tm autopilot start <taskId>         # Initialize workflow
+
+# Workflow Control
+tm autopilot next                    # What's next?
+tm autopilot status                  # Current state
+tm autopilot resume                  # Continue interrupted work
+tm autopilot abort                   # Cancel and cleanup
+
+# TDD Cycle
+tm autopilot complete --results '{...}'   # Advance phase
+tm autopilot commit                       # Save progress
+
+# Options
+--json                              # Machine-readable output
+--project-root <path>               # Specify project location
+--force                             # Override safety checks
+```
+
+## Next Steps
+
+- Read [AI Agent Integration Guide](./ai-agent-integration.mdx) for complete documentation
+- Check [Command Reference](/command-reference) for all options
+
+## Tips
+
+1. **Always let tests fail first** - That's the RED phase
+2. **Write minimal code** - Just enough to pass
+3. **Commit frequently** - After each subtask
+4. **Use --json** - Better for programmatic use
+5. **Check status often** - Know where you are
+6. **Trust the workflow** - It enforces TDD rules
+
+---
+
+**Ready to start?** Run `tm autopilot start <taskId>` and begin your TDD journey!
--- a/apps/extension/CHANGELOG.md
+++ b/apps/extension/CHANGELOG.md
@@ -1,5 +1,7 @@
 # Change Log

+## 0.25.6
+
 ## 0.25.6-rc.0

 ### Patch Changes
--- a/apps/extension/package.json
+++ b/apps/extension/package.json
@@ -3,7 +3,7 @@
 	"private": true,
 	"displayName": "TaskMaster",
 	"description": "A visual Kanban board interface for TaskMaster projects in VS Code",
-	"version": "0.25.6-rc.0",
+	"version": "0.25.6",
 	"publisher": "Hamster",
 	"icon": "assets/icon.png",
 	"engines": {
@@ -239,9 +239,6 @@
 		"watch:css": "npx @tailwindcss/cli -i ./src/webview/index.css -o ./dist/index.css --watch",
 		"check-types": "tsc --noEmit"
 	},
-	"dependencies": {
-		"task-master-ai": "0.29.0-rc.0"
-	},
 	"devDependencies": {
 		"@dnd-kit/core": "^6.3.1",
 		"@dnd-kit/modifiers": "^9.0.0",
@@ -277,7 +274,8 @@
 		"tailwind-merge": "^3.3.1",
 		"tailwindcss": "4.1.11",
 		"typescript": "^5.9.2",
-		"@tm/core": "*"
+		"@tm/core": "*",
+		"task-master-ai": "0.30.0-rc.1"
 	},
 	"overrides": {
 		"glob@<8": "^10.4.5",
--- a/apps/mcp/package.json
+++ b/apps/mcp/package.json
@@ -0,0 +1,54 @@
+{
+	"name": "@tm/mcp",
+	"description": "Task Master MCP Tools - TypeScript MCP server tools for AI agent integration",
+	"type": "module",
+	"private": true,
+	"version": "",
+	"main": "./dist/index.js",
+	"types": "./src/index.ts",
+	"exports": {
+		".": "./src/index.ts",
+		"./tools/autopilot": "./src/tools/autopilot/index.ts"
+	},
+	"files": ["dist", "README.md"],
+	"scripts": {
+		"typecheck": "tsc --noEmit",
+		"lint": "biome check src",
+		"format": "biome format --write src",
+		"test": "vitest run",
+		"test:watch": "vitest",
+		"test:coverage": "vitest run --coverage",
+		"test:unit": "vitest run -t unit",
+		"test:integration": "vitest run -t integration",
+		"test:ci": "vitest run --coverage --reporter=dot"
+	},
+	"dependencies": {
+		"@tm/core": "*",
+		"zod": "^4.1.11",
+		"fastmcp": "^3.19.2"
+	},
+	"devDependencies": {
+		"@biomejs/biome": "^1.9.4",
+		"@types/node": "^22.10.5",
+		"typescript": "^5.9.2",
+		"vitest": "^3.2.4"
+	},
+	"engines": {
+		"node": ">=18.0.0"
+	},
+	"keywords": [
+		"task-master",
+		"mcp",
+		"mcp-server",
+		"ai-agent",
+		"workflow",
+		"tdd"
+	],
+	"author": "",
+	"license": "MIT",
+	"typesVersions": {
+		"*": {
+			"*": ["src/*"]
+		}
+	}
+}
--- a/apps/mcp/src/index.ts
+++ b/apps/mcp/src/index.ts
@@ -0,0 +1,8 @@
+/**
+ * @fileoverview Main entry point for @tm/mcp package
+ * Exports all MCP tool registration functions
+ */
+
+export * from './tools/autopilot/index.js';
+export * from './shared/utils.js';
+export * from './shared/types.js';
--- a/apps/mcp/src/shared/types.ts
+++ b/apps/mcp/src/shared/types.ts
@@ -0,0 +1,36 @@
+/**
+ * Shared types for MCP tools
+ */
+
+export interface MCPResponse<T = any> {
+	success: boolean;
+	data?: T;
+	error?: {
+		code: string;
+		message: string;
+		suggestion?: string;
+		details?: any;
+	};
+	version?: {
+		version: string;
+		name: string;
+	};
+	tag?: {
+		currentTag: string;
+		availableTags: string[];
+	};
+}
+
+export interface MCPContext {
+	log: {
+		info: (message: string) => void;
+		warn: (message: string) => void;
+		error: (message: string) => void;
+		debug: (message: string) => void;
+	};
+	session: any;
+}
+
+export interface WithProjectRoot {
+	projectRoot: string;
+}
--- a/apps/mcp/src/shared/utils.ts
+++ b/apps/mcp/src/shared/utils.ts
@@ -0,0 +1,257 @@
+/**
+ * Shared utilities for MCP tools
+ */
+
+import type { ContentResult } from 'fastmcp';
+import path from 'node:path';
+import fs from 'node:fs';
+import packageJson from '../../../../package.json' with { type: 'json' };
+
+/**
+ * Get version information
+ */
+export function getVersionInfo() {
+	return {
+		version: packageJson.version || 'unknown',
+		name: packageJson.name || 'task-master-ai'
+	};
+}
+
+/**
+ * Get current tag for a project root
+ */
+export function getCurrentTag(projectRoot: string): string | null {
+	try {
+		// Try to read current tag from state.json
+		const stateJsonPath = path.join(projectRoot, '.taskmaster', 'state.json');
+
+		if (fs.existsSync(stateJsonPath)) {
+			const stateData = JSON.parse(fs.readFileSync(stateJsonPath, 'utf-8'));
+			return stateData.currentTag || 'master';
+		}
+
+		return 'master';
+	} catch {
+		return null;
+	}
+}
+
+/**
+ * Handle API result with standardized error handling and response formatting
+ * This provides a consistent response structure for all MCP tools
+ */
+export async function handleApiResult<T>(options: {
+	result: { success: boolean; data?: T; error?: { message: string } };
+	log?: any;
+	errorPrefix?: string;
+	projectRoot?: string;
+}): Promise<ContentResult> {
+	const { result, log, errorPrefix = 'API error', projectRoot } = options;
+
+	// Get version info for every response
+	const versionInfo = getVersionInfo();
+
+	// Get current tag if project root is provided
+	const currentTag = projectRoot ? getCurrentTag(projectRoot) : null;
+
+	if (!result.success) {
+		const errorMsg = result.error?.message || `Unknown ${errorPrefix}`;
+		log?.error?.(`${errorPrefix}: ${errorMsg}`);
+
+		let errorText = `Error: ${errorMsg}\nVersion: ${versionInfo.version}\nName: ${versionInfo.name}`;
+
+		if (currentTag) {
+			errorText += `\nCurrent Tag: ${currentTag}`;
+		}
+
+		return {
+			content: [
+				{
+					type: 'text',
+					text: errorText
+				}
+			],
+			isError: true
+		};
+	}
+
+	log?.info?.('Successfully completed operation');
+
+	// Create the response payload including version info and tag
+	const responsePayload: any = {
+		data: result.data,
+		version: versionInfo
+	};
+
+	// Add current tag if available
+	if (currentTag) {
+		responsePayload.tag = currentTag;
+	}
+
+	return {
+		content: [
+			{
+				type: 'text',
+				text: JSON.stringify(responsePayload, null, 2)
+			}
+		]
+	};
+}
+
+/**
+ * Normalize project root path (handles URI encoding, file:// protocol, Windows paths)
+ */
+export function normalizeProjectRoot(rawPath: string): string {
+	if (!rawPath) return process.cwd();
+
+	try {
+		let pathString = rawPath;
+
+		// Decode URI encoding
+		try {
+			pathString = decodeURIComponent(pathString);
+		} catch {
+			// If decoding fails, use as-is
+		}
+
+		// Strip file:// prefix
+		if (pathString.startsWith('file:///')) {
+			pathString = pathString.slice(7);
+		} else if (pathString.startsWith('file://')) {
+			pathString = pathString.slice(7);
+		}
+
+		// Handle Windows drive letter after stripping prefix (e.g., /C:/...)
+		if (
+			pathString.startsWith('/') &&
+			/[A-Za-z]:/.test(pathString.substring(1, 3))
+		) {
+			pathString = pathString.substring(1);
+		}
+
+		// Normalize backslashes to forward slashes
+		pathString = pathString.replace(/\\/g, '/');
+
+		// Resolve to absolute path
+		return path.resolve(pathString);
+	} catch {
+		return path.resolve(rawPath);
+	}
+}
+
+/**
+ * Get project root from session object
+ */
+function getProjectRootFromSession(session: any): string | null {
+	try {
+		// Check primary location
+		if (session?.roots?.[0]?.uri) {
+			return normalizeProjectRoot(session.roots[0].uri);
+		}
+		// Check alternate location
+		else if (session?.roots?.roots?.[0]?.uri) {
+			return normalizeProjectRoot(session.roots.roots[0].uri);
+		}
+		return null;
+	} catch {
+		return null;
+	}
+}
+
+/**
+ * Wrapper to normalize project root in args with proper precedence order
+ *
+ * PRECEDENCE ORDER:
+ * 1. TASK_MASTER_PROJECT_ROOT environment variable (from process.env or session)
+ * 2. args.projectRoot (explicitly provided)
+ * 3. Session-based project root resolution
+ * 4. Current directory fallback
+ */
+export function withNormalizedProjectRoot<T extends { projectRoot?: string }>(
+	fn: (
+		args: T & { projectRoot: string },
+		context: any
+	) => Promise<ContentResult>
+): (args: T, context: any) => Promise<ContentResult> {
+	return async (args: T, context: any): Promise<ContentResult> => {
+		const { log, session } = context;
+		let normalizedRoot: string | null = null;
+		let rootSource = 'unknown';
+
+		try {
+			// 1. Check for TASK_MASTER_PROJECT_ROOT environment variable first
+			if (process.env.TASK_MASTER_PROJECT_ROOT) {
+				const envRoot = process.env.TASK_MASTER_PROJECT_ROOT;
+				normalizedRoot = path.isAbsolute(envRoot)
+					? envRoot
+					: path.resolve(process.cwd(), envRoot);
+				rootSource = 'TASK_MASTER_PROJECT_ROOT environment variable';
+				log?.info?.(`Using project root from ${rootSource}: ${normalizedRoot}`);
+			}
+			// Also check session environment variables for TASK_MASTER_PROJECT_ROOT
+			else if (session?.env?.TASK_MASTER_PROJECT_ROOT) {
+				const envRoot = session.env.TASK_MASTER_PROJECT_ROOT;
+				normalizedRoot = path.isAbsolute(envRoot)
+					? envRoot
+					: path.resolve(process.cwd(), envRoot);
+				rootSource = 'TASK_MASTER_PROJECT_ROOT session environment variable';
+				log?.info?.(`Using project root from ${rootSource}: ${normalizedRoot}`);
+			}
+			// 2. If no environment variable, try args.projectRoot
+			else if (args.projectRoot) {
+				normalizedRoot = normalizeProjectRoot(args.projectRoot);
+				rootSource = 'args.projectRoot';
+				log?.info?.(`Using project root from ${rootSource}: ${normalizedRoot}`);
+			}
+			// 3. If no args.projectRoot, try session-based resolution
+			else {
+				const sessionRoot = getProjectRootFromSession(session);
+				if (sessionRoot) {
+					normalizedRoot = sessionRoot;
+					rootSource = 'session';
+					log?.info?.(
+						`Using project root from ${rootSource}: ${normalizedRoot}`
+					);
+				}
+			}
+
+			if (!normalizedRoot) {
+				log?.error?.(
+					'Could not determine project root from environment, args, or session.'
+				);
+				return handleApiResult({
+					result: {
+						success: false,
+						error: {
+							message:
+								'Could not determine project root. Please provide projectRoot argument or ensure TASK_MASTER_PROJECT_ROOT environment variable is set.'
+						}
+					}
+				});
+			}
+
+			// Inject the normalized root back into args
+			const updatedArgs = { ...args, projectRoot: normalizedRoot } as T & {
+				projectRoot: string;
+			};
+
+			// Execute the original function with normalized root in args
+			return await fn(updatedArgs, context);
+		} catch (error: any) {
+			log?.error?.(
+				`Error within withNormalizedProjectRoot HOF (Normalized Root: ${normalizedRoot}): ${error.message}`
+			);
+			if (error.stack && log?.debug) {
+				log.debug(error.stack);
+			}
+			return handleApiResult({
+				result: {
+					success: false,
+					error: {
+						message: `Operation failed: ${error.message}`
+					}
+				}
+			});
+		}
+	};
+}
--- a/apps/mcp/src/tools/README-ZOD-V3.md
+++ b/apps/mcp/src/tools/README-ZOD-V3.md
@@ -0,0 +1,70 @@
+# Why MCP Tools Use Zod v3
+
+## Problem
+
+- **FastMCP** uses `xsschema` to convert schemas → outputs JSON Schema **Draft 2020-12**
+- **MCP clients** (Augment IDE, gemini-cli, etc.) only support **Draft-07**
+- Using Zod v4 in tools causes "vendor undefined" errors and tool discovery failures
+
+## Temporary Solution
+
+All MCP tool files import from `zod/v3` instead of `zod`:
+
+```typescript
+import { z } from 'zod/v3';  // ✅ Draft-07 compatible
+// NOT: import { z } from 'zod';  // ❌ Would use Draft 2020-12
+```
+
+### Why This Works
+
+- Zod v4 ships with v3 compatibility at `zod/v3`
+- FastMCP + zod-to-json-schema converts Zod v3 schemas → **Draft-07**
+- This ensures MCP clients can discover and use our tools
+
+### What This Means
+
+- ✅ **MCP tools** → use `zod/v3` (apps/mcp & mcp-server/src/tools)
+- ✅ **Rest of codebase** → uses `zod` (Zod v4)
+- ✅ **No conflicts** → they're from the same package, just different versions
+
+## When Can We Remove This?
+
+This workaround can be removed when **either**:
+
+1. **FastMCP adds JSON Schema version configuration**
+   - e.g., `new FastMCP({ jsonSchema: { target: 'draft-07' } })`
+   - Tracking: https://github.com/punkpeye/fastmcp/issues/189
+
+2. **MCP spec adds Draft 2020-12 support**
+   - Unlikely in the short term
+
+3. **xsschema adds version targeting**
+   - Would allow FastMCP to use Draft-07
+
+## How to Maintain
+
+When adding new MCP tools:
+
+```typescript
+// ✅ CORRECT
+import { z } from 'zod/v3';
+
+export function registerMyTool(server: FastMCP) {
+  server.addTool({
+    name: 'my_tool',
+    parameters: z.object({ ... }),  // Will use Draft-07
+    execute: async (args, context) => { ... }
+  });
+}
+```
+
+```typescript
+// ❌ WRONG - Will break MCP client compatibility
+import { z } from 'zod';  // Don't do this in apps/mcp/src/tools/
+```
+
+---
+
+**Last Updated:** 2025-10-18
+**Affects:** All files in `apps/mcp/src/tools/`
+**See Also:** `mcp-server/src/tools/README-ZOD-V3.md` (same workaround)
--- a/apps/mcp/src/tools/autopilot/abort.tool.ts
+++ b/apps/mcp/src/tools/autopilot/abort.tool.ts
@@ -0,0 +1,101 @@
+/**
+ * @fileoverview autopilot-abort MCP tool
+ * Abort a running TDD workflow and clean up state
+ */
+
+// TEMPORARY: Using zod/v3 for Draft-07 JSON Schema compatibility with FastMCP's zod-to-json-schema
+// TODO: Revert to 'zod' when MCP spec issue is resolved (see PR #1323)
+import { z } from 'zod/v3';
+import {
+	handleApiResult,
+	withNormalizedProjectRoot
+} from '../../shared/utils.js';
+import type { MCPContext } from '../../shared/types.js';
+import { WorkflowService } from '@tm/core';
+import type { FastMCP } from 'fastmcp';
+
+const AbortSchema = z.object({
+	projectRoot: z
+		.string()
+		.describe('Absolute path to the project root directory')
+});
+
+type AbortArgs = z.infer<typeof AbortSchema>;
+
+/**
+ * Register the autopilot_abort tool with the MCP server
+ */
+export function registerAutopilotAbortTool(server: FastMCP) {
+	server.addTool({
+		name: 'autopilot_abort',
+		description:
+			'Abort the current TDD workflow and clean up workflow state. This will remove the workflow state file but will NOT delete the git branch or any code changes.',
+		parameters: AbortSchema,
+		execute: withNormalizedProjectRoot(
+			async (args: AbortArgs, context: MCPContext) => {
+				const { projectRoot } = args;
+
+				try {
+					context.log.info(`Aborting autopilot workflow in ${projectRoot}`);
+
+					const workflowService = new WorkflowService(projectRoot);
+
+					// Check if workflow exists
+					const hasWorkflow = await workflowService.hasWorkflow();
+
+					if (!hasWorkflow) {
+						context.log.warn('No active workflow to abort');
+						return handleApiResult({
+							result: {
+								success: true,
+								data: {
+									message: 'No active workflow to abort',
+									hadWorkflow: false
+								}
+							},
+							log: context.log,
+							projectRoot
+						});
+					}
+
+					// Get info before aborting
+					await workflowService.resumeWorkflow();
+					const status = workflowService.getStatus();
+
+					// Abort workflow
+					await workflowService.abortWorkflow();
+
+					context.log.info('Workflow state deleted');
+
+					return handleApiResult({
+						result: {
+							success: true,
+							data: {
+								message: 'Workflow aborted',
+								hadWorkflow: true,
+								taskId: status.taskId,
+								branchName: status.branchName,
+								note: 'Git branch and code changes were preserved. You can manually clean them up if needed.'
+							}
+						},
+						log: context.log,
+						projectRoot
+					});
+				} catch (error: any) {
+					context.log.error(`Error in autopilot-abort: ${error.message}`);
+					if (error.stack) {
+						context.log.debug(error.stack);
+					}
+					return handleApiResult({
+						result: {
+							success: false,
+							error: { message: `Failed to abort workflow: ${error.message}` }
+						},
+						log: context.log,
+						projectRoot
+					});
+				}
+			}
+		)
+	});
+}
--- a/apps/mcp/src/tools/autopilot/commit.tool.ts
+++ b/apps/mcp/src/tools/autopilot/commit.tool.ts
@@ -0,0 +1,242 @@
+/**
+ * @fileoverview autopilot-commit MCP tool
+ * Create a git commit with automatic staging and message generation
+ */
+
+// TEMPORARY: Using zod/v3 for Draft-07 JSON Schema compatibility with FastMCP's zod-to-json-schema
+// TODO: Revert to 'zod' when MCP spec issue is resolved (see PR #1323)
+import { z } from 'zod/v3';
+import {
+	handleApiResult,
+	withNormalizedProjectRoot
+} from '../../shared/utils.js';
+import type { MCPContext } from '../../shared/types.js';
+import { WorkflowService, GitAdapter, CommitMessageGenerator } from '@tm/core';
+import type { FastMCP } from 'fastmcp';
+
+const CommitSchema = z.object({
+	projectRoot: z
+		.string()
+		.describe('Absolute path to the project root directory'),
+	files: z
+		.array(z.string())
+		.optional()
+		.describe(
+			'Specific files to stage (relative to project root). If not provided, stages all changes.'
+		),
+	customMessage: z
+		.string()
+		.optional()
+		.describe('Custom commit message to use instead of auto-generated message')
+});
+
+type CommitArgs = z.infer<typeof CommitSchema>;
+
+/**
+ * Register the autopilot_commit tool with the MCP server
+ */
+export function registerAutopilotCommitTool(server: FastMCP) {
+	server.addTool({
+		name: 'autopilot_commit',
+		description:
+			'Create a git commit with automatic staging, message generation, and metadata embedding. Generates appropriate commit messages based on subtask context and TDD phase.',
+		parameters: CommitSchema,
+		execute: withNormalizedProjectRoot(
+			async (args: CommitArgs, context: MCPContext) => {
+				const { projectRoot, files, customMessage } = args;
+
+				try {
+					context.log.info(`Creating commit for workflow in ${projectRoot}`);
+
+					const workflowService = new WorkflowService(projectRoot);
+
+					// Check if workflow exists
+					if (!(await workflowService.hasWorkflow())) {
+						return handleApiResult({
+							result: {
+								success: false,
+								error: {
+									message:
+										'No active workflow found. Start a workflow with autopilot_start'
+								}
+							},
+							log: context.log,
+							projectRoot
+						});
+					}
+
+					// Resume workflow
+					await workflowService.resumeWorkflow();
+					const status = workflowService.getStatus();
+					const workflowContext = workflowService.getContext();
+
+					// Verify we're in COMMIT phase
+					if (status.tddPhase !== 'COMMIT') {
+						context.log.warn(
+							`Not in COMMIT phase (currently in ${status.tddPhase})`
+						);
+						return handleApiResult({
+							result: {
+								success: false,
+								error: {
+									message: `Cannot commit: currently in ${status.tddPhase} phase. Complete the ${status.tddPhase} phase first using autopilot_complete_phase`
+								}
+							},
+							log: context.log,
+							projectRoot
+						});
+					}
+
+					// Verify there's an active subtask
+					if (!status.currentSubtask) {
+						return handleApiResult({
+							result: {
+								success: false,
+								error: { message: 'No active subtask to commit' }
+							},
+							log: context.log,
+							projectRoot
+						});
+					}
+
+					// Initialize git adapter
+					const gitAdapter = new GitAdapter(projectRoot);
+
+					// Stage files
+					try {
+						if (files && files.length > 0) {
+							await gitAdapter.stageFiles(files);
+							context.log.info(`Staged ${files.length} files`);
+						} else {
+							await gitAdapter.stageFiles(['.']);
+							context.log.info('Staged all changes');
+						}
+					} catch (error: any) {
+						context.log.error(`Failed to stage files: ${error.message}`);
+						return handleApiResult({
+							result: {
+								success: false,
+								error: { message: `Failed to stage files: ${error.message}` }
+							},
+							log: context.log,
+							projectRoot
+						});
+					}
+
+					// Check if there are staged changes
+					const hasStagedChanges = await gitAdapter.hasStagedChanges();
+					if (!hasStagedChanges) {
+						context.log.warn('No staged changes to commit');
+						return handleApiResult({
+							result: {
+								success: false,
+								error: {
+									message:
+										'No staged changes to commit. Make code changes before committing'
+								}
+							},
+							log: context.log,
+							projectRoot
+						});
+					}
+
+					// Get git status for message generation
+					const gitStatus = await gitAdapter.getStatus();
+
+					// Generate commit message
+					let commitMessage: string;
+					if (customMessage) {
+						commitMessage = customMessage;
+						context.log.info('Using custom commit message');
+					} else {
+						const messageGenerator = new CommitMessageGenerator();
+
+						// Determine commit type based on phase and subtask
+						// RED phase = test files, GREEN phase = implementation
+						const type = status.tddPhase === 'COMMIT' ? 'feat' : 'test';
+
+						// Use subtask title as description
+						const description = status.currentSubtask.title;
+
+						// Construct proper CommitMessageOptions
+						const options = {
+							type,
+							description,
+							changedFiles: gitStatus.staged,
+							taskId: status.taskId,
+							phase: status.tddPhase,
+							testsPassing: workflowContext.lastTestResults?.passed,
+							testsFailing: workflowContext.lastTestResults?.failed
+						};
+
+						commitMessage = messageGenerator.generateMessage(options);
+						context.log.info('Generated commit message automatically');
+					}
+
+					// Create commit
+					try {
+						await gitAdapter.createCommit(commitMessage);
+						context.log.info('Commit created successfully');
+					} catch (error: any) {
+						context.log.error(`Failed to create commit: ${error.message}`);
+						return handleApiResult({
+							result: {
+								success: false,
+								error: { message: `Failed to create commit: ${error.message}` }
+							},
+							log: context.log,
+							projectRoot
+						});
+					}
+
+					// Get last commit info
+					const lastCommit = await gitAdapter.getLastCommit();
+
+					// Complete COMMIT phase and advance workflow
+					const newStatus = await workflowService.commit();
+
+					context.log.info(
+						`Commit completed. Current phase: ${newStatus.tddPhase || newStatus.phase}`
+					);
+
+					const isComplete = newStatus.phase === 'COMPLETE';
+
+					// Get next action with guidance
+					const nextAction = workflowService.getNextAction();
+
+					return handleApiResult({
+						result: {
+							success: true,
+							data: {
+								message: isComplete
+									? 'Workflow completed successfully'
+									: 'Commit created and workflow advanced',
+								commitSha: lastCommit.sha,
+								commitMessage,
+								...newStatus,
+								isComplete,
+								nextAction: nextAction.action,
+								nextSteps: nextAction.nextSteps
+							}
+						},
+						log: context.log,
+						projectRoot
+					});
+				} catch (error: any) {
+					context.log.error(`Error in autopilot-commit: ${error.message}`);
+					if (error.stack) {
+						context.log.debug(error.stack);
+					}
+					return handleApiResult({
+						result: {
+							success: false,
+							error: { message: `Failed to commit: ${error.message}` }
+						},
+						log: context.log,
+						projectRoot
+					});
+				}
+			}
+		)
+	});
+}
--- a/apps/mcp/src/tools/autopilot/complete.tool.ts
+++ b/apps/mcp/src/tools/autopilot/complete.tool.ts
@@ -0,0 +1,154 @@
+/**
+ * @fileoverview autopilot-complete MCP tool
+ * Complete the current TDD phase with test result validation
+ */
+
+// TEMPORARY: Using zod/v3 for Draft-07 JSON Schema compatibility with FastMCP's zod-to-json-schema
+// TODO: Revert to 'zod' when MCP spec issue is resolved (see PR #1323)
+import { z } from 'zod/v3';
+import {
+	handleApiResult,
+	withNormalizedProjectRoot
+} from '../../shared/utils.js';
+import type { MCPContext } from '../../shared/types.js';
+import { WorkflowService } from '@tm/core';
+import type { FastMCP } from 'fastmcp';
+
+const CompletePhaseSchema = z.object({
+	projectRoot: z
+		.string()
+		.describe('Absolute path to the project root directory'),
+	testResults: z
+		.object({
+			total: z.number().describe('Total number of tests'),
+			passed: z.number().describe('Number of passing tests'),
+			failed: z.number().describe('Number of failing tests'),
+			skipped: z.number().optional().describe('Number of skipped tests')
+		})
+		.describe('Test results from running the test suite')
+});
+
+type CompletePhaseArgs = z.infer<typeof CompletePhaseSchema>;
+
+/**
+ * Register the autopilot_complete_phase tool with the MCP server
+ */
+export function registerAutopilotCompleteTool(server: FastMCP) {
+	server.addTool({
+		name: 'autopilot_complete_phase',
+		description:
+			'Complete the current TDD phase (RED, GREEN, or COMMIT) with test result validation. RED phase: expects failures (if 0 failures, feature is already implemented and subtask auto-completes). GREEN phase: expects all tests passing.',
+		parameters: CompletePhaseSchema,
+		execute: withNormalizedProjectRoot(
+			async (args: CompletePhaseArgs, context: MCPContext) => {
+				const { projectRoot, testResults } = args;
+
+				try {
+					context.log.info(
+						`Completing current phase in workflow for ${projectRoot}`
+					);
+
+					const workflowService = new WorkflowService(projectRoot);
+
+					// Check if workflow exists
+					if (!(await workflowService.hasWorkflow())) {
+						return handleApiResult({
+							result: {
+								success: false,
+								error: {
+									message:
+										'No active workflow found. Start a workflow with autopilot_start'
+								}
+							},
+							log: context.log,
+							projectRoot
+						});
+					}
+
+					// Resume workflow to get current state
+					await workflowService.resumeWorkflow();
+					const currentStatus = workflowService.getStatus();
+
+					// Validate that we're in a TDD phase (RED or GREEN)
+					if (!currentStatus.tddPhase) {
+						return handleApiResult({
+							result: {
+								success: false,
+								error: {
+									message: `Cannot complete phase: not in a TDD phase (current phase: ${currentStatus.phase})`
+								}
+							},
+							log: context.log,
+							projectRoot
+						});
+					}
+
+					// COMMIT phase completion is handled by autopilot_commit tool
+					if (currentStatus.tddPhase === 'COMMIT') {
+						return handleApiResult({
+							result: {
+								success: false,
+								error: {
+									message:
+										'Cannot complete COMMIT phase with this tool. Use autopilot_commit instead'
+								}
+							},
+							log: context.log,
+							projectRoot
+						});
+					}
+
+					// Map TDD phase to TestResult phase (only RED or GREEN allowed)
+					const phase = currentStatus.tddPhase as 'RED' | 'GREEN';
+
+					// Construct full TestResult with phase
+					const fullTestResults = {
+						total: testResults.total,
+						passed: testResults.passed,
+						failed: testResults.failed,
+						skipped: testResults.skipped ?? 0,
+						phase
+					};
+
+					// Complete phase with test results
+					const status = await workflowService.completePhase(fullTestResults);
+					const nextAction = workflowService.getNextAction();
+
+					context.log.info(
+						`Phase completed. New phase: ${status.tddPhase || status.phase}`
+					);
+
+					return handleApiResult({
+						result: {
+							success: true,
+							data: {
+								message: `Phase completed. Transitioned to ${status.tddPhase || status.phase}`,
+								...status,
+								nextAction: nextAction.action,
+								actionDescription: nextAction.description,
+								nextSteps: nextAction.nextSteps
+							}
+						},
+						log: context.log,
+						projectRoot
+					});
+				} catch (error: any) {
+					context.log.error(`Error in autopilot-complete: ${error.message}`);
+					if (error.stack) {
+						context.log.debug(error.stack);
+					}
+					return handleApiResult({
+						result: {
+							success: false,
+							error: {
+								message: `Failed to complete phase: ${error.message}`
+							}
+						},
+						log: context.log,
+						projectRoot
+					});
+				}
+			}
+		)
+	});
+}
--- a/apps/mcp/src/tools/autopilot/finalize.tool.ts
+++ b/apps/mcp/src/tools/autopilot/finalize.tool.ts
@@ -0,0 +1,116 @@
+/**
+ * @fileoverview autopilot-finalize MCP tool
+ * Finalize and complete the workflow with working tree validation
+ */
+
+// TEMPORARY: Using zod/v3 for Draft-07 JSON Schema compatibility with FastMCP's zod-to-json-schema
+// TODO: Revert to 'zod' when MCP spec issue is resolved (see PR #1323)
+import { z } from 'zod/v3';
+import {
+	handleApiResult,
+	withNormalizedProjectRoot
+} from '../../shared/utils.js';
+import type { MCPContext } from '../../shared/types.js';
+import { WorkflowService } from '@tm/core';
+import type { FastMCP } from 'fastmcp';
+
+const FinalizeSchema = z.object({
+	projectRoot: z
+		.string()
+		.describe('Absolute path to the project root directory')
+});
+
+type FinalizeArgs = z.infer<typeof FinalizeSchema>;
+
+/**
+ * Register the autopilot_finalize tool with the MCP server
+ */
+export function registerAutopilotFinalizeTool(server: FastMCP) {
+	server.addTool({
+		name: 'autopilot_finalize',
+		description:
+			'Finalize and complete the workflow. Validates that all changes are committed and working tree is clean before marking workflow as complete.',
+		parameters: FinalizeSchema,
+		execute: withNormalizedProjectRoot(
+			async (args: FinalizeArgs, context: MCPContext) => {
+				const { projectRoot } = args;
+
+				try {
+					context.log.info(`Finalizing workflow in ${projectRoot}`);
+
+					const workflowService = new WorkflowService(projectRoot);
+
+					// Check if workflow exists
+					if (!(await workflowService.hasWorkflow())) {
+						return handleApiResult({
+							result: {
+								success: false,
+								error: {
+									message:
+										'No active workflow found. Start a workflow with autopilot_start'
+								}
+							},
+							log: context.log,
+							projectRoot
+						});
+					}
+
+					// Resume workflow
+					await workflowService.resumeWorkflow();
+					const currentStatus = workflowService.getStatus();
+
+					// Verify we're in FINALIZE phase
+					if (currentStatus.phase !== 'FINALIZE') {
+						return handleApiResult({
+							result: {
+								success: false,
+								error: {
+									message: `Cannot finalize: workflow is in ${currentStatus.phase} phase. Complete all subtasks first.`
+								}
+							},
+							log: context.log,
+							projectRoot
+						});
+					}
+
+					// Finalize workflow (validates clean working tree)
+					const newStatus = await workflowService.finalizeWorkflow();
+
+					context.log.info('Workflow finalized successfully');
+
+					// Get next action
+					const nextAction = workflowService.getNextAction();
+
+					return handleApiResult({
+						result: {
+							success: true,
+							data: {
+								message: 'Workflow completed successfully',
+								...newStatus,
+								nextAction: nextAction.action,
+								nextSteps: nextAction.nextSteps
+							}
+						},
+						log: context.log,
+						projectRoot
+					});
+				} catch (error: any) {
+					context.log.error(`Error in autopilot-finalize: ${error.message}`);
+					if (error.stack) {
+						context.log.debug(error.stack);
+					}
+					return handleApiResult({
+						result: {
+							success: false,
+							error: {
+								message: `Failed to finalize workflow: ${error.message}`
+							}
+						},
+						log: context.log,
+						projectRoot
+					});
+				}
+			}
+		)
+	});
+}
--- a/apps/mcp/src/tools/autopilot/index.ts
+++ b/apps/mcp/src/tools/autopilot/index.ts
@@ -0,0 +1,13 @@
+/**
+ * @fileoverview Autopilot MCP tools index
+ * Exports all autopilot tool registration functions
+ */
+
+export { registerAutopilotStartTool } from './start.tool.js';
+export { registerAutopilotResumeTool } from './resume.tool.js';
+export { registerAutopilotNextTool } from './next.tool.js';
+export { registerAutopilotStatusTool } from './status.tool.js';
+export { registerAutopilotCompleteTool } from './complete.tool.js';
+export { registerAutopilotCommitTool } from './commit.tool.js';
+export { registerAutopilotFinalizeTool } from './finalize.tool.js';
+export { registerAutopilotAbortTool } from './abort.tool.js';
--- a/apps/mcp/src/tools/autopilot/next.tool.ts
+++ b/apps/mcp/src/tools/autopilot/next.tool.ts
@@ -0,0 +1,101 @@
+/**
+ * @fileoverview autopilot-next MCP tool
+ * Get the next action to perform in the TDD workflow
+ */
+
+// TEMPORARY: Using zod/v3 for Draft-07 JSON Schema compatibility with FastMCP's zod-to-json-schema
+// TODO: Revert to 'zod' when MCP spec issue is resolved (see PR #1323)
+import { z } from 'zod/v3';
+import {
+	handleApiResult,
+	withNormalizedProjectRoot
+} from '../../shared/utils.js';
+import type { MCPContext } from '../../shared/types.js';
+import { WorkflowService } from '@tm/core';
+import type { FastMCP } from 'fastmcp';
+
+const NextActionSchema = z.object({
+	projectRoot: z
+		.string()
+		.describe('Absolute path to the project root directory')
+});
+
+type NextActionArgs = z.infer<typeof NextActionSchema>;
+
+/**
+ * Register the autopilot_next tool with the MCP server
+ */
+export function registerAutopilotNextTool(server: FastMCP) {
+	server.addTool({
+		name: 'autopilot_next',
+		description:
+			'Get the next action to perform in the TDD workflow. Returns detailed context about what needs to be done next, including the current phase, subtask, and expected actions.',
+		parameters: NextActionSchema,
+		execute: withNormalizedProjectRoot(
+			async (args: NextActionArgs, context: MCPContext) => {
+				const { projectRoot } = args;
+
+				try {
+					context.log.info(
+						`Getting next action for workflow in ${projectRoot}`
+					);
+
+					const workflowService = new WorkflowService(projectRoot);
+
+					// Check if workflow exists
+					if (!(await workflowService.hasWorkflow())) {
+						return handleApiResult({
+							result: {
+								success: false,
+								error: {
+									message:
+										'No active workflow found. Start a workflow with autopilot_start'
+								}
+							},
+							log: context.log,
+							projectRoot
+						});
+					}
+
+					// Resume to load state
+					await workflowService.resumeWorkflow();
+
+					// Get next action
+					const nextAction = workflowService.getNextAction();
+					const status = workflowService.getStatus();
+
+					context.log.info(`Next action determined: ${nextAction.action}`);
+
+					return handleApiResult({
+						result: {
+							success: true,
+							data: {
+								action: nextAction.action,
+								actionDescription: nextAction.description,
+								...status,
+								nextSteps: nextAction.nextSteps
+							}
+						},
+						log: context.log,
+						projectRoot
+					});
+				} catch (error: any) {
+					context.log.error(`Error in autopilot-next: ${error.message}`);
+					if (error.stack) {
+						context.log.debug(error.stack);
+					}
+					return handleApiResult({
+						result: {
+							success: false,
+							error: {
+								message: `Failed to get next action: ${error.message}`
+							}
+						},
+						log: context.log,
+						projectRoot
+					});
+				}
+			}
+		)
+	});
+}
--- a/apps/mcp/src/tools/autopilot/resume.tool.ts
+++ b/apps/mcp/src/tools/autopilot/resume.tool.ts
@@ -0,0 +1,97 @@
+/**
+ * @fileoverview autopilot-resume MCP tool
+ * Resume a previously started TDD workflow from saved state
+ */
+
+// TEMPORARY: Using zod/v3 for Draft-07 JSON Schema compatibility with FastMCP's zod-to-json-schema
+// TODO: Revert to 'zod' when MCP spec issue is resolved (see PR #1323)
+import { z } from 'zod/v3';
+import {
+	handleApiResult,
+	withNormalizedProjectRoot
+} from '../../shared/utils.js';
+import type { MCPContext } from '../../shared/types.js';
+import { WorkflowService } from '@tm/core';
+import type { FastMCP } from 'fastmcp';
+
+const ResumeWorkflowSchema = z.object({
+	projectRoot: z
+		.string()
+		.describe('Absolute path to the project root directory')
+});
+
+type ResumeWorkflowArgs = z.infer<typeof ResumeWorkflowSchema>;
+
+/**
+ * Register the autopilot_resume tool with the MCP server
+ */
+export function registerAutopilotResumeTool(server: FastMCP) {
+	server.addTool({
+		name: 'autopilot_resume',
+		description:
+			'Resume a previously started TDD workflow from saved state. Restores the workflow state machine and continues from where it left off.',
+		parameters: ResumeWorkflowSchema,
+		execute: withNormalizedProjectRoot(
+			async (args: ResumeWorkflowArgs, context: MCPContext) => {
+				const { projectRoot } = args;
+
+				try {
+					context.log.info(`Resuming autopilot workflow in ${projectRoot}`);
+
+					const workflowService = new WorkflowService(projectRoot);
+
+					// Check if workflow exists
+					if (!(await workflowService.hasWorkflow())) {
+						return handleApiResult({
+							result: {
+								success: false,
+								error: {
+									message:
+										'No workflow state found. Start a new workflow with autopilot_start'
+								}
+							},
+							log: context.log,
+							projectRoot
+						});
+					}
+
+					// Resume workflow
+					const status = await workflowService.resumeWorkflow();
+					const nextAction = workflowService.getNextAction();
+
+					context.log.info(
+						`Workflow resumed successfully for task ${status.taskId}`
+					);
+
+					return handleApiResult({
+						result: {
+							success: true,
+							data: {
+								message: 'Workflow resumed',
+								...status,
+								nextAction: nextAction.action,
+								actionDescription: nextAction.description,
+								nextSteps: nextAction.nextSteps
+							}
+						},
+						log: context.log,
+						projectRoot
+					});
+				} catch (error: any) {
+					context.log.error(`Error in autopilot-resume: ${error.message}`);
+					if (error.stack) {
+						context.log.debug(error.stack);
+					}
+					return handleApiResult({
+						result: {
+							success: false,
+							error: { message: `Failed to resume workflow: ${error.message}` }
+						},
+						log: context.log,
+						projectRoot
+					});
+				}
+			}
+		)
+	});
+}
--- a/apps/mcp/src/tools/autopilot/start.tool.ts
+++ b/apps/mcp/src/tools/autopilot/start.tool.ts
@@ -0,0 +1,199 @@
+/**
+ * @fileoverview autopilot-start MCP tool
+ * Initialize and start a new TDD workflow for a task
+ */
+
+// TEMPORARY: Using zod/v3 for Draft-07 JSON Schema compatibility with FastMCP's zod-to-json-schema
+// TODO: Revert to 'zod' when MCP spec issue is resolved (see PR #1323)
+import { z } from 'zod/v3';
+import {
+	handleApiResult,
+	withNormalizedProjectRoot
+} from '../../shared/utils.js';
+import type { MCPContext } from '../../shared/types.js';
+import { createTaskMasterCore } from '@tm/core';
+import { WorkflowService } from '@tm/core';
+import type { FastMCP } from 'fastmcp';
+
+const StartWorkflowSchema = z.object({
+	taskId: z
+		.string()
+		.describe(
+			'Main task ID to start workflow for (e.g., "1", "2", "HAM-123"). Subtask IDs (e.g., "2.3", "1.1") are not allowed.'
+		),
+	projectRoot: z
+		.string()
+		.describe('Absolute path to the project root directory'),
+	maxAttempts: z
+		.number()
+		.optional()
+		.default(3)
+		.describe('Maximum attempts per subtask (default: 3)'),
+	force: z
+		.boolean()
+		.optional()
+		.default(false)
+		.describe('Force start even if workflow state exists')
+});
+
+type StartWorkflowArgs = z.infer<typeof StartWorkflowSchema>;
+
+/**
+ * Check if a task ID is a main task (not a subtask)
+ * Main tasks: "1", "2", "HAM-123", "PROJ-456"
+ * Subtasks: "1.1", "2.3", "HAM-123.1"
+ */
+function isMainTaskId(taskId: string): boolean {
+	// A main task has no dots in the ID after the optional prefix
+	// Examples: "1" ✓, "HAM-123" ✓, "1.1" ✗, "HAM-123.1" ✗
+	const parts = taskId.split('.');
+	return parts.length === 1;
+}
+
+/**
+ * Register the autopilot_start tool with the MCP server
+ */
+export function registerAutopilotStartTool(server: FastMCP) {
+	server.addTool({
+		name: 'autopilot_start',
+		description:
+			'Initialize and start a new TDD workflow for a task. Creates a git branch and sets up the workflow state machine.',
+		parameters: StartWorkflowSchema,
+		execute: withNormalizedProjectRoot(
+			async (args: StartWorkflowArgs, context: MCPContext) => {
+				const { taskId, projectRoot, maxAttempts, force } = args;
+
+				try {
+					context.log.info(
+						`Starting autopilot workflow for task ${taskId} in ${projectRoot}`
+					);
+
+					// Validate that taskId is a main task (not a subtask)
+					if (!isMainTaskId(taskId)) {
+						return handleApiResult({
+							result: {
+								success: false,
+								error: {
+									message: `Task ID "${taskId}" is a subtask. Autopilot workflows can only be started for main tasks (e.g., "1", "2", "HAM-123"). Please provide the parent task ID instead.`
+								}
+							},
+							log: context.log,
+							projectRoot
+						});
+					}
+
+					// Load task data and get current tag
+					const core = await createTaskMasterCore({
+						projectPath: projectRoot
+					});
+
+					// Get current tag from ConfigManager
+					const currentTag = core.getActiveTag();
+
+					const taskResult = await core.getTaskWithSubtask(taskId);
+
+					if (!taskResult || !taskResult.task) {
+						await core.close();
+						return handleApiResult({
+							result: {
+								success: false,
+								error: { message: `Task ${taskId} not found` }
+							},
+							log: context.log,
+							projectRoot
+						});
+					}
+
+					const task = taskResult.task;
+
+					// Validate task has subtasks
+					if (!task.subtasks || task.subtasks.length === 0) {
+						await core.close();
+						return handleApiResult({
+							result: {
+								success: false,
+								error: {
+									message: `Task ${taskId} has no subtasks. Please use expand_task (with id="${taskId}") to create subtasks first. For improved results, consider running analyze_complexity before expanding the task.`
+								}
+							},
+							log: context.log,
+							projectRoot
+						});
+					}
+
+					// Initialize workflow service
+					const workflowService = new WorkflowService(projectRoot);
+
+					// Check for existing workflow
+					const hasWorkflow = await workflowService.hasWorkflow();
+					if (hasWorkflow && !force) {
+						context.log.warn('Workflow state already exists');
+						return handleApiResult({
+							result: {
+								success: false,
+								error: {
+									message:
+										'Workflow already in progress. Use force=true to override or resume the existing workflow. Suggestion: Use autopilot_resume to continue the existing workflow'
+								}
+							},
+							log: context.log,
+							projectRoot
+						});
+					}
+
+					// Start workflow
+					const status = await workflowService.startWorkflow({
+						taskId,
+						taskTitle: task.title,
+						subtasks: task.subtasks.map((st: any) => ({
+							id: st.id,
+							title: st.title,
+							status: st.status,
+							maxAttempts
+						})),
+						maxAttempts,
+						force,
+						tag: currentTag // Pass current tag for branch naming
+					});
+
+					context.log.info(`Workflow started successfully for task ${taskId}`);
+
+					// Get next action with guidance from WorkflowService
+					const nextAction = workflowService.getNextAction();
+
+					return handleApiResult({
+						result: {
+							success: true,
+							data: {
+								message: `Workflow started for task ${taskId}`,
+								taskId,
+								branchName: status.branchName,
+								phase: status.phase,
+								tddPhase: status.tddPhase,
+								progress: status.progress,
+								currentSubtask: status.currentSubtask,
+								nextAction: nextAction.action,
+								nextSteps: nextAction.nextSteps
+							}
+						},
+						log: context.log,
+						projectRoot
+					});
+				} catch (error: any) {
+					context.log.error(`Error in autopilot-start: ${error.message}`);
+					if (error.stack) {
+						context.log.debug(error.stack);
+					}
+					return handleApiResult({
+						result: {
+							success: false,
+							error: { message: `Failed to start workflow: ${error.message}` }
+						},
+						log: context.log,
+						projectRoot
+					});
+				}
+			}
+		)
+	});
+}
--- a/apps/mcp/src/tools/autopilot/status.tool.ts
+++ b/apps/mcp/src/tools/autopilot/status.tool.ts
@@ -0,0 +1,95 @@
+/**
+ * @fileoverview autopilot-status MCP tool
+ * Get comprehensive workflow status and progress information
+ */
+
+// TEMPORARY: Using zod/v3 for Draft-07 JSON Schema compatibility with FastMCP's zod-to-json-schema
+// TODO: Revert to 'zod' when MCP spec issue is resolved (see PR #1323)
+import { z } from 'zod/v3';
+import {
+	handleApiResult,
+	withNormalizedProjectRoot
+} from '../../shared/utils.js';
+import type { MCPContext } from '../../shared/types.js';
+import { WorkflowService } from '@tm/core';
+import type { FastMCP } from 'fastmcp';
+
+const StatusSchema = z.object({
+	projectRoot: z
+		.string()
+		.describe('Absolute path to the project root directory')
+});
+
+type StatusArgs = z.infer<typeof StatusSchema>;
+
+/**
+ * Register the autopilot_status tool with the MCP server
+ */
+export function registerAutopilotStatusTool(server: FastMCP) {
+	server.addTool({
+		name: 'autopilot_status',
+		description:
+			'Get comprehensive workflow status including current phase, progress, subtask details, and activity history.',
+		parameters: StatusSchema,
+		execute: withNormalizedProjectRoot(
+			async (args: StatusArgs, context: MCPContext) => {
+				const { projectRoot } = args;
+
+				try {
+					context.log.info(`Getting workflow status for ${projectRoot}`);
+
+					const workflowService = new WorkflowService(projectRoot);
+
+					// Check if workflow exists
+					if (!(await workflowService.hasWorkflow())) {
+						return handleApiResult({
+							result: {
+								success: false,
+								error: {
+									message:
+										'No active workflow found. Start a workflow with autopilot_start'
+								}
+							},
+							log: context.log,
+							projectRoot
+						});
+					}
+
+					// Resume to load state
+					await workflowService.resumeWorkflow();
+
+					// Get status
+					const status = workflowService.getStatus();
+
+					context.log.info(
+						`Workflow status retrieved for task ${status.taskId}`
+					);
+
+					return handleApiResult({
+						result: {
+							success: true,
+							data: status
+						},
+						log: context.log,
+						projectRoot
+					});
+				} catch (error: any) {
+					context.log.error(`Error in autopilot-status: ${error.message}`);
+					if (error.stack) {
+						context.log.debug(error.stack);
+					}
+					return handleApiResult({
+						result: {
+							success: false,
+							error: {
+								message: `Failed to get workflow status: ${error.message}`
+							}
+						},
+						log: context.log,
+						projectRoot
+					});
+				}
+			}
+		)
+	});
+}
--- a/apps/mcp/tsconfig.json
+++ b/apps/mcp/tsconfig.json
@@ -0,0 +1,36 @@
+{
+	"compilerOptions": {
+		"target": "ES2022",
+		"module": "NodeNext",
+		"lib": ["ES2022"],
+		"declaration": true,
+		"declarationMap": true,
+		"sourceMap": true,
+		"outDir": "./dist",
+		"baseUrl": ".",
+		"rootDir": "./src",
+		"strict": true,
+		"noImplicitAny": true,
+		"strictNullChecks": true,
+		"strictFunctionTypes": true,
+		"strictBindCallApply": true,
+		"strictPropertyInitialization": true,
+		"noImplicitThis": true,
+		"alwaysStrict": true,
+		"noUnusedLocals": true,
+		"noUnusedParameters": true,
+		"noImplicitReturns": true,
+		"noFallthroughCasesInSwitch": true,
+		"esModuleInterop": true,
+		"skipLibCheck": true,
+		"forceConsistentCasingInFileNames": true,
+		"moduleResolution": "NodeNext",
+		"moduleDetection": "force",
+		"types": ["node"],
+		"resolveJsonModule": true,
+		"isolatedModules": true,
+		"allowImportingTsExtensions": false
+	},
+	"include": ["src/**/*"],
+	"exclude": ["node_modules", "dist", "tests", "**/*.test.ts", "**/*.spec.ts"]
+}
--- a/apps/mcp/vitest.config.ts
+++ b/apps/mcp/vitest.config.ts
@@ -0,0 +1,23 @@
+import { defineConfig } from 'vitest/config';
+
+export default defineConfig({
+	test: {
+		globals: true,
+		environment: 'node',
+		coverage: {
+			provider: 'v8',
+			reporter: ['text', 'json', 'html'],
+			exclude: [
+				'node_modules/',
+				'dist/',
+				'tests/',
+				'**/*.test.ts',
+				'**/*.spec.ts',
+				'**/*.d.ts',
+				'**/mocks/**',
+				'**/fixtures/**',
+				'vitest.config.ts'
+			]
+		}
+	}
+});
--- a/docs/configuration.md
+++ b/docs/configuration.md
@@ -59,6 +59,76 @@ Taskmaster uses two primary methods for configuration:
    - **Migration:** Use `task-master migrate` to move this to `.taskmaster/config.json`.
    - **Deprecation:** While still supported, you'll see warnings encouraging migration to the new structure.

+## MCP Tool Loading Configuration
+
+### TASK_MASTER_TOOLS Environment Variable
+
+The `TASK_MASTER_TOOLS` environment variable controls which tools are loaded by the Task Master MCP server. This allows you to optimize token usage based on your workflow needs.
+
+> Note
+> Prefer setting `TASK_MASTER_TOOLS` in your MCP client's `env` block (e.g., `.cursor/mcp.json`) or in CI/deployment env. The `.env` file is reserved for API keys/endpoints; avoid persisting non-secret settings there.
+
+#### Configuration Options
+
+- **`all`** (default): Loads all 36 available tools (~21,000 tokens)
+  - Best for: Users who need the complete feature set
+  - Use when: Working with complex projects requiring all Task Master features
+  - Backward compatibility: This is the default to maintain compatibility with existing installations
+
+- **`standard`**: Loads 15 commonly used tools (~10,000 tokens, 50% reduction)
+  - Best for: Regular task management workflows
+  - Tools included: All core tools plus project initialization, complexity analysis, task generation, and more
+  - Use when: You need a balanced set of features with reduced token usage
+
+- **`core`** (or `lean`): Loads 7 essential tools (~5,000 tokens, 70% reduction)
+  - Best for: Daily development with minimal token overhead
+  - Tools included: `get_tasks`, `next_task`, `get_task`, `set_task_status`, `update_subtask`, `parse_prd`, `expand_task`
+  - Use when: Working in large contexts where token usage is critical
+  - Note: "lean" is an alias for "core" (same tools, token estimate and recommended use). You can refer to it as either "core" or "lean" when configuring.
+
+- **Custom list**: Comma-separated list of specific tool names
+  - Best for: Specialized workflows requiring specific tools
+  - Example: `"get_tasks,next_task,set_task_status"`
+  - Use when: You know exactly which tools you need
+
+#### How to Configure
+
+1. **In MCP configuration files** (`.cursor/mcp.json`, `.vscode/mcp.json`, etc.) - **Recommended**:
+
+   ```jsonc
+   {
+     "mcpServers": {
+       "task-master-ai": {
+         "env": {
+           "TASK_MASTER_TOOLS": "standard",  // Set tool loading mode
+           // API keys can still use .env for security
+         }
+       }
+     }
+   }
+   ```
+
+2. **Via Claude Code CLI**:
+
+   ```bash
+   claude mcp add task-master-ai --scope user \
+     --env TASK_MASTER_TOOLS="core" \
+     -- npx -y task-master-ai@latest
+   ```
+
+3. **In CI/deployment environment variables**:
+   ```bash
+   export TASK_MASTER_TOOLS="standard"
+   node mcp-server/server.js
+   ```
+
+#### Tool Loading Behavior
+
+- When `TASK_MASTER_TOOLS` is unset or empty, the system defaults to `"all"`
+- Invalid tool names in a user-specified list are ignored (a warning is emitted for each)
+- If every tool name in a custom list is invalid, the system falls back to `"all"`
+- Tool names are case-insensitive (e.g., `"CORE"`, `"core"`, and `"Core"` are treated identically)
+
 ## Environment Variables (`.env` file or MCP `env` block - For API Keys Only)

 - Used **exclusively** for sensitive API keys and specific endpoint URLs.
@@ -223,10 +293,10 @@ node scripts/init.js
   ```bash
   # Set MCP provider for main role
   task-master models set-main --provider mcp --model claude-3-5-sonnet-20241022
-   
-   # Set MCP provider for research role  
+
+   # Set MCP provider for research role
   task-master models set-research --provider mcp --model claude-3-opus-20240229
-   
+
   # Verify configuration
   task-master models list
   ```
@@ -357,7 +427,7 @@ Azure OpenAI provides enterprise-grade OpenAI models through Microsoft's Azure c
         "temperature": 0.7
       },
       "fallback": {
-         "provider": "azure", 
+         "provider": "azure",
         "modelId": "gpt-4o-mini",
         "maxTokens": 10000,
         "temperature": 0.7
@@ -376,7 +446,7 @@ Azure OpenAI provides enterprise-grade OpenAI models through Microsoft's Azure c
     "models": {
       "main": {
         "provider": "azure",
-         "modelId": "gpt-4o", 
+         "modelId": "gpt-4o",
         "maxTokens": 16000,
         "temperature": 0.7,
         "baseURL": "https://your-resource-name.azure.com/openai/deployments"
@@ -390,7 +460,7 @@ Azure OpenAI provides enterprise-grade OpenAI models through Microsoft's Azure c
       "fallback": {
         "provider": "azure",
         "modelId": "gpt-4o-mini",
-         "maxTokens": 10000, 
+         "maxTokens": 10000,
         "temperature": 0.7,
         "baseURL": "https://your-resource-name.azure.com/openai/deployments"
       }
@@ -402,7 +472,7 @@ Azure OpenAI provides enterprise-grade OpenAI models through Microsoft's Azure c
   ```bash
   # In .env file
   AZURE_OPENAI_API_KEY=your-azure-openai-api-key-here
-   
+
   # Optional: Override endpoint for all Azure models
   AZURE_OPENAI_ENDPOINT=https://your-resource-name.azure.com/openai/deployments
   ```
--- a/docs/contributor-docs/worktree-setup.md
+++ b/docs/contributor-docs/worktree-setup.md
@@ -0,0 +1,420 @@
+# Git Worktree Setup for Parallel Development
+
+Simple git worktree setup for running multiple AI coding assistants in parallel.
+
+## Why Worktrees?
+
+Instead of Docker complexity, use git worktrees to create isolated working directories:
+
+✅ **Editor Agnostic** - Works with Cursor, Windsurf, VS Code, Claude Code, etc.
+✅ **Simple** - No Docker, no containers, just git
+✅ **Fast** - Instant setup, shared git history
+✅ **Flexible** - Each worktree can be on a different branch
+✅ **Task Master Works** - Full access to `.taskmaster/` in each worktree
+
+## Quick Start
+
+### 1. Create a Worktree
+
+```bash
+# Using current branch as base
+./scripts/create-worktree.sh
+
+# Or specify a branch name
+./scripts/create-worktree.sh feature/my-feature
+```
+
+This creates a worktree in `../claude-task-master-worktrees/<branch-name>/`
+
+### 2. Open in Your Editor
+
+```bash
+# Navigate to the worktree
+cd ../claude-task-master-worktrees/auto-main/  # (or whatever branch)
+
+# Open with your preferred AI editor
+cursor .        # Cursor
+code .          # VS Code
+windsurf .      # Windsurf
+claude          # Claude Code CLI
+```
+
+### 3. Work in Parallel
+
+**Main directory** (where you are now):
+```bash
+# Keep working normally
+git checkout main
+cursor .
+```
+
+**Worktree directory**:
+```bash
+cd ../claude-task-master-worktrees/auto-main/
+# Different files, different branch, same git repo
+claude
+```
+
+## Usage Examples
+
+### Example 1: Let Claude Work Autonomously
+
+```bash
+# Create worktree
+./scripts/create-worktree.sh auto/taskmaster-work
+
+# Navigate there
+cd ../claude-task-master-worktrees/auto-taskmaster-work/
+
+# Start Claude
+claude
+
+# In Claude session
+> Use task-master to get the next task and complete it
+```
+
+**Meanwhile in your main directory:**
+```bash
+# You keep working normally
+cursor .
+# No conflicts!
+```
+
+### Example 2: Multiple AI Assistants in Parallel
+
+```bash
+# Create multiple worktrees
+./scripts/create-worktree.sh cursor/feature-a
+./scripts/create-worktree.sh claude/feature-b
+./scripts/create-worktree.sh windsurf/feature-c
+
+# Terminal 1
+cd ../claude-task-master-worktrees/cursor-feature-a/
+cursor .
+
+# Terminal 2
+cd ../claude-task-master-worktrees/claude-feature-b/
+claude
+
+# Terminal 3
+cd ../claude-task-master-worktrees/windsurf-feature-c/
+windsurf .
+```
+
+### Example 3: Test vs Implementation
+
+```bash
+# Main directory: Write implementation
+cursor .
+
+# Worktree: Have Claude write tests
+cd ../claude-task-master-worktrees/auto-main/
+claude -p "Write tests for the recent changes in the main branch"
+```
+
+## How It Works
+
+### Directory Structure
+
+```
+/Volumes/Workspace/workspace/contrib/task-master/
+├── claude-task-master/              # Main directory (this one)
+│   ├── .git/                        # Shared git repo
+│   ├── .taskmaster/                 # Synced via git
+│   └── your code...
+│
+└── claude-task-master-worktrees/    # Worktrees directory
+    ├── auto-main/                   # Worktree 1
+    │   ├── .git -> (points to main .git)
+    │   ├── .taskmaster/             # Same tasks, synced
+    │   └── your code... (on branch auto/main)
+    │
+    └── feature-x/                   # Worktree 2
+        ├── .git -> (points to main .git)
+        ├── .taskmaster/
+        └── your code... (on branch feature/x)
+```
+
+### Shared Git Repository
+
+All worktrees share the same `.git`:
+- Commits in one worktree are immediately visible in others
+- Branches are shared
+- Git history is shared
+- Only the working files differ
+
+## Task Master in Worktrees
+
+Task Master works perfectly in worktrees:
+
+```bash
+# In any worktree
+task-master list        # Same tasks
+task-master next        # Same task queue
+task-master show 1.2    # Same task data
+
+# Changes are shared (if committed/pushed)
+```
+
+### Recommended Workflow
+
+Use **tags** to separate task contexts:
+
+```bash
+# Main directory - use default tag
+task-master list
+
+# Worktree 1 - use separate tag
+cd ../claude-task-master-worktrees/auto-main/
+task-master add-tag --name=claude-auto
+task-master use-tag --name=claude-auto
+task-master list  # Shows claude-auto tasks only
+```
+
+## Managing Worktrees
+
+### List All Worktrees
+
+```bash
+./scripts/list-worktrees.sh
+
+# Or directly with git
+git worktree list
+```
+
+### Remove a Worktree
+
+```bash
+# Remove specific worktree
+git worktree remove ../claude-task-master-worktrees/auto-main/
+
+# Or if there are uncommitted changes, force it
+git worktree remove --force ../claude-task-master-worktrees/auto-main/
+```
+
+### Sync Changes Between Worktrees
+
+Changes are automatically synced through git:
+
+```bash
+# In worktree
+git add .
+git commit -m "feat: implement feature"
+git push
+
+# In main directory
+git pull
+# Changes are now available
+```
+
+## Common Workflows
+
+### 1. Autonomous Claude with Task Master
+
+**Setup:**
+```bash
+./scripts/create-worktree.sh auto/claude-work
+cd ../claude-task-master-worktrees/auto-claude-work/
+```
+
+**Run:**
+```bash
+# Copy the autonomous script
+cp ../claude-task-master/run-autonomous-tasks.sh .
+
+# Run Claude autonomously
+./run-autonomous-tasks.sh
+```
+
+**Monitor from main directory:**
+```bash
+# In another terminal, in main directory
+watch -n 5 "task-master list"
+```
+
+### 2. Code Review Workflow
+
+**Main directory:**
+```bash
+# You write code
+cursor .
+git add .
+git commit -m "feat: new feature"
+```
+
+**Worktree:**
+```bash
+cd ../claude-task-master-worktrees/auto-main/
+git pull
+
+# Have Claude review
+claude -p "Review the latest commit and suggest improvements"
+```
+
+### 3. Parallel Feature Development
+
+**Worktree 1 (Backend):**
+```bash
+./scripts/create-worktree.sh backend/api
+cd ../claude-task-master-worktrees/backend-api/
+cursor .
+# Work on API
+```
+
+**Worktree 2 (Frontend):**
+```bash
+./scripts/create-worktree.sh frontend/ui
+cd ../claude-task-master-worktrees/frontend-ui/
+windsurf .
+# Work on UI
+```
+
+**Main directory:**
+```bash
+# Monitor and merge
+git log --all --graph --oneline
+```
+
+## Tips
+
+### 1. Branch Naming Convention
+
+Use prefixes to organize:
+- `auto/*` - For autonomous AI work
+- `cursor/*` - For Cursor-specific features
+- `claude/*` - For Claude-specific features
+- `review/*` - For code review worktrees
+
+### 2. Commit Often in Worktrees
+
+Worktrees make it easy to try things:
+```bash
+# In worktree
+git commit -m "experiment: trying approach X"
+# If it doesn't work, just delete the worktree
+git worktree remove .
+```
+
+### 3. Use Different npm Dependencies
+
+Each worktree can have different `node_modules`:
+```bash
+# Main directory
+npm install
+
+# Worktree (different dependencies)
+cd ../claude-task-master-worktrees/auto-main/
+npm install
+# Installs independently
+```
+
+### 4. .env Files
+
+Each worktree can have its own `.env`:
+```bash
+# Main directory
+echo "API_URL=http://localhost:3000" > .env
+
+# Worktree
+cd ../claude-task-master-worktrees/auto-main/
+echo "API_URL=http://localhost:4000" > .env
+# Different config!
+```
+
+## Cleanup
+
+### Remove All Worktrees
+
+```bash
+# List and manually remove
+./scripts/list-worktrees.sh
+
+# Remove each one
+git worktree remove ../claude-task-master-worktrees/auto-main/
+git worktree remove ../claude-task-master-worktrees/feature-x/
+
+# Or remove all at once (careful!)
+rm -rf ../claude-task-master-worktrees/
+git worktree prune  # Clean up git's worktree metadata
+```
+
+### Delete Remote Branches
+
+```bash
+# After merging/done with branches
+git branch -d auto/claude-work
+git push origin --delete auto/claude-work
+```
+
+## Troubleshooting
+
+### "Cannot create worktree: already exists"
+
+```bash
+# Remove the existing worktree first
+git worktree remove ../claude-task-master-worktrees/auto-main/
+```
+
+### "Branch already checked out"
+
+Git won't let you check out the same branch in multiple worktrees:
+```bash
+# Use a different branch name
+./scripts/create-worktree.sh auto/main-2
+```
+
+### Changes Not Syncing
+
+Worktrees don't auto-sync files. Use git:
+```bash
+# In worktree with changes
+git add .
+git commit -m "changes"
+git push
+
+# In other worktree
+git pull
+```
+
+### npm install Fails
+
+Each worktree needs its own `node_modules`:
+```bash
+cd ../claude-task-master-worktrees/auto-main/
+npm install
+```
+
+## Comparison to Docker
+
+| Feature | Git Worktrees | Docker |
+|---------|---------------|--------|
+| Setup time | Instant | Minutes (build) |
+| Disk usage | Minimal (shared .git) | GBs per container |
+| Editor support | Native (any editor) | Limited (need special setup) |
+| File sync | Via git | Via volumes (can be slow) |
+| Resource usage | None (native) | RAM/CPU overhead |
+| Complexity | Simple (just git) | Complex (Dockerfile, compose, etc.) |
+| npm install | Per worktree | Per container |
+| AI editor support | ✅ All editors work | ⚠️ Need web-based or special config |
+
+**TL;DR: Worktrees are simpler, faster, and more flexible for this use case.**
+
+---
+
+## Summary
+
+```bash
+# 1. Create worktree
+./scripts/create-worktree.sh auto/claude-work
+
+# 2. Open in AI editor
+cd ../claude-task-master-worktrees/auto-claude-work/
+cursor .   # or claude, windsurf, code, etc.
+
+# 3. Work in parallel
+# Main directory: You work
+# Worktree: AI works
+# No conflicts!
+```
+
+**Simple, fast, editor-agnostic.** 🚀
--- a/docs/models.md
+++ b/docs/models.md
@@ -1,4 +1,4 @@
-# Available Models as of October 5, 2025
+# Available Models as of October 18, 2025

 ## Main Models

@@ -8,8 +8,11 @@
 | anthropic   | claude-opus-4-20250514                         | 0.725     | 15         | 75          |
 | anthropic   | claude-3-7-sonnet-20250219                     | 0.623     | 3          | 15          |
 | anthropic   | claude-3-5-sonnet-20241022                     | 0.49      | 3          | 15          |
+| anthropic   | claude-sonnet-4-5-20250929                     | 0.73      | 3          | 15          |
+| anthropic   | claude-haiku-4-5-20251001                      | 0.45      | 1          | 5           |
 | claude-code | opus                                           | 0.725     | 0          | 0           |
 | claude-code | sonnet                                         | 0.727     | 0          | 0           |
+| claude-code | haiku                                          | 0.45      | 0          | 0           |
 | codex-cli   | gpt-5                                          | 0.749     | 0          | 0           |
 | codex-cli   | gpt-5-codex                                    | 0.749     | 0          | 0           |
 | mcp         | mcp-sampling                                   | —         | 0          | 0           |
@@ -102,6 +105,7 @@
 | ----------- | -------------------------------------------- | --------- | ---------- | ----------- |
 | claude-code | opus                                         | 0.725     | 0          | 0           |
 | claude-code | sonnet                                       | 0.727     | 0          | 0           |
+| claude-code | haiku                                        | 0.45      | 0          | 0           |
 | codex-cli   | gpt-5                                        | 0.749     | 0          | 0           |
 | codex-cli   | gpt-5-codex                                  | 0.749     | 0          | 0           |
 | mcp         | mcp-sampling                                 | —         | 0          | 0           |
@@ -142,8 +146,11 @@
 | anthropic   | claude-opus-4-20250514                         | 0.725     | 15         | 75          |
 | anthropic   | claude-3-7-sonnet-20250219                     | 0.623     | 3          | 15          |
 | anthropic   | claude-3-5-sonnet-20241022                     | 0.49      | 3          | 15          |
+| anthropic   | claude-sonnet-4-5-20250929                     | 0.73      | 3          | 15          |
+| anthropic   | claude-haiku-4-5-20251001                      | 0.45      | 1          | 5           |
 | claude-code | opus                                           | 0.725     | 0          | 0           |
 | claude-code | sonnet                                         | 0.727     | 0          | 0           |
+| claude-code | haiku                                          | 0.45      | 0          | 0           |
 | codex-cli   | gpt-5                                          | 0.749     | 0          | 0           |
 | codex-cli   | gpt-5-codex                                    | 0.749     | 0          | 0           |
 | mcp         | mcp-sampling                                   | —         | 0          | 0           |
--- a/images/hamster-hiring.png
+++ b/images/hamster-hiring.png
--- a/mcp-server/server.js
+++ b/mcp-server/server.js
@@ -7,6 +7,9 @@ import logger from './src/logger.js';
 // Load environment variables
 dotenv.config();

+// Set MCP mode to silence tm-core console output
+process.env.TASK_MASTER_MCP = 'true';
+
 /**
 * Start the MCP server
 */
--- a/mcp-server/src/index.js
+++ b/mcp-server/src/index.js
@@ -4,12 +4,14 @@ import dotenv from 'dotenv';
 import { fileURLToPath } from 'url';
 import fs from 'fs';
 import logger from './logger.js';
-import { registerTaskMasterTools } from './tools/index.js';
+import {
+	registerTaskMasterTools,
+	getToolsConfiguration
+} from './tools/index.js';
 import ProviderRegistry from '../../src/provider-registry/index.js';
 import { MCPProvider } from './providers/mcp-provider.js';
 import packageJson from '../../package.json' with { type: 'json' };

-// Load environment variables
 dotenv.config();

 // Constants
@@ -29,12 +31,10 @@ class TaskMasterMCPServer {
 		this.server = new FastMCP(this.options);
 		this.initialized = false;

-		// Bind methods
 		this.init = this.init.bind(this);
 		this.start = this.start.bind(this);
 		this.stop = this.stop.bind(this);

-		// Setup logging
 		this.logger = logger;
 	}

@@ -44,8 +44,34 @@ class TaskMasterMCPServer {
 	async init() {
 		if (this.initialized) return;

-		// Pass the manager instance to the tool registration function
-		registerTaskMasterTools(this.server, this.asyncManager);
+		const normalizedToolMode = getToolsConfiguration();
+
+		this.logger.info('Task Master MCP Server starting...');
+		this.logger.info(`Tool mode configuration: ${normalizedToolMode}`);
+
+		const registrationResult = registerTaskMasterTools(
+			this.server,
+			normalizedToolMode
+		);
+
+		this.logger.info(
+			`Normalized tool mode: ${registrationResult.normalizedMode}`
+		);
+		this.logger.info(
+			`Registered ${registrationResult.registeredTools.length} tools successfully`
+		);
+
+		if (registrationResult.registeredTools.length > 0) {
+			this.logger.debug(
+				`Registered tools: ${registrationResult.registeredTools.join(', ')}`
+			);
+		}
+
+		if (registrationResult.failedTools.length > 0) {
+			this.logger.warn(
+				`Failed to register ${registrationResult.failedTools.length} tools: ${registrationResult.failedTools.join(', ')}`
+			);
+		}

 		this.initialized = true;

--- a/mcp-server/src/logger.js
+++ b/mcp-server/src/logger.js
@@ -78,7 +78,7 @@ function log(level, ...args) {
 		// is responsible for directing logs correctly (e.g., to stderr)
 		// during tool execution without upsetting the client connection.
 		// Logs outside of tool execution (like startup) will go to stdout.
-		console.log(prefix, ...coloredArgs);
+		console.error(prefix, ...coloredArgs);
 	}
 }

--- a/mcp-server/src/tools/README-ZOD-V3.md
+++ b/mcp-server/src/tools/README-ZOD-V3.md
@@ -0,0 +1,67 @@
+# Why MCP Tools Use Zod v3
+
+## Problem
+
+- **FastMCP** uses `xsschema` to convert schemas → outputs JSON Schema **Draft 2020-12**
+- **MCP clients** (Augment IDE, gemini-cli, etc.) only support **Draft-07**
+- Using Zod v4 in tools causes "vendor undefined" errors and tool discovery failures
+
+## Temporary Solution
+
+All MCP tool files import from `zod/v3` instead of `zod`:
+
+```javascript
+import { z } from 'zod/v3';  // ✅ Draft-07 compatible
+// NOT: import { z } from 'zod';  // ❌ Would use Draft 2020-12
+```
+
+### Why This Works
+
+- Zod v4 ships with v3 compatibility at `zod/v3`
+- FastMCP + zod-to-json-schema converts Zod v3 schemas → **Draft-07**
+- This ensures MCP clients can discover and use our tools
+
+### What This Means
+
+- ✅ **MCP tools** → use `zod/v3` (this directory)
+- ✅ **Rest of codebase** → uses `zod` (Zod v4)
+- ✅ **No conflicts** → they're from the same package, just different versions
+
+## When Can We Remove This?
+
+This workaround can be removed when **either**:
+
+1. **FastMCP adds JSON Schema version configuration**
+   - e.g., `new FastMCP({ jsonSchema: { target: 'draft-07' } })`
+   - Tracking: https://github.com/punkpeye/fastmcp/issues/189
+
+2. **MCP spec adds Draft 2020-12 support**
+   - Unlikely in the short term
+
+3. **xsschema adds version targeting**
+   - Would allow FastMCP to use Draft-07
+
+## How to Maintain
+
+When adding new MCP tools:
+
+```javascript
+// ✅ CORRECT
+import { z } from 'zod/v3';
+
+server.addTool({
+  name: 'my_tool',
+  parameters: z.object({ ... }),  // Will use Draft-07
+  execute: async (args) => { ... }
+});
+```
+
+```javascript
+// ❌ WRONG - Will break MCP client compatibility
+import { z } from 'zod';  // Don't do this in mcp-server/src/tools/
+```
+
+---
+
+**Last Updated:** 2025-10-18
+**Affects:** All files in `mcp-server/src/tools/`
--- a/Show More
+++ b/Show More
Author	SHA1	Message	Date
Ralph Khreish	c1cd85e667	chore: exit rc mode	2025-10-18 18:39:54 +02:00
github-actions[bot]	d3f8e0c52d	chore: rc version bump	2025-10-18 16:38:31 +00:00
Ralph Khreish	dc6652ccd2	fix: temporary fix, revert zod schema definitions for mcp tools to zod v3 (#1323 )	2025-10-18 18:34:40 +02:00
Ralph Khreish	518d7ea8dc	chore: fix logging error in mcp stdio (#1322 )	2025-10-18 16:34:11 +02:00
Ralph Khreish	ccb87a516a	feat: implement tdd workflow (#1309 ) Co-authored-by: Claude <noreply@anthropic.com> Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>	2025-10-18 16:29:03 +02:00
github-actions[bot]	b8830d9508	docs: Auto-update and format models.md	2025-10-18 09:13:05 +00:00
Ralph Khreish	548beb4344	feat: add sonnet and haiku to supported providers (#1317 )	2025-10-18 11:12:48 +02:00
Ralph Khreish	555da2b5b9	fix: downgrade log level to silent (#1321 )	2025-10-18 11:03:27 +02:00
Ralph Khreish	662e3865f3	feat: handle new command errors better (#1318 )	2025-10-16 22:31:50 +02:00
Ralph Khreish	8649c8a347	chore: apply requested coderabbit changes	2025-10-16 19:24:29 +02:00
Ralph Khreish	f7cab246b0	fix: storage issue (show file or api more consistently)	2025-10-16 19:24:29 +02:00
Ralph Khreish	5aca107827	fix: runtime env variables working with new tm_ env variables	2025-10-16 19:24:29 +02:00
Ralph Khreish	fb68c9fe1f	feat: improve auth login by adding context selection immediately after logging in	2025-10-16 19:24:29 +02:00
Ralph Khreish	ff3bd7add8	chore: CI format	2025-10-16 19:24:29 +02:00
Ralph Khreish	c8228e913b	feat: show brief title when listing brief instead of uuid - add search for brief selection	2025-10-16 19:24:29 +02:00
Ralph Khreish	218b68a31e	feat: implement runtime and build time env variables for remote access	2025-10-16 19:24:29 +02:00
Ralph Khreish	6bc75c0ac6	fix: auth refresh (#1314 )	2025-10-15 17:32:15 +02:00
Ralph Khreish	d7fca1844f	feat: add "next" command to new command structure (#1312 )	2025-10-15 15:26:34 +02:00
Ben Coombs	a98d96ef04	fix: standardize UI box width calculations across components (#1305 ) Co-authored-by: Ralph Khreish <35776126+Crunchyman-ralph@users.noreply.github.com>	2025-10-14 20:08:11 +02:00
Karol Fabjańczuk	a69d8c91dc	feat: add configurable MCP tool loading to reduce LLM context usage (#1181 ) Co-authored-by: Ralph Khreish <35776126+Crunchyman-ralph@users.noreply.github.com>	2025-10-14 20:01:21 +02:00
Ralph Khreish	474a86cebb	Merge pull request #1308 from eyaltoledano/ralph/chore/update.from.main.october	2025-10-14 18:49:08 +02:00
Ben Coombs	3283506444	fix: enhance findProjectRoot to traverse parent directories (#1302 ) Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: Ralph Khreish <35776126+Crunchyman-ralph@users.noreply.github.com>	2025-10-14 18:32:10 +02:00
github-actions[bot]	9acb900153	Version Packages (#1303 ) Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: Ralph Khreish <35776126+Crunchyman-ralph@users.noreply.github.com>	2025-10-14 11:47:13 +02:00
Ralph Khreish	c4f5d89e72	Merge pull request #1297 from eyaltoledano/next	2025-10-14 10:08:08 +02:00
Ralph Khreish	e308cf4f46	chore: exit pre mode	2025-10-13 22:46:24 +02:00
Ralph Khreish	11b7354010	fix: export url (#1288 )	2025-10-13 21:51:19 +02:00
Ralph Khreish	4c1ef2ca94	fix: auth refresh token (#1299 ) Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>	2025-10-13 21:50:22 +02:00
Ralph Khreish	663aa2dfe9	chore: fix CI	2025-10-12 17:10:24 +02:00
Ralph Khreish	8f60a0561e	chore: add hiring banner	2025-10-12 16:52:58 +02:00