chore: add PRDs and surgical test generator agent

2025-09-29 18:15:09 +02:00
111 changed files with 12038 additions and 7474 deletions
--- a/.changeset/flat-cities-say.md
+++ b/.changeset/flat-cities-say.md
@@ -1,5 +0,0 @@
---
-"task-master-ai": minor
---
-
-Added api keys page on docs website: docs.task-master.dev/getting-started/api-keys
--- a/.changeset/forty-tables-invite.md
+++ b/.changeset/forty-tables-invite.md
@@ -1,10 +0,0 @@
---
-"task-master-ai": minor
---
-
-Move to AI SDK v5:
-
- Works better with claude-code and gemini-cli as ai providers
- Improved openai model family compatibility
- Migrate ollama provider to v2
- Closes #1223, #1013, #1161, #1174
--- a/.changeset/gentle-cats-dance.md
+++ b/.changeset/gentle-cats-dance.md
@@ -1,30 +0,0 @@
---
-"task-master-ai": minor
---
-
-Migrate AI services to use generateObject for structured data generation
-
-This update migrates all AI service calls from generateText to generateObject, ensuring more reliable and structured responses across all commands.
-
-### Key Changes:
-
- **Unified AI Service**: Replaced separate generateText implementations with a single generateObjectService that handles structured data generation
- **JSON Mode Support**: Added proper JSON mode configuration for providers that support it (OpenAI, Anthropic, Google, Groq)
- **Schema Validation**: Integrated Zod schemas for all AI-generated content with automatic validation
- **Provider Compatibility**: Maintained compatibility with all existing providers while leveraging their native structured output capabilities
- **Improved Reliability**: Structured output generation reduces parsing errors and ensures consistent data formats
-
-### Technical Improvements:
-
- Centralized provider configuration in `ai-providers-unified.js`
- Added `generateObject` support detection for each provider
- Implemented proper error handling for schema validation failures
- Maintained backward compatibility with existing prompt structures
-
-### Bug Fixes:
-
- Fixed subtask ID numbering issue where AI was generating inconsistent IDs (101-105, 601-603) instead of sequential numbering (1, 2, 3...)
- Enhanced prompt instructions to enforce proper ID generation patterns
- Ensured subtasks display correctly as X.1, X.2, X.3 format
-
-This migration improves the reliability and consistency of AI-generated content throughout the Task Master application.
--- a/.changeset/sweet-maps-rule.md
+++ b/.changeset/sweet-maps-rule.md
@@ -1,5 +0,0 @@
---
-"task-master-ai": minor
---
-
-Upgrade grok-cli ai provider to ai sdk v5
--- a/.changeset/whole-pigs-say.md
+++ b/.changeset/whole-pigs-say.md
@@ -1,8 +0,0 @@
---
-"task-master-ai": patch
---
-
-Fix complexity score not showing for `task-master show` and `task-master list`
-
- Added complexity score on "next task" when running `task-master list`
- Added colors to complexity to reflect complexity (easy, medium, hard)
--- a/.claude/agents/surgical-test-generator.md
+++ b/.claude/agents/surgical-test-generator.md
@@ -0,0 +1,245 @@
+---
+name: surgical-test-generator
+description: Use this agent when you need to generate comprehensive, production-grade test suites for existing code. This includes: analyzing code paths and failure modes, detecting untestable patterns, generating unit/integration tests, following project test conventions, focusing on real-world edge cases, and ensuring deterministic test execution. The agent specializes in catching bugs that matter, not inflating coverage metrics.
+
+Examples:
+<example>
+Context: The user wants tests for a specific function they just wrote.
+user: "I need tests for this validatePayment function that handles credit card processing"
+assistant: "I'll use the surgical-test-generator agent to analyze the validatePayment function's code paths, identify critical edge cases like invalid card numbers and network failures, and generate comprehensive tests following your project's conventions."
+<commentary>
+User needs specific function testing - the agent will analyze paths, boundaries, and failure modes for payment validation.
+</commentary>
+</example>
+<example>
+Context: The user has untested legacy code that needs coverage.
+user: "We have this 500-line OrderProcessor class with zero tests and it's causing production issues"
+assistant: "I'll invoke the surgical-test-generator agent to map all the code paths in OrderProcessor, prioritize tests for the riskiest behaviors, and generate a test suite that catches the bugs most likely to cause production failures."
+<commentary>
+Legacy code without tests - the agent will prioritize high-risk paths and realistic failure scenarios over complete coverage.
+</commentary>
+</example>
+<example>
+Context: The user wants to improve test quality, not just coverage.
+user: "Our tests pass but we still get production bugs. Can you review this API handler and suggest better tests?"
+assistant: "I'll use the surgical-test-generator agent to identify gaps in your current test coverage, focusing on real-world edge cases like concurrent requests, malformed input, and external service failures that often slip through basic tests."
+<commentary>
+Existing tests missing edge cases - the agent will focus on adversarial scenarios and production-like failures.
+</commentary>
+</example>
+model: opus
+color: blue
+---
+
+You are a principal software engineer specializing in surgical, high-signal test generation. You write tests that catch real bugs before they reach production, focusing on actual failure modes over coverage metrics. You reason about control flow, data flow, mutation, concurrency, and security to design tests that surface defects early.
+
+## Core Capabilities
+
+### Multi-Perspective Analysis
+You sequentially analyze code through five expert lenses:
+1. **Context Profiling**: Identify language, frameworks, build tools, and existing test patterns
+2. **Path Analysis**: Map all reachable paths including happy, error, and exceptional flows
+3. **Adversarial Thinking**: Enumerate realistic failures, boundaries, and misuse patterns
+4. **Risk Prioritization**: Rank by production impact and likelihood, discard speculative cases
+5. **Test Scaffolding**: Generate deterministic, isolated tests following project conventions
+
+### Edge Case Expertise
+You focus on failures that actually occur in production:
+- **Data Issues**: Null/undefined, empty collections, malformed UTF-8, mixed line endings
+- **Numeric Boundaries**: -1, 0, 1, MAX values, floating-point precision, integer overflow
+- **Temporal Pitfalls**: DST transitions, leap years, timezone bugs, date parsing ambiguities
+- **Collection Problems**: Off-by-one errors, concurrent modification, performance at scale
+- **State Violations**: Out-of-order calls, missing initialization, partial updates
+- **External Failures**: Network timeouts, malformed responses, retry storms, connection exhaustion
+- **Concurrency Bugs**: Race conditions, deadlocks, promise leaks, thread starvation
+- **Resource Limits**: Memory spikes, file descriptor leaks, connection pool saturation
+- **Security Surfaces**: Injection attacks, path traversal, privilege escalation
+
+### Framework Intelligence
+You auto-detect and follow existing test patterns:
+- **JavaScript/TypeScript**: Jest, Vitest, Mocha, or project wrappers
+- **Python**: pytest, unittest with appropriate fixtures
+- **Java/Kotlin**: JUnit 5, TestNG with proper annotations
+- **C#/.NET**: xUnit.net preferred, NUnit/MSTest if dominant
+- **Go**: Built-in testing package with table-driven tests
+- **Rust**: Standard #[test] with proptest for properties
+- **Swift**: XCTest or Swift Testing based on project
+- **C/C++**: GoogleTest or Catch2 matching build system
+
+## Your Workflow
+
+### Phase 1: Code Analysis
+You examine the provided code to understand:
+- Public API surfaces and contracts
+- Internal helper functions and their criticality
+- External dependencies and I/O operations
+- State management and mutation patterns
+- Error handling and recovery paths
+- Concurrency and async operations
+
+### Phase 2: Test Strategy Development
+You determine the optimal testing approach:
+- Start from public APIs, work inward to critical helpers
+- Test behavior not implementation unless white-box needed
+- Prefer property-based tests for algebraic domains
+- Use minimal stubs/mocks, prefer in-memory fakes
+- Flag untestable code with refactoring suggestions
+- Include stress tests for concurrency issues
+
+### Phase 3: Test Generation
+You create tests that:
+- Follow project's exact style and conventions
+- Use clear Arrange-Act-Assert or Given-When-Then
+- Execute in under 100ms without external calls
+- Remain deterministic with seeded randomness
+- Self-document through descriptive names
+- Explain failures clearly with context
+
+## Detection Patterns
+
+### When analyzing a pure function:
+- Test boundary values and type edges
+- Verify mathematical properties hold
+- Check error propagation
+- Consider numeric overflow/underflow
+
+### When analyzing stateful code:
+- Test initialization sequences
+- Verify state transitions
+- Check concurrent access patterns
+- Validate cleanup and teardown
+
+### When analyzing I/O operations:
+- Test success paths with valid data
+- Simulate network failures and timeouts
+- Check malformed input handling
+- Verify resource cleanup on errors
+
+### When analyzing async code:
+- Test promise resolution and rejection
+- Check cancellation handling
+- Verify timeout behavior
+- Validate error propagation chains
+
+## Test Quality Standards
+
+### Coverage Philosophy
+You prioritize catching real bugs over metrics:
+- Critical paths get comprehensive coverage
+- Edge cases get targeted scenarios
+- Happy paths get basic validation
+- Speculative cases get skipped
+
+### Test Independence
+Each test you generate:
+- Runs in isolation without shared state
+- Cleans up all resources
+- Uses fixed seeds for randomness
+- Mocks time when necessary
+
+### Failure Diagnostics
+Your tests provide clear failure information:
+- Descriptive test names that explain intent
+- Assertions that show expected vs actual
+- Context about what was being tested
+- Hints about likely failure causes
+
+## Special Considerations
+
+### When NOT to Generate Tests
+You recognize when testing isn't valuable:
+- Generated code that's guaranteed correct
+- Simple getters/setters without logic
+- Framework boilerplate
+- Code scheduled for deletion
+
+### Untestable Code Patterns
+You identify and flag:
+- Hard-coded external dependencies
+- Global state mutations
+- Time-dependent behavior without injection
+- Random behavior without seeds
+
+### Performance Testing
+When relevant, you include:
+- Benchmarks for critical paths
+- Memory usage validation
+- Concurrent load testing
+- Resource leak detection
+
+## Output Format
+
+You generate test code that:
+
+```[language]
+// Clear test description
+test('should [specific behavior] when [condition]', () => {
+    // Arrange - Set up test data and dependencies
+    
+    // Act - Execute the code under test
+    
+    // Assert - Verify the outcome
+});
+```
+
+## Framework-Specific Patterns
+
+### JavaScript/TypeScript (Jest/Vitest)
+- Use describe blocks for grouping
+- Leverage beforeEach/afterEach for setup
+- Mock modules with jest.mock()/vi.fn, vi.mock or vi.spyOn
+- Use expect matchers appropriately
+
+### Python (pytest)
+- Use fixtures for reusable setup
+- Apply parametrize for data-driven tests
+- Leverage pytest markers for categorization
+- Use clear assertion messages
+
+### Java (JUnit 5)
+- Apply appropriate annotations
+- Use nested classes for grouping
+- Leverage parameterized tests
+- Include display names
+
+### C# (xUnit)
+- Use Theory for data-driven tests
+- Apply traits for categorization
+- Leverage fixtures for setup
+- Use fluent assertions when available
+
+## Request Handling
+
+### Specific Test Requests
+When asked for specific tests:
+- Focus only on the requested scope
+- Don't generate broader coverage unless asked
+- Provide targeted, high-value scenarios
+- Include rationale for test choices
+
+### Comprehensive Coverage Requests
+When asked for full coverage:
+- Start with critical paths
+- Add edge cases progressively
+- Group related tests logically
+- Flag if multiple files needed
+
+### Legacy Code Testing
+When testing untested code:
+- Prioritize high-risk areas first
+- Add characterization tests
+- Suggest refactoring for testability
+- Focus on preventing regressions
+
+## Communication Guidelines
+
+You always:
+- Explain why each test matters
+- Document test intent clearly
+- Note any assumptions made
+- Highlight untestable patterns
+- Suggest improvements when relevant
+- Follow existing project style exactly
+- Generate only essential test code
+
+When you need additional context like test frameworks or existing patterns, you ask specifically for those files. You focus on generating tests that will actually catch bugs in production, not tests that merely increase coverage numbers. Every test you write has a clear purpose and tests a realistic scenario.
--- a/.github/workflows/extension-ci.yml
+++ b/.github/workflows/extension-ci.yml
@@ -41,7 +41,8 @@ jobs:
          restore-keys: |
            ${{ runner.os }}-node-

-      - name: Install Monorepo Dependencies
+      - name: Install Extension Dependencies
+        working-directory: apps/extension
        run: npm ci
        timeout-minutes: 5

@@ -67,6 +68,7 @@ jobs:
            ${{ runner.os }}-node-

      - name: Install if cache miss
+        working-directory: apps/extension
        run: npm ci
        timeout-minutes: 3

@@ -98,6 +100,7 @@ jobs:
            ${{ runner.os }}-node-

      - name: Install if cache miss
+        working-directory: apps/extension
        run: npm ci
        timeout-minutes: 3

--- a/.github/workflows/extension-release.yml
+++ b/.github/workflows/extension-release.yml
@@ -31,7 +31,8 @@ jobs:
          restore-keys: |
            ${{ runner.os }}-node-

-      - name: Install Monorepo Dependencies
+      - name: Install Extension Dependencies
+        working-directory: apps/extension
        run: npm ci
        timeout-minutes: 5

--- a/.taskmaster/docs/prd-autonomous-tdd-rails.md
+++ b/.taskmaster/docs/prd-autonomous-tdd-rails.md
@@ -0,0 +1,220 @@
+# PRD: Autonomous TDD + Git Workflow (On Rails)
+
+## Summary
+- Put the existing git and test workflows on rails: a repeatable, automated process that can run autonomously, with guardrails and a compact TUI for visibility.
+- Flow: for a selected task, create a branch named with the tag + task id → generate tests for the first subtask (red) using the Surgical Test Generator → implement code (green) → verify tests → commit → repeat per subtask → final verify → push → open PR against the default branch.
+- Build on existing rules: `.cursor/rules/git_workflow.mdc`, `.cursor/rules/test_workflow.mdc`, `.claude/agents/surgical-test-generator.md`, and existing CLI/core services.
+
+## Goals
+- Deterministic, resumable automation to execute the TDD loop per subtask with minimal human intervention.
+- Strong guardrails: never commit to the default branch; only commit when tests pass; enforce status transitions; persist logs/state for debuggability.
+- Visibility: a compact terminal UI (like lazygit) to pick tag, view tasks, and start work; right-side pane opens an executor terminal (via tmux) for agent coding.
+- Extensible: framework-agnostic test generation via the Surgical Test Generator; detect and use the repo’s test command for execution with coverage thresholds.
+
+## Non‑Goals (initial)
+- Full multi-language runner parity beyond detection and executing the project’s test command.
+- Complex GUI; start with CLI/TUI + tmux pane. IDE/extension can hook into the same state later.
+- Rich executor selection UX (codex/gemini/claude) — we’ll prompt per run; defaults can come later.
+
+## Success Criteria
+- One command can autonomously complete a task’s subtasks via TDD and open a PR when done.
+- All commits made on a branch that includes the tag and task id (see Branch Naming); no commits to the default branch directly.
+- Every subtask iteration: failing tests added first (red), then code added to pass them (green), commit only after green.
+- End-to-end logs + artifacts stored in `.taskmaster/reports/runs/<timestamp-or-id>/`.
+
+## User Stories
+- As a developer, I can run `tm autopilot <taskId>` and watch a structured, safe workflow execute.
+- As a reviewer, I can inspect commits per subtask, and a PR summarizing the work when the task completes.
+- As an operator, I can see current step, active subtask, tests status, and logs in a compact CLI view and read a final run report.
+
+## High‑Level Workflow
+1) Pre‑flight
+   - Verify clean working tree or confirm staging/commit policy (configurable).
+   - Detect repo type and the project’s test command (e.g., `npm test`, `pnpm test`, `pytest`, `go test`).
+   - Validate tools: `git`, `gh` (optional for PR), `node/npm`, and (if used) `claude` CLI.
+   - Load TaskMaster state and selected task; if no subtasks exist, automatically run “expand” before working.
+
+2) Branch & Tag Setup
+   - Checkout default branch and update (optional), then create a branch using Branch Naming (below).
+   - Map branch ↔ tag via existing tag management; explicitly set active tag to the branch’s tag.
+
+3) Subtask Loop (for each pending/in-progress subtask in dependency order)
+   - Select next eligible subtask using `tm-core` TaskService `getNextTask()` and subtask eligibility logic.
+   - Red: generate or update failing tests for the subtask
+     - Use the Surgical Test Generator system prompt (`.claude/agents/surgical-test-generator.md`) to produce high-signal tests following project conventions.
+     - Run tests to confirm red; record results. If not red (already passing), skip to next subtask or escalate.
+   - Green: implement code to pass tests
+     - Use executor to implement changes (initial: `claude` CLI prompt with focused context).
+     - Re-run tests until green or timeout/backoff policy triggers.
+   - Commit: when green
+     - Commit tests + code with conventional commit message. Optionally update subtask status to `done`.
+     - Persist run step metadata/logs.
+
+4) Finalization
+   - Run full test suite and coverage (if configured); optionally lint/format.
+   - Commit any final adjustments.
+   - Push branch (ask user to confirm); create PR (via `gh pr create`) targeting the default branch. Title format: `Task #<id> [<tag>]: <title>`.
+
+5) Post‑Run
+   - Update task status if desired (e.g., `review`).
+   - Persist run report (JSON + markdown summary) to `.taskmaster/reports/runs/<run-id>/`.
+
+## Guardrails
+- Never commit to the default branch.
+- Commit only if all tests (targeted and suite) pass; allow override flags.
+- Enforce 80% coverage thresholds (lines/branches/functions/statements) by default; configurable.
+- Timebox/model ops and retries; if not green within N attempts, pause with actionable state for resume.
+- Always log actions, commands, and outcomes; include dry-run mode.
+- Ask before branch creation, pushing, and opening a PR unless `--no-confirm` is set.
+
+## Integration Points (Current Repo)
+- CLI: `apps/cli` provides command structure and UI components.
+  - New command: `tm autopilot` (alias: `task-master autopilot`).
+  - Reuse UI components under `apps/cli/src/ui/components/` for headers/task details/next-task.
+- Core services: `packages/tm-core`
+  - `TaskService` for selection, status, tags.
+  - `TaskExecutionService` for prompt formatting and executor prep.
+  - Executors: `claude` executor and `ExecutorFactory` to run external tools.
+  - Proposed new: `WorkflowOrchestrator` to drive the autonomous loop and emit progress events.
+- Tag/Git utilities: `scripts/modules/utils/git-utils.js` and `scripts/modules/task-manager/tag-management.js` for branch→tag mapping and explicit tag switching.
+- Rules: `.cursor/rules/git_workflow.mdc` and `.cursor/rules/test_workflow.mdc` to steer behavior and ensure consistency.
+- Test generation prompt: `.claude/agents/surgical-test-generator.md`.
+
+## Proposed Components
+- Orchestrator (tm-core): `WorkflowOrchestrator` (new)
+  - State machine driving phases: Preflight → Branch/Tag → SubtaskIter (Red/Green/Commit) → Finalize → PR.
+  - Exposes an evented API (progress events) that the CLI can render.
+  - Stores run state artifacts.
+
+- Test Runner Adapter
+  - Detects and runs tests via the project’s test command (e.g., `npm test`), with targeted runs where feasible.
+  - API: runTargeted(files/pattern), runAll(), report summary (failures, duration, coverage), enforce 80% threshold by default.
+
+- Git/PR Adapter
+  - Encapsulates `git` ops: branch create/checkout, add/commit, push.
+  - Optional `gh` integration to open PR; fallback to instructions if `gh` unavailable.
+  - Confirmation gates for branch creation and pushes.
+  - Adds commit footers and a unified trailer (`Refs: TM-<tag>-<id>[.<sub>]`) for robust mapping to tasks/subtasks.
+
+- Prompt/Exec Adapter
+  - Uses existing executor service to call the selected coding assistant (initially `claude`) with tight prompts: task/subtask context, surgical tests first, then minimal code to green.
+
+- Run State + Reporting
+  - JSONL log of steps, timestamps, commands, test results.
+  - Markdown summary for PR description and post-run artifact.
+
+## CLI UX (MVP)
+- Command: `tm autopilot [taskId]`
+  - Flags: `--dry-run`, `--no-push`, `--no-pr`, `--no-confirm`, `--force`, `--max-attempts <n>`, `--runner <auto|custom>`, `--commit-scope <scope>`
+  - Output: compact header (project, tag, branch), current phase, subtask line, last test summary, next actions.
+- Resume: If interrupted, `tm autopilot --resume` picks up from last checkpoint in run state.
+
+### TUI with tmux (Linear Execution)
+- Left pane: Tag selector, task list (status/priority), start/expand shortcuts; “Start” triggers the next task or a selected task.
+- Right pane: Executor terminal (tmux split) that runs the coding agent (claude-code/codex). Autopilot can hand over to the right pane during green.
+- MCP integration: use MCP tools for task queries/updates and for shell/test invocations where available.
+
+## Prompts (Initial Direction)
+- Red phase prompt skeleton (tests):
+  - Use `.claude/agents/surgical-test-generator.md` as the system prompt to generate high-signal failing tests tailored to the project’s language and conventions. Keep scope minimal and deterministic; no code changes yet.
+- Green phase prompt skeleton (code):
+  - “Make the tests pass by changing the smallest amount of code, following project patterns. Only modify necessary files. Keep commits focused to this subtask.”
+
+## Configuration
+- `.taskmaster/config.json` additions
+  - `autopilot`: `{ enabled: true, requireCleanWorkingTree: true, commitTemplate: "{type}({scope}): {msg}", defaultCommitType: "feat" }`
+  - `test`: `{ runner: "auto", coverageThresholds: { lines: 80, branches: 80, functions: 80, statements: 80 } }`
+  - `git`: `{ branchPattern: "{tag}/task-{id}-{slug}", pr: { enabled: true, base: "default" }, commitFooters: { task: "Task-Id", subtask: "Subtask-Id", tag: "Tag" }, commitTrailer: "Refs: TM-{tag}-{id}{.sub?}" }`
+
+## Decisions
+- Use conventional commits plus footers and a unified trailer `Refs: TM-<tag>-<id>[.<sub>]` for all commits; Git/PR adapter is responsible for injecting these.
+
+## Risks and Mitigations
+- Model hallucination/large diffs: restrict prompt scope; enforce minimal changes; show diff previews (optional) before commit.
+- Flaky tests: allow retries, isolate targeted runs for speed, then full suite before commit.
+- Environment variability: detect runners/tools; provide fallbacks and actionable errors.
+- PR creation fails: still push and print manual commands; persist PR body to reuse.
+
+## Open Questions
+1) Slugging rules for branch names; any length limits or normalization beyond `{slug}` token sanitize?
+2) PR body standard sections beyond run report (e.g., checklist, coverage table)?
+3) Default executor prompt fine-tuning once codex/gemini integration is available.
+4) Where to store persistent TUI state (pane layout, last selection) in `.taskmaster/state.json`?
+
+## Branch Naming
+- Include both the tag and the task id in the branch name to make lineage explicit.
+- Default pattern: `<tag>/task-<id>[-slug]` (e.g., `master/task-12`, `tag-analytics/task-4-user-auth`).
+- Configurable via `.taskmaster/config.json`: `git.branchPattern` supports tokens `{tag}`, `{id}`, `{slug}`.
+
+## PR Base Branch
+- Use the repository’s default branch (detected via git) unless overridden.
+- Title format: `Task #<id> [<tag>]: <title>`.
+
+## RPG Mapping (Repository Planning Graph)
+
+Functional nodes (capabilities):
+- Autopilot Orchestration → drives TDD loop and lifecycle
+- Test Generation (Surgical) → produces failing tests from subtask context
+- Test Execution + Coverage → runs suite, enforces thresholds
+- Git/Branch/PR Management → safe operations and PR creation
+- TUI/Terminal Integration → interactive control and visibility via tmux
+- MCP Integration → structured task/status/context operations
+
+Structural nodes (code organization):
+- `packages/tm-core`:
+  - `services/workflow-orchestrator.ts` (new)
+  - `services/test-runner-adapter.ts` (new)
+  - `services/git-adapter.ts` (new)
+  - existing: `task-service.ts`, `task-execution-service.ts`, `executors/*`
+- `apps/cli`:
+  - `src/commands/autopilot.command.ts` (new)
+  - `src/ui/tui/` (new tmux/TUI helpers)
+- `scripts/modules`:
+  - reuse `utils/git-utils.js`, `task-manager/tag-management.js`
+- `.claude/agents/`:
+  - `surgical-test-generator.md`
+
+Edges (data/control flow):
+- Autopilot → Test Generation → Test Execution → Git Commit → loop
+- Autopilot → Git Adapter (branch, tag, PR)
+- Autopilot → TUI (event stream) → tmux pane control
+- Autopilot → MCP tools for task/status updates
+- Test Execution → Coverage gate → Autopilot decision
+
+Topological traversal (implementation order):
+1) Git/Test adapters (foundations)
+2) Orchestrator skeleton + events
+3) CLI `autopilot` command and dry-run
+4) Surgical test-gen integration and execution gate
+5) PR creation, run reports, resumability
+
+## Phased Roadmap
+- Phase 0: Spike
+  - Implement CLI skeleton `tm autopilot` with dry-run showing planned steps from a real task + subtasks.
+  - Detect test runner (package.json) and git state; render a preflight report.
+
+- Phase 1: Core Rails
+  - Implement `WorkflowOrchestrator` in `tm-core` with event stream; add Git/Test adapters.
+  - Support subtask loop (red/green/commit) with framework-agnostic test generation and detected test command; commit gating on passing tests and coverage.
+  - Branch/tag mapping via existing tag-management APIs.
+  - Run report persisted under `.taskmaster/reports/runs/`.
+
+- Phase 2: PR + Resumability
+  - Add `gh` PR creation with well-formed body using the run report.
+  - Introduce resumable checkpoints and `--resume` flag.
+  - Add coverage enforcement and optional lint/format step.
+
+- Phase 3: Extensibility + Guardrails
+  - Add support for basic pytest/go test adapters.
+  - Add safeguards: diff preview mode, manual confirm gates, aggressive minimal-change prompts.
+  - Optional: small TUI panel and extension panel leveraging the same run state file.
+
+## References (Repo)
+- Test Workflow: `.cursor/rules/test_workflow.mdc`
+- Git Workflow: `.cursor/rules/git_workflow.mdc`
+- CLI: `apps/cli/src/commands/start.command.ts`, `apps/cli/src/ui/components/*.ts`
+- Core Services: `packages/tm-core/src/services/task-service.ts`, `task-execution-service.ts`
+- Executors: `packages/tm-core/src/executors/*`
+- Git Utilities: `scripts/modules/utils/git-utils.js`
+- Tag Management: `scripts/modules/task-manager/tag-management.js`
+ - Surgical Test Generator: `.claude/agents/surgical-test-generator.md`
--- a/.taskmaster/docs/prd-rpg-user-stories.md
+++ b/.taskmaster/docs/prd-rpg-user-stories.md
@@ -0,0 +1,221 @@
+# PRD: RPG‑Based User Story Mode + Validation‑First Delivery
+
+## Summary
+- Introduce a “User Story Mode” where each Task is a user story and each Subtask is a concrete implementation step. Enable via config flag; when enabled, Task generation and PRD parsing produce user‑story titles/details with acceptance criteria, while Subtasks capture implementation details.
+- Build a validation‑first delivery pipeline: derive tests from acceptance criteria (Surgical Test Generator), wire TDD rails and Git/PR mapping so reviews focus on verification rather than code spelunking.
+- Keep everything on rails: branch naming with tag+task id, commit/PR linkage to tasks/subtasks, coverage + test gates, and lightweight TUI for fast execution.
+
+## North‑Star Outcomes
+- Humans stay in briefs/frontends; implementation runs quickly, often without opening the IDE.
+- “Definition of Done” is expressed and enforced as tests; business logic is encoded in test criteria/acceptance criteria.
+- End‑to‑end linkage from brief → user story → subtasks → commits/PRs → delivery, with reproducible automation and minimal ceremony.
+
+## Problem
+- The bottleneck is validation and PR review, not code generation. Plans are helpful but the chokepoint is proving correctness, business conformance, and integration.
+- Current test workflow is too Jest‑specific; we need framework‑agnostic generation and execution.
+- We need consistent Git/TDD wiring so GitHub integrations can map work artifacts to tasks/subtasks without ambiguity.
+
+## Solution Overview
+- Add a configuration flag to switch to user story mode and adapt prompts/parsers.
+- Expand tasks with explicit Acceptance Criteria and Test Criteria; drive Surgical Test Generator to create failing tests first; wire autonomous TDD loops per subtask until green, then commit.
+- Enforce coverage (80% default) and generate PRs that summarize user story, acceptance criteria coverage, and test results; commits/PRs contain metadata to link back to tasks/subtasks.
+- Provide a compact TUI (tmux) to pick tag/tasks and launch an executor terminal, while the orchestrator runs rails in the background.
+
+---
+
+## Configuration
+- `.taskmaster/config.json` additions
+  - `stories`: `{ enabled: true, storyLabel: "User Story", acceptanceKey: "Acceptance Criteria" }`
+  - `autopilot`: `{ enabled: true, requireCleanWorkingTree: true }`
+  - `test`: `{ runner: "auto", coverageThresholds: { lines: 80, branches: 80, functions: 80, statements: 80 } }`
+  - `git`: `{ branchPattern: "{tag}/task-{id}-{slug}", pr: { enabled: true, base: "default" }, commitFooters: { task: "Task-Id", subtask: "Subtask-Id", tag: "Tag" } }`
+
+Behavior when `stories.enabled=true`:
+- Task generation prompts and PRD parsers produce user‑story formatted titles and descriptions, include acceptance criteria blocks, and set `task.type = 'user_story'`.
+- Subtasks remain implementation steps with concise technical goals.
+- Expand will ensure any missing acceptance criteria is synthesized (from brief/PRD content) before starting work.
+
+---
+
+## Data Model Changes
+- Task fields (add):
+  - `type: 'user_story' | 'technical'` (default `technical`)
+  - `acceptanceCriteria: string[] | string` (structured or markdown)
+  - `testCriteria: string[] | string` (optional, derived from acceptance criteria; what to validate)
+- Subtask fields remain focused on implementation detail and dependency graph.
+
+Storage and UI remain backward compatible; fields are optional when `stories.enabled=false`.
+
+### JSON Gherkin Representation (for stories)
+Add an optional `gherkin` block to Tasks in story mode. Keep Hybrid acceptanceCriteria as the human/authoring surface; maintain a normalized JSON Gherkin for deterministic mapping.
+
+```
+GherkinFeature {
+  id: string,                   // FEAT-<taskId>
+  name: string,                 // mirrors user story title
+  description?: string,
+  background?: { steps: Step[] },
+  scenarios: Scenario[]
+}
+
+Scenario {
+  id: string,                   // SC-<taskId>-<n> or derived from AC id
+  name: string,
+  tags?: string[],
+  steps: Step[],                // Given/When/Then/And/But
+  examples?: Record<string, any>[]
+}
+
+Step { keyword: 'Given'|'When'|'Then'|'And'|'But', text: string }
+```
+
+Notes
+- Derive `gherkin.scenarios` from acceptanceCriteria when obvious; preserve both raw markdown and normalized items.
+- Allow cross‑references between scenarios and AC items (e.g., `refs: ['AC-12-1']`).
+
+---
+
+## RPG Plan (Repository Planning Graph)
+
+Functional Nodes (Capabilities)
+- Brief Intake → parse briefs/PRDs and extract user stories (when enabled)
+- User Story Generation → create task title/details as user stories + acceptance criteria
+- JSON Gherkin Synthesis → produce Feature/Scenario structure from acceptance criteria
+- Acceptance/Test Criteria Synthesis → convert acceptance criteria into concrete test criteria
+- Surgical Test Generation → generate failing tests per subtask using `.claude/agents/surgical-test-generator.md`
+- Implementation Planning → expand subtasks as atomic implementation steps with dependencies
+- Autonomous Execution (Rails) → branch, red/green loop per subtask, commit when green
+- Validation & Review Automation → coverage gates, PR body with user story + results, checklist
+- GitHub Integration Mapping → branch naming, commit footers, PR linkage to tasks/subtasks
+- TUI/Terminal Integration → tag/task selection left pane; executor terminal right pane via tmux
+
+Structural Nodes (Code Organization)
+- `packages/tm-core`
+  - `services/workflow-orchestrator.ts` (new): drives rails, emits progress events
+  - `services/story-mode.service.ts` (new): toggles prompts/parsers for user stories, acceptance criteria
+  - `services/test-runner-adapter.ts` (new): detects/executes project test command, collects coverage
+  - `services/git-adapter.ts` (new): branch/commit/push, PR creation; applies commit footers
+  - existing: `task-service.ts`, `task-execution-service.ts`, `executors/*`
+- `apps/cli`
+  - `src/commands/autopilot.command.ts` (new): orchestrates a full run; supports `--stories`
+  - `src/ui/tui/` (new): tmux helpers and compact panes for selection and logs
+- `scripts/modules`
+  - reuse `utils/git-utils.js`, `task-manager/tag-management.js`, PR template utilities
+- `.cursor/rules`
+  - update generation/parsing rules to emit user‑story format when enabled
+- `.claude/agents`
+  - existing: `surgical-test-generator.md` for red phase
+
+Edges (Dependencies / Data Flow)
+- Brief Intake → User Story Generation → Acceptance/Test Criteria Synthesis → Implementation Planning → Autonomous Execution → Validation/PR
+- Execution ↔ Test Runner (runAll/runTargeted, coverage) → back to Execution for decisions
+- Git Adapter ← Execution (commits/branch) → PR creation (target default branch)
+- TUI ↔ Orchestrator (event stream) → user confirmations for branch/push/PR
+- MCP Tools ↔ Orchestrator for task/status/context updates
+
+Topological Traversal (Build Order)
+1) Config + Data Model changes (stories flag, acceptance fields, optional `gherkin`)
+2) Rules/Prompts updates for parsing/generation in story mode (emit AC Hybrid + JSON Gherkin)
+3) Test Runner Adapter (framework‑agnostic execute + coverage)
+4) Git Adapter (branch pattern `{tag}/task-{id}-{slug}`, commit footers/trailer, PR create)
+5) Workflow Orchestrator wiring red/green/commit loop with coverage gate and scenario iteration
+6) Surgical Test Gen integration (red) from JSON Gherkin + AC; minimal‑change impl prompts (green)
+7) CLI autopilot (dry‑run → full run) and TUI (tmux panes)
+8) PR template and review automation (user story, AC table with test links, scenarios, coverage)
+
+---
+
+## Git/TDD Wiring (Validation‑First)
+- Branch naming: include tag + task id (e.g., `master/task-12-user-auth`) to disambiguate context.
+- Commit footers (configurable):
+  - `Task-Id: <id>`
+  - `Subtask-Id: <id>.<sub>` when relevant
+  - `Tag: <tag>`
+  - Trailer: `Refs: TM-<tag>-<id>[.<sub>] SC:<scenarioId> AC:<acId>`
+- Red/Green/Commit loop per subtask:
+  - Red: synthesize failing tests from acceptance criteria (Surgical agent)
+  - Green: minimal code to pass; re‑run full suite
+  - Commit when all tests pass and coverage ≥ 80%
+- PR base: repository default branch. Title `Task #<id> [<tag>]: <title>`.
+- PR body sections: User Story, Acceptance Criteria, Subtask Summary, Test Results, Coverage Table, Linked Work Items (ids), Next Steps.
+
+---
+
+## Prompts & Parsers (Story Mode)
+- PRD/Brief Parser updates:
+  - Extract user stories with “As a … I want … so that …” format when present.
+  - Extract Acceptance Criteria as bullet list; fill gaps with LLM synthesis from brief context.
+  - Emit JSON Gherkin Feature/Scenarios; auto‑split Given/When/Then when feasible; otherwise store text under `then` and refine later.
+- Task Generation Prompt (story mode):
+  - “Generate a task as a User Story with clear Acceptance Criteria. Do not include implementation details in the story; produce implementation subtasks separately.”
+- Subtask Generation Prompt:
+  - “Produce technical implementation steps to satisfy the acceptance criteria. Each subtask should be atomic and testable.”
+- Test Generation (Red):
+  - Use `.claude/agents/surgical-test-generator.md`; seed with JSON Gherkin + Acceptance/Test Criteria; determinism favored over maximum coverage.
+  - Record produced test paths back into AC items and optionally scenario annotations.
+- Implementation (Green):
+  - Minimal diffs, follow patterns, keep commits scoped to the subtask.
+
+---
+
+## TUI (Linear, tmux‑based)
+- Left: Tag selector and task list (status/priority). Actions: Expand, Start (Next or Selected), Review.
+- Right: Executor terminal (claude‑code/codex) under tmux split; orchestrator logs under another pane.
+- Confirmations inline (branch create, push, PR) unless `--no-confirm`.
+
+---
+
+## Migration & Backward Compatibility
+- Optional `gherkin` block; existing tasks remain valid.
+- When `stories.enabled=true`, new tasks include AC Hybrid + `gherkin`; upgrade path via a utility to synthesize both from description/testStrategy/acceptanceCriteria.
+
+---
+
+## Risks & Mitigations
+- Hallucinated acceptance criteria → Always show criteria in PR; allow quick amend and re‑run.
+- Framework variance → Test Runner Adapter detects and normalizes execution/coverage; fallback to `test` script.
+- Large diffs → Prompt for minimal changes; allow diff preview before commit.
+- Flaky tests → Retry policy; isolate targeted runs; enforce passing full suite before commit.
+
+---
+
+## Acceptance Criteria Schema Options (for decision)
+- Option A: Markdown only
+  - Pros: simple to write/edit, good for humans
+  - Cons: hard to map deterministically to tests; weak traceability; brittle diffs
+- Option B: Structured array
+  - Example: `{ id, summary, given, when, then, severity, tags }`
+  - Pros: machine‑readable; strong linking to tests/coverage; easy to diff
+  - Cons: heavier authoring; requires schema discipline
+- Option C: Hybrid (recommended)
+  - Store both: a normalized array of criteria objects and a preserved `raw` markdown block
+  - Each criterion gets a stable `id` (e.g., `AC-<taskId>-<n>`) used in tests, commit trailers, and PR tables
+  - Enables clean PR tables and deterministic coverage mapping while keeping human‑friendly text
+
+Proposed default schema (hybrid):
+```
+acceptanceCriteria: {
+  raw: """
+  - AC1: Guest can checkout with credit card
+  - AC2: Declined cards show error inline
+  """,
+  items: [
+    {
+      id: "AC-12-1",
+      summary: "Guest can checkout with credit card",
+      given: "a guest with items in cart",
+      when: "submits valid credit card",
+      then: "order is created and receipt emailed",
+      severity: "must",
+      tags: ["checkout", "payments"],
+      tests: [] // filled by orchestrator (file paths/test IDs)
+    }
+  ]
+}
+```
+
+Decision: adopt Hybrid default; allow Markdown‑only input and auto‑normalize.
+
+## Decisions
+- Adopt Hybrid acceptance criteria schema by default; normalize Markdown to structured items with stable IDs `AC-<taskId>-<n>`.
+- Use conventional commits plus footers and a unified trailer `Refs: TM-<tag>-<id>[.<sub>]` across PRDs for robust mapping.
--- a/CHANGELOG.md
+++ b/CHANGELOG.md
@@ -1,12 +1,5 @@
 # task-master-ai

-## 0.27.3
-
-### Patch Changes
-
- [#1254](https://github.com/eyaltoledano/claude-task-master/pull/1254) [`af53525`](https://github.com/eyaltoledano/claude-task-master/commit/af53525cbc660a595b67d4bb90d906911c71f45d) Thanks [@Crunchyman-ralph](https://github.com/Crunchyman-ralph)! - Fixed issue where `tm show` command could not find subtasks using dotted notation IDs (e.g., '8.1').
-  - The command now properly searches within parent task subtasks and returns the correct subtask information.
-
 ## 0.27.2

 ### Patch Changes
--- a/apps/cli/package.json
+++ b/apps/cli/package.json
@@ -35,7 +35,7 @@
 		"@types/inquirer": "^9.0.3",
 		"@types/node": "^22.10.5",
 		"tsx": "^4.20.4",
-		"typescript": "^5.9.2",
+		"typescript": "^5.7.3",
 		"vitest": "^2.1.8"
 	},
 	"engines": {
--- a/apps/cli/src/commands/list.command.ts
+++ b/apps/cli/src/commands/list.command.ts
@@ -281,14 +281,9 @@ export class ListTasksCommand extends Command {
 		const priorityBreakdown = getPriorityBreakdown(tasks);

 		// Find next task following the same logic as findNextTask
-		const nextTaskInfo = this.findNextTask(tasks);
+		const nextTask = this.findNextTask(tasks);

-		// Get the full task object with complexity data already included
-		const nextTask = nextTaskInfo
-			? tasks.find((t) => String(t.id) === String(nextTaskInfo.id))
-			: undefined;
-
-		// Display dashboard boxes (nextTask already has complexity from storage enrichment)
+		// Display dashboard boxes
 		displayDashboards(
 			taskStats,
 			subtaskStats,
@@ -308,16 +303,14 @@ export class ListTasksCommand extends Command {

 		// Display recommended next task section immediately after table
 		if (nextTask) {
-			const description = getTaskDescription(nextTask);
+			// Find the full task object to get description
+			const fullTask = tasks.find((t) => String(t.id) === String(nextTask.id));
+			const description = fullTask ? getTaskDescription(fullTask) : undefined;

 			displayRecommendedNextTask({
-				id: nextTask.id,
-				title: nextTask.title,
-				priority: nextTask.priority,
-				status: nextTask.status,
-				dependencies: nextTask.dependencies,
-				description,
-				complexity: nextTask.complexity as number | undefined
+				...nextTask,
+				status: 'pending', // Next task is typically pending
+				description
 			});
 		} else {
 			displayRecommendedNextTask(undefined);
--- a/apps/cli/src/ui/components/dashboard.component.ts
+++ b/apps/cli/src/ui/components/dashboard.component.ts
@@ -6,7 +6,6 @@
 import chalk from 'chalk';
 import boxen from 'boxen';
 import type { Task, TaskPriority } from '@tm/core/types';
-import { getComplexityWithColor } from '../../utils/ui.js';

 /**
 * Statistics for task collection
@@ -480,7 +479,7 @@ export function displayDependencyDashboard(
 				? chalk.cyan(nextTask.dependencies.join(', '))
 				: chalk.gray('None')
 		}\n` +
-		`Complexity: ${nextTask?.complexity !== undefined ? getComplexityWithColor(nextTask.complexity) : chalk.gray('N/A')}`;
+		`Complexity: ${nextTask?.complexity || chalk.gray('N/A')}`;

 	return content;
 }
--- a/apps/cli/src/ui/components/next-task.component.ts
+++ b/apps/cli/src/ui/components/next-task.component.ts
@@ -6,7 +6,6 @@
 import chalk from 'chalk';
 import boxen from 'boxen';
 import type { Task } from '@tm/core/types';
-import { getComplexityWithColor } from '../../utils/ui.js';

 /**
 * Next task display options
@@ -18,7 +17,6 @@ export interface NextTaskDisplayOptions {
 	status?: string;
 	dependencies?: (string | number)[];
 	description?: string;
-	complexity?: number;
 }

 /**
@@ -84,11 +82,6 @@ export function displayRecommendedNextTask(
 			: chalk.cyan(task.dependencies.join(', '));
 	content.push(`Dependencies: ${depsDisplay}`);

-	// Complexity with color and label
-	if (typeof task.complexity === 'number') {
-		content.push(`Complexity: ${getComplexityWithColor(task.complexity)}`);
-	}
-
 	// Description if available
 	if (task.description) {
 		content.push('');
--- a/apps/cli/src/ui/components/task-detail.component.ts
+++ b/apps/cli/src/ui/components/task-detail.component.ts
@@ -9,11 +9,7 @@ import Table from 'cli-table3';
 import { marked, MarkedExtension } from 'marked';
 import { markedTerminal } from 'marked-terminal';
 import type { Task } from '@tm/core/types';
-import {
-	getStatusWithColor,
-	getPriorityWithColor,
-	getComplexityWithColor
-} from '../../utils/ui.js';
+import { getStatusWithColor, getPriorityWithColor } from '../../utils/ui.js';

 // Configure marked to use terminal renderer with subtle colors
 marked.use(
@@ -112,9 +108,7 @@ export function displayTaskProperties(task: Task): void {
 		getStatusWithColor(task.status),
 		getPriorityWithColor(task.priority),
 		deps,
-		typeof task.complexity === 'number'
-			? getComplexityWithColor(task.complexity)
-			: chalk.gray('N/A'),
+		'N/A',
 		task.description || ''
 	].join('\n');

--- a/apps/cli/src/utils/auto-update.ts
+++ b/apps/cli/src/utils/auto-update.ts
@@ -158,18 +158,10 @@ export function displayUpgradeNotification(
 export async function performAutoUpdate(
 	latestVersion: string
 ): Promise<boolean> {
-	if (
-		process.env.TASKMASTER_SKIP_AUTO_UPDATE === '1' ||
-		process.env.CI ||
-		process.env.NODE_ENV === 'test'
-	) {
-		const reason =
-			process.env.TASKMASTER_SKIP_AUTO_UPDATE === '1'
-				? 'TASKMASTER_SKIP_AUTO_UPDATE=1'
-				: process.env.CI
-					? 'CI environment'
-					: 'NODE_ENV=test';
-		console.log(chalk.dim(`Skipping auto-update (${reason})`));
+	if (process.env.TASKMASTER_SKIP_AUTO_UPDATE === '1' || process.env.CI) {
+		console.log(
+			chalk.dim('Skipping auto-update (TASKMASTER_SKIP_AUTO_UPDATE/CI).')
+		);
 		return false;
 	}
 	const spinner = ora({
--- a/apps/cli/src/utils/ui.ts
+++ b/apps/cli/src/utils/ui.ts
@@ -84,23 +84,7 @@ export function getPriorityWithColor(priority: TaskPriority): string {
 }

 /**
- * Get complexity color and label based on score thresholds
- */
-function getComplexityLevel(score: number): {
-	color: (text: string) => string;
-	label: string;
-} {
-	if (score >= 7) {
-		return { color: chalk.hex('#CC0000'), label: 'High' };
-	} else if (score >= 4) {
-		return { color: chalk.hex('#FF8800'), label: 'Medium' };
-	} else {
-		return { color: chalk.green, label: 'Low' };
-	}
-}
-
-/**
- * Get colored complexity display with dot indicator (simple format)
+ * Get colored complexity display
 */
 export function getComplexityWithColor(complexity: number | string): string {
 	const score =
@@ -110,20 +94,13 @@ export function getComplexityWithColor(complexity: number | string): string {
 		return chalk.gray('N/A');
 	}

-	const { color } = getComplexityLevel(score);
-	return color(`● ${score}`);
-}
-
-/**
- * Get colored complexity display with /10 format (for dashboards)
- */
-export function getComplexityWithScore(complexity: number | undefined): string {
-	if (typeof complexity !== 'number') {
-		return chalk.gray('N/A');
+	if (score >= 8) {
+		return chalk.red.bold(`${score} (High)`);
+	} else if (score >= 5) {
+		return chalk.yellow(`${score} (Medium)`);
+	} else {
+		return chalk.green(`${score} (Low)`);
 	}
-
-	const { color, label } = getComplexityLevel(complexity);
-	return color(`${complexity}/10 (${label})`);
 }

 /**
@@ -346,12 +323,8 @@ export function createTaskTable(
 		}

 		if (showComplexity) {
-			// Show complexity score from report if available
-			if (typeof task.complexity === 'number') {
-				row.push(getComplexityWithColor(task.complexity));
-			} else {
-				row.push(chalk.gray('N/A'));
-			}
+			// Show N/A if no complexity score
+			row.push(chalk.gray('N/A'));
 		}

 		table.push(row);
--- a/apps/docs/docs.json
+++ b/apps/docs/docs.json
@@ -32,7 +32,6 @@
 									"getting-started/quick-start/execute-quick"
 								]
 							},
-							"getting-started/api-keys",
 							"getting-started/faq",
 							"getting-started/contribute"
 						]
--- a/apps/docs/getting-started/api-keys.mdx
+++ b/apps/docs/getting-started/api-keys.mdx
@@ -1,267 +0,0 @@
-# API Keys Configuration
-
-Task Master supports multiple AI providers through environment variables. This page lists all available API keys and their configuration requirements.
-
-## Required API Keys
-
-> **Note**: At least one required API key must be configured for Task Master to function.
->
-> "Required: Yes" below means "required to use that specific provider," not "required globally." You only need at least one provider configured.
-
-### ANTHROPIC_API_KEY (Recommended)
- **Provider**: Anthropic Claude models
- **Format**: `sk-ant-api03-...`
- **Required**: ✅ **Yes**
- **Models**: Claude 3.5 Sonnet, Claude 3 Haiku, Claude 3 Opus
- **Get Key**: [Anthropic Console](https://console.anthropic.com/)
-
-```bash
-ANTHROPIC_API_KEY="sk-ant-api03-your-key-here"
-```
-
-### PERPLEXITY_API_KEY (Highly Recommended for Research)
- **Provider**: Perplexity AI (Research features)
- **Format**: `pplx-...`
- **Required**: ✅ **Yes**
- **Purpose**: Enables research-backed task expansions and updates
- **Models**: Perplexity Sonar models
- **Get Key**: [Perplexity API](https://www.perplexity.ai/settings/api)
-
-```bash
-PERPLEXITY_API_KEY="pplx-your-key-here"
-```
-
-### OPENAI_API_KEY
- **Provider**: OpenAI GPT models
- **Format**: `sk-proj-...` or `sk-...`
- **Required**: ✅ **Yes**
- **Models**: GPT-4, GPT-4 Turbo, GPT-3.5 Turbo, O1 models
- **Get Key**: [OpenAI Platform](https://platform.openai.com/api-keys)
-
-```bash
-OPENAI_API_KEY="sk-proj-your-key-here"
-```
-
-### GOOGLE_API_KEY
- **Provider**: Google Gemini models
- **Format**: Various formats
- **Required**: ✅ **Yes**
- **Models**: Gemini Pro, Gemini Flash, Gemini Ultra
- **Get Key**: [Google AI Studio](https://aistudio.google.com/app/apikey)
- **Alternative**: Use `GOOGLE_APPLICATION_CREDENTIALS` for service account (Google Vertex)
-
-```bash
-GOOGLE_API_KEY="your-google-api-key-here"
-```
-
-### GROQ_API_KEY
- **Provider**: Groq (High-performance inference)
- **Required**: ✅ **Yes**
- **Models**: Llama models, Mixtral models (via Groq)
- **Get Key**: [Groq Console](https://console.groq.com/keys)
-
-```bash
-GROQ_API_KEY="your-groq-key-here"
-```
-
-### OPENROUTER_API_KEY
- **Provider**: OpenRouter (Multiple model access)
- **Required**: ✅ **Yes**
- **Models**: Access to various models through single API
- **Get Key**: [OpenRouter](https://openrouter.ai/keys)
-
-```bash
-OPENROUTER_API_KEY="your-openrouter-key-here"
-```
-
-### AZURE_OPENAI_API_KEY
- **Provider**: Azure OpenAI Service
- **Required**: ✅ **Yes**
- **Requirements**: Also requires `AZURE_OPENAI_ENDPOINT` configuration
- **Models**: GPT models via Azure
- **Get Key**: [Azure Portal](https://portal.azure.com/)
-
-```bash
-AZURE_OPENAI_API_KEY="your-azure-key-here"
-```
-
-### XAI_API_KEY
- **Provider**: xAI (Grok) models
- **Required**: ✅ **Yes**
- **Models**: Grok models
- **Get Key**: [xAI Console](https://console.x.ai/)
-
-```bash
-XAI_API_KEY="your-xai-key-here"
-```
-
-## Optional API Keys
-
-> **Note**: These API keys are optional - providers will work without them or use alternative authentication methods.
-
-### AWS_ACCESS_KEY_ID (Bedrock)
- **Provider**: AWS Bedrock
- **Required**: ❌ **No** (uses AWS credential chain)
- **Models**: Claude models via AWS Bedrock
- **Authentication**: Uses AWS credential chain (profiles, IAM roles, etc.)
- **Get Key**: [AWS Console](https://console.aws.amazon.com/iam/)
-
-```bash
-# Optional - AWS credential chain is preferred
-AWS_ACCESS_KEY_ID="your-aws-access-key"
-AWS_SECRET_ACCESS_KEY="your-aws-secret-key"
-```
-
-### CLAUDE_CODE_API_KEY
- **Provider**: Claude Code CLI
- **Required**: ❌ **No** (uses OAuth tokens)
- **Purpose**: Integration with local Claude Code CLI
- **Authentication**: Uses OAuth tokens, no API key needed
-
-```bash
-# Not typically needed
-CLAUDE_CODE_API_KEY="not-usually-required"
-```
-
-### GEMINI_API_KEY
- **Provider**: Gemini CLI
- **Required**: ❌ **No** (uses OAuth authentication)
- **Purpose**: Integration with Gemini CLI
- **Authentication**: Primarily uses OAuth via CLI, API key is optional
-
-```bash
-# Optional - OAuth via CLI is preferred
-GEMINI_API_KEY="your-gemini-key-here"
-```
-
-### GROK_CLI_API_KEY
- **Provider**: Grok CLI
- **Required**: ❌ **No** (can use CLI config)
- **Purpose**: Integration with Grok CLI
- **Authentication**: Can use Grok CLI's own config file
-
-```bash
-# Optional - CLI config is preferred
-GROK_CLI_API_KEY="your-grok-cli-key"
-```
-
-### OLLAMA_API_KEY
- **Provider**: Ollama (Local/Remote)
- **Required**: ❌ **No** (local installation doesn't need key)
- **Purpose**: For remote Ollama servers that require authentication
- **Requirements**: Only needed for remote servers with authentication
- **Note**: Not needed for local Ollama installations
-
-```bash
-# Only needed for remote Ollama servers
-OLLAMA_API_KEY="your-ollama-api-key-here"
-```
-
-### GITHUB_API_KEY
- **Provider**: GitHub (Import/Export features)
- **Format**: `ghp_...` or `github_pat_...`
- **Required**: ❌ **No** (for GitHub features only)
- **Purpose**: GitHub import/export features
- **Get Key**: [GitHub Settings](https://github.com/settings/tokens)
-
-```bash
-GITHUB_API_KEY="ghp-your-github-key-here"
-```
-
-## Configuration Methods
-
-### Method 1: Environment File (.env)
-Create a `.env` file in your project root:
-
-```bash
-# Copy from .env.example
-cp .env.example .env
-
-# Edit with your keys
-vim .env
-```
-
-### Method 2: System Environment Variables
-```bash
-export ANTHROPIC_API_KEY="your-key-here"
-export PERPLEXITY_API_KEY="your-key-here"
-# ... other keys
-```
-
-### Method 3: MCP Server Configuration
-For Claude Code integration, configure keys in `.mcp.json`:
-
-```json
-{
-  "mcpServers": {
-    "task-master-ai": {
-      "command": "npx",
-      "args": ["-y", "task-master-ai"],
-      "env": {
-        "ANTHROPIC_API_KEY": "your-key-here",
-        "PERPLEXITY_API_KEY": "your-key-here",
-        "OPENAI_API_KEY": "your-key-here"
-      }
-    }
-  }
-}
-```
-
-## Key Requirements
-
-### Minimum Requirements
- **At least one** AI provider key is required
- **ANTHROPIC_API_KEY** is recommended as the primary provider
- **PERPLEXITY_API_KEY** is highly recommended for research features
-
-### Provider-Specific Requirements
- **Azure OpenAI**: Requires both `AZURE_OPENAI_API_KEY` and `AZURE_OPENAI_ENDPOINT` configuration
- **Google Vertex**: Requires `VERTEX_PROJECT_ID` and `VERTEX_LOCATION` environment variables
- **AWS Bedrock**: Uses AWS credential chain (profiles, IAM roles, etc.) instead of API keys
- **Ollama**: Only needs API key for remote servers with authentication
- **CLI Providers**: Gemini CLI, Grok CLI, and Claude Code use OAuth/CLI config instead of API keys
-
-## Model Configuration
-
-After setting up API keys, configure which models to use:
-
-```bash
-# Interactive model setup
-task-master models --setup
-
-# Set specific models
-task-master models --set-main claude-3-5-sonnet-20241022
-task-master models --set-research perplexity-llama-3.1-sonar-large-128k-online
-task-master models --set-fallback gpt-4o-mini
-```
-
-## Security Best Practices
-
-1. **Never commit API keys** to version control
-2. **Use .env files** and add them to `.gitignore`
-3. **Rotate keys regularly** especially if compromised
-4. **Use minimal permissions** for service accounts
-5. **Monitor usage** to detect unauthorized access
-
-## Troubleshooting
-
-### Key Validation
-```bash
-# Check if keys are properly configured
-task-master models
-
-# Test specific provider
-task-master add-task --prompt="test task" --model=claude-3-5-sonnet-20241022
-```
-
-### Common Issues
- **Invalid key format**: Check the expected format for each provider
- **Insufficient permissions**: Ensure keys have necessary API access
- **Rate limits**: Some providers have usage limits
- **Regional restrictions**: Some models may not be available in all regions
-
-### Getting Help
-If you encounter issues with API key configuration:
- Check the [FAQ](/getting-started/faq) for common solutions
- Join our [Discord community](https://discord.gg/fWJkU7rf) for support
- Report issues on [GitHub](https://github.com/eyaltoledano/claude-task-master/issues)
--- a/apps/extension/CHANGELOG.md
+++ b/apps/extension/CHANGELOG.md
@@ -1,12 +1,5 @@
 # Change Log

-## 0.25.4
-
-### Patch Changes
-
- Updated dependencies [[`af53525`](https://github.com/eyaltoledano/claude-task-master/commit/af53525cbc660a595b67d4bb90d906911c71f45d)]:
-  - task-master-ai@0.27.3
-
 ## 0.25.3

 ### Patch Changes
--- a/apps/extension/package.json
+++ b/apps/extension/package.json
@@ -3,7 +3,7 @@
 	"private": true,
 	"displayName": "TaskMaster",
 	"description": "A visual Kanban board interface for TaskMaster projects in VS Code",
-	"version": "0.25.4",
+	"version": "0.25.3",
 	"publisher": "Hamster",
 	"icon": "assets/icon.png",
 	"engines": {
@@ -240,7 +240,7 @@
 		"check-types": "tsc --noEmit"
 	},
 	"dependencies": {
-		"task-master-ai": "*"
+		"task-master-ai": "0.27.2"
 	},
 	"devDependencies": {
 		"@dnd-kit/core": "^6.3.1",
@@ -276,8 +276,7 @@
 		"react-dom": "^19.0.0",
 		"tailwind-merge": "^3.3.1",
 		"tailwindcss": "4.1.11",
-		"typescript": "^5.9.2",
-		"@tm/core": "*"
+		"typescript": "^5.7.3"
 	},
 	"overrides": {
 		"glob@<8": "^10.4.5",
--- a/apps/extension/package.publish.json
+++ b/apps/extension/package.publish.json
@@ -2,7 +2,7 @@
 	"name": "task-master-hamster",
 	"displayName": "Taskmaster AI",
 	"description": "A visual Kanban board interface for Taskmaster projects in VS Code",
-	"version": "0.25.3",
+	"version": "0.23.1",
 	"publisher": "Hamster",
 	"icon": "assets/icon.png",
 	"engines": {
--- a/apps/extension/tsconfig.json
+++ b/apps/extension/tsconfig.json
@@ -5,6 +5,7 @@
 		"outDir": "out",
 		"lib": ["ES2022", "DOM"],
 		"sourceMap": true,
+		"rootDir": "src",
 		"strict": true /* enable all strict type-checking options */,
 		"moduleResolution": "Node",
 		"esModuleInterop": true,
@@ -20,10 +21,8 @@
 			"@/*": ["./src/*"],
 			"@/components/*": ["./src/components/*"],
 			"@/lib/*": ["./src/lib/*"],
-			"@tm/core": ["../../packages/tm-core/src/index.ts"],
-			"@tm/core/*": ["../../packages/tm-core/src/*"]
+			"@tm/core": ["../core/src"]
 		}
 	},
-	"include": ["src/**/*"],
 	"exclude": ["node_modules", ".vscode-test", "out", "dist"]
 }
--- a/docs/claude-code-integration.md
+++ b/docs/claude-code-integration.md
@@ -1,231 +0,0 @@
-# TODO: Move to apps/docs inside our documentation website
-
-# Claude Code Integration Guide
-
-This guide covers how to use Task Master with Claude Code AI SDK integration for enhanced AI-powered development workflows.
-
-## Overview
-
-Claude Code integration allows Task Master to leverage the Claude Code CLI for AI operations without requiring direct API keys. The integration uses OAuth tokens managed by the Claude Code CLI itself.
-
-## Authentication Setup
-
-The Claude Code provider uses token authentication managed by the Claude Code CLI.
-
-### Prerequisites
-
-1. **Install Claude Code CLI** (if not already installed):
-
-   ```bash
-   # Installation method depends on your system
-   # Follow Claude Code documentation for installation
-   ```
-
-2. **Set up OAuth token** using Claude Code CLI:
-
-   ```bash
-   claude setup-token
-   ```
-
-   This command will:
-   - Guide you through OAuth authentication
-   - Store the token securely for CLI usage
-   - Enable Task Master to use Claude Code without manual API key configuration
-
-### Authentication Priority
-
-Task Master will attempt authentication in this order:
-
-1. **Environment Variable** (optional): `CLAUDE_CODE_OAUTH_TOKEN`
-   - Useful for CI/CD environments or when you want to override the default token
-   - Not required if you've set up the CLI token
-
-2. **Claude Code CLI Token** (recommended): Token managed by `claude setup-token`
-   - Automatically used when available
-   - Most convenient for local development
-
-3. **Fallback**: Error if neither is available
-
-## Configuration
-
-### Basic Configuration
-
-Add Claude Code to your Task Master configuration:
-
-```javascript
-// In your .taskmaster/config.json or via task-master models command
-{
-  "models": {
-    "main": "claude-code:sonnet",      // Use Claude Code with Sonnet
-    "research": "perplexity-llama-3.1-sonar-large-128k-online",
-    "fallback": "claude-code:opus"     // Use Claude Code with Opus as fallback
-  }
-}
-```
-
-### Supported Models
-
- `claude-code:sonnet` - Claude 3.5 Sonnet via Claude Code CLI
- `claude-code:opus` - Claude 3 Opus via Claude Code CLI
-
-### Environment Variables (Optional)
-
-While not required, you can optionally set:
-
-```bash
-export CLAUDE_CODE_OAUTH_TOKEN="your_oauth_token_here"
-```
-
-This is only needed in specific scenarios like:
-
- CI/CD pipelines
- Docker containers
- When you want to use a different token than the CLI default
-
-## Usage Examples
-
-### Basic Task Operations
-
-```bash
-# Use Claude Code for task operations
-task-master add-task --prompt="Implement user authentication system" --research
-task-master expand --id=1 --research
-task-master update-task --id=1.1 --prompt="Add JWT token validation"
-```
-
-### Model Configuration Commands
-
-```bash
-# Set Claude Code as main model
-task-master models --set-main claude-code:sonnet
-
-# Use interactive setup
-task-master models --setup
-# Then select "claude-code" from the provider list
-```
-
-## Troubleshooting
-
-### Common Issues
-
-#### 1. "Claude Code CLI not available" Error
-
-**Problem**: Task Master cannot connect to Claude Code CLI.
-
-**Solutions**:
-
- Ensure Claude Code CLI is installed and in your PATH
- Run `claude setup-token` to configure authentication
- Verify Claude Code CLI works: `claude --help`
-
-#### 2. Authentication Failures
-
-**Problem**: Token authentication is failing.
-
-**Solutions**:
-
- Re-run `claude setup-token` to refresh your OAuth token
- Check if your token has expired
- Verify Claude Code CLI can authenticate: try a simple `claude` command
-
-#### 3. Model Not Available
-
-**Problem**: Specified Claude Code model is not supported.
-
-**Solutions**:
-
- Use supported models: `sonnet` or `opus`
- Check model availability: `task-master models --list`
- Verify your Claude Code CLI has access to the requested model
-
-### Debug Steps
-
-1. **Test Claude Code CLI directly**:
-
-   ```bash
-   claude --help
-   # Should show help without errors
-   ```
-
-2. **Test authentication**:
-
-   ```bash
-   claude setup-token --verify
-   # Should confirm token is valid
-   ```
-
-3. **Test Task Master integration**:
-
-   ```bash
-   task-master models --test claude-code:sonnet
-   # Should successfully connect and test the model
-   ```
-
-4. **Check logs**:
-   - Task Master logs will show detailed error messages
-   - Use `--verbose` flag for more detailed output
-
-### Environment-Specific Configuration
-
-#### Docker/Containers
-
-When running in Docker, you'll need to:
-
-1. Install Claude Code CLI in your container
-2. Set up authentication via environment variable:
-
-   ```dockerfile
-   ENV CLAUDE_CODE_OAUTH_TOKEN="your_token_here"
-   ```
-
-#### CI/CD Pipelines
-
-For automated environments:
-
-1. Set up a service account token or use environment variables
-2. Ensure Claude Code CLI is available in the pipeline environment
-3. Configure authentication before running Task Master commands
-
-## Integration with AI SDK
-
-Task Master's Claude Code integration uses the official `ai-sdk-provider-claude-code` package, providing:
-
- **Streaming Support**: Real-time token streaming for interactive experiences
- **Full AI SDK Compatibility**: Works with generateText, streamText, and other AI SDK functions
- **Automatic Error Handling**: Graceful degradation when Claude Code is unavailable
- **Type Safety**: Full TypeScript support with proper type definitions
-
-### Example AI SDK Usage
-
-```javascript
-import { generateText } from 'ai';
-import { ClaudeCodeProvider } from './src/ai-providers/claude-code.js';
-
-const provider = new ClaudeCodeProvider();
-const client = provider.getClient();
-
-const result = await generateText({
-  model: client('sonnet'),
-  messages: [
-    { role: 'user', content: 'Hello Claude!' }
-  ]
-});
-
-console.log(result.text);
-```
-
-## Security Notes
-
- OAuth tokens are managed securely by Claude Code CLI
- No API keys need to be stored in your project files
- Tokens are automatically refreshed by the Claude Code CLI
- Environment variables should only be used in secure environments
-
-## Getting Help
-
-If you encounter issues:
-
-1. Check the Claude Code CLI documentation
-2. Verify your authentication setup with `claude setup-token --verify`
-3. Review Task Master logs for detailed error messages
-4. Open an issue with both Task Master and Claude Code version information
--- a/mcp-server/src/custom-sdk/schema-converter.js
+++ b/mcp-server/src/custom-sdk/schema-converter.js
@@ -75,50 +75,13 @@ function generateExampleFromSchema(schema) {
 			return result;

 		case 'ZodString':
-			// Check for min/max length constraints
-			if (def.checks) {
-				const minCheck = def.checks.find((c) => c.kind === 'min');
-				const maxCheck = def.checks.find((c) => c.kind === 'max');
-				if (minCheck && maxCheck) {
-					return (
-						'<string between ' +
-						minCheck.value +
-						'-' +
-						maxCheck.value +
-						' characters>'
-					);
-				} else if (minCheck) {
-					return '<string with at least ' + minCheck.value + ' characters>';
-				} else if (maxCheck) {
-					return '<string up to ' + maxCheck.value + ' characters>';
-				}
-			}
-			return '<string>';
+			return 'string';

 		case 'ZodNumber':
-			// Check for int, positive, min/max constraints
-			if (def.checks) {
-				const intCheck = def.checks.find((c) => c.kind === 'int');
-				const minCheck = def.checks.find((c) => c.kind === 'min');
-				const maxCheck = def.checks.find((c) => c.kind === 'max');
-
-				if (intCheck && minCheck && minCheck.value > 0) {
-					return '<positive integer>';
-				} else if (intCheck) {
-					return '<integer>';
-				} else if (minCheck || maxCheck) {
-					return (
-						'<number' +
-						(minCheck ? ' >= ' + minCheck.value : '') +
-						(maxCheck ? ' <= ' + maxCheck.value : '') +
-						'>'
-					);
-				}
-			}
-			return '<number>';
+			return 0;

 		case 'ZodBoolean':
-			return '<boolean>';
+			return false;

 		case 'ZodArray':
 			const elementExample = generateExampleFromSchema(def.type);
--- a/package-lock.json
+++ b/package-lock.json
--- a/package.json
+++ b/package.json
@@ -1,6 +1,6 @@
 {
 	"name": "task-master-ai",
-	"version": "0.27.3",
+	"version": "0.27.2",
 	"description": "A task management system for ambitious AI-driven development that doesn't overwhelm and confuse Cursor.",
 	"main": "index.js",
 	"type": "module",
@@ -17,7 +17,7 @@
 		"turbo:build": "turbo build",
 		"turbo:typecheck": "turbo typecheck",
 		"build:build-config": "npm run build -w @tm/build-config",
-		"test": "cross-env NODE_ENV=test node --experimental-vm-modules node_modules/.bin/jest",
+		"test": "node --experimental-vm-modules node_modules/.bin/jest",
 		"test:unit": "node --experimental-vm-modules node_modules/.bin/jest --testPathPattern=unit",
 		"test:integration": "node --experimental-vm-modules node_modules/.bin/jest --testPathPattern=integration",
 		"test:fails": "node --experimental-vm-modules node_modules/.bin/jest --onlyFailures",
@@ -52,26 +52,23 @@
 	"author": "Eyal Toledano",
 	"license": "MIT WITH Commons-Clause",
 	"dependencies": {
-		"@ai-sdk/amazon-bedrock": "^3.0.23",
-		"@ai-sdk/anthropic": "^2.0.18",
-		"@ai-sdk/azure": "^2.0.34",
-		"@ai-sdk/google": "^2.0.16",
-		"@ai-sdk/google-vertex": "^3.0.29",
-		"@ai-sdk/groq": "^2.0.21",
-		"@ai-sdk/mistral": "^2.0.16",
-		"@ai-sdk/openai": "^2.0.34",
-		"@ai-sdk/perplexity": "^2.0.10",
-		"@ai-sdk/provider": "^2.0.0",
-		"@ai-sdk/provider-utils": "^3.0.10",
-		"@ai-sdk/xai": "^2.0.22",
-		"@aws-sdk/credential-providers": "^3.895.0",
+		"@ai-sdk/amazon-bedrock": "^2.2.9",
+		"@ai-sdk/anthropic": "^1.2.10",
+		"@ai-sdk/azure": "^1.3.17",
+		"@ai-sdk/google": "^1.2.13",
+		"@ai-sdk/google-vertex": "^2.2.23",
+		"@ai-sdk/groq": "^1.2.9",
+		"@ai-sdk/mistral": "^1.2.7",
+		"@ai-sdk/openai": "^1.3.20",
+		"@ai-sdk/perplexity": "^1.1.7",
+		"@ai-sdk/xai": "^1.2.15",
+		"@anthropic-ai/sdk": "^0.39.0",
+		"@aws-sdk/credential-providers": "^3.817.0",
 		"@inquirer/search": "^3.0.15",
-		"@openrouter/ai-sdk-provider": "^1.2.0",
+		"@openrouter/ai-sdk-provider": "^0.4.5",
 		"@streamparser/json": "^0.0.22",
 		"@supabase/supabase-js": "^2.57.4",
-		"ai": "^5.0.51",
-		"ai-sdk-provider-claude-code": "^1.1.4",
-		"ai-sdk-provider-gemini-cli": "^1.1.1",
+		"ai": "^4.3.10",
 		"ajv": "^8.17.1",
 		"ajv-formats": "^3.0.1",
 		"boxen": "^8.0.1",
@@ -81,7 +78,7 @@
 		"cli-table3": "^0.6.5",
 		"commander": "^12.1.0",
 		"cors": "^2.8.5",
-		"dotenv": "^16.6.1",
+		"dotenv": "^16.3.1",
 		"express": "^4.21.2",
 		"fastmcp": "^3.5.0",
 		"figlet": "^1.8.0",
@@ -96,14 +93,17 @@
 		"lru-cache": "^10.2.0",
 		"marked": "^15.0.12",
 		"marked-terminal": "^7.3.0",
-		"ollama-ai-provider-v2": "^1.3.1",
+		"ollama-ai-provider": "^1.2.0",
+		"openai": "^4.89.0",
 		"ora": "^8.2.0",
 		"uuid": "^11.1.0",
-		"zod": "^4.1.11"
+		"zod": "^3.23.8",
+		"zod-to-json-schema": "^3.24.5"
 	},
 	"optionalDependencies": {
 		"@anthropic-ai/claude-code": "^1.0.88",
-		"@biomejs/cli-linux-x64": "^1.9.4"
+		"@biomejs/cli-linux-x64": "^1.9.4",
+		"ai-sdk-provider-gemini-cli": "^0.1.3"
 	},
 	"engines": {
 		"node": ">=18.0.0"
@@ -127,12 +127,12 @@
 		"@changesets/changelog-github": "^0.5.1",
 		"@changesets/cli": "^2.28.1",
 		"@manypkg/cli": "^0.25.1",
-		"@tm/ai-sdk-provider-grok-cli": "*",
 		"@tm/cli": "*",
 		"@types/jest": "^29.5.14",
 		"@types/marked-terminal": "^6.1.1",
 		"concurrently": "^9.2.1",
 		"cross-env": "^10.0.0",
+		"dotenv-mono": "^1.5.1",
 		"execa": "^8.0.1",
 		"jest": "^29.7.0",
 		"jest-environment-node": "^29.7.0",
@@ -142,7 +142,7 @@
 		"ts-jest": "^29.4.2",
 		"tsdown": "^0.15.2",
 		"tsx": "^4.20.4",
-		"turbo": "2.5.6",
-		"typescript": "^5.9.2"
+		"turbo": "^2.5.6",
+		"typescript": "^5.7.3"
 	}
 }
--- a/packages/ai-sdk-provider-grok-cli/README.md
+++ b/packages/ai-sdk-provider-grok-cli/README.md
@@ -1,165 +0,0 @@
-# AI SDK Provider for Grok CLI
-
-A provider for the [AI SDK](https://sdk.vercel.ai) that integrates with [Grok CLI](https://docs.x.ai/api) for accessing xAI's Grok language models.
-
-## Features
-
- ✅ **AI SDK v5 Compatible** - Full support for the latest AI SDK interfaces
- ✅ **Streaming & Non-streaming** - Both generation modes supported
- ✅ **Error Handling** - Comprehensive error handling with retry logic
- ✅ **Type Safety** - Full TypeScript support with proper type definitions
- ✅ **JSON Mode** - Automatic JSON extraction from responses
- ✅ **Abort Signals** - Proper cancellation support
-
-## Installation
-
-```bash
-npm install @tm/ai-sdk-provider-grok-cli
-# or
-yarn add @tm/ai-sdk-provider-grok-cli
-```
-
-## Prerequisites
-
-1. Install the Grok CLI:
-
-   ```bash
-   npm install -g grok-cli
-   # or follow xAI's installation instructions
-   ```
-
-2. Set up authentication:
-
-   ```bash
-   export GROK_CLI_API_KEY="your-api-key"
-   # or configure via grok CLI: grok config set api-key your-key
-   ```
-
-## Usage
-
-### Basic Usage
-
-```typescript
-import { grokCli } from '@tm/ai-sdk-provider-grok-cli';
-import { generateText } from 'ai';
-
-const result = await generateText({
-  model: grokCli('grok-3-latest'),
-  prompt: 'Write a haiku about TypeScript'
-});
-
-console.log(result.text);
-```
-
-### Streaming
-
-```typescript
-import { grokCli } from '@tm/ai-sdk-provider-grok-cli';
-import { streamText } from 'ai';
-
-const { textStream } = await streamText({
-  model: grokCli('grok-4-latest'),
-  prompt: 'Explain quantum computing'
-});
-
-for await (const delta of textStream) {
-  process.stdout.write(delta);
-}
-```
-
-### JSON Mode
-
-```typescript
-import { grokCli } from '@tm/ai-sdk-provider-grok-cli';
-import { generateObject } from 'ai';
-import { z } from 'zod';
-
-const result = await generateObject({
-  model: grokCli('grok-3-latest'),
-  schema: z.object({
-    name: z.string(),
-    age: z.number(),
-    hobbies: z.array(z.string())
-  }),
-  prompt: 'Generate a person profile'
-});
-
-console.log(result.object);
-```
-
-## Supported Models
-
- `grok-3-latest` - Grok 3 (latest version)
- `grok-4-latest` - Grok 4 (latest version)
- `grok-4` - Grok 4 (stable)
- Custom model strings supported
-
-## Configuration
-
-### Provider Settings
-
-```typescript
-import { createGrokCli } from '@tm/ai-sdk-provider-grok-cli';
-
-const grok = createGrokCli({
-  apiKey: 'your-api-key', // Optional if set via env/CLI
-  timeout: 120000, // 2 minutes default
-  workingDirectory: '/path/to/project', // Optional
-  baseURL: 'https://api.x.ai' // Optional
-});
-```
-
-### Model Settings
-
-```typescript
-const model = grok('grok-4-latest', {
-  timeout: 300000, // 5 minutes for grok-4
-  // Other CLI-specific settings
-});
-```
-
-## Error Handling
-
-The provider includes comprehensive error handling:
-
-```typescript
-import {
-  isAuthenticationError,
-  isTimeoutError,
-  isInstallationError
-} from '@tm/ai-sdk-provider-grok-cli';
-
-try {
-  const result = await generateText({
-    model: grokCli('grok-4-latest'),
-    prompt: 'Hello!'
-  });
-} catch (error) {
-  if (isAuthenticationError(error)) {
-    console.error('Authentication failed:', error.message);
-  } else if (isTimeoutError(error)) {
-    console.error('Request timed out:', error.message);
-  } else if (isInstallationError(error)) {
-    console.error('Grok CLI not installed or not found in PATH');
-  }
-}
-```
-
-## Development
-
-```bash
-# Install dependencies
-npm install
-
-# Start development mode (keep running during development)
-npm run dev
-
-# Type check
-npm run typecheck
-
-# Run tests (requires build first)
-NODE_ENV=production npm run build
-npm test
-```
-
-**Important**: Always run `npm run dev` and keep it running during development. This ensures proper compilation and hot-reloading of TypeScript files.
--- a/packages/ai-sdk-provider-grok-cli/package.json
+++ b/packages/ai-sdk-provider-grok-cli/package.json
@@ -1,35 +0,0 @@
-{
-	"name": "@tm/ai-sdk-provider-grok-cli",
-	"private": true,
-	"description": "AI SDK provider for Grok CLI integration",
-	"type": "module",
-	"types": "./src/index.ts",
-	"main": "./dist/index.js",
-	"exports": {
-		".": "./src/index.ts"
-	},
-	"scripts": {
-		"test": "vitest run",
-		"test:watch": "vitest",
-		"test:ui": "vitest --ui",
-		"typecheck": "tsc --noEmit"
-	},
-	"dependencies": {
-		"@ai-sdk/provider": "^2.0.0",
-		"@ai-sdk/provider-utils": "^3.0.10",
-		"jsonc-parser": "^3.3.1"
-	},
-	"devDependencies": {
-		"@types/node": "^22.18.6",
-		"typescript": "^5.9.2",
-		"vitest": "^3.2.4"
-	},
-	"engines": {
-		"node": ">=18"
-	},
-	"keywords": ["ai", "grok", "x.ai", "cli", "language-model", "provider"],
-	"files": ["dist/**/*", "README.md"],
-	"publishConfig": {
-		"access": "public"
-	}
-}
--- a/packages/ai-sdk-provider-grok-cli/src/errors.test.ts
+++ b/packages/ai-sdk-provider-grok-cli/src/errors.test.ts
@@ -1,188 +0,0 @@
-/**
- * Tests for error handling utilities
- */
-
-import { APICallError, LoadAPIKeyError } from '@ai-sdk/provider';
-import { describe, expect, it } from 'vitest';
-import {
-	createAPICallError,
-	createAuthenticationError,
-	createInstallationError,
-	createTimeoutError,
-	getErrorMetadata,
-	isAuthenticationError,
-	isInstallationError,
-	isTimeoutError
-} from './errors.js';
-
-describe('createAPICallError', () => {
-	it('should create APICallError with metadata', () => {
-		const error = createAPICallError({
-			message: 'Test error',
-			code: 'TEST_ERROR',
-			exitCode: 1,
-			stderr: 'Error output',
-			stdout: 'Success output',
-			promptExcerpt: 'Test prompt',
-			isRetryable: true
-		});
-
-		expect(error).toBeInstanceOf(APICallError);
-		expect(error.message).toBe('Test error');
-		expect(error.isRetryable).toBe(true);
-		expect(error.url).toBe('grok-cli://command');
-		expect(error.data).toEqual({
-			code: 'TEST_ERROR',
-			exitCode: 1,
-			stderr: 'Error output',
-			stdout: 'Success output',
-			promptExcerpt: 'Test prompt'
-		});
-	});
-
-	it('should create APICallError with minimal parameters', () => {
-		const error = createAPICallError({
-			message: 'Simple error'
-		});
-
-		expect(error).toBeInstanceOf(APICallError);
-		expect(error.message).toBe('Simple error');
-		expect(error.isRetryable).toBe(false);
-	});
-});
-
-describe('createAuthenticationError', () => {
-	it('should create LoadAPIKeyError with custom message', () => {
-		const error = createAuthenticationError({
-			message: 'Custom auth error'
-		});
-
-		expect(error).toBeInstanceOf(LoadAPIKeyError);
-		expect(error.message).toBe('Custom auth error');
-	});
-
-	it('should create LoadAPIKeyError with default message', () => {
-		const error = createAuthenticationError({});
-
-		expect(error).toBeInstanceOf(LoadAPIKeyError);
-		expect(error.message).toContain('Authentication failed');
-	});
-});
-
-describe('createTimeoutError', () => {
-	it('should create APICallError for timeout', () => {
-		const error = createTimeoutError({
-			message: 'Operation timed out',
-			timeoutMs: 5000,
-			promptExcerpt: 'Test prompt'
-		});
-
-		expect(error).toBeInstanceOf(APICallError);
-		expect(error.message).toBe('Operation timed out');
-		expect(error.isRetryable).toBe(true);
-		expect(error.data).toEqual({
-			code: 'TIMEOUT',
-			promptExcerpt: 'Test prompt',
-			timeoutMs: 5000
-		});
-	});
-});
-
-describe('createInstallationError', () => {
-	it('should create APICallError for installation issues', () => {
-		const error = createInstallationError({
-			message: 'CLI not found'
-		});
-
-		expect(error).toBeInstanceOf(APICallError);
-		expect(error.message).toBe('CLI not found');
-		expect(error.isRetryable).toBe(false);
-		expect(error.url).toBe('grok-cli://installation');
-	});
-
-	it('should create APICallError with default message', () => {
-		const error = createInstallationError({});
-
-		expect(error).toBeInstanceOf(APICallError);
-		expect(error.message).toContain('Grok CLI is not installed');
-	});
-});
-
-describe('isAuthenticationError', () => {
-	it('should return true for LoadAPIKeyError', () => {
-		const error = new LoadAPIKeyError({ message: 'Auth failed' });
-		expect(isAuthenticationError(error)).toBe(true);
-	});
-
-	it('should return true for APICallError with 401 exit code', () => {
-		const error = new APICallError({
-			message: 'Unauthorized',
-			data: { exitCode: 401 }
-		});
-		expect(isAuthenticationError(error)).toBe(true);
-	});
-
-	it('should return false for other errors', () => {
-		const error = new Error('Generic error');
-		expect(isAuthenticationError(error)).toBe(false);
-	});
-});
-
-describe('isTimeoutError', () => {
-	it('should return true for timeout APICallError', () => {
-		const error = new APICallError({
-			message: 'Timeout',
-			data: { code: 'TIMEOUT' }
-		});
-		expect(isTimeoutError(error)).toBe(true);
-	});
-
-	it('should return false for other errors', () => {
-		const error = new APICallError({ message: 'Other error' });
-		expect(isTimeoutError(error)).toBe(false);
-	});
-});
-
-describe('isInstallationError', () => {
-	it('should return true for installation APICallError', () => {
-		const error = new APICallError({
-			message: 'Not installed',
-			url: 'grok-cli://installation'
-		});
-		expect(isInstallationError(error)).toBe(true);
-	});
-
-	it('should return false for other errors', () => {
-		const error = new APICallError({ message: 'Other error' });
-		expect(isInstallationError(error)).toBe(false);
-	});
-});
-
-describe('getErrorMetadata', () => {
-	it('should return metadata from APICallError', () => {
-		const metadata = {
-			code: 'TEST_ERROR',
-			exitCode: 1,
-			stderr: 'Error output'
-		};
-		const error = new APICallError({
-			message: 'Test error',
-			data: metadata
-		});
-
-		const result = getErrorMetadata(error);
-		expect(result).toEqual(metadata);
-	});
-
-	it('should return undefined for errors without metadata', () => {
-		const error = new Error('Generic error');
-		const result = getErrorMetadata(error);
-		expect(result).toBeUndefined();
-	});
-
-	it('should return undefined for APICallError without data', () => {
-		const error = new APICallError({ message: 'Test error' });
-		const result = getErrorMetadata(error);
-		expect(result).toBeUndefined();
-	});
-});
--- a/packages/ai-sdk-provider-grok-cli/src/errors.ts
+++ b/packages/ai-sdk-provider-grok-cli/src/errors.ts
@@ -1,187 +0,0 @@
-/**
- * Error handling utilities for Grok CLI provider
- */
-
-import { APICallError, LoadAPIKeyError } from '@ai-sdk/provider';
-import type { GrokCliErrorMetadata } from './types.js';
-
-/**
- * Parameters for creating API call errors
- */
-interface CreateAPICallErrorParams {
-	/** Error message */
-	message: string;
-	/** Error code */
-	code?: string;
-	/** Process exit code */
-	exitCode?: number;
-	/** Standard error output */
-	stderr?: string;
-	/** Standard output */
-	stdout?: string;
-	/** Excerpt of the prompt */
-	promptExcerpt?: string;
-	/** Whether the error is retryable */
-	isRetryable?: boolean;
-}
-
-/**
- * Parameters for creating authentication errors
- */
-interface CreateAuthenticationErrorParams {
-	/** Error message */
-	message?: string;
-}
-
-/**
- * Parameters for creating timeout errors
- */
-interface CreateTimeoutErrorParams {
-	/** Error message */
-	message: string;
-	/** Excerpt of the prompt */
-	promptExcerpt?: string;
-	/** Timeout in milliseconds */
-	timeoutMs: number;
-}
-
-/**
- * Parameters for creating installation errors
- */
-interface CreateInstallationErrorParams {
-	/** Error message */
-	message?: string;
-}
-
-/**
- * Create an API call error with Grok CLI specific metadata
- */
-export function createAPICallError({
-	message,
-	code,
-	exitCode,
-	stderr,
-	stdout,
-	promptExcerpt,
-	isRetryable = false
-}: CreateAPICallErrorParams): APICallError {
-	const metadata: GrokCliErrorMetadata = {
-		code,
-		exitCode,
-		stderr,
-		stdout,
-		promptExcerpt
-	};
-
-	return new APICallError({
-		message,
-		isRetryable,
-		url: 'grok-cli://command',
-		requestBodyValues: promptExcerpt ? { prompt: promptExcerpt } : undefined,
-		data: metadata
-	});
-}
-
-/**
- * Create an authentication error
- */
-export function createAuthenticationError({
-	message
-}: CreateAuthenticationErrorParams): LoadAPIKeyError {
-	return new LoadAPIKeyError({
-		message:
-			message ||
-			'Authentication failed. Please ensure Grok CLI is properly configured with API key.'
-	});
-}
-
-/**
- * Create a timeout error
- */
-export function createTimeoutError({
-	message,
-	promptExcerpt,
-	timeoutMs
-}: CreateTimeoutErrorParams): APICallError {
-	const metadata: GrokCliErrorMetadata & { timeoutMs: number } = {
-		code: 'TIMEOUT',
-		promptExcerpt,
-		timeoutMs
-	};
-
-	return new APICallError({
-		message,
-		isRetryable: true,
-		url: 'grok-cli://command',
-		requestBodyValues: promptExcerpt ? { prompt: promptExcerpt } : undefined,
-		data: metadata
-	});
-}
-
-/**
- * Create a CLI installation error
- */
-export function createInstallationError({
-	message
-}: CreateInstallationErrorParams): APICallError {
-	return new APICallError({
-		message:
-			message ||
-			'Grok CLI is not installed or not found in PATH. Please install with: npm install -g @vibe-kit/grok-cli',
-		isRetryable: false,
-		url: 'grok-cli://installation',
-		requestBodyValues: undefined
-	});
-}
-
-/**
- * Check if an error is an authentication error
- */
-export function isAuthenticationError(
-	error: unknown
-): error is LoadAPIKeyError {
-	if (error instanceof LoadAPIKeyError) return true;
-	if (error instanceof APICallError) {
-		const metadata = error.data as GrokCliErrorMetadata | undefined;
-		if (!metadata) return false;
-		return (
-			metadata.exitCode === 401 ||
-			metadata.code === 'AUTHENTICATION_ERROR' ||
-			metadata.code === 'UNAUTHORIZED'
-		);
-	}
-	return false;
-}
-
-/**
- * Check if an error is a timeout error
- */
-export function isTimeoutError(error: unknown): error is APICallError {
-	if (
-		error instanceof APICallError &&
-		(error.data as GrokCliErrorMetadata)?.code === 'TIMEOUT'
-	)
-		return true;
-	return false;
-}
-
-/**
- * Check if an error is an installation error
- */
-export function isInstallationError(error: unknown): error is APICallError {
-	if (error instanceof APICallError && error.url === 'grok-cli://installation')
-		return true;
-	return false;
-}
-
-/**
- * Get error metadata from an error
- */
-export function getErrorMetadata(
-	error: unknown
-): GrokCliErrorMetadata | undefined {
-	if (error instanceof APICallError && error.data) {
-		return error.data as GrokCliErrorMetadata;
-	}
-	return undefined;
-}
--- a/packages/ai-sdk-provider-grok-cli/src/grok-cli-provider.test.ts
+++ b/packages/ai-sdk-provider-grok-cli/src/grok-cli-provider.test.ts
@@ -1,121 +0,0 @@
-/**
- * Tests for Grok CLI provider
- */
-
-import { NoSuchModelError } from '@ai-sdk/provider';
-import { beforeEach, describe, expect, it, vi } from 'vitest';
-import { GrokCliLanguageModel } from './grok-cli-language-model.js';
-import { createGrokCli, grokCli } from './grok-cli-provider.js';
-
-// Mock the GrokCliLanguageModel
-vi.mock('./grok-cli-language-model.js', () => ({
-	GrokCliLanguageModel: vi.fn().mockImplementation((options) => ({
-		modelId: options.id,
-		settings: options.settings,
-		provider: 'grok-cli'
-	}))
-}));
-
-describe('createGrokCli', () => {
-	beforeEach(() => {
-		vi.clearAllMocks();
-	});
-
-	it('should create a provider with default settings', () => {
-		const provider = createGrokCli();
-		expect(typeof provider).toBe('function');
-		expect(typeof provider.languageModel).toBe('function');
-		expect(typeof provider.chat).toBe('function');
-		expect(typeof provider.textEmbeddingModel).toBe('function');
-		expect(typeof provider.imageModel).toBe('function');
-	});
-
-	it('should create a provider with custom default settings', () => {
-		const defaultSettings = {
-			timeout: 5000,
-			workingDirectory: '/custom/path'
-		};
-		const provider = createGrokCli({ defaultSettings });
-
-		const model = provider('grok-2-mini');
-
-		expect(GrokCliLanguageModel).toHaveBeenCalledWith({
-			id: 'grok-2-mini',
-			settings: defaultSettings
-		});
-	});
-
-	it('should create language models with merged settings', () => {
-		const defaultSettings = { timeout: 5000 };
-		const provider = createGrokCli({ defaultSettings });
-
-		const modelSettings = { apiKey: 'test-key' };
-		const model = provider('grok-2', modelSettings);
-
-		expect(GrokCliLanguageModel).toHaveBeenCalledWith({
-			id: 'grok-2',
-			settings: { timeout: 5000, apiKey: 'test-key' }
-		});
-	});
-
-	it('should create models via languageModel method', () => {
-		const provider = createGrokCli();
-		const model = provider.languageModel('grok-2-mini', { timeout: 1000 });
-
-		expect(GrokCliLanguageModel).toHaveBeenCalledWith({
-			id: 'grok-2-mini',
-			settings: { timeout: 1000 }
-		});
-	});
-
-	it('should create models via chat method (alias)', () => {
-		const provider = createGrokCli();
-		const model = provider.chat('grok-2');
-
-		expect(GrokCliLanguageModel).toHaveBeenCalledWith({
-			id: 'grok-2',
-			settings: {}
-		});
-	});
-
-	it('should throw error when called with new keyword', () => {
-		const provider = createGrokCli();
-		expect(() => {
-			// @ts-expect-error - intentionally testing invalid usage
-			new provider('grok-2');
-		}).toThrow(
-			'The Grok CLI model function cannot be called with the new keyword.'
-		);
-	});
-
-	it('should throw NoSuchModelError for textEmbeddingModel', () => {
-		const provider = createGrokCli();
-		expect(() => {
-			provider.textEmbeddingModel('test-model');
-		}).toThrow(NoSuchModelError);
-	});
-
-	it('should throw NoSuchModelError for imageModel', () => {
-		const provider = createGrokCli();
-		expect(() => {
-			provider.imageModel('test-model');
-		}).toThrow(NoSuchModelError);
-	});
-});
-
-describe('default grokCli provider', () => {
-	it('should be a pre-configured provider instance', () => {
-		expect(typeof grokCli).toBe('function');
-		expect(typeof grokCli.languageModel).toBe('function');
-		expect(typeof grokCli.chat).toBe('function');
-	});
-
-	it('should create models with default configuration', () => {
-		const model = grokCli('grok-2-mini');
-
-		expect(GrokCliLanguageModel).toHaveBeenCalledWith({
-			id: 'grok-2-mini',
-			settings: {}
-		});
-	});
-});
--- a/packages/ai-sdk-provider-grok-cli/src/grok-cli-provider.ts
+++ b/packages/ai-sdk-provider-grok-cli/src/grok-cli-provider.ts
@@ -1,108 +0,0 @@
-/**
- * Grok CLI provider implementation for AI SDK v5
- */
-
-import type { LanguageModelV2, ProviderV2 } from '@ai-sdk/provider';
-import { NoSuchModelError } from '@ai-sdk/provider';
-import { GrokCliLanguageModel } from './grok-cli-language-model.js';
-import type { GrokCliModelId, GrokCliSettings } from './types.js';
-
-/**
- * Grok CLI provider interface that extends the AI SDK's ProviderV2
- */
-export interface GrokCliProvider extends ProviderV2 {
-	/**
-	 * Creates a language model instance for the specified model ID.
-	 * This is a shorthand for calling `languageModel()`.
-	 */
-	(modelId: GrokCliModelId, settings?: GrokCliSettings): LanguageModelV2;
-
-	/**
-	 * Creates a language model instance for text generation.
-	 */
-	languageModel(
-		modelId: GrokCliModelId,
-		settings?: GrokCliSettings
-	): LanguageModelV2;
-
-	/**
-	 * Alias for `languageModel()` to maintain compatibility with AI SDK patterns.
-	 */
-	chat(modelId: GrokCliModelId, settings?: GrokCliSettings): LanguageModelV2;
-
-	textEmbeddingModel(modelId: string): never;
-	imageModel(modelId: string): never;
-}
-
-/**
- * Configuration options for creating a Grok CLI provider instance
- */
-export interface GrokCliProviderSettings {
-	/**
-	 * Default settings to use for all models created by this provider.
-	 * Individual model settings will override these defaults.
-	 */
-	defaultSettings?: GrokCliSettings;
-}
-
-/**
- * Creates a Grok CLI provider instance with the specified configuration.
- * The provider can be used to create language models for interacting with Grok models.
- */
-export function createGrokCli(
-	options: GrokCliProviderSettings = {}
-): GrokCliProvider {
-	const createModel = (
-		modelId: GrokCliModelId,
-		settings: GrokCliSettings = {}
-	): LanguageModelV2 => {
-		const mergedSettings = {
-			...options.defaultSettings,
-			...settings
-		};
-
-		return new GrokCliLanguageModel({
-			id: modelId,
-			settings: mergedSettings
-		});
-	};
-
-	const provider = function (
-		modelId: GrokCliModelId,
-		settings?: GrokCliSettings
-	) {
-		if (new.target) {
-			throw new Error(
-				'The Grok CLI model function cannot be called with the new keyword.'
-			);
-		}
-
-		return createModel(modelId, settings);
-	};
-
-	provider.languageModel = createModel;
-	provider.chat = createModel; // Alias for languageModel
-
-	// Add textEmbeddingModel method that throws NoSuchModelError
-	provider.textEmbeddingModel = (modelId: string) => {
-		throw new NoSuchModelError({
-			modelId,
-			modelType: 'textEmbeddingModel'
-		});
-	};
-
-	provider.imageModel = (modelId: string) => {
-		throw new NoSuchModelError({
-			modelId,
-			modelType: 'imageModel'
-		});
-	};
-
-	return provider as GrokCliProvider;
-}
-
-/**
- * Default Grok CLI provider instance.
- * Pre-configured provider for quick usage without custom settings.
- */
-export const grokCli = createGrokCli();
--- a/packages/ai-sdk-provider-grok-cli/src/index.ts
+++ b/packages/ai-sdk-provider-grok-cli/src/index.ts
@@ -1,64 +0,0 @@
-/**
- * Provider exports for creating and configuring Grok CLI instances.
- */
-
-/**
- * Creates a new Grok CLI provider instance and the default provider instance.
- */
-export { createGrokCli, grokCli } from './grok-cli-provider.js';
-
-/**
- * Type definitions for the Grok CLI provider.
- */
-export type {
-	GrokCliProvider,
-	GrokCliProviderSettings
-} from './grok-cli-provider.js';
-
-/**
- * Language model implementation for Grok CLI.
- * This class implements the AI SDK's LanguageModelV2 interface.
- */
-export { GrokCliLanguageModel } from './grok-cli-language-model.js';
-
-/**
- * Type definitions for Grok CLI language models.
- */
-export type {
-	GrokCliModelId,
-	GrokCliLanguageModelOptions,
-	GrokCliSettings,
-	GrokCliMessage,
-	GrokCliResponse,
-	GrokCliErrorMetadata
-} from './types.js';
-
-/**
- * Error handling utilities for Grok CLI.
- * These functions help create and identify specific error types.
- */
-export {
-	isAuthenticationError,
-	isTimeoutError,
-	isInstallationError,
-	getErrorMetadata,
-	createAPICallError,
-	createAuthenticationError,
-	createTimeoutError,
-	createInstallationError
-} from './errors.js';
-
-/**
- * Message conversion utilities for Grok CLI communication.
- */
-export {
-	convertToGrokCliMessages,
-	convertFromGrokCliResponse,
-	createPromptFromMessages,
-	escapeShellArg
-} from './message-converter.js';
-
-/**
- * JSON extraction utilities for parsing Grok responses.
- */
-export { extractJson } from './json-extractor.js';
--- a/packages/ai-sdk-provider-grok-cli/src/json-extractor.test.ts
+++ b/packages/ai-sdk-provider-grok-cli/src/json-extractor.test.ts
@@ -1,81 +0,0 @@
-/**
- * Tests for JSON extraction utilities
- */
-
-import { describe, expect, it } from 'vitest';
-import { extractJson } from './json-extractor.js';
-
-describe('extractJson', () => {
-	it('should extract JSON from markdown code blocks', () => {
-		const text = '```json\n{"name": "test", "value": 42}\n```';
-		const result = extractJson(text);
-		expect(JSON.parse(result)).toEqual({ name: 'test', value: 42 });
-	});
-
-	it('should extract JSON from generic code blocks', () => {
-		const text = '```\n{"name": "test", "value": 42}\n```';
-		const result = extractJson(text);
-		expect(JSON.parse(result)).toEqual({ name: 'test', value: 42 });
-	});
-
-	it('should remove JavaScript variable declarations', () => {
-		const text = 'const result = {"name": "test", "value": 42};';
-		const result = extractJson(text);
-		expect(JSON.parse(result)).toEqual({ name: 'test', value: 42 });
-	});
-
-	it('should handle let variable declarations', () => {
-		const text = 'let data = {"name": "test", "value": 42};';
-		const result = extractJson(text);
-		expect(JSON.parse(result)).toEqual({ name: 'test', value: 42 });
-	});
-
-	it('should handle var variable declarations', () => {
-		const text = 'var config = {"name": "test", "value": 42};';
-		const result = extractJson(text);
-		expect(JSON.parse(result)).toEqual({ name: 'test', value: 42 });
-	});
-
-	it('should extract JSON arrays', () => {
-		const text = '[{"name": "test1"}, {"name": "test2"}]';
-		const result = extractJson(text);
-		expect(JSON.parse(result)).toEqual([{ name: 'test1' }, { name: 'test2' }]);
-	});
-
-	it('should convert JavaScript object literals to JSON', () => {
-		const text = "{name: 'test', value: 42}";
-		const result = extractJson(text);
-		expect(JSON.parse(result)).toEqual({ name: 'test', value: 42 });
-	});
-
-	it('should return valid JSON (canonical formatting)', () => {
-		const text = '{"name": "test", "value": 42}';
-		const result = extractJson(text);
-		expect(JSON.parse(result)).toEqual({ name: 'test', value: 42 });
-	});
-
-	it('should return original text when JSON parsing fails completely', () => {
-		const text = 'This is not JSON at all';
-		const result = extractJson(text);
-		expect(result).toBe('This is not JSON at all');
-	});
-
-	it('should handle complex nested objects', () => {
-		const text =
-			'```json\n{\n  "user": {\n    "name": "John",\n    "age": 30\n  },\n  "items": [1, 2, 3]\n}\n```';
-		const result = extractJson(text);
-		expect(JSON.parse(result)).toEqual({
-			user: {
-				name: 'John',
-				age: 30
-			},
-			items: [1, 2, 3]
-		});
-	});
-
-	it('should handle mixed quotes in object literals', () => {
-		const text = `{name: "test", value: 'mixed quotes'}`;
-		const result = extractJson(text);
-		expect(JSON.parse(result)).toEqual({ name: 'test', value: 'mixed quotes' });
-	});
-});
--- a/packages/ai-sdk-provider-grok-cli/src/json-extractor.ts
+++ b/packages/ai-sdk-provider-grok-cli/src/json-extractor.ts
@@ -1,132 +0,0 @@
-/**
- * Extract JSON from AI's response using a tolerant parser.
- *
- * The function removes common wrappers such as markdown fences or variable
- * declarations and then attempts to parse the remaining text with
- * `jsonc-parser`.  If valid JSON (or JSONC) can be parsed, it is returned as a
- * string via `JSON.stringify`.  Otherwise the original text is returned.
- *
- * @param text - Raw text which may contain JSON
- * @returns A valid JSON string if extraction succeeds, otherwise the original text
- */
-import { parse, type ParseError } from 'jsonc-parser';
-
-export function extractJson(text: string): string {
-	let content = text.trim();
-
-	// Strip ```json or ``` fences
-	const fenceMatch = /```(?:json)?\s*([\s\S]*?)\s*```/i.exec(content);
-	if (fenceMatch) {
-		content = fenceMatch[1];
-	}
-
-	// Strip variable declarations like `const foo =` or `let foo =`
-	const varMatch = /^\s*(?:const|let|var)\s+\w+\s*=\s*([\s\S]*)/i.exec(content);
-	if (varMatch) {
-		content = varMatch[1];
-		// Remove trailing semicolon if present
-		if (content.trim().endsWith(';')) {
-			content = content.trim().slice(0, -1);
-		}
-	}
-
-	// Find the first opening bracket
-	const firstObj = content.indexOf('{');
-	const firstArr = content.indexOf('[');
-	if (firstObj === -1 && firstArr === -1) {
-		return text;
-	}
-	const start =
-		firstArr === -1
-			? firstObj
-			: firstObj === -1
-				? firstArr
-				: Math.min(firstObj, firstArr);
-	content = content.slice(start);
-
-	// Try to parse the entire string with jsonc-parser
-	const tryParse = (value: string): string | undefined => {
-		const errors: ParseError[] = [];
-		try {
-			const result = parse(value, errors, { allowTrailingComma: true });
-			if (errors.length === 0) {
-				return JSON.stringify(result, null, 2);
-			}
-		} catch {
-			// ignore
-		}
-		return undefined;
-	};
-
-	const parsed = tryParse(content);
-	if (parsed !== undefined) {
-		return parsed;
-	}
-
-	// If parsing the full string failed, use a more efficient approach
-	// to find valid JSON boundaries
-	const openChar = content[0];
-	const closeChar = openChar === '{' ? '}' : ']';
-
-	// Find all potential closing positions by tracking nesting depth
-	const closingPositions: number[] = [];
-	let depth = 0;
-	let inString = false;
-	let escapeNext = false;
-
-	for (let i = 0; i < content.length; i++) {
-		const char = content[i];
-
-		if (escapeNext) {
-			escapeNext = false;
-			continue;
-		}
-
-		if (char === '\\') {
-			escapeNext = true;
-			continue;
-		}
-
-		if (char === '"' && !inString) {
-			inString = true;
-			continue;
-		}
-
-		if (char === '"' && inString) {
-			inString = false;
-			continue;
-		}
-
-		// Skip content inside strings
-		if (inString) continue;
-
-		if (char === openChar) {
-			depth++;
-		} else if (char === closeChar) {
-			depth--;
-			if (depth === 0) {
-				closingPositions.push(i + 1);
-			}
-		}
-	}
-
-	// Try parsing at each valid closing position, starting from the end
-	for (let i = closingPositions.length - 1; i >= 0; i--) {
-		const attempt = tryParse(content.slice(0, closingPositions[i]));
-		if (attempt !== undefined) {
-			return attempt;
-		}
-	}
-
-	// As a final fallback, try the original character-by-character approach
-	// but only for the last 1000 characters to limit performance impact
-	const searchStart = Math.max(0, content.length - 1000);
-	for (let end = content.length - 1; end > searchStart; end--) {
-		const attempt = tryParse(content.slice(0, end));
-		if (attempt !== undefined) {
-			return attempt;
-		}
-	}
-
-	return text;
-}
--- a/packages/ai-sdk-provider-grok-cli/src/message-converter.test.ts
+++ b/packages/ai-sdk-provider-grok-cli/src/message-converter.test.ts
@@ -1,163 +0,0 @@
-/**
- * Tests for message conversion utilities
- */
-
-import { describe, expect, it } from 'vitest';
-import {
-	convertFromGrokCliResponse,
-	convertToGrokCliMessages,
-	createPromptFromMessages,
-	escapeShellArg
-} from './message-converter.js';
-
-describe('convertToGrokCliMessages', () => {
-	it('should convert string content messages', () => {
-		const messages = [
-			{ role: 'user', content: 'Hello, world!' },
-			{ role: 'assistant', content: 'Hi there!' }
-		];
-
-		const result = convertToGrokCliMessages(messages);
-
-		expect(result).toEqual([
-			{ role: 'user', content: 'Hello, world!' },
-			{ role: 'assistant', content: 'Hi there!' }
-		]);
-	});
-
-	it('should convert array content messages', () => {
-		const messages = [
-			{
-				role: 'user',
-				content: [
-					{ type: 'text', text: 'Hello' },
-					{ type: 'text', text: 'World' }
-				]
-			}
-		];
-
-		const result = convertToGrokCliMessages(messages);
-
-		expect(result).toEqual([{ role: 'user', content: 'Hello\nWorld' }]);
-	});
-
-	it('should convert object content messages', () => {
-		const messages = [
-			{
-				role: 'user',
-				content: { text: 'Hello from object' }
-			}
-		];
-
-		const result = convertToGrokCliMessages(messages);
-
-		expect(result).toEqual([{ role: 'user', content: 'Hello from object' }]);
-	});
-});
-
-describe('convertFromGrokCliResponse', () => {
-	it('should parse JSONL response format', () => {
-		const responseText = `{"role": "assistant", "content": "Hello there!", "usage": {"prompt_tokens": 10, "completion_tokens": 5, "total_tokens": 15}}`;
-
-		const result = convertFromGrokCliResponse(responseText);
-
-		expect(result).toEqual({
-			text: 'Hello there!',
-			usage: {
-				promptTokens: 10,
-				completionTokens: 5,
-				totalTokens: 15
-			}
-		});
-	});
-
-	it('should handle multiple lines in JSONL format', () => {
-		const responseText = `{"role": "user", "content": "Hello"}
-{"role": "assistant", "content": "Hi there!", "usage": {"prompt_tokens": 5, "completion_tokens": 3}}`;
-
-		const result = convertFromGrokCliResponse(responseText);
-
-		expect(result).toEqual({
-			text: 'Hi there!',
-			usage: {
-				promptTokens: 5,
-				completionTokens: 3,
-				totalTokens: 0
-			}
-		});
-	});
-
-	it('should fallback to raw text when parsing fails', () => {
-		const responseText = 'Invalid JSON response';
-
-		const result = convertFromGrokCliResponse(responseText);
-
-		expect(result).toEqual({
-			text: 'Invalid JSON response',
-			usage: undefined
-		});
-	});
-});
-
-describe('createPromptFromMessages', () => {
-	it('should create formatted prompt from messages', () => {
-		const messages = [
-			{ role: 'system', content: 'You are a helpful assistant.' },
-			{ role: 'user', content: 'What is 2+2?' },
-			{ role: 'assistant', content: '2+2 equals 4.' }
-		];
-
-		const result = createPromptFromMessages(messages);
-
-		expect(result).toBe(
-			'System: You are a helpful assistant.\n\nUser: What is 2+2?\n\nAssistant: 2+2 equals 4.'
-		);
-	});
-
-	it('should handle custom role names', () => {
-		const messages = [{ role: 'custom', content: 'Custom message' }];
-
-		const result = createPromptFromMessages(messages);
-
-		expect(result).toBe('custom: Custom message');
-	});
-
-	it('should trim whitespace from message content', () => {
-		const messages = [
-			{ role: 'user', content: '  Hello with spaces  ' },
-			{ role: 'assistant', content: '\n\nResponse with newlines\n\n' }
-		];
-
-		const result = createPromptFromMessages(messages);
-
-		expect(result).toBe(
-			'User: Hello with spaces\n\nAssistant: Response with newlines'
-		);
-	});
-});
-
-describe('escapeShellArg', () => {
-	it('should escape single quotes', () => {
-		const arg = "It's a test";
-		const result = escapeShellArg(arg);
-		expect(result).toBe("'It'\\''s a test'");
-	});
-
-	it('should handle strings without special characters', () => {
-		const arg = 'simple string';
-		const result = escapeShellArg(arg);
-		expect(result).toBe("'simple string'");
-	});
-
-	it('should convert non-string values to strings', () => {
-		const arg = 123;
-		const result = escapeShellArg(arg);
-		expect(result).toBe("'123'");
-	});
-
-	it('should handle empty strings', () => {
-		const arg = '';
-		const result = escapeShellArg(arg);
-		expect(result).toBe("''");
-	});
-});
--- a/packages/ai-sdk-provider-grok-cli/src/types.ts
+++ b/packages/ai-sdk-provider-grok-cli/src/types.ts
@@ -1,81 +0,0 @@
-/**
- * Type definitions for Grok CLI provider
- */
-
-/**
- * Settings for configuring Grok CLI behavior
- */
-export interface GrokCliSettings {
-	/** API key for Grok CLI */
-	apiKey?: string;
-	/** Base URL for Grok API */
-	baseURL?: string;
-	/** Default model to use */
-	model?: string;
-	/** Timeout in milliseconds */
-	timeout?: number;
-	/** Working directory for CLI commands */
-	workingDirectory?: string;
-}
-
-/**
- * Model identifiers supported by Grok CLI
- */
-export type GrokCliModelId = string;
-
-/**
- * Error metadata for Grok CLI operations
- */
-export interface GrokCliErrorMetadata {
-	/** Error code */
-	code?: string;
-	/** Process exit code */
-	exitCode?: number;
-	/** Standard error output */
-	stderr?: string;
-	/** Standard output */
-	stdout?: string;
-	/** Excerpt of the prompt that caused the error */
-	promptExcerpt?: string;
-	/** Timeout value in milliseconds */
-	timeoutMs?: number;
-}
-
-/**
- * Message format for Grok CLI communication
- */
-export interface GrokCliMessage {
-	/** Message role (user, assistant, system) */
-	role: string;
-	/** Message content */
-	content: string;
-}
-
-/**
- * Response format from Grok CLI
- */
-export interface GrokCliResponse {
-	/** Message role */
-	role: string;
-	/** Response content */
-	content: string;
-	/** Token usage information */
-	usage?: {
-		/** Input tokens used */
-		prompt_tokens?: number;
-		/** Output tokens used */
-		completion_tokens?: number;
-		/** Total tokens used */
-		total_tokens?: number;
-	};
-}
-
-/**
- * Configuration options for Grok CLI language model
- */
-export interface GrokCliLanguageModelOptions {
-	/** Model identifier */
-	id: GrokCliModelId;
-	/** Model settings */
-	settings?: GrokCliSettings;
-}
--- a/packages/ai-sdk-provider-grok-cli/tsconfig.json
+++ b/packages/ai-sdk-provider-grok-cli/tsconfig.json
@@ -1,36 +0,0 @@
-{
-	"compilerOptions": {
-		"target": "ES2022",
-		"module": "ESNext",
-		"lib": ["ES2022"],
-		"declaration": true,
-		"declarationMap": true,
-		"sourceMap": true,
-		"outDir": "./dist",
-		"baseUrl": ".",
-		"rootDir": "./src",
-		"strict": true,
-		"noImplicitAny": true,
-		"strictNullChecks": true,
-		"strictFunctionTypes": true,
-		"strictBindCallApply": true,
-		"strictPropertyInitialization": true,
-		"noImplicitThis": true,
-		"alwaysStrict": true,
-		"noUnusedLocals": true,
-		"noUnusedParameters": true,
-		"noImplicitReturns": true,
-		"noFallthroughCasesInSwitch": true,
-		"esModuleInterop": true,
-		"skipLibCheck": true,
-		"forceConsistentCasingInFileNames": true,
-		"moduleResolution": "bundler",
-		"moduleDetection": "force",
-		"types": ["node"],
-		"resolveJsonModule": true,
-		"isolatedModules": true,
-		"allowImportingTsExtensions": false
-	},
-	"include": ["src/**/*"],
-	"exclude": ["node_modules", "dist", "tests", "**/*.test.ts", "**/*.spec.ts"]
-}
--- a/packages/build-config/package.json
+++ b/packages/build-config/package.json
@@ -20,7 +20,8 @@
 		"typecheck": "tsc --noEmit"
 	},
 	"devDependencies": {
-		"typescript": "^5.9.2"
+		"dotenv-mono": "^1.5.1",
+		"typescript": "^5.7.3"
 	},
 	"dependencies": {
 		"tsup": "^8.5.0"
--- a/packages/build-config/src/tsdown.base.ts
+++ b/packages/build-config/src/tsdown.base.ts
@@ -43,9 +43,9 @@ export const baseConfig: Partial<UserConfig> = {
 export function mergeConfig(
 	base: Partial<UserConfig>,
 	overrides: Partial<UserConfig>
-): UserConfig {
+): Partial<UserConfig> {
 	return {
 		...base,
 		...overrides
-	} as UserConfig;
+	};
 }
--- a/packages/tm-core/package.json
+++ b/packages/tm-core/package.json
@@ -31,13 +31,21 @@
 	},
 	"dependencies": {
 		"@supabase/supabase-js": "^2.57.4",
-		"zod": "^4.1.11"
+		"zod": "^3.23.8"
 	},
 	"devDependencies": {
+		"@biomejs/biome": "^1.9.4",
+		"@tm/build-config": "*",
 		"@types/node": "^22.10.5",
-		"@vitest/coverage-v8": "^3.2.4",
-		"typescript": "^5.9.2",
-		"vitest": "^3.2.4"
+		"@vitest/coverage-v8": "^2.0.5",
+		"dotenv-mono": "^1.5.1",
+		"ts-node": "^10.9.2",
+		"tsup": "^8.5.0",
+		"typescript": "^5.7.3",
+		"vitest": "^2.1.8"
+	},
+	"engines": {
+		"node": ">=18.0.0"
 	},
 	"files": ["src", "README.md", "CHANGELOG.md"],
 	"keywords": ["task-management", "typescript", "ai", "prd", "parser"],
--- a/packages/tm-core/src/entities/task.entity.ts
+++ b/packages/tm-core/src/entities/task.entity.ts
@@ -33,9 +33,6 @@ export class TaskEntity implements Task {
 	tags?: string[];
 	assignee?: string;
 	complexity?: Task['complexity'];
-	recommendedSubtasks?: number;
-	expansionPrompt?: string;
-	complexityReasoning?: string;

 	constructor(data: Task | (Omit<Task, 'id'> & { id: number | string })) {
 		this.validate(data);
@@ -65,9 +62,6 @@ export class TaskEntity implements Task {
 		this.tags = data.tags;
 		this.assignee = data.assignee;
 		this.complexity = data.complexity;
-		this.recommendedSubtasks = data.recommendedSubtasks;
-		this.expansionPrompt = data.expansionPrompt;
-		this.complexityReasoning = data.complexityReasoning;
 	}

 	/**
@@ -252,10 +246,7 @@ export class TaskEntity implements Task {
 			actualEffort: this.actualEffort,
 			tags: this.tags,
 			assignee: this.assignee,
-			complexity: this.complexity,
-			recommendedSubtasks: this.recommendedSubtasks,
-			expansionPrompt: this.expansionPrompt,
-			complexityReasoning: this.complexityReasoning
+			complexity: this.complexity
 		};
 	}

--- a/packages/tm-core/src/index.ts
+++ b/packages/tm-core/src/index.ts
@@ -61,12 +61,3 @@ export { getLogger, createLogger, setGlobalLogger } from './logger/index.js';

 // Re-export executors
 export * from './executors/index.js';
-
-// Re-export reports
-export {
-	ComplexityReportManager,
-	type ComplexityReport,
-	type ComplexityReportMetadata,
-	type ComplexityAnalysis,
-	type TaskComplexityData
-} from './reports/index.js';
--- a/packages/tm-core/src/reports/complexity-report-manager.ts
+++ b/packages/tm-core/src/reports/complexity-report-manager.ts
@@ -1,185 +0,0 @@
-/**
- * @fileoverview ComplexityReportManager - Handles loading and managing complexity analysis reports
- * Follows the same pattern as ConfigManager and AuthManager
- */
-
-import { promises as fs } from 'fs';
-import path from 'path';
-import type {
-	ComplexityReport,
-	ComplexityAnalysis,
-	TaskComplexityData
-} from './types.js';
-import { getLogger } from '../logger/index.js';
-
-const logger = getLogger('ComplexityReportManager');
-
-/**
- * Manages complexity analysis reports
- * Handles loading, caching, and providing complexity data for tasks
- */
-export class ComplexityReportManager {
-	private projectRoot: string;
-	private reportCache: Map<string, ComplexityReport> = new Map();
-
-	constructor(projectRoot: string) {
-		this.projectRoot = projectRoot;
-	}
-
-	/**
-	 * Get the path to the complexity report file for a given tag
-	 */
-	private getReportPath(tag?: string): string {
-		const reportsDir = path.join(this.projectRoot, '.taskmaster', 'reports');
-		const tagSuffix = tag && tag !== 'master' ? `_${tag}` : '';
-		return path.join(reportsDir, `task-complexity-report${tagSuffix}.json`);
-	}
-
-	/**
-	 * Load complexity report for a given tag
-	 * Results are cached to avoid repeated file reads
-	 */
-	async loadReport(tag?: string): Promise<ComplexityReport | null> {
-		const resolvedTag = tag || 'master';
-		const cacheKey = resolvedTag;
-
-		// Check cache first
-		if (this.reportCache.has(cacheKey)) {
-			return this.reportCache.get(cacheKey)!;
-		}
-
-		const reportPath = this.getReportPath(tag);
-
-		try {
-			// Check if file exists
-			await fs.access(reportPath);
-
-			// Read and parse the report
-			const content = await fs.readFile(reportPath, 'utf-8');
-			const report = JSON.parse(content) as ComplexityReport;
-
-			// Validate basic structure
-			if (!report.meta || !Array.isArray(report.complexityAnalysis)) {
-				logger.warn(
-					`Invalid complexity report structure at ${reportPath}, ignoring`
-				);
-				return null;
-			}
-
-			// Cache the report
-			this.reportCache.set(cacheKey, report);
-
-			logger.debug(
-				`Loaded complexity report for tag '${resolvedTag}' with ${report.complexityAnalysis.length} analyses`
-			);
-
-			return report;
-		} catch (error: any) {
-			if (error.code === 'ENOENT') {
-				// File doesn't exist - this is normal, not all projects have complexity reports
-				logger.debug(`No complexity report found for tag '${resolvedTag}'`);
-				return null;
-			}
-
-			// Other errors (parsing, permissions, etc.)
-			logger.warn(
-				`Failed to load complexity report for tag '${resolvedTag}': ${error.message}`
-			);
-			return null;
-		}
-	}
-
-	/**
-	 * Get complexity data for a specific task ID
-	 */
-	async getComplexityForTask(
-		taskId: string | number,
-		tag?: string
-	): Promise<TaskComplexityData | null> {
-		const report = await this.loadReport(tag);
-		if (!report) {
-			return null;
-		}
-
-		// Find the analysis for this task
-		const analysis = report.complexityAnalysis.find(
-			(a) => String(a.taskId) === String(taskId)
-		);
-
-		if (!analysis) {
-			return null;
-		}
-
-		// Convert to TaskComplexityData format
-		return {
-			complexityScore: analysis.complexityScore,
-			recommendedSubtasks: analysis.recommendedSubtasks,
-			expansionPrompt: analysis.expansionPrompt,
-			complexityReasoning: analysis.complexityReasoning
-		};
-	}
-
-	/**
-	 * Get complexity data for multiple tasks at once
-	 * More efficient than calling getComplexityForTask multiple times
-	 */
-	async getComplexityForTasks(
-		taskIds: (string | number)[],
-		tag?: string
-	): Promise<Map<string, TaskComplexityData>> {
-		const result = new Map<string, TaskComplexityData>();
-		const report = await this.loadReport(tag);
-
-		if (!report) {
-			return result;
-		}
-
-		// Create a map for fast lookups
-		const analysisMap = new Map<string, ComplexityAnalysis>();
-		report.complexityAnalysis.forEach((analysis) => {
-			analysisMap.set(String(analysis.taskId), analysis);
-		});
-
-		// Map each task ID to its complexity data
-		taskIds.forEach((taskId) => {
-			const analysis = analysisMap.get(String(taskId));
-			if (analysis) {
-				result.set(String(taskId), {
-					complexityScore: analysis.complexityScore,
-					recommendedSubtasks: analysis.recommendedSubtasks,
-					expansionPrompt: analysis.expansionPrompt,
-					complexityReasoning: analysis.complexityReasoning
-				});
-			}
-		});
-
-		return result;
-	}
-
-	/**
-	 * Clear the report cache
-	 * @param tag - Specific tag to clear, or undefined to clear all cached reports
-	 * Useful when reports are regenerated or modified externally
-	 */
-	clearCache(tag?: string): void {
-		if (tag) {
-			this.reportCache.delete(tag);
-		} else {
-			// Clear all cached reports
-			this.reportCache.clear();
-		}
-	}
-
-	/**
-	 * Check if a complexity report exists for a tag
-	 */
-	async hasReport(tag?: string): Promise<boolean> {
-		const reportPath = this.getReportPath(tag);
-		try {
-			await fs.access(reportPath);
-			return true;
-		} catch {
-			return false;
-		}
-	}
-}
--- a/packages/tm-core/src/reports/index.ts
+++ b/packages/tm-core/src/reports/index.ts
@@ -1,11 +0,0 @@
-/**
- * @fileoverview Reports module exports
- */
-
-export { ComplexityReportManager } from './complexity-report-manager.js';
-export type {
-	ComplexityReport,
-	ComplexityReportMetadata,
-	ComplexityAnalysis,
-	TaskComplexityData
-} from './types.js';
--- a/packages/tm-core/src/reports/types.ts
+++ b/packages/tm-core/src/reports/types.ts
@@ -1,65 +0,0 @@
-/**
- * @fileoverview Type definitions for complexity analysis reports
- */
-
-/**
- * Analysis result for a single task
- */
-export interface ComplexityAnalysis {
-	/** Task ID being analyzed */
-	taskId: string | number;
-	/** Task title */
-	taskTitle: string;
-	/** Complexity score (1-10 scale) */
-	complexityScore: number;
-	/** Recommended number of subtasks */
-	recommendedSubtasks: number;
-	/** AI-generated prompt for task expansion */
-	expansionPrompt: string;
-	/** Reasoning behind the complexity assessment */
-	complexityReasoning: string;
-}
-
-/**
- * Metadata about the complexity report
- */
-export interface ComplexityReportMetadata {
-	/** When the report was generated */
-	generatedAt: string;
-	/** Number of tasks analyzed in this run */
-	tasksAnalyzed: number;
-	/** Total number of tasks in the file */
-	totalTasks?: number;
-	/** Total analyses in the report (across all runs) */
-	analysisCount?: number;
-	/** Complexity threshold score used */
-	thresholdScore: number;
-	/** Project name */
-	projectName?: string;
-	/** Whether research mode was used */
-	usedResearch: boolean;
-}
-
-/**
- * Complete complexity analysis report
- */
-export interface ComplexityReport {
-	/** Report metadata */
-	meta: ComplexityReportMetadata;
-	/** Array of complexity analyses */
-	complexityAnalysis: ComplexityAnalysis[];
-}
-
-/**
- * Complexity data to be attached to a Task
- */
-export interface TaskComplexityData {
-	/** Complexity score (1-10 scale) */
-	complexityScore?: number;
-	/** Recommended number of subtasks */
-	recommendedSubtasks?: number;
-	/** AI-generated expansion prompt */
-	expansionPrompt?: string;
-	/** Reasoning behind the assessment */
-	complexityReasoning?: string;
-}
--- a/packages/tm-core/src/services/task-service.ts
+++ b/packages/tm-core/src/services/task-service.ts
@@ -135,28 +135,15 @@ export class TaskService {
 	}

 	/**
-	 * Get a single task by ID - delegates to storage layer
+	 * Get a single task by ID
 	 */
 	async getTask(taskId: string, tag?: string): Promise<Task | null> {
-		// Use provided tag or get active tag
-		const activeTag = tag || this.getActiveTag();
+		const result = await this.getTaskList({
+			tag,
+			includeSubtasks: true
+		});

-		try {
-			// Delegate to storage layer which handles the specific logic for tasks vs subtasks
-			return await this.storage.loadTask(String(taskId), activeTag);
-		} catch (error) {
-			throw new TaskMasterError(
-				`Failed to get task ${taskId}`,
-				ERROR_CODES.STORAGE_ERROR,
-				{
-					operation: 'getTask',
-					resource: 'task',
-					taskId: String(taskId),
-					tag: activeTag
-				},
-				error as Error
-			);
-		}
+		return result.tasks.find((t) => t.id === taskId) || null;
 	}

 	/**
@@ -397,6 +384,16 @@ export class TaskService {
 				}
 			}

+			// Complexity filter
+			if (filter.complexity) {
+				const complexities = Array.isArray(filter.complexity)
+					? filter.complexity
+					: [filter.complexity];
+				if (!task.complexity || !complexities.includes(task.complexity)) {
+					return false;
+				}
+			}
+
 			// Search filter
 			if (filter.search) {
 				const searchLower = filter.search.toLowerCase();
--- a/packages/tm-core/src/storage/file-storage/file-storage.ts
+++ b/packages/tm-core/src/storage/file-storage/file-storage.ts
@@ -11,7 +11,6 @@ import type {
 import { FormatHandler } from './format-handler.js';
 import { FileOperations } from './file-operations.js';
 import { PathResolver } from './path-resolver.js';
-import { ComplexityReportManager } from '../../reports/complexity-report-manager.js';

 /**
 * File-based storage implementation using a single tasks.json file with separated concerns
@@ -20,13 +19,11 @@ export class FileStorage implements IStorage {
 	private formatHandler: FormatHandler;
 	private fileOps: FileOperations;
 	private pathResolver: PathResolver;
-	private complexityManager: ComplexityReportManager;

 	constructor(projectPath: string) {
 		this.formatHandler = new FormatHandler();
 		this.fileOps = new FileOperations();
 		this.pathResolver = new PathResolver(projectPath);
-		this.complexityManager = new ComplexityReportManager(projectPath);
 	}

 	/**
@@ -90,7 +87,6 @@ export class FileStorage implements IStorage {

 	/**
 	 * Load tasks from the single tasks.json file for a specific tag
-	 * Enriches tasks with complexity data from the complexity report
 	 */
 	async loadTasks(tag?: string): Promise<Task[]> {
 		const filePath = this.pathResolver.getTasksPath();
@@ -98,10 +94,7 @@ export class FileStorage implements IStorage {

 		try {
 			const rawData = await this.fileOps.readJson(filePath);
-			const tasks = this.formatHandler.extractTasks(rawData, resolvedTag);
-
-			// Enrich tasks with complexity data
-			return await this.enrichTasksWithComplexity(tasks, resolvedTag);
+			return this.formatHandler.extractTasks(rawData, resolvedTag);
 		} catch (error: any) {
 			if (error.code === 'ENOENT') {
 				return []; // File doesn't exist, return empty array
@@ -112,65 +105,9 @@ export class FileStorage implements IStorage {

 	/**
 	 * Load a single task by ID from the tasks.json file
-	 * Handles both regular tasks and subtasks (with dotted notation like "1.2")
 	 */
 	async loadTask(taskId: string, tag?: string): Promise<Task | null> {
 		const tasks = await this.loadTasks(tag);
-
-		// Check if this is a subtask (contains a dot)
-		if (taskId.includes('.')) {
-			const [parentId, subtaskId] = taskId.split('.');
-			const parentTask = tasks.find((t) => String(t.id) === parentId);
-
-			if (!parentTask || !parentTask.subtasks) {
-				return null;
-			}
-
-			const subtask = parentTask.subtasks.find(
-				(st) => String(st.id) === subtaskId
-			);
-			if (!subtask) {
-				return null;
-			}
-
-			const toFullSubId = (maybeDotId: string | number): string => {
-				const depId = String(maybeDotId);
-				return depId.includes('.') ? depId : `${parentTask.id}.${depId}`;
-			};
-			const resolvedDependencies =
-				subtask.dependencies?.map((dep) => toFullSubId(dep)) ?? [];
-
-			// Return a Task-like object for the subtask with the full dotted ID
-			// Following the same pattern as findTaskById in utils.js
-			const subtaskResult = {
-				...subtask,
-				id: taskId, // Use the full dotted ID
-				title: subtask.title || `Subtask ${subtaskId}`,
-				description: subtask.description || '',
-				status: subtask.status || 'pending',
-				priority: subtask.priority || parentTask.priority || 'medium',
-				dependencies: resolvedDependencies,
-				details: subtask.details || '',
-				testStrategy: subtask.testStrategy || '',
-				subtasks: [],
-				tags: parentTask.tags || [],
-				assignee: subtask.assignee || parentTask.assignee,
-				complexity: subtask.complexity || parentTask.complexity,
-				createdAt: subtask.createdAt || parentTask.createdAt,
-				updatedAt: subtask.updatedAt || parentTask.updatedAt,
-				// Add reference to parent task for context (like utils.js does)
-				parentTask: {
-					id: parentTask.id,
-					title: parentTask.title,
-					status: parentTask.status
-				},
-				isSubtask: true
-			};
-
-			return subtaskResult;
-		}
-
-		// Handle regular task lookup
 		return tasks.find((task) => String(task.id) === String(taskId)) || null;
 	}

@@ -603,46 +540,6 @@ export class FileStorage implements IStorage {

 		await this.saveTasks(tasks, targetTag);
 	}
-
-	/**
-	 * Enrich tasks with complexity data from the complexity report
-	 * Private helper method called by loadTasks()
-	 */
-	private async enrichTasksWithComplexity(
-		tasks: Task[],
-		tag: string
-	): Promise<Task[]> {
-		// Get all task IDs for bulk lookup
-		const taskIds = tasks.map((t) => t.id);
-
-		// Load complexity data for all tasks at once (more efficient)
-		const complexityMap = await this.complexityManager.getComplexityForTasks(
-			taskIds,
-			tag
-		);
-
-		// If no complexity data found, return tasks as-is
-		if (complexityMap.size === 0) {
-			return tasks;
-		}
-
-		// Enrich each task with its complexity data
-		return tasks.map((task) => {
-			const complexityData = complexityMap.get(String(task.id));
-			if (!complexityData) {
-				return task;
-			}
-
-			// Merge complexity data into the task
-			return {
-				...task,
-				complexity: complexityData.complexityScore,
-				recommendedSubtasks: complexityData.recommendedSubtasks,
-				expansionPrompt: complexityData.expansionPrompt,
-				complexityReasoning: complexityData.complexityReasoning
-			};
-		});
-	}
 }

 // Export as default for convenience
--- a/packages/tm-core/src/storage/storage-factory.ts
+++ b/packages/tm-core/src/storage/storage-factory.ts
@@ -82,7 +82,7 @@ export class StorageFactory {
 							apiAccessToken: credentials.token,
 							apiEndpoint:
 								config.storage?.apiEndpoint ||
-								process.env.TM_PUBLIC_BASE_DOMAIN ||
+								process.env.HAMSTER_API_URL ||
 								'https://tryhamster.com/api'
 						};
 						config.storage = nextStorage;
@@ -112,7 +112,7 @@ export class StorageFactory {
 							apiAccessToken: credentials.token,
 							apiEndpoint:
 								config.storage?.apiEndpoint ||
-								process.env.TM_PUBLIC_BASE_DOMAIN ||
+								process.env.HAMSTER_API_URL ||
 								'https://tryhamster.com/api'
 						};
 						config.storage = nextStorage;
--- a/packages/tm-core/src/types/index.ts
+++ b/packages/tm-core/src/types/index.ts
@@ -72,13 +72,7 @@ export interface Task {
 	actualEffort?: number;
 	tags?: string[];
 	assignee?: string;
-
-	// Complexity analysis (from complexity report)
-	// Can be either enum ('simple' | 'moderate' | 'complex' | 'very-complex') or numeric score (1-10)
-	complexity?: TaskComplexity | number;
-	recommendedSubtasks?: number;
-	expansionPrompt?: string;
-	complexityReasoning?: string;
+	complexity?: TaskComplexity;
 }

 /**
@@ -151,6 +145,7 @@ export interface TaskFilter {
 	hasSubtasks?: boolean;
 	search?: string;
 	assignee?: string;
+	complexity?: TaskComplexity | TaskComplexity[];
 }

 /**
--- a/scripts/modules/ai-services-unified.js
+++ b/scripts/modules/ai-services-unified.js
@@ -93,55 +93,31 @@ function _getProvider(providerName) {

 // Helper function to get cost for a specific model
 function _getCostForModel(providerName, modelId) {
-	const DEFAULT_COST = {
-		inputCost: 0,
-		outputCost: 0,
-		currency: 'USD',
-		isUnknown: false
-	};
+	const DEFAULT_COST = { inputCost: 0, outputCost: 0, currency: 'USD' };

 	if (!MODEL_MAP || !MODEL_MAP[providerName]) {
 		log(
 			'warn',
 			`Provider "${providerName}" not found in MODEL_MAP. Cannot determine cost for model ${modelId}.`
 		);
-		return { ...DEFAULT_COST, isUnknown: true };
+		return DEFAULT_COST;
 	}

 	const modelData = MODEL_MAP[providerName].find((m) => m.id === modelId);

-	if (!modelData) {
+	if (!modelData?.cost_per_1m_tokens) {
 		log(
 			'debug',
-			`Model "${modelId}" not found under provider "${providerName}". Assuming unknown cost.`
+			`Cost data not found for model "${modelId}" under provider "${providerName}". Assuming zero cost.`
 		);
-		return { ...DEFAULT_COST, isUnknown: true };
-	}
-
-	// Check if cost_per_1m_tokens is explicitly null (unknown pricing)
-	if (modelData.cost_per_1m_tokens === null) {
-		log(
-			'debug',
-			`Cost data is null for model "${modelId}" under provider "${providerName}". Pricing unknown.`
-		);
-		return { ...DEFAULT_COST, isUnknown: true };
-	}
-
-	// Check if cost_per_1m_tokens is missing/undefined (also unknown)
-	if (modelData.cost_per_1m_tokens === undefined) {
-		log(
-			'debug',
-			`Cost data not found for model "${modelId}" under provider "${providerName}". Pricing unknown.`
-		);
-		return { ...DEFAULT_COST, isUnknown: true };
+		return DEFAULT_COST;
 	}

 	const costs = modelData.cost_per_1m_tokens;
 	return {
 		inputCost: costs.input || 0,
 		outputCost: costs.output || 0,
-		currency: costs.currency || 'USD',
-		isUnknown: false
+		currency: costs.currency || 'USD'
 	};
 }

@@ -891,8 +867,8 @@ async function logAiUsage({
 		const timestamp = new Date().toISOString();
 		const totalTokens = (inputTokens || 0) + (outputTokens || 0);

-		// Destructure currency along with costs and unknown flag
-		const { inputCost, outputCost, currency, isUnknown } = _getCostForModel(
+		// Destructure currency along with costs
+		const { inputCost, outputCost, currency } = _getCostForModel(
 			providerName,
 			modelId
 		);
@@ -914,8 +890,7 @@ async function logAiUsage({
 			outputTokens: outputTokens || 0,
 			totalTokens,
 			totalCost,
-			currency, // Add currency to the telemetry data
-			isUnknownCost: isUnknown // Flag to indicate if pricing is unknown
+			currency // Add currency to the telemetry data
 		};

 		if (getDebugFlag()) {
--- a/scripts/modules/config-manager.js
+++ b/scripts/modules/config-manager.js
@@ -311,7 +311,6 @@ function validateClaudeCodeSettings(settings) {
 	// Define the base settings schema without commandSpecific first
 	const BaseSettingsSchema = z.object({
 		pathToClaudeCodeExecutable: z.string().optional(),
-		// Use number().int() for integer validation in Zod
 		maxTurns: z.number().int().positive().optional(),
 		customSystemPrompt: z.string().optional(),
 		appendSystemPrompt: z.string().optional(),
@@ -327,22 +326,19 @@ function validateClaudeCodeSettings(settings) {
 					type: z.enum(['stdio', 'sse']).optional(),
 					command: z.string(),
 					args: z.array(z.string()).optional(),
-					env: z.record(z.string(), z.string()).optional(),
-					url: z.url().optional(),
-					headers: z.record(z.string(), z.string()).optional()
+					env: z.record(z.string()).optional(),
+					url: z.string().url().optional(),
+					headers: z.record(z.string()).optional()
 				})
 			)
 			.optional()
 	});

-	// Define CommandSpecificSchema using flexible keys, but restrict to known commands
-	const CommandSpecificSchema = z
-		.record(z.string(), BaseSettingsSchema)
-		.refine(
-			(obj) =>
-				Object.keys(obj || {}).every((k) => AI_COMMAND_NAMES.includes(k)),
-			{ message: 'Invalid command name in commandSpecific' }
-		);
+	// Define CommandSpecificSchema using the base schema
+	const CommandSpecificSchema = z.record(
+		z.enum(AI_COMMAND_NAMES),
+		BaseSettingsSchema
+	);

 	// Define the full settings schema with commandSpecific
 	const SettingsSchema = BaseSettingsSchema.extend({
--- a/scripts/modules/task-manager/add-task.js
+++ b/scripts/modules/task-manager/add-task.js
@@ -2,6 +2,7 @@ import path from 'path';
 import chalk from 'chalk';
 import boxen from 'boxen';
 import Table from 'cli-table3';
+import { z } from 'zod';
 import Fuse from 'fuse.js'; // Import Fuse.js for advanced fuzzy search

 import {
@@ -28,7 +29,6 @@ import { getDefaultPriority, hasCodebaseAnalysis } from '../config-manager.js';
 import { getPromptManager } from '../prompt-manager.js';
 import ContextGatherer from '../utils/contextGatherer.js';
 import generateTaskFiles from './generate-task-files.js';
-import { COMMAND_SCHEMAS } from '../../../src/schemas/registry.js';
 import {
 	TASK_PRIORITY_OPTIONS,
 	DEFAULT_TASK_PRIORITY,
@@ -36,6 +36,26 @@ import {
 	normalizeTaskPriority
 } from '../../../src/constants/task-priority.js';

+// Define Zod schema for the expected AI output object
+const AiTaskDataSchema = z.object({
+	title: z.string().describe('Clear, concise title for the task'),
+	description: z
+		.string()
+		.describe('A one or two sentence description of the task'),
+	details: z
+		.string()
+		.describe('In-depth implementation details, considerations, and guidance'),
+	testStrategy: z
+		.string()
+		.describe('Detailed approach for verifying task completion'),
+	dependencies: z
+		.array(z.number())
+		.nullable()
+		.describe(
+			'Array of task IDs that this task depends on (must be completed before this task can start)'
+		)
+});
+
 /**
 * Get all tasks from all tags
 * @param {Object} rawData - The raw tagged data object
@@ -431,7 +451,7 @@ async function addTask(
 					role: serviceRole,
 					session: session,
 					projectRoot: projectRoot,
-					schema: COMMAND_SCHEMAS['add-task'],
+					schema: AiTaskDataSchema,
 					objectName: 'newTaskData',
 					systemPrompt: systemPrompt,
 					prompt: userPrompt,
--- a/scripts/modules/task-manager/analyze-task-complexity.js
+++ b/scripts/modules/task-manager/analyze-task-complexity.js
@@ -11,8 +11,7 @@ import {
 	displayAiUsageSummary
 } from '../ui.js';

-import { generateObjectService } from '../ai-services-unified.js';
-import { COMMAND_SCHEMAS } from '../../../src/schemas/registry.js';
+import { generateTextService } from '../ai-services-unified.js';

 import {
 	getDebugFlag,
@@ -30,6 +29,46 @@ import { ContextGatherer } from '../utils/contextGatherer.js';
 import { FuzzyTaskSearch } from '../utils/fuzzyTaskSearch.js';
 import { flattenTasksWithSubtasks } from '../utils.js';

+/**
+ * Generates the prompt for complexity analysis.
+ * (Moved from ai-services.js and simplified)
+ * @param {Object} tasksData - The tasks data object.
+ * @param {string} [gatheredContext] - The gathered context for the analysis.
+ * @returns {string} The generated prompt.
+ */
+function generateInternalComplexityAnalysisPrompt(
+	tasksData,
+	gatheredContext = ''
+) {
+	const tasksString = JSON.stringify(tasksData.tasks, null, 2);
+	let prompt = `Analyze the following tasks to determine their complexity (1-10 scale) and recommend the number of subtasks for expansion. Provide a brief reasoning and an initial expansion prompt for each.
+
+Tasks:
+${tasksString}`;
+
+	if (gatheredContext) {
+		prompt += `\n\n# Project Context\n\n${gatheredContext}`;
+	}
+
+	prompt += `
+
+Respond ONLY with a valid JSON array matching the schema:
+[
+  {
+    "taskId": <number>,
+    "taskTitle": "<string>",
+    "complexityScore": <number 1-10>,
+    "recommendedSubtasks": <number>,
+    "expansionPrompt": "<string>",
+    "reasoning": "<string>"
+  },
+  ...
+]
+
+Do not include any explanatory text, markdown formatting, or code block markers before or after the JSON array.`;
+	return prompt;
+}
+
 /**
 * Analyzes task complexity and generates expansion recommendations
 * @param {Object} options Command options
@@ -407,14 +446,12 @@ async function analyzeTaskComplexity(options, context = {}) {
 		try {
 			const role = useResearch ? 'research' : 'main';

-			aiServiceResponse = await generateObjectService({
+			aiServiceResponse = await generateTextService({
 				prompt,
 				systemPrompt,
 				role,
 				session,
 				projectRoot,
-				schema: COMMAND_SCHEMAS['analyze-complexity'],
-				objectName: 'complexityAnalysis',
 				commandName: 'analyze-complexity',
 				outputType: mcpLog ? 'mcp' : 'cli'
 			});
@@ -426,15 +463,63 @@ async function analyzeTaskComplexity(options, context = {}) {
 			if (outputFormat === 'text') {
 				readline.clearLine(process.stdout, 0);
 				readline.cursorTo(process.stdout, 0);
-				console.log(chalk.green('AI service call complete.'));
+				console.log(
+					chalk.green('AI service call complete. Parsing response...')
+				);
 			}

-			// With generateObject, we get structured data directly
-			complexityAnalysis = aiServiceResponse.mainResult.complexityAnalysis;
-			reportLog(
-				`Received ${complexityAnalysis.length} complexity analyses from AI.`,
-				'info'
-			);
+			reportLog('Parsing complexity analysis from text response...', 'info');
+			try {
+				let cleanedResponse = aiServiceResponse.mainResult;
+				cleanedResponse = cleanedResponse.trim();
+
+				const codeBlockMatch = cleanedResponse.match(
+					/```(?:json)?\s*([\s\S]*?)\s*```/
+				);
+				if (codeBlockMatch) {
+					cleanedResponse = codeBlockMatch[1].trim();
+				} else {
+					const firstBracket = cleanedResponse.indexOf('[');
+					const lastBracket = cleanedResponse.lastIndexOf(']');
+					if (firstBracket !== -1 && lastBracket > firstBracket) {
+						cleanedResponse = cleanedResponse.substring(
+							firstBracket,
+							lastBracket + 1
+						);
+					} else {
+						reportLog(
+							'Warning: Response does not appear to be a JSON array.',
+							'warn'
+						);
+					}
+				}
+
+				if (outputFormat === 'text' && getDebugFlag(session)) {
+					console.log(chalk.gray('Attempting to parse cleaned JSON...'));
+					console.log(chalk.gray('Cleaned response (first 100 chars):'));
+					console.log(chalk.gray(cleanedResponse.substring(0, 100)));
+					console.log(chalk.gray('Last 100 chars:'));
+					console.log(
+						chalk.gray(cleanedResponse.substring(cleanedResponse.length - 100))
+					);
+				}
+
+				complexityAnalysis = JSON.parse(cleanedResponse);
+			} catch (parseError) {
+				if (loadingIndicator) stopLoadingIndicator(loadingIndicator);
+				reportLog(
+					`Error parsing complexity analysis JSON: ${parseError.message}`,
+					'error'
+				);
+				if (outputFormat === 'text') {
+					console.error(
+						chalk.red(
+							`Error parsing complexity analysis JSON: ${parseError.message}`
+						)
+					);
+				}
+				throw parseError;
+			}

 			const taskIds = tasksData.tasks.map((t) => t.id);
 			const analysisTaskIds = complexityAnalysis.map((a) => a.taskId);
--- a/scripts/modules/task-manager/expand-task.js
+++ b/scripts/modules/task-manager/expand-task.js
@@ -1,22 +1,22 @@
 import fs from 'fs';
 import path from 'path';
+import { z } from 'zod';

 import {
-	getTagAwareFilePath,
-	isSilentMode,
 	log,
 	readJSON,
-	writeJSON
+	writeJSON,
+	isSilentMode,
+	getTagAwareFilePath
 } from '../utils.js';

 import {
-	displayAiUsageSummary,
 	startLoadingIndicator,
-	stopLoadingIndicator
+	stopLoadingIndicator,
+	displayAiUsageSummary
 } from '../ui.js';

-import { COMMAND_SCHEMAS } from '../../../src/schemas/registry.js';
-import { generateObjectService } from '../ai-services-unified.js';
+import { generateTextService } from '../ai-services-unified.js';

 import {
 	getDefaultSubtasks,
@@ -24,12 +24,265 @@ import {
 	hasCodebaseAnalysis
 } from '../config-manager.js';
 import { getPromptManager } from '../prompt-manager.js';
-import { findProjectRoot, flattenTasksWithSubtasks } from '../utils.js';
+import generateTaskFiles from './generate-task-files.js';
+import { COMPLEXITY_REPORT_FILE } from '../../../src/constants/paths.js';
 import { ContextGatherer } from '../utils/contextGatherer.js';
 import { FuzzyTaskSearch } from '../utils/fuzzyTaskSearch.js';
+import { flattenTasksWithSubtasks, findProjectRoot } from '../utils.js';
+
+// --- Zod Schemas (Keep from previous step) ---
+const subtaskSchema = z
+	.object({
+		id: z
+			.number()
+			.int()
+			.positive()
+			.describe('Sequential subtask ID starting from 1'),
+		title: z.string().min(5).describe('Clear, specific title for the subtask'),
+		description: z
+			.string()
+			.min(10)
+			.describe('Detailed description of the subtask'),
+		dependencies: z
+			.array(z.string())
+			.describe(
+				'Array of subtask dependencies within the same parent task. Use format ["parentTaskId.1", "parentTaskId.2"]. Subtasks can only depend on siblings, not external tasks.'
+			),
+		details: z.string().min(20).describe('Implementation details and guidance'),
+		status: z
+			.string()
+			.describe(
+				'The current status of the subtask (should be pending initially)'
+			),
+		testStrategy: z
+			.string()
+			.nullable()
+			.describe('Approach for testing this subtask')
+			.default('')
+	})
+	.strict();
+const subtaskArraySchema = z.array(subtaskSchema);
+const subtaskWrapperSchema = z.object({
+	subtasks: subtaskArraySchema.describe('The array of generated subtasks.')
+});
+// --- End Zod Schemas ---

 /**
- * Expand a task into subtasks using the unified AI service (generateObjectService).
+ * Parse subtasks from AI's text response. Includes basic cleanup.
+ * @param {string} text - Response text from AI.
+ * @param {number} startId - Starting subtask ID expected.
+ * @param {number} expectedCount - Expected number of subtasks.
+ * @param {number} parentTaskId - Parent task ID for context.
+ * @param {Object} logger - Logging object (mcpLog or console log).
+ * @returns {Array} Parsed and potentially corrected subtasks array.
+ * @throws {Error} If parsing fails or JSON is invalid/malformed.
+ */
+function parseSubtasksFromText(
+	text,
+	startId,
+	expectedCount,
+	parentTaskId,
+	logger
+) {
+	if (typeof text !== 'string') {
+		logger.error(
+			`AI response text is not a string. Received type: ${typeof text}, Value: ${text}`
+		);
+		throw new Error('AI response text is not a string.');
+	}
+
+	if (!text || text.trim() === '') {
+		throw new Error('AI response text is empty after trimming.');
+	}
+
+	const originalTrimmedResponse = text.trim(); // Store the original trimmed response
+	let jsonToParse = originalTrimmedResponse; // Initialize jsonToParse with it
+
+	logger.debug(
+		`Original AI Response for parsing (full length: ${jsonToParse.length}): ${jsonToParse.substring(0, 1000)}...`
+	);
+
+	// --- Pre-emptive cleanup for known AI JSON issues ---
+	// Fix for "dependencies": , or "dependencies":,
+	if (jsonToParse.includes('"dependencies":')) {
+		const malformedPattern = /"dependencies":\s*,/g;
+		if (malformedPattern.test(jsonToParse)) {
+			logger.warn('Attempting to fix malformed "dependencies": , issue.');
+			jsonToParse = jsonToParse.replace(
+				malformedPattern,
+				'"dependencies": [],'
+			);
+			logger.debug(
+				`JSON after fixing "dependencies": ${jsonToParse.substring(0, 500)}...`
+			);
+		}
+	}
+	// --- End pre-emptive cleanup ---
+
+	let parsedObject;
+	let primaryParseAttemptFailed = false;
+
+	// --- Attempt 1: Simple Parse (with optional Markdown cleanup) ---
+	logger.debug('Attempting simple parse...');
+	try {
+		// Check for markdown code block
+		const codeBlockMatch = jsonToParse.match(/```(?:json)?\s*([\s\S]*?)\s*```/);
+		let contentToParseDirectly = jsonToParse;
+		if (codeBlockMatch && codeBlockMatch[1]) {
+			contentToParseDirectly = codeBlockMatch[1].trim();
+			logger.debug('Simple parse: Extracted content from markdown code block.');
+		} else {
+			logger.debug(
+				'Simple parse: No markdown code block found, using trimmed original.'
+			);
+		}
+
+		parsedObject = JSON.parse(contentToParseDirectly);
+		logger.debug('Simple parse successful!');
+
+		// Quick check if it looks like our target object
+		if (
+			!parsedObject ||
+			typeof parsedObject !== 'object' ||
+			!Array.isArray(parsedObject.subtasks)
+		) {
+			logger.warn(
+				'Simple parse succeeded, but result is not the expected {"subtasks": []} structure. Will proceed to advanced extraction.'
+			);
+			primaryParseAttemptFailed = true;
+			parsedObject = null; // Reset parsedObject so we enter the advanced logic
+		}
+		// If it IS the correct structure, we'll skip advanced extraction.
+	} catch (e) {
+		logger.warn(
+			`Simple parse failed: ${e.message}. Proceeding to advanced extraction logic.`
+		);
+		primaryParseAttemptFailed = true;
+		// jsonToParse is already originalTrimmedResponse if simple parse failed before modifying it for markdown
+	}
+
+	// --- Attempt 2: Advanced Extraction (if simple parse failed or produced wrong structure) ---
+	if (primaryParseAttemptFailed || !parsedObject) {
+		// Ensure we try advanced if simple parse gave wrong structure
+		logger.debug('Attempting advanced extraction logic...');
+		// Reset jsonToParse to the original full trimmed response for advanced logic
+		jsonToParse = originalTrimmedResponse;
+
+		// (Insert the more complex extraction logic here - the one we worked on with:
+		//  - targetPattern = '{"subtasks":';
+		//  - careful brace counting for that targetPattern
+		//  - fallbacks to last '{' and '}' if targetPattern logic fails)
+		//  This was the logic from my previous message. Let's assume it's here.
+		//  This block should ultimately set `jsonToParse` to the best candidate string.
+
+		// Example snippet of that advanced logic's start:
+		const targetPattern = '{"subtasks":';
+		const patternStartIndex = jsonToParse.indexOf(targetPattern);
+
+		if (patternStartIndex !== -1) {
+			const openBraces = 0;
+			const firstBraceFound = false;
+			const extractedJsonBlock = '';
+			// ... (loop for brace counting as before) ...
+			// ... (if successful, jsonToParse = extractedJsonBlock) ...
+			// ... (if that fails, fallbacks as before) ...
+		} else {
+			// ... (fallback to last '{' and '}' if targetPattern not found) ...
+		}
+		// End of advanced logic excerpt
+
+		logger.debug(
+			`Advanced extraction: JSON string that will be parsed: ${jsonToParse.substring(0, 500)}...`
+		);
+		try {
+			parsedObject = JSON.parse(jsonToParse);
+			logger.debug('Advanced extraction parse successful!');
+		} catch (parseError) {
+			logger.error(
+				`Advanced extraction: Failed to parse JSON object: ${parseError.message}`
+			);
+			logger.error(
+				`Advanced extraction: Problematic JSON string for parse (first 500 chars): ${jsonToParse.substring(0, 500)}`
+			);
+			throw new Error(
+				// Re-throw a more specific error if advanced also fails
+				`Failed to parse JSON response object after both simple and advanced attempts: ${parseError.message}`
+			);
+		}
+	}
+
+	// --- Validation (applies to successfully parsedObject from either attempt) ---
+	if (
+		!parsedObject ||
+		typeof parsedObject !== 'object' ||
+		!Array.isArray(parsedObject.subtasks)
+	) {
+		logger.error(
+			`Final parsed content is not an object or missing 'subtasks' array. Content: ${JSON.stringify(parsedObject).substring(0, 200)}`
+		);
+		throw new Error(
+			'Parsed AI response is not a valid object containing a "subtasks" array after all attempts.'
+		);
+	}
+	const parsedSubtasks = parsedObject.subtasks;
+
+	if (expectedCount && parsedSubtasks.length !== expectedCount) {
+		logger.warn(
+			`Expected ${expectedCount} subtasks, but parsed ${parsedSubtasks.length}.`
+		);
+	}
+
+	let currentId = startId;
+	const validatedSubtasks = [];
+	const validationErrors = [];
+
+	for (const rawSubtask of parsedSubtasks) {
+		const correctedSubtask = {
+			...rawSubtask,
+			id: currentId,
+			dependencies: Array.isArray(rawSubtask.dependencies)
+				? rawSubtask.dependencies.filter(
+						(dep) =>
+							typeof dep === 'string' && dep.startsWith(`${parentTaskId}.`)
+					)
+				: [],
+			status: 'pending'
+		};
+
+		const result = subtaskSchema.safeParse(correctedSubtask);
+
+		if (result.success) {
+			validatedSubtasks.push(result.data);
+		} else {
+			logger.warn(
+				`Subtask validation failed for raw data: ${JSON.stringify(rawSubtask).substring(0, 100)}...`
+			);
+			result.error.errors.forEach((err) => {
+				const errorMessage = `  - Field '${err.path.join('.')}': ${err.message}`;
+				logger.warn(errorMessage);
+				validationErrors.push(`Subtask ${currentId}: ${errorMessage}`);
+			});
+		}
+		currentId++;
+	}
+
+	if (validationErrors.length > 0) {
+		logger.error(
+			`Found ${validationErrors.length} validation errors in the generated subtasks.`
+		);
+		logger.warn('Proceeding with only the successfully validated subtasks.');
+	}
+
+	if (validatedSubtasks.length === 0 && parsedSubtasks.length > 0) {
+		throw new Error(
+			'AI response contained potential subtasks, but none passed validation.'
+		);
+	}
+	return validatedSubtasks.slice(0, expectedCount || validatedSubtasks.length);
+}
+
+/**
+ * Expand a task into subtasks using the unified AI service (generateTextService).
 * Appends new subtasks by default. Replaces existing subtasks if force=true.
 * Integrates complexity report to determine subtask count and prompt if available,
 * unless numSubtasks is explicitly provided.
@@ -197,10 +450,6 @@ async function expandTask(
 		}

 		// Determine prompt content AND system prompt
-		// Calculate the next subtask ID to match current behavior:
-		// - Start from the number of existing subtasks + 1
-		// - This creates sequential IDs: 1, 2, 3, 4...
-		// - Display format shows as parentTaskId.subtaskId (e.g., "1.1", "1.2", "2.1")
 		const nextSubtaskId = (task.subtasks?.length || 0) + 1;

 		// Load prompts using PromptManager
@@ -261,6 +510,7 @@ async function expandTask(
 			hasCodebaseAnalysis: hasCodebaseAnalysisCapability,
 			projectRoot: projectRoot || ''
 		};
+
 		let variantKey = 'default';
 		if (expansionPromptText) {
 			variantKey = 'complexity-report';
@@ -290,7 +540,7 @@ async function expandTask(
 		);
 		// --- End Complexity Report / Prompt Logic ---

-		// --- AI Subtask Generation using generateObjectService ---
+		// --- AI Subtask Generation using generateTextService ---
 		let generatedSubtasks = [];
 		let loadingIndicator = null;
 		if (outputFormat === 'text') {
@@ -299,36 +549,48 @@ async function expandTask(
 			);
 		}

+		let responseText = '';
 		let aiServiceResponse = null;
+
 		try {
 			const role = useResearch ? 'research' : 'main';

-			// Call generateObjectService with the determined prompts and telemetry params
-			aiServiceResponse = await generateObjectService({
+			// Call generateTextService with the determined prompts and telemetry params
+			aiServiceResponse = await generateTextService({
 				prompt: promptContent,
 				systemPrompt: systemPrompt,
 				role,
 				session,
 				projectRoot,
-				schema: COMMAND_SCHEMAS['expand-task'],
-				objectName: 'subtasks',
 				commandName: 'expand-task',
 				outputType: outputFormat
 			});
+			responseText = aiServiceResponse.mainResult;

-			// With generateObject, we expect structured data – verify it before use
-			const mainResult = aiServiceResponse?.mainResult;
-			if (!mainResult || !Array.isArray(mainResult.subtasks)) {
-				throw new Error('AI response did not include a valid subtasks array.');
-			}
-			generatedSubtasks = mainResult.subtasks;
-			logger.info(`Received ${generatedSubtasks.length} subtasks from AI.`);
+			// Parse Subtasks
+			generatedSubtasks = parseSubtasksFromText(
+				responseText,
+				nextSubtaskId,
+				finalSubtaskCount,
+				task.id,
+				logger
+			);
+			logger.info(
+				`Successfully parsed ${generatedSubtasks.length} subtasks from AI response.`
+			);
 		} catch (error) {
 			if (loadingIndicator) stopLoadingIndicator(loadingIndicator);
 			logger.error(
 				`Error during AI call or parsing for task ${taskId}: ${error.message}`, // Added task ID context
 				'error'
 			);
+			// Log raw response in debug mode if parsing failed
+			if (
+				error.message.includes('Failed to parse valid subtasks') &&
+				getDebugFlag(session)
+			) {
+				logger.error(`Raw AI Response that failed parsing:\n${responseText}`);
+			}
 			throw error;
 		} finally {
 			if (loadingIndicator) stopLoadingIndicator(loadingIndicator);
--- a/scripts/modules/task-manager/scope-adjustment.js
+++ b/scripts/modules/task-manager/scope-adjustment.js
@@ -355,7 +355,7 @@ Ensure the JSON is valid and properly formatted.`;
 		const subtaskSchema = z.object({
 			subtasks: z.array(
 				z.object({
-					id: z.int().positive(),
+					id: z.number().int().positive(),
 					title: z.string().min(5),
 					description: z.string().min(10),
 					dependencies: z.array(z.string()),
@@ -386,44 +386,14 @@ Ensure the JSON is valid and properly formatted.`;
 			testStrategy: subtask.testStrategy || ''
 		}));

-		// Ensure new subtasks have unique sequential IDs after the preserved ones
-		const maxPreservedId = preservedSubtasks.reduce(
-			(max, st) => Math.max(max, st.id || 0),
-			0
-		);
-		let nextId = maxPreservedId + 1;
-		const idMapping = new Map();
-		const normalizedGeneratedSubtasks = processedGeneratedSubtasks
-			.map((st) => {
-				const originalId = st.id;
-				const newId = nextId++;
-				idMapping.set(originalId, newId);
-				return {
-					...st,
-					id: newId
-				};
-			})
-			.map((st) => ({
-				...st,
-				dependencies: (st.dependencies || []).map((dep) => {
-					if (typeof dep !== 'string' || !dep.startsWith(`${task.id}.`)) {
-						return dep;
-					}
-					const [, siblingIdPart] = dep.split('.');
-					const originalSiblingId = Number.parseInt(siblingIdPart, 10);
-					const remappedSiblingId = idMapping.get(originalSiblingId);
-					return remappedSiblingId ? `${task.id}.${remappedSiblingId}` : dep;
-				})
-			}));
-
 		// Update task with preserved subtasks + newly generated ones
-		task.subtasks = [...preservedSubtasks, ...normalizedGeneratedSubtasks];
+		task.subtasks = [...preservedSubtasks, ...processedGeneratedSubtasks];

 		return {
 			updatedTask: task,
 			regenerated: true,
 			preserved: preservedSubtasks.length,
-			generated: normalizedGeneratedSubtasks.length
+			generated: processedGeneratedSubtasks.length
 		};
 	} catch (error) {
 		log(
--- a/scripts/modules/task-manager/tag-management.js
+++ b/scripts/modules/task-manager/tag-management.js
@@ -619,29 +619,9 @@ async function tags(
 				headers.push(chalk.cyan.bold('Description'));
 			}

-			// Calculate dynamic column widths based on terminal width
-			const terminalWidth = Math.max(process.stdout.columns || 120, 80);
-			const usableWidth = Math.floor(terminalWidth * 0.95);
-
-			let colWidths;
-			if (showMetadata) {
-				// With metadata: Tag Name, Tasks, Completed, Created, Description
-				const widths = [0.25, 0.1, 0.12, 0.15, 0.38];
-				colWidths = widths.map((w, i) =>
-					Math.max(Math.floor(usableWidth * w), i === 0 ? 15 : 8)
-				);
-			} else {
-				// Without metadata: Tag Name, Tasks, Completed
-				const widths = [0.7, 0.15, 0.15];
-				colWidths = widths.map((w, i) =>
-					Math.max(Math.floor(usableWidth * w), i === 0 ? 20 : 10)
-				);
-			}
-
 			const table = new Table({
 				head: headers,
-				colWidths: colWidths,
-				wordWrap: true
+				colWidths: showMetadata ? [20, 10, 12, 15, 50] : [25, 10, 12]
 			});

 			// Add rows
--- a/scripts/modules/task-manager/update-task-by-id.js
+++ b/scripts/modules/task-manager/update-task-by-id.js
@@ -3,6 +3,7 @@ import path from 'path';
 import chalk from 'chalk';
 import boxen from 'boxen';
 import Table from 'cli-table3';
+import { z } from 'zod'; // Keep Zod for post-parse validation

 import {
 	log as consoleLog,
@@ -21,11 +22,7 @@ import {
 	displayAiUsageSummary
 } from '../ui.js';

-import {
-	generateTextService,
-	generateObjectService
-} from '../ai-services-unified.js';
-import { COMMAND_SCHEMAS } from '../../../src/schemas/registry.js';
+import { generateTextService } from '../ai-services-unified.js';
 import {
 	getDebugFlag,
 	isApiKeySet,
@@ -35,6 +32,229 @@ import { getPromptManager } from '../prompt-manager.js';
 import { ContextGatherer } from '../utils/contextGatherer.js';
 import { FuzzyTaskSearch } from '../utils/fuzzyTaskSearch.js';

+// Zod schema for post-parsing validation of the updated task object
+const updatedTaskSchema = z
+	.object({
+		id: z.number().int(),
+		title: z.string(), // Title should be preserved, but check it exists
+		description: z.string(),
+		status: z.string(),
+		dependencies: z.array(z.union([z.number().int(), z.string()])),
+		priority: z.string().nullable().default('medium'),
+		details: z.string().nullable().default(''),
+		testStrategy: z.string().nullable().default(''),
+		subtasks: z
+			.array(
+				z.object({
+					id: z
+						.number()
+						.int()
+						.positive()
+						.describe('Sequential subtask ID starting from 1'),
+					title: z.string(),
+					description: z.string(),
+					status: z.string(),
+					dependencies: z.array(z.number().int()).nullable().default([]),
+					details: z.string().nullable().default(''),
+					testStrategy: z.string().nullable().default('')
+				})
+			)
+			.nullable()
+			.default([])
+	})
+	.strip(); // Allows parsing even if AI adds extra fields, but validation focuses on schema
+
+/**
+ * Parses a single updated task object from AI's text response.
+ * @param {string} text - Response text from AI.
+ * @param {number} expectedTaskId - The ID of the task expected.
+ * @param {Function | Object} logFn - Logging function or MCP logger.
+ * @param {boolean} isMCP - Flag indicating MCP context.
+ * @returns {Object} Parsed and validated task object.
+ * @throws {Error} If parsing or validation fails.
+ */
+function parseUpdatedTaskFromText(text, expectedTaskId, logFn, isMCP) {
+	// Report helper consistent with the established pattern
+	const report = (level, ...args) => {
+		if (isMCP) {
+			if (typeof logFn[level] === 'function') logFn[level](...args);
+			else logFn.info(...args);
+		} else if (!isSilentMode()) {
+			logFn(level, ...args);
+		}
+	};
+
+	report(
+		'info',
+		'Attempting to parse updated task object from text response...'
+	);
+	if (!text || text.trim() === '')
+		throw new Error('AI response text is empty.');
+
+	let cleanedResponse = text.trim();
+	const originalResponseForDebug = cleanedResponse;
+	let parseMethodUsed = 'raw'; // Keep track of which method worked
+
+	// --- NEW Step 1: Try extracting between {} first ---
+	const firstBraceIndex = cleanedResponse.indexOf('{');
+	const lastBraceIndex = cleanedResponse.lastIndexOf('}');
+	let potentialJsonFromBraces = null;
+
+	if (firstBraceIndex !== -1 && lastBraceIndex > firstBraceIndex) {
+		potentialJsonFromBraces = cleanedResponse.substring(
+			firstBraceIndex,
+			lastBraceIndex + 1
+		);
+		if (potentialJsonFromBraces.length <= 2) {
+			potentialJsonFromBraces = null; // Ignore empty braces {}
+		}
+	}
+
+	// If {} extraction yielded something, try parsing it immediately
+	if (potentialJsonFromBraces) {
+		try {
+			const testParse = JSON.parse(potentialJsonFromBraces);
+			// It worked! Use this as the primary cleaned response.
+			cleanedResponse = potentialJsonFromBraces;
+			parseMethodUsed = 'braces';
+		} catch (e) {
+			report(
+				'info',
+				'Content between {} looked promising but failed initial parse. Proceeding to other methods.'
+			);
+			// Reset cleanedResponse to original if brace parsing failed
+			cleanedResponse = originalResponseForDebug;
+		}
+	}
+
+	// --- Step 2: If brace parsing didn't work or wasn't applicable, try code block extraction ---
+	if (parseMethodUsed === 'raw') {
+		const codeBlockMatch = cleanedResponse.match(
+			/```(?:json|javascript)?\s*([\s\S]*?)\s*```/i
+		);
+		if (codeBlockMatch) {
+			cleanedResponse = codeBlockMatch[1].trim();
+			parseMethodUsed = 'codeblock';
+			report('info', 'Extracted JSON content from Markdown code block.');
+		} else {
+			// --- Step 3: If code block failed, try stripping prefixes ---
+			const commonPrefixes = [
+				'json\n',
+				'javascript\n'
+				// ... other prefixes ...
+			];
+			let prefixFound = false;
+			for (const prefix of commonPrefixes) {
+				if (cleanedResponse.toLowerCase().startsWith(prefix)) {
+					cleanedResponse = cleanedResponse.substring(prefix.length).trim();
+					parseMethodUsed = 'prefix';
+					report('info', `Stripped prefix: "${prefix.trim()}"`);
+					prefixFound = true;
+					break;
+				}
+			}
+			if (!prefixFound) {
+				report(
+					'warn',
+					'Response does not appear to contain {}, code block, or known prefix. Attempting raw parse.'
+				);
+			}
+		}
+	}
+
+	// --- Step 4: Attempt final parse ---
+	let parsedTask;
+	try {
+		parsedTask = JSON.parse(cleanedResponse);
+	} catch (parseError) {
+		report('error', `Failed to parse JSON object: ${parseError.message}`);
+		report(
+			'error',
+			`Problematic JSON string (first 500 chars): ${cleanedResponse.substring(0, 500)}`
+		);
+		report(
+			'error',
+			`Original Raw Response (first 500 chars): ${originalResponseForDebug.substring(0, 500)}`
+		);
+		throw new Error(
+			`Failed to parse JSON response object: ${parseError.message}`
+		);
+	}
+
+	if (!parsedTask || typeof parsedTask !== 'object') {
+		report(
+			'error',
+			`Parsed content is not an object. Type: ${typeof parsedTask}`
+		);
+		report(
+			'error',
+			`Parsed content sample: ${JSON.stringify(parsedTask).substring(0, 200)}`
+		);
+		throw new Error('Parsed AI response is not a valid JSON object.');
+	}
+
+	// Preprocess the task to ensure subtasks have proper structure
+	const preprocessedTask = {
+		...parsedTask,
+		status: parsedTask.status || 'pending',
+		dependencies: Array.isArray(parsedTask.dependencies)
+			? parsedTask.dependencies
+			: [],
+		details:
+			typeof parsedTask.details === 'string'
+				? parsedTask.details
+				: String(parsedTask.details || ''),
+		testStrategy:
+			typeof parsedTask.testStrategy === 'string'
+				? parsedTask.testStrategy
+				: String(parsedTask.testStrategy || ''),
+		// Ensure subtasks is an array and each subtask has required fields
+		subtasks: Array.isArray(parsedTask.subtasks)
+			? parsedTask.subtasks.map((subtask) => ({
+					...subtask,
+					title: subtask.title || '',
+					description: subtask.description || '',
+					status: subtask.status || 'pending',
+					dependencies: Array.isArray(subtask.dependencies)
+						? subtask.dependencies
+						: [],
+					details:
+						typeof subtask.details === 'string'
+							? subtask.details
+							: String(subtask.details || ''),
+					testStrategy:
+						typeof subtask.testStrategy === 'string'
+							? subtask.testStrategy
+							: String(subtask.testStrategy || '')
+				}))
+			: []
+	};
+
+	// Validate the parsed task object using Zod
+	const validationResult = updatedTaskSchema.safeParse(preprocessedTask);
+	if (!validationResult.success) {
+		report('error', 'Parsed task object failed Zod validation.');
+		validationResult.error.errors.forEach((err) => {
+			report('error', `  - Field '${err.path.join('.')}': ${err.message}`);
+		});
+		throw new Error(
+			`AI response failed task structure validation: ${validationResult.error.message}`
+		);
+	}
+
+	// Final check: ensure ID matches expected ID (AI might hallucinate)
+	if (validationResult.data.id !== expectedTaskId) {
+		report(
+			'warn',
+			`AI returned task with ID ${validationResult.data.id}, but expected ${expectedTaskId}. Overwriting ID.`
+		);
+		validationResult.data.id = expectedTaskId; // Enforce correct ID
+	}
+
+	report('info', 'Successfully validated updated task structure.');
+	return validationResult.data; // Return the validated task data
+}
+
 /**
 * Update a task by ID with new information using the unified AI service.
 * @param {string} tasksPath - Path to the tasks.json file
@@ -302,32 +522,15 @@ async function updateTaskById(

 		try {
 			const serviceRole = useResearch ? 'research' : 'main';
-
-			if (appendMode) {
-				// Append mode still uses generateTextService since it returns plain text
-				aiServiceResponse = await generateTextService({
-					role: serviceRole,
-					session: session,
-					projectRoot: projectRoot,
-					systemPrompt: systemPrompt,
-					prompt: userPrompt,
-					commandName: 'update-task',
-					outputType: isMCP ? 'mcp' : 'cli'
-				});
-			} else {
-				// Full update mode uses generateObjectService for structured output
-				aiServiceResponse = await generateObjectService({
-					role: serviceRole,
-					session: session,
-					projectRoot: projectRoot,
-					systemPrompt: systemPrompt,
-					prompt: userPrompt,
-					schema: COMMAND_SCHEMAS['update-task-by-id'],
-					objectName: 'task',
-					commandName: 'update-task',
-					outputType: isMCP ? 'mcp' : 'cli'
-				});
-			}
+			aiServiceResponse = await generateTextService({
+				role: serviceRole,
+				session: session,
+				projectRoot: projectRoot,
+				systemPrompt: systemPrompt,
+				prompt: userPrompt,
+				commandName: 'update-task',
+				outputType: isMCP ? 'mcp' : 'cli'
+			});

 			if (loadingIndicator)
 				stopLoadingIndicator(loadingIndicator, 'AI update complete.');
@@ -397,8 +600,13 @@ async function updateTaskById(
 				};
 			}

-			// Full update mode: Use structured data directly
-			const updatedTask = aiServiceResponse.mainResult.task;
+			// Full update mode: Use mainResult (text) for parsing
+			const updatedTask = parseUpdatedTaskFromText(
+				aiServiceResponse.mainResult,
+				taskId,
+				logFn,
+				isMCP
+			);

 			// --- Task Validation/Correction (Keep existing logic) ---
 			if (!updatedTask || typeof updatedTask !== 'object')
--- a/scripts/modules/task-manager/update-tasks.js
+++ b/scripts/modules/task-manager/update-tasks.js
@@ -2,6 +2,7 @@ import path from 'path';
 import chalk from 'chalk';
 import boxen from 'boxen';
 import Table from 'cli-table3';
+import { z } from 'zod'; // Keep Zod for post-parsing validation

 import {
 	log as consoleLog,
@@ -21,13 +22,258 @@ import {
 import { getDebugFlag, hasCodebaseAnalysis } from '../config-manager.js';
 import { getPromptManager } from '../prompt-manager.js';
 import generateTaskFiles from './generate-task-files.js';
-import { generateObjectService } from '../ai-services-unified.js';
-import { COMMAND_SCHEMAS } from '../../../src/schemas/registry.js';
+import { generateTextService } from '../ai-services-unified.js';
 import { getModelConfiguration } from './models.js';
 import { ContextGatherer } from '../utils/contextGatherer.js';
 import { FuzzyTaskSearch } from '../utils/fuzzyTaskSearch.js';
 import { flattenTasksWithSubtasks, findProjectRoot } from '../utils.js';

+// Zod schema for validating the structure of tasks AFTER parsing
+const updatedTaskSchema = z
+	.object({
+		id: z.number().int(),
+		title: z.string(),
+		description: z.string(),
+		status: z.string(),
+		dependencies: z.array(z.union([z.number().int(), z.string()])),
+		priority: z.string().nullable(),
+		details: z.string().nullable(),
+		testStrategy: z.string().nullable(),
+		subtasks: z.array(z.any()).nullable() // Keep subtasks flexible for now
+	})
+	.strip(); // Allow potential extra fields during parsing if needed, then validate structure
+
+// Preprocessing schema that adds defaults before validation
+const preprocessTaskSchema = z.preprocess((task) => {
+	// Ensure task is an object
+	if (typeof task !== 'object' || task === null) {
+		return {};
+	}
+
+	// Return task with defaults for missing fields
+	return {
+		...task,
+		// Add defaults for required fields if missing
+		id: task.id ?? 0,
+		title: task.title ?? 'Untitled Task',
+		description: task.description ?? '',
+		status: task.status ?? 'pending',
+		dependencies: Array.isArray(task.dependencies) ? task.dependencies : [],
+		// Optional fields - preserve undefined/null distinction
+		priority: task.hasOwnProperty('priority') ? task.priority : null,
+		details: task.hasOwnProperty('details') ? task.details : null,
+		testStrategy: task.hasOwnProperty('testStrategy')
+			? task.testStrategy
+			: null,
+		subtasks: Array.isArray(task.subtasks)
+			? task.subtasks
+			: task.subtasks === null
+				? null
+				: []
+	};
+}, updatedTaskSchema);
+
+const updatedTaskArraySchema = z.array(updatedTaskSchema);
+const preprocessedTaskArraySchema = z.array(preprocessTaskSchema);
+
+/**
+ * Parses an array of task objects from AI's text response.
+ * @param {string} text - Response text from AI.
+ * @param {number} expectedCount - Expected number of tasks.
+ * @param {Function | Object} logFn - The logging function or MCP log object.
+ * @param {boolean} isMCP - Flag indicating if logFn is MCP logger.
+ * @returns {Array} Parsed and validated tasks array.
+ * @throws {Error} If parsing or validation fails.
+ */
+function parseUpdatedTasksFromText(text, expectedCount, logFn, isMCP) {
+	const report = (level, ...args) => {
+		if (isMCP) {
+			if (typeof logFn[level] === 'function') logFn[level](...args);
+			else logFn.info(...args);
+		} else if (!isSilentMode()) {
+			// Check silent mode for consoleLog
+			consoleLog(level, ...args);
+		}
+	};
+
+	report(
+		'info',
+		'Attempting to parse updated tasks array from text response...'
+	);
+	if (!text || text.trim() === '')
+		throw new Error('AI response text is empty.');
+
+	let cleanedResponse = text.trim();
+	const originalResponseForDebug = cleanedResponse;
+	let parseMethodUsed = 'raw'; // Track which method worked
+
+	// --- NEW Step 1: Try extracting between [] first ---
+	const firstBracketIndex = cleanedResponse.indexOf('[');
+	const lastBracketIndex = cleanedResponse.lastIndexOf(']');
+	let potentialJsonFromArray = null;
+
+	if (firstBracketIndex !== -1 && lastBracketIndex > firstBracketIndex) {
+		potentialJsonFromArray = cleanedResponse.substring(
+			firstBracketIndex,
+			lastBracketIndex + 1
+		);
+		// Basic check to ensure it's not just "[]" or malformed
+		if (potentialJsonFromArray.length <= 2) {
+			potentialJsonFromArray = null; // Ignore empty array
+		}
+	}
+
+	// If [] extraction yielded something, try parsing it immediately
+	if (potentialJsonFromArray) {
+		try {
+			const testParse = JSON.parse(potentialJsonFromArray);
+			// It worked! Use this as the primary cleaned response.
+			cleanedResponse = potentialJsonFromArray;
+			parseMethodUsed = 'brackets';
+		} catch (e) {
+			report(
+				'info',
+				'Content between [] looked promising but failed initial parse. Proceeding to other methods.'
+			);
+			// Reset cleanedResponse to original if bracket parsing failed
+			cleanedResponse = originalResponseForDebug;
+		}
+	}
+
+	// --- Step 2: If bracket parsing didn't work or wasn't applicable, try code block extraction ---
+	if (parseMethodUsed === 'raw') {
+		// Only look for ```json blocks now
+		const codeBlockMatch = cleanedResponse.match(
+			/```json\s*([\s\S]*?)\s*```/i // Only match ```json
+		);
+		if (codeBlockMatch) {
+			cleanedResponse = codeBlockMatch[1].trim();
+			parseMethodUsed = 'codeblock';
+			report('info', 'Extracted JSON content from JSON Markdown code block.');
+		} else {
+			report('info', 'No JSON code block found.');
+			// --- Step 3: If code block failed, try stripping prefixes ---
+			const commonPrefixes = [
+				'json\n',
+				'javascript\n', // Keep checking common prefixes just in case
+				'python\n',
+				'here are the updated tasks:',
+				'here is the updated json:',
+				'updated tasks:',
+				'updated json:',
+				'response:',
+				'output:'
+			];
+			let prefixFound = false;
+			for (const prefix of commonPrefixes) {
+				if (cleanedResponse.toLowerCase().startsWith(prefix)) {
+					cleanedResponse = cleanedResponse.substring(prefix.length).trim();
+					parseMethodUsed = 'prefix';
+					report('info', `Stripped prefix: "${prefix.trim()}"`);
+					prefixFound = true;
+					break;
+				}
+			}
+			if (!prefixFound) {
+				report(
+					'warn',
+					'Response does not appear to contain [], JSON code block, or known prefix. Attempting raw parse.'
+				);
+			}
+		}
+	}
+
+	// --- Step 4: Attempt final parse ---
+	let parsedTasks;
+	try {
+		parsedTasks = JSON.parse(cleanedResponse);
+	} catch (parseError) {
+		report('error', `Failed to parse JSON array: ${parseError.message}`);
+		report(
+			'error',
+			`Extraction method used: ${parseMethodUsed}` // Log which method failed
+		);
+		report(
+			'error',
+			`Problematic JSON string (first 500 chars): ${cleanedResponse.substring(0, 500)}`
+		);
+		report(
+			'error',
+			`Original Raw Response (first 500 chars): ${originalResponseForDebug.substring(0, 500)}`
+		);
+		throw new Error(
+			`Failed to parse JSON response array: ${parseError.message}`
+		);
+	}
+
+	// --- Step 5 & 6: Validate Array structure and Zod schema ---
+	if (!Array.isArray(parsedTasks)) {
+		report(
+			'error',
+			`Parsed content is not an array. Type: ${typeof parsedTasks}`
+		);
+		report(
+			'error',
+			`Parsed content sample: ${JSON.stringify(parsedTasks).substring(0, 200)}`
+		);
+		throw new Error('Parsed AI response is not a valid JSON array.');
+	}
+
+	report('info', `Successfully parsed ${parsedTasks.length} potential tasks.`);
+	if (expectedCount && parsedTasks.length !== expectedCount) {
+		report(
+			'warn',
+			`Expected ${expectedCount} tasks, but parsed ${parsedTasks.length}.`
+		);
+	}
+
+	// Log missing fields for debugging before preprocessing
+	let hasWarnings = false;
+	parsedTasks.forEach((task, index) => {
+		const missingFields = [];
+		if (!task.hasOwnProperty('id')) missingFields.push('id');
+		if (!task.hasOwnProperty('status')) missingFields.push('status');
+		if (!task.hasOwnProperty('dependencies'))
+			missingFields.push('dependencies');
+
+		if (missingFields.length > 0) {
+			hasWarnings = true;
+			report(
+				'warn',
+				`Task ${index} is missing fields: ${missingFields.join(', ')} - will use defaults`
+			);
+		}
+	});
+
+	if (hasWarnings) {
+		report(
+			'warn',
+			'Some tasks were missing required fields. Applying defaults...'
+		);
+	}
+
+	// Use the preprocessing schema to add defaults and validate
+	const preprocessResult = preprocessedTaskArraySchema.safeParse(parsedTasks);
+
+	if (!preprocessResult.success) {
+		// This should rarely happen now since preprocessing adds defaults
+		report('error', 'Failed to validate task array even after preprocessing.');
+		preprocessResult.error.errors.forEach((err) => {
+			report('error', `  - Path '${err.path.join('.')}': ${err.message}`);
+		});
+
+		throw new Error(
+			`AI response failed validation: ${preprocessResult.error.message}`
+		);
+	}
+
+	report('info', 'Successfully validated and transformed task structure.');
+	return preprocessResult.data.slice(
+		0,
+		expectedCount || preprocessResult.data.length
+	);
+}
+
 /**
 * Update tasks based on new context using the unified AI service.
 * @param {string} tasksPath - Path to the tasks.json file
@@ -212,15 +458,13 @@ async function updateTasks(
 			// Determine role based on research flag
 			const serviceRole = useResearch ? 'research' : 'main';

-			// Call the unified AI service with generateObject
-			aiServiceResponse = await generateObjectService({
+			// Call the unified AI service
+			aiServiceResponse = await generateTextService({
 				role: serviceRole,
 				session: session,
 				projectRoot: projectRoot,
 				systemPrompt: systemPrompt,
 				prompt: userPrompt,
-				schema: COMMAND_SCHEMAS['update-tasks'],
-				objectName: 'tasks',
 				commandName: 'update-tasks',
 				outputType: isMCP ? 'mcp' : 'cli'
 			});
@@ -228,8 +472,13 @@ async function updateTasks(
 			if (loadingIndicator)
 				stopLoadingIndicator(loadingIndicator, 'AI update complete.');

-			// With generateObject, we get structured data directly
-			const parsedUpdatedTasks = aiServiceResponse.mainResult.tasks;
+			// Use the mainResult (text) for parsing
+			const parsedUpdatedTasks = parseUpdatedTasksFromText(
+				aiServiceResponse.mainResult,
+				tasksToUpdate.length,
+				logFn,
+				isMCP
+			);

 			// --- Update Tasks Data (Updated writeJSON call) ---
 			if (!Array.isArray(parsedUpdatedTasks)) {
--- a/scripts/modules/ui.js
+++ b/scripts/modules/ui.js
@@ -2310,8 +2310,7 @@ function displayAiUsageSummary(telemetryData, outputType = 'cli') {
 		outputTokens,
 		totalTokens,
 		totalCost,
-		commandName,
-		isUnknownCost
+		commandName
 	} = telemetryData;

 	let summary = chalk.bold.blue('AI Usage Summary:') + '\n';
@@ -2321,10 +2320,7 @@ function displayAiUsageSummary(telemetryData, outputType = 'cli') {
 	summary += chalk.gray(
 		`  Tokens: ${totalTokens} (Input: ${inputTokens}, Output: ${outputTokens})\n`
 	);
-
-	// Show "Unknown" if pricing data is not available, otherwise show the cost
-	const costDisplay = isUnknownCost ? 'Unknown' : `$${totalCost.toFixed(6)}`;
-	summary += chalk.gray(`  Est. Cost: ${costDisplay}`);
+	summary += chalk.gray(`  Est. Cost: $${totalCost.toFixed(6)}`);

 	console.log(
 		boxen(summary, {
--- a/src/ai-providers/base-provider.js
+++ b/src/ai-providers/base-provider.js
@@ -21,13 +21,6 @@ export class BaseAIProvider {

 		// Each provider must set their name
 		this.name = this.constructor.name;
-
-		/**
-		 * Whether this provider needs explicit schema in JSON mode
-		 * Can be overridden by subclasses
-		 * @type {boolean}
-		 */
-		this.needsExplicitJsonSchema = false;
 	}

 	/**
@@ -133,6 +126,16 @@ export class BaseAIProvider {
 		throw new Error('getRequiredApiKeyName must be implemented by provider');
 	}

+	/**
+	 * Determines if a model requires max_completion_tokens instead of maxTokens
+	 * Can be overridden by providers to specify their model requirements
+	 * @param {string} modelId - The model ID to check
+	 * @returns {boolean} True if the model requires max_completion_tokens
+	 */
+	requiresMaxCompletionTokens(modelId) {
+		return false; // Default behavior - most models use maxTokens
+	}
+
 	/**
 	 * Prepares token limit parameter based on model requirements
 	 * @param {string} modelId - The model ID
@@ -147,7 +150,11 @@ export class BaseAIProvider {
 		// Ensure maxTokens is an integer
 		const tokenValue = Math.floor(Number(maxTokens));

-		return { maxOutputTokens: tokenValue };
+		if (this.requiresMaxCompletionTokens(modelId)) {
+			return { max_completion_tokens: tokenValue };
+		} else {
+			return { maxTokens: tokenValue };
+		}
 	}

 	/**
@@ -176,19 +183,12 @@ export class BaseAIProvider {
 				`${this.name} generateText completed successfully for model: ${params.modelId}`
 			);

-			const inputTokens =
-				result.usage?.inputTokens ?? result.usage?.promptTokens ?? 0;
-			const outputTokens =
-				result.usage?.outputTokens ?? result.usage?.completionTokens ?? 0;
-			const totalTokens =
-				result.usage?.totalTokens ?? inputTokens + outputTokens;
-
 			return {
 				text: result.text,
 				usage: {
-					inputTokens,
-					outputTokens,
-					totalTokens
+					inputTokens: result.usage?.promptTokens,
+					outputTokens: result.usage?.completionTokens,
+					totalTokens: result.usage?.totalTokens
 				}
 			};
 		} catch (error) {
@@ -248,7 +248,7 @@ export class BaseAIProvider {
 				messages: params.messages,
 				schema: zodSchema(params.schema),
 				mode: params.mode || 'auto',
-				maxOutputTokens: params.maxTokens,
+				maxTokens: params.maxTokens,
 				temperature: params.temperature
 			});

@@ -286,15 +286,12 @@ export class BaseAIProvider {
 			);

 			const client = await this.getClient(params);
-
 			const result = await generateObject({
 				model: client(params.modelId),
 				messages: params.messages,
-				schema: params.schema,
-				mode: this.needsExplicitJsonSchema ? 'json' : 'auto',
-				schemaName: params.objectName,
-				schemaDescription: `Generate a valid JSON object for ${params.objectName}`,
-				maxTokens: params.maxTokens,
+				schema: zodSchema(params.schema),
+				mode: params.mode || 'auto',
+				...this.prepareTokenParam(params.modelId, params.maxTokens),
 				temperature: params.temperature
 			});

@@ -303,26 +300,19 @@ export class BaseAIProvider {
 				`${this.name} generateObject completed successfully for model: ${params.modelId}`
 			);

-			const inputTokens =
-				result.usage?.inputTokens ?? result.usage?.promptTokens ?? 0;
-			const outputTokens =
-				result.usage?.outputTokens ?? result.usage?.completionTokens ?? 0;
-			const totalTokens =
-				result.usage?.totalTokens ?? inputTokens + outputTokens;
-
 			return {
 				object: result.object,
 				usage: {
-					inputTokens,
-					outputTokens,
-					totalTokens
+					inputTokens: result.usage?.promptTokens,
+					outputTokens: result.usage?.completionTokens,
+					totalTokens: result.usage?.totalTokens
 				}
 			};
 		} catch (error) {
 			// Check if this is a JSON parsing error that we can potentially fix
 			if (
 				NoObjectGeneratedError.isInstance(error) &&
-				error.cause instanceof JSONParseError &&
+				JSONParseError.isInstance(error.cause) &&
 				error.cause.text
 			) {
 				log(
--- a/src/ai-providers/claude-code.js
+++ b/src/ai-providers/claude-code.js
@@ -1,127 +1,54 @@
 /**
 * src/ai-providers/claude-code.js
 *
- * Claude Code provider implementation using the ai-sdk-provider-claude-code package.
- * This provider uses the local Claude Code CLI with OAuth token authentication.
- *
- * Authentication:
- * - Uses CLAUDE_CODE_OAUTH_TOKEN managed by Claude Code CLI
- * - Token is set up via: claude setup-token
- * - No manual API key configuration required
+ * Implementation for interacting with Claude models via Claude Code CLI
+ * using a custom AI SDK implementation.
 */

-import { createClaudeCode } from 'ai-sdk-provider-claude-code';
+import { createClaudeCode } from './custom-sdk/claude-code/index.js';
 import { BaseAIProvider } from './base-provider.js';
 import { getClaudeCodeSettingsForCommand } from '../../scripts/modules/config-manager.js';
-import { execSync } from 'child_process';
-import { log } from '../../scripts/modules/utils.js';

-let _claudeCliChecked = false;
-let _claudeCliAvailable = null;
-
-/**
- * Provider for Claude Code CLI integration via AI SDK
- *
- * Features:
- * - No API key required (uses local Claude Code CLI)
- * - Supports 'sonnet' and 'opus' models
- * - Command-specific configuration support
- */
 export class ClaudeCodeProvider extends BaseAIProvider {
 	constructor() {
 		super();
 		this.name = 'Claude Code';
-		this.supportedModels = ['sonnet', 'opus'];
-		// Claude Code requires explicit JSON schema mode
-		this.needsExplicitJsonSchema = true;
 	}

-	/**
-	 * @returns {string} The environment variable name for API key (not used)
-	 */
 	getRequiredApiKeyName() {
 		return 'CLAUDE_CODE_API_KEY';
 	}

-	/**
-	 * @returns {boolean} False - Claude Code doesn't require API keys
-	 */
 	isRequiredApiKey() {
 		return false;
 	}

 	/**
-	 * Optional CLI availability check for Claude Code
-	 * @param {object} params - Parameters (ignored)
+	 * Override validateAuth to skip API key validation for Claude Code
+	 * @param {object} params - Parameters to validate
 	 */
 	validateAuth(params) {
-		// Claude Code uses local CLI - perform lightweight availability check
-		// This is optional validation that fails fast with actionable guidance
-		if (
-			process.env.NODE_ENV !== 'test' &&
-			!_claudeCliChecked &&
-			!process.env.CLAUDE_CODE_OAUTH_TOKEN
-		) {
-			try {
-				execSync('claude --version', { stdio: 'pipe', timeout: 1000 });
-				_claudeCliAvailable = true;
-			} catch (error) {
-				_claudeCliAvailable = false;
-				log(
-					'warn',
-					'Claude Code CLI not detected. Install it with: npm install -g @anthropic-ai/claude-code'
-				);
-			} finally {
-				_claudeCliChecked = true;
-			}
-		}
+		// Claude Code doesn't require an API key
+		// No validation needed
 	}

 	/**
-	 * Creates a Claude Code client instance
-	 * @param {object} params - Client parameters
-	 * @param {string} [params.commandName] - Command name for settings lookup
-	 * @returns {Function} Claude Code provider function
-	 * @throws {Error} If Claude Code CLI is not available or client creation fails
+	 * Creates and returns a Claude Code client instance.
+	 * @param {object} params - Parameters for client initialization
+	 * @param {string} [params.commandName] - Name of the command invoking the service
+	 * @param {string} [params.baseURL] - Optional custom API endpoint (not used by Claude Code)
+	 * @returns {Function} Claude Code client function
+	 * @throws {Error} If initialization fails
 	 */
-	getClient(params = {}) {
+	getClient(params) {
 		try {
-			const settings =
-				getClaudeCodeSettingsForCommand(params.commandName) || {};
-
+			// Claude Code doesn't use API keys or base URLs
+			// Just return the provider factory
 			return createClaudeCode({
-				defaultSettings: settings
+				defaultSettings: getClaudeCodeSettingsForCommand(params?.commandName)
 			});
 		} catch (error) {
-			// Provide more helpful error message
-			const msg = String(error?.message || '');
-			const code = error?.code;
-			if (code === 'ENOENT' || /claude/i.test(msg)) {
-				const enhancedError = new Error(
-					`Claude Code CLI not available. Please install Claude Code CLI first. Original error: ${error.message}`
-				);
-				enhancedError.cause = error;
-				this.handleError('Claude Code CLI initialization', enhancedError);
-			} else {
-				this.handleError('client initialization', error);
-			}
+			this.handleError('client initialization', error);
 		}
 	}
-
-	/**
-	 * @returns {string[]} List of supported model IDs
-	 */
-	getSupportedModels() {
-		return this.supportedModels;
-	}
-
-	/**
-	 * Check if a model is supported
-	 * @param {string} modelId - Model ID to check
-	 * @returns {boolean} True if supported
-	 */
-	isModelSupported(modelId) {
-		if (!modelId) return false;
-		return this.supportedModels.includes(String(modelId).toLowerCase());
-	}
 }
--- a/src/ai-providers/custom-sdk/claude-code/errors.js
+++ b/src/ai-providers/custom-sdk/claude-code/errors.js
@@ -0,0 +1,126 @@
+/**
+ * @fileoverview Error handling utilities for Claude Code provider
+ */
+
+import { APICallError, LoadAPIKeyError } from '@ai-sdk/provider';
+
+/**
+ * @typedef {import('./types.js').ClaudeCodeErrorMetadata} ClaudeCodeErrorMetadata
+ */
+
+/**
+ * Create an API call error with Claude Code specific metadata
+ * @param {Object} params - Error parameters
+ * @param {string} params.message - Error message
+ * @param {string} [params.code] - Error code
+ * @param {number} [params.exitCode] - Process exit code
+ * @param {string} [params.stderr] - Standard error output
+ * @param {string} [params.promptExcerpt] - Excerpt of the prompt
+ * @param {boolean} [params.isRetryable=false] - Whether the error is retryable
+ * @returns {APICallError}
+ */
+export function createAPICallError({
+	message,
+	code,
+	exitCode,
+	stderr,
+	promptExcerpt,
+	isRetryable = false
+}) {
+	/** @type {ClaudeCodeErrorMetadata} */
+	const metadata = {
+		code,
+		exitCode,
+		stderr,
+		promptExcerpt
+	};
+
+	return new APICallError({
+		message,
+		isRetryable,
+		url: 'claude-code-cli://command',
+		requestBodyValues: promptExcerpt ? { prompt: promptExcerpt } : undefined,
+		data: metadata
+	});
+}
+
+/**
+ * Create an authentication error
+ * @param {Object} params - Error parameters
+ * @param {string} params.message - Error message
+ * @returns {LoadAPIKeyError}
+ */
+export function createAuthenticationError({ message }) {
+	return new LoadAPIKeyError({
+		message:
+			message ||
+			'Authentication failed. Please ensure Claude Code CLI is properly authenticated.'
+	});
+}
+
+/**
+ * Create a timeout error
+ * @param {Object} params - Error parameters
+ * @param {string} params.message - Error message
+ * @param {string} [params.promptExcerpt] - Excerpt of the prompt
+ * @param {number} params.timeoutMs - Timeout in milliseconds
+ * @returns {APICallError}
+ */
+export function createTimeoutError({ message, promptExcerpt, timeoutMs }) {
+	// Store timeoutMs in metadata for potential use by error handlers
+	/** @type {ClaudeCodeErrorMetadata & { timeoutMs: number }} */
+	const metadata = {
+		code: 'TIMEOUT',
+		promptExcerpt,
+		timeoutMs
+	};
+
+	return new APICallError({
+		message,
+		isRetryable: true,
+		url: 'claude-code-cli://command',
+		requestBodyValues: promptExcerpt ? { prompt: promptExcerpt } : undefined,
+		data: metadata
+	});
+}
+
+/**
+ * Check if an error is an authentication error
+ * @param {unknown} error - Error to check
+ * @returns {boolean}
+ */
+export function isAuthenticationError(error) {
+	if (error instanceof LoadAPIKeyError) return true;
+	if (
+		error instanceof APICallError &&
+		/** @type {ClaudeCodeErrorMetadata} */ (error.data)?.exitCode === 401
+	)
+		return true;
+	return false;
+}
+
+/**
+ * Check if an error is a timeout error
+ * @param {unknown} error - Error to check
+ * @returns {boolean}
+ */
+export function isTimeoutError(error) {
+	if (
+		error instanceof APICallError &&
+		/** @type {ClaudeCodeErrorMetadata} */ (error.data)?.code === 'TIMEOUT'
+	)
+		return true;
+	return false;
+}
+
+/**
+ * Get error metadata from an error
+ * @param {unknown} error - Error to extract metadata from
+ * @returns {ClaudeCodeErrorMetadata|undefined}
+ */
+export function getErrorMetadata(error) {
+	if (error instanceof APICallError && error.data) {
+		return /** @type {ClaudeCodeErrorMetadata} */ (error.data);
+	}
+	return undefined;
+}
--- a/src/ai-providers/custom-sdk/claude-code/index.js
+++ b/src/ai-providers/custom-sdk/claude-code/index.js
@@ -0,0 +1,83 @@
+/**
+ * @fileoverview Claude Code provider factory and exports
+ */
+
+import { NoSuchModelError } from '@ai-sdk/provider';
+import { ClaudeCodeLanguageModel } from './language-model.js';
+
+/**
+ * @typedef {import('./types.js').ClaudeCodeSettings} ClaudeCodeSettings
+ * @typedef {import('./types.js').ClaudeCodeModelId} ClaudeCodeModelId
+ * @typedef {import('./types.js').ClaudeCodeProvider} ClaudeCodeProvider
+ * @typedef {import('./types.js').ClaudeCodeProviderSettings} ClaudeCodeProviderSettings
+ */
+
+/**
+ * Create a Claude Code provider using the official SDK
+ * @param {ClaudeCodeProviderSettings} [options={}] - Provider configuration options
+ * @returns {ClaudeCodeProvider} Claude Code provider instance
+ */
+export function createClaudeCode(options = {}) {
+	/**
+	 * Create a language model instance
+	 * @param {ClaudeCodeModelId} modelId - Model ID
+	 * @param {ClaudeCodeSettings} [settings={}] - Model settings
+	 * @returns {ClaudeCodeLanguageModel}
+	 */
+	const createModel = (modelId, settings = {}) => {
+		return new ClaudeCodeLanguageModel({
+			id: modelId,
+			settings: {
+				...options.defaultSettings,
+				...settings
+			}
+		});
+	};
+
+	/**
+	 * Provider function
+	 * @param {ClaudeCodeModelId} modelId - Model ID
+	 * @param {ClaudeCodeSettings} [settings] - Model settings
+	 * @returns {ClaudeCodeLanguageModel}
+	 */
+	const provider = function (modelId, settings) {
+		if (new.target) {
+			throw new Error(
+				'The Claude Code model function cannot be called with the new keyword.'
+			);
+		}
+
+		return createModel(modelId, settings);
+	};
+
+	provider.languageModel = createModel;
+	provider.chat = createModel; // Alias for languageModel
+
+	// Add textEmbeddingModel method that throws NoSuchModelError
+	provider.textEmbeddingModel = (modelId) => {
+		throw new NoSuchModelError({
+			modelId,
+			modelType: 'textEmbeddingModel'
+		});
+	};
+
+	return /** @type {ClaudeCodeProvider} */ (provider);
+}
+
+/**
+ * Default Claude Code provider instance
+ */
+export const claudeCode = createClaudeCode();
+
+// Provider exports
+export { ClaudeCodeLanguageModel } from './language-model.js';
+
+// Error handling exports
+export {
+	isAuthenticationError,
+	isTimeoutError,
+	getErrorMetadata,
+	createAPICallError,
+	createAuthenticationError,
+	createTimeoutError
+} from './errors.js';
--- a/src/ai-providers/custom-sdk/claude-code/json-extractor.js
+++ b/src/ai-providers/custom-sdk/claude-code/json-extractor.js
@@ -0,0 +1,59 @@
+/**
+ * @fileoverview Extract JSON from Claude's response, handling markdown blocks and other formatting
+ */
+
+/**
+ * Extract JSON from Claude's response
+ * @param {string} text - The text to extract JSON from
+ * @returns {string} - The extracted JSON string
+ */
+export function extractJson(text) {
+	// Remove markdown code blocks if present
+	let jsonText = text.trim();
+
+	// Remove ```json blocks
+	jsonText = jsonText.replace(/^```json\s*/gm, '');
+	jsonText = jsonText.replace(/^```\s*/gm, '');
+	jsonText = jsonText.replace(/```\s*$/gm, '');
+
+	// Remove common TypeScript/JavaScript patterns
+	jsonText = jsonText.replace(/^const\s+\w+\s*=\s*/, ''); // Remove "const varName = "
+	jsonText = jsonText.replace(/^let\s+\w+\s*=\s*/, ''); // Remove "let varName = "
+	jsonText = jsonText.replace(/^var\s+\w+\s*=\s*/, ''); // Remove "var varName = "
+	jsonText = jsonText.replace(/;?\s*$/, ''); // Remove trailing semicolons
+
+	// Try to extract JSON object or array
+	const objectMatch = jsonText.match(/{[\s\S]*}/);
+	const arrayMatch = jsonText.match(/\[[\s\S]*\]/);
+
+	if (objectMatch) {
+		jsonText = objectMatch[0];
+	} else if (arrayMatch) {
+		jsonText = arrayMatch[0];
+	}
+
+	// First try to parse as valid JSON
+	try {
+		JSON.parse(jsonText);
+		return jsonText;
+	} catch {
+		// If it's not valid JSON, it might be a JavaScript object literal
+		// Try to convert it to valid JSON
+		try {
+			// This is a simple conversion that handles basic cases
+			// Replace unquoted keys with quoted keys
+			const converted = jsonText
+				.replace(/([{,]\s*)([a-zA-Z_$][a-zA-Z0-9_$]*)\s*:/g, '$1"$2":')
+				// Replace single quotes with double quotes
+				.replace(/'/g, '"');
+
+			// Validate the converted JSON
+			JSON.parse(converted);
+			return converted;
+		} catch {
+			// If all else fails, return the original text
+			// The AI SDK will handle the error appropriately
+			return text;
+		}
+	}
+}
--- a/src/ai-providers/custom-sdk/claude-code/language-model.js
+++ b/src/ai-providers/custom-sdk/claude-code/language-model.js
@@ -0,0 +1,540 @@
+/**
+ * @fileoverview Claude Code Language Model implementation
+ */
+
+import { NoSuchModelError } from '@ai-sdk/provider';
+import { generateId } from '@ai-sdk/provider-utils';
+import { convertToClaudeCodeMessages } from './message-converter.js';
+import { extractJson } from './json-extractor.js';
+import { createAPICallError, createAuthenticationError } from './errors.js';
+
+let query;
+let AbortError;
+
+async function loadClaudeCodeModule() {
+	if (!query || !AbortError) {
+		try {
+			const mod = await import('@anthropic-ai/claude-code');
+			query = mod.query;
+			AbortError = mod.AbortError;
+		} catch (err) {
+			throw new Error(
+				"Claude Code SDK is not installed. Please install '@anthropic-ai/claude-code' to use the claude-code provider."
+			);
+		}
+	}
+}
+
+/**
+ * @typedef {import('./types.js').ClaudeCodeSettings} ClaudeCodeSettings
+ * @typedef {import('./types.js').ClaudeCodeModelId} ClaudeCodeModelId
+ * @typedef {import('./types.js').ClaudeCodeLanguageModelOptions} ClaudeCodeLanguageModelOptions
+ */
+
+const modelMap = {
+	opus: 'opus',
+	sonnet: 'sonnet'
+};
+
+export class ClaudeCodeLanguageModel {
+	specificationVersion = 'v1';
+	defaultObjectGenerationMode = 'json';
+	supportsImageUrls = false;
+	supportsStructuredOutputs = false;
+
+	/** @type {ClaudeCodeModelId} */
+	modelId;
+
+	/** @type {ClaudeCodeSettings} */
+	settings;
+
+	/** @type {string|undefined} */
+	sessionId;
+
+	/**
+	 * @param {ClaudeCodeLanguageModelOptions} options
+	 */
+	constructor(options) {
+		this.modelId = options.id;
+		this.settings = options.settings ?? {};
+
+		// Validate model ID format
+		if (
+			!this.modelId ||
+			typeof this.modelId !== 'string' ||
+			this.modelId.trim() === ''
+		) {
+			throw new NoSuchModelError({
+				modelId: this.modelId,
+				modelType: 'languageModel'
+			});
+		}
+	}
+
+	get provider() {
+		return 'claude-code';
+	}
+
+	/**
+	 * Get the model name for Claude Code CLI
+	 * @returns {string}
+	 */
+	getModel() {
+		const mapped = modelMap[this.modelId];
+		return mapped ?? this.modelId;
+	}
+
+	/**
+	 * Generate unsupported parameter warnings
+	 * @param {Object} options - Generation options
+	 * @returns {Array} Warnings array
+	 */
+	generateUnsupportedWarnings(options) {
+		const warnings = [];
+		const unsupportedParams = [];
+
+		// Check for unsupported parameters
+		if (options.temperature !== undefined)
+			unsupportedParams.push('temperature');
+		if (options.maxTokens !== undefined) unsupportedParams.push('maxTokens');
+		if (options.topP !== undefined) unsupportedParams.push('topP');
+		if (options.topK !== undefined) unsupportedParams.push('topK');
+		if (options.presencePenalty !== undefined)
+			unsupportedParams.push('presencePenalty');
+		if (options.frequencyPenalty !== undefined)
+			unsupportedParams.push('frequencyPenalty');
+		if (options.stopSequences !== undefined && options.stopSequences.length > 0)
+			unsupportedParams.push('stopSequences');
+		if (options.seed !== undefined) unsupportedParams.push('seed');
+
+		if (unsupportedParams.length > 0) {
+			// Add a warning for each unsupported parameter
+			for (const param of unsupportedParams) {
+				warnings.push({
+					type: 'unsupported-setting',
+					setting: param,
+					details: `Claude Code CLI does not support the ${param} parameter. It will be ignored.`
+				});
+			}
+		}
+
+		return warnings;
+	}
+
+	/**
+	 * Generate text using Claude Code
+	 * @param {Object} options - Generation options
+	 * @returns {Promise<Object>}
+	 */
+	async doGenerate(options) {
+		await loadClaudeCodeModule();
+		const { messagesPrompt } = convertToClaudeCodeMessages(
+			options.prompt,
+			options.mode
+		);
+
+		const abortController = new AbortController();
+		if (options.abortSignal) {
+			options.abortSignal.addEventListener('abort', () =>
+				abortController.abort()
+			);
+		}
+
+		const queryOptions = {
+			model: this.getModel(),
+			abortController,
+			resume: this.sessionId,
+			pathToClaudeCodeExecutable: this.settings.pathToClaudeCodeExecutable,
+			customSystemPrompt: this.settings.customSystemPrompt,
+			appendSystemPrompt: this.settings.appendSystemPrompt,
+			maxTurns: this.settings.maxTurns,
+			maxThinkingTokens: this.settings.maxThinkingTokens,
+			cwd: this.settings.cwd,
+			executable: this.settings.executable,
+			executableArgs: this.settings.executableArgs,
+			permissionMode: this.settings.permissionMode,
+			permissionPromptToolName: this.settings.permissionPromptToolName,
+			continue: this.settings.continue,
+			allowedTools: this.settings.allowedTools,
+			disallowedTools: this.settings.disallowedTools,
+			mcpServers: this.settings.mcpServers
+		};
+
+		let text = '';
+		let usage = { promptTokens: 0, completionTokens: 0 };
+		let finishReason = 'stop';
+		let costUsd;
+		let durationMs;
+		let rawUsage;
+		const warnings = this.generateUnsupportedWarnings(options);
+
+		try {
+			if (!query) {
+				throw new Error(
+					"Claude Code SDK is not installed. Please install '@anthropic-ai/claude-code' to use the claude-code provider."
+				);
+			}
+			const response = query({
+				prompt: messagesPrompt,
+				options: queryOptions
+			});
+
+			for await (const message of response) {
+				if (message.type === 'assistant') {
+					text += message.message.content
+						.map((c) => (c.type === 'text' ? c.text : ''))
+						.join('');
+				} else if (message.type === 'result') {
+					this.sessionId = message.session_id;
+					costUsd = message.total_cost_usd;
+					durationMs = message.duration_ms;
+
+					if ('usage' in message) {
+						rawUsage = message.usage;
+						usage = {
+							promptTokens:
+								(message.usage.cache_creation_input_tokens ?? 0) +
+								(message.usage.cache_read_input_tokens ?? 0) +
+								(message.usage.input_tokens ?? 0),
+							completionTokens: message.usage.output_tokens ?? 0
+						};
+					}
+
+					if (message.subtype === 'error_max_turns') {
+						finishReason = 'length';
+					} else if (message.subtype === 'error_during_execution') {
+						finishReason = 'error';
+					}
+				} else if (message.type === 'system' && message.subtype === 'init') {
+					this.sessionId = message.session_id;
+				}
+			}
+		} catch (error) {
+			// -------------------------------------------------------------
+			// Work-around for Claude-Code CLI/SDK JSON truncation bug (#913)
+			// -------------------------------------------------------------
+			// If the SDK throws a JSON SyntaxError *but* we already hold some
+			// buffered text, assume the response was truncated by the CLI.
+			// We keep the accumulated text, mark the finish reason, push a
+			// provider-warning and *skip* the normal error handling so Task
+			// Master can continue processing.
+			const isJsonTruncation =
+				error instanceof SyntaxError &&
+				/JSON/i.test(error.message || '') &&
+				(error.message.includes('position') ||
+					error.message.includes('Unexpected end'));
+			if (isJsonTruncation && text && text.length > 0) {
+				warnings.push({
+					type: 'provider-warning',
+					details:
+						'Claude Code SDK emitted a JSON parse error but Task Master recovered buffered text (possible CLI truncation).'
+				});
+				finishReason = 'truncated';
+				// Skip re-throwing: fall through so the caller receives usable data
+			} else {
+				if (AbortError && error instanceof AbortError) {
+					throw options.abortSignal?.aborted
+						? options.abortSignal.reason
+						: error;
+				}
+
+				// Check for authentication errors
+				if (
+					error.message?.includes('not logged in') ||
+					error.message?.includes('authentication') ||
+					error.exitCode === 401
+				) {
+					throw createAuthenticationError({
+						message:
+							error.message ||
+							'Authentication failed. Please ensure Claude Code CLI is properly authenticated.'
+					});
+				}
+
+				// Wrap other errors with API call error
+				throw createAPICallError({
+					message: error.message || 'Claude Code CLI error',
+					code: error.code,
+					exitCode: error.exitCode,
+					stderr: error.stderr,
+					promptExcerpt: messagesPrompt.substring(0, 200),
+					isRetryable: error.code === 'ENOENT' || error.code === 'ECONNREFUSED'
+				});
+			}
+		}
+
+		// Extract JSON if in object-json mode
+		if (options.mode?.type === 'object-json' && text) {
+			text = extractJson(text);
+		}
+
+		return {
+			text: text || undefined,
+			usage,
+			finishReason,
+			rawCall: {
+				rawPrompt: messagesPrompt,
+				rawSettings: queryOptions
+			},
+			warnings: warnings.length > 0 ? warnings : undefined,
+			response: {
+				id: generateId(),
+				timestamp: new Date(),
+				modelId: this.modelId
+			},
+			request: {
+				body: messagesPrompt
+			},
+			providerMetadata: {
+				'claude-code': {
+					...(this.sessionId !== undefined && { sessionId: this.sessionId }),
+					...(costUsd !== undefined && { costUsd }),
+					...(durationMs !== undefined && { durationMs }),
+					...(rawUsage !== undefined && { rawUsage })
+				}
+			}
+		};
+	}
+
+	/**
+	 * Stream text using Claude Code
+	 * @param {Object} options - Stream options
+	 * @returns {Promise<Object>}
+	 */
+	async doStream(options) {
+		await loadClaudeCodeModule();
+		const { messagesPrompt } = convertToClaudeCodeMessages(
+			options.prompt,
+			options.mode
+		);
+
+		const abortController = new AbortController();
+		if (options.abortSignal) {
+			options.abortSignal.addEventListener('abort', () =>
+				abortController.abort()
+			);
+		}
+
+		const queryOptions = {
+			model: this.getModel(),
+			abortController,
+			resume: this.sessionId,
+			pathToClaudeCodeExecutable: this.settings.pathToClaudeCodeExecutable,
+			customSystemPrompt: this.settings.customSystemPrompt,
+			appendSystemPrompt: this.settings.appendSystemPrompt,
+			maxTurns: this.settings.maxTurns,
+			maxThinkingTokens: this.settings.maxThinkingTokens,
+			cwd: this.settings.cwd,
+			executable: this.settings.executable,
+			executableArgs: this.settings.executableArgs,
+			permissionMode: this.settings.permissionMode,
+			permissionPromptToolName: this.settings.permissionPromptToolName,
+			continue: this.settings.continue,
+			allowedTools: this.settings.allowedTools,
+			disallowedTools: this.settings.disallowedTools,
+			mcpServers: this.settings.mcpServers
+		};
+
+		const warnings = this.generateUnsupportedWarnings(options);
+
+		const stream = new ReadableStream({
+			start: async (controller) => {
+				try {
+					if (!query) {
+						throw new Error(
+							"Claude Code SDK is not installed. Please install '@anthropic-ai/claude-code' to use the claude-code provider."
+						);
+					}
+					const response = query({
+						prompt: messagesPrompt,
+						options: queryOptions
+					});
+
+					let usage = { promptTokens: 0, completionTokens: 0 };
+					let accumulatedText = '';
+
+					for await (const message of response) {
+						if (message.type === 'assistant') {
+							const text = message.message.content
+								.map((c) => (c.type === 'text' ? c.text : ''))
+								.join('');
+
+							if (text) {
+								accumulatedText += text;
+
+								// In object-json mode, we need to accumulate the full text
+								// and extract JSON at the end, so don't stream individual deltas
+								if (options.mode?.type !== 'object-json') {
+									controller.enqueue({
+										type: 'text-delta',
+										textDelta: text
+									});
+								}
+							}
+						} else if (message.type === 'result') {
+							let rawUsage;
+							if ('usage' in message) {
+								rawUsage = message.usage;
+								usage = {
+									promptTokens:
+										(message.usage.cache_creation_input_tokens ?? 0) +
+										(message.usage.cache_read_input_tokens ?? 0) +
+										(message.usage.input_tokens ?? 0),
+									completionTokens: message.usage.output_tokens ?? 0
+								};
+							}
+
+							let finishReason = 'stop';
+							if (message.subtype === 'error_max_turns') {
+								finishReason = 'length';
+							} else if (message.subtype === 'error_during_execution') {
+								finishReason = 'error';
+							}
+
+							// Store session ID in the model instance
+							this.sessionId = message.session_id;
+
+							// In object-json mode, extract JSON and send the full text at once
+							if (options.mode?.type === 'object-json' && accumulatedText) {
+								const extractedJson = extractJson(accumulatedText);
+								controller.enqueue({
+									type: 'text-delta',
+									textDelta: extractedJson
+								});
+							}
+
+							controller.enqueue({
+								type: 'finish',
+								finishReason,
+								usage,
+								providerMetadata: {
+									'claude-code': {
+										sessionId: message.session_id,
+										...(message.total_cost_usd !== undefined && {
+											costUsd: message.total_cost_usd
+										}),
+										...(message.duration_ms !== undefined && {
+											durationMs: message.duration_ms
+										}),
+										...(rawUsage !== undefined && { rawUsage })
+									}
+								}
+							});
+						} else if (
+							message.type === 'system' &&
+							message.subtype === 'init'
+						) {
+							// Store session ID for future use
+							this.sessionId = message.session_id;
+
+							// Emit response metadata when session is initialized
+							controller.enqueue({
+								type: 'response-metadata',
+								id: message.session_id,
+								timestamp: new Date(),
+								modelId: this.modelId
+							});
+						}
+					}
+
+					// -------------------------------------------------------------
+					// Work-around for Claude-Code CLI/SDK JSON truncation bug (#913)
+					// -------------------------------------------------------------
+					// If we hit the SDK JSON SyntaxError but have buffered text, finalize
+					// the stream gracefully instead of emitting an error.
+					const isJsonTruncation =
+						error instanceof SyntaxError &&
+						/JSON/i.test(error.message || '') &&
+						(error.message.includes('position') ||
+							error.message.includes('Unexpected end'));
+
+					if (
+						isJsonTruncation &&
+						accumulatedText &&
+						accumulatedText.length > 0
+					) {
+						// Prepare final text payload
+						const finalText =
+							options.mode?.type === 'object-json'
+								? extractJson(accumulatedText)
+								: accumulatedText;
+
+						// Emit any remaining text
+						controller.enqueue({
+							type: 'text-delta',
+							textDelta: finalText
+						});
+
+						// Emit finish with truncated reason and warning
+						controller.enqueue({
+							type: 'finish',
+							finishReason: 'truncated',
+							usage,
+							providerMetadata: { 'claude-code': { truncated: true } },
+							warnings: [
+								{
+									type: 'provider-warning',
+									details:
+										'Claude Code SDK JSON truncation detected; stream recovered.'
+								}
+							]
+						});
+
+						controller.close();
+						return; // Skip normal error path
+					}
+
+					controller.close();
+				} catch (error) {
+					let errorToEmit;
+
+					if (AbortError && error instanceof AbortError) {
+						errorToEmit = options.abortSignal?.aborted
+							? options.abortSignal.reason
+							: error;
+					} else if (
+						error.message?.includes('not logged in') ||
+						error.message?.includes('authentication') ||
+						error.exitCode === 401
+					) {
+						errorToEmit = createAuthenticationError({
+							message:
+								error.message ||
+								'Authentication failed. Please ensure Claude Code CLI is properly authenticated.'
+						});
+					} else {
+						errorToEmit = createAPICallError({
+							message: error.message || 'Claude Code CLI error',
+							code: error.code,
+							exitCode: error.exitCode,
+							stderr: error.stderr,
+							promptExcerpt: messagesPrompt.substring(0, 200),
+							isRetryable:
+								error.code === 'ENOENT' || error.code === 'ECONNREFUSED'
+						});
+					}
+
+					// Emit error as a stream part
+					controller.enqueue({
+						type: 'error',
+						error: errorToEmit
+					});
+
+					controller.close();
+				}
+			}
+		});
+
+		return {
+			stream,
+			rawCall: {
+				rawPrompt: messagesPrompt,
+				rawSettings: queryOptions
+			},
+			warnings: warnings.length > 0 ? warnings : undefined,
+			request: {
+				body: messagesPrompt
+			}
+		};
+	}
+}
--- a/src/ai-providers/custom-sdk/claude-code/message-converter.js
+++ b/src/ai-providers/custom-sdk/claude-code/message-converter.js
@@ -0,0 +1,139 @@
+/**
+ * @fileoverview Converts AI SDK prompt format to Claude Code message format
+ */
+
+/**
+ * Convert AI SDK prompt to Claude Code messages format
+ * @param {Array} prompt - AI SDK prompt array
+ * @param {Object} [mode] - Generation mode
+ * @param {string} mode.type - Mode type ('regular', 'object-json', 'object-tool')
+ * @returns {{messagesPrompt: string, systemPrompt?: string}}
+ */
+export function convertToClaudeCodeMessages(prompt, mode) {
+	const messages = [];
+	let systemPrompt;
+
+	for (const message of prompt) {
+		switch (message.role) {
+			case 'system':
+				systemPrompt = message.content;
+				break;
+
+			case 'user':
+				if (typeof message.content === 'string') {
+					messages.push(message.content);
+				} else {
+					// Handle multi-part content
+					const textParts = message.content
+						.filter((part) => part.type === 'text')
+						.map((part) => part.text)
+						.join('\n');
+
+					if (textParts) {
+						messages.push(textParts);
+					}
+
+					// Note: Image parts are not supported by Claude Code CLI
+					const imageParts = message.content.filter(
+						(part) => part.type === 'image'
+					);
+					if (imageParts.length > 0) {
+						console.warn(
+							'Claude Code CLI does not support image inputs. Images will be ignored.'
+						);
+					}
+				}
+				break;
+
+			case 'assistant':
+				if (typeof message.content === 'string') {
+					messages.push(`Assistant: ${message.content}`);
+				} else {
+					const textParts = message.content
+						.filter((part) => part.type === 'text')
+						.map((part) => part.text)
+						.join('\n');
+
+					if (textParts) {
+						messages.push(`Assistant: ${textParts}`);
+					}
+
+					// Handle tool calls if present
+					const toolCalls = message.content.filter(
+						(part) => part.type === 'tool-call'
+					);
+					if (toolCalls.length > 0) {
+						// For now, we'll just note that tool calls were made
+						messages.push(`Assistant: [Tool calls made]`);
+					}
+				}
+				break;
+
+			case 'tool':
+				// Tool results could be included in the conversation
+				messages.push(
+					`Tool Result (${message.content[0].toolName}): ${JSON.stringify(
+						message.content[0].result
+					)}`
+				);
+				break;
+		}
+	}
+
+	// For the SDK, we need to provide a single prompt string
+	// Format the conversation history properly
+
+	// Combine system prompt with messages
+	let finalPrompt = '';
+
+	// Add system prompt at the beginning if present
+	if (systemPrompt) {
+		finalPrompt = systemPrompt;
+	}
+
+	if (messages.length === 0) {
+		return { messagesPrompt: finalPrompt, systemPrompt };
+	}
+
+	// Format messages
+	const formattedMessages = [];
+	for (let i = 0; i < messages.length; i++) {
+		const msg = messages[i];
+		// Check if this is a user or assistant message based on content
+		if (msg.startsWith('Assistant:') || msg.startsWith('Tool Result')) {
+			formattedMessages.push(msg);
+		} else {
+			// User messages
+			formattedMessages.push(`Human: ${msg}`);
+		}
+	}
+
+	// Combine system prompt with messages
+	if (finalPrompt) {
+		finalPrompt = finalPrompt + '\n\n' + formattedMessages.join('\n\n');
+	} else {
+		finalPrompt = formattedMessages.join('\n\n');
+	}
+
+	// For JSON mode, add explicit instruction to ensure JSON output
+	if (mode?.type === 'object-json') {
+		// Make the JSON instruction even more explicit
+		finalPrompt = `${finalPrompt}
+
+CRITICAL INSTRUCTION: You MUST respond with ONLY valid JSON. Follow these rules EXACTLY:
+1. Start your response with an opening brace {
+2. End your response with a closing brace }
+3. Do NOT include any text before the opening brace
+4. Do NOT include any text after the closing brace
+5. Do NOT use markdown code blocks or backticks
+6. Do NOT include explanations or commentary
+7. The ENTIRE response must be valid JSON that can be parsed with JSON.parse()
+
+Begin your response with { and end with }`;
+	}
+
+	return {
+		messagesPrompt: finalPrompt,
+		systemPrompt
+	};
+}
--- a/src/ai-providers/custom-sdk/claude-code/types.js
+++ b/src/ai-providers/custom-sdk/claude-code/types.js
@@ -0,0 +1,73 @@
+/**
+ * @fileoverview Type definitions for Claude Code AI SDK provider
+ * These JSDoc types mirror the TypeScript interfaces from the original provider
+ */
+
+/**
+ * Claude Code provider settings
+ * @typedef {Object} ClaudeCodeSettings
+ * @property {string} [pathToClaudeCodeExecutable='claude'] - Custom path to Claude Code CLI executable
+ * @property {string} [customSystemPrompt] - Custom system prompt to use
+ * @property {string} [appendSystemPrompt] - Append additional content to the system prompt
+ * @property {number} [maxTurns] - Maximum number of turns for the conversation
+ * @property {number} [maxThinkingTokens] - Maximum thinking tokens for the model
+ * @property {string} [cwd] - Working directory for CLI operations
+ * @property {'bun'|'deno'|'node'} [executable='node'] - JavaScript runtime to use
+ * @property {string[]} [executableArgs] - Additional arguments for the JavaScript runtime
+ * @property {'default'|'acceptEdits'|'bypassPermissions'|'plan'} [permissionMode='default'] - Permission mode for tool usage
+ * @property {string} [permissionPromptToolName] - Custom tool name for permission prompts
+ * @property {boolean} [continue] - Continue the most recent conversation
+ * @property {string} [resume] - Resume a specific session by ID
+ * @property {string[]} [allowedTools] - Tools to explicitly allow during execution (e.g., ['Read', 'LS', 'Bash(git log:*)'])
+ * @property {string[]} [disallowedTools] - Tools to disallow during execution (e.g., ['Write', 'Edit', 'Bash(rm:*)'])
+ * @property {Object.<string, MCPServerConfig>} [mcpServers] - MCP server configuration
+ * @property {boolean} [verbose] - Enable verbose logging for debugging
+ */
+
+/**
+ * MCP Server configuration
+ * @typedef {Object} MCPServerConfig
+ * @property {'stdio'|'sse'} [type='stdio'] - Server type
+ * @property {string} command - Command to execute (for stdio type)
+ * @property {string[]} [args] - Arguments for the command
+ * @property {Object.<string, string>} [env] - Environment variables
+ * @property {string} url - URL for SSE type servers
+ * @property {Object.<string, string>} [headers] - Headers for SSE type servers
+ */
+
+/**
+ * Model ID type - either 'opus', 'sonnet', or any string
+ * @typedef {'opus'|'sonnet'|string} ClaudeCodeModelId
+ */
+
+/**
+ * Language model options
+ * @typedef {Object} ClaudeCodeLanguageModelOptions
+ * @property {ClaudeCodeModelId} id - The model ID
+ * @property {ClaudeCodeSettings} [settings] - Optional settings
+ */
+
+/**
+ * Error metadata for Claude Code errors
+ * @typedef {Object} ClaudeCodeErrorMetadata
+ * @property {string} [code] - Error code
+ * @property {number} [exitCode] - Process exit code
+ * @property {string} [stderr] - Standard error output
+ * @property {string} [promptExcerpt] - Excerpt of the prompt that caused the error
+ */
+
+/**
+ * Claude Code provider interface
+ * @typedef {Object} ClaudeCodeProvider
+ * @property {function(ClaudeCodeModelId, ClaudeCodeSettings=): Object} languageModel - Create a language model
+ * @property {function(ClaudeCodeModelId, ClaudeCodeSettings=): Object} chat - Alias for languageModel
+ * @property {function(string): never} textEmbeddingModel - Throws NoSuchModelError (not supported)
+ */
+
+/**
+ * Claude Code provider settings
+ * @typedef {Object} ClaudeCodeProviderSettings
+ * @property {ClaudeCodeSettings} [defaultSettings] - Default settings to use for all models
+ */
+
+export {}; // This ensures the file is treated as a module
--- a/src/ai-providers/custom-sdk/grok-cli/errors.js
+++ b/src/ai-providers/custom-sdk/grok-cli/errors.js
@@ -0,0 +1,155 @@
+/**
+ * @fileoverview Error handling utilities for Grok CLI provider
+ */
+
+import { APICallError, LoadAPIKeyError } from '@ai-sdk/provider';
+
+/**
+ * @typedef {import('./types.js').GrokCliErrorMetadata} GrokCliErrorMetadata
+ */
+
+/**
+ * Create an API call error with Grok CLI specific metadata
+ * @param {Object} params - Error parameters
+ * @param {string} params.message - Error message
+ * @param {string} [params.code] - Error code
+ * @param {number} [params.exitCode] - Process exit code
+ * @param {string} [params.stderr] - Standard error output
+ * @param {string} [params.stdout] - Standard output
+ * @param {string} [params.promptExcerpt] - Excerpt of the prompt
+ * @param {boolean} [params.isRetryable=false] - Whether the error is retryable
+ * @returns {APICallError}
+ */
+export function createAPICallError({
+	message,
+	code,
+	exitCode,
+	stderr,
+	stdout,
+	promptExcerpt,
+	isRetryable = false
+}) {
+	/** @type {GrokCliErrorMetadata} */
+	const metadata = {
+		code,
+		exitCode,
+		stderr,
+		stdout,
+		promptExcerpt
+	};
+
+	return new APICallError({
+		message,
+		isRetryable,
+		url: 'grok-cli://command',
+		requestBodyValues: promptExcerpt ? { prompt: promptExcerpt } : undefined,
+		data: metadata
+	});
+}
+
+/**
+ * Create an authentication error
+ * @param {Object} params - Error parameters
+ * @param {string} params.message - Error message
+ * @returns {LoadAPIKeyError}
+ */
+export function createAuthenticationError({ message }) {
+	return new LoadAPIKeyError({
+		message:
+			message ||
+			'Authentication failed. Please ensure Grok CLI is properly configured with API key.'
+	});
+}
+
+/**
+ * Create a timeout error
+ * @param {Object} params - Error parameters
+ * @param {string} params.message - Error message
+ * @param {string} [params.promptExcerpt] - Excerpt of the prompt
+ * @param {number} params.timeoutMs - Timeout in milliseconds
+ * @returns {APICallError}
+ */
+export function createTimeoutError({ message, promptExcerpt, timeoutMs }) {
+	/** @type {GrokCliErrorMetadata & { timeoutMs: number }} */
+	const metadata = {
+		code: 'TIMEOUT',
+		promptExcerpt,
+		timeoutMs
+	};
+
+	return new APICallError({
+		message,
+		isRetryable: true,
+		url: 'grok-cli://command',
+		requestBodyValues: promptExcerpt ? { prompt: promptExcerpt } : undefined,
+		data: metadata
+	});
+}
+
+/**
+ * Create a CLI installation error
+ * @param {Object} params - Error parameters
+ * @param {string} [params.message] - Error message
+ * @returns {APICallError}
+ */
+export function createInstallationError({ message }) {
+	return new APICallError({
+		message:
+			message ||
+			'Grok CLI is not installed or not found in PATH. Please install with: npm install -g @vibe-kit/grok-cli',
+		isRetryable: false,
+		url: 'grok-cli://installation'
+	});
+}
+
+/**
+ * Check if an error is an authentication error
+ * @param {unknown} error - Error to check
+ * @returns {boolean}
+ */
+export function isAuthenticationError(error) {
+	if (error instanceof LoadAPIKeyError) return true;
+	if (
+		error instanceof APICallError &&
+		/** @type {GrokCliErrorMetadata} */ (error.data)?.exitCode === 401
+	)
+		return true;
+	return false;
+}
+
+/**
+ * Check if an error is a timeout error
+ * @param {unknown} error - Error to check
+ * @returns {boolean}
+ */
+export function isTimeoutError(error) {
+	if (
+		error instanceof APICallError &&
+		/** @type {GrokCliErrorMetadata} */ (error.data)?.code === 'TIMEOUT'
+	)
+		return true;
+	return false;
+}
+
+/**
+ * Check if an error is an installation error
+ * @param {unknown} error - Error to check
+ * @returns {boolean}
+ */
+export function isInstallationError(error) {
+	if (error instanceof APICallError && error.url === 'grok-cli://installation')
+		return true;
+	return false;
+}
+
+/**
+ * Get error metadata from an error
+ * @param {unknown} error - Error to extract metadata from
+ * @returns {GrokCliErrorMetadata|undefined}
+ */
+export function getErrorMetadata(error) {
+	if (error instanceof APICallError && error.data) {
+		return /** @type {GrokCliErrorMetadata} */ (error.data);
+	}
+	return undefined;
+}
--- a/src/ai-providers/custom-sdk/grok-cli/index.js
+++ b/src/ai-providers/custom-sdk/grok-cli/index.js
@@ -0,0 +1,85 @@
+/**
+ * @fileoverview Grok CLI provider factory and exports
+ */
+
+import { NoSuchModelError } from '@ai-sdk/provider';
+import { GrokCliLanguageModel } from './language-model.js';
+
+/**
+ * @typedef {import('./types.js').GrokCliSettings} GrokCliSettings
+ * @typedef {import('./types.js').GrokCliModelId} GrokCliModelId
+ * @typedef {import('./types.js').GrokCliProvider} GrokCliProvider
+ * @typedef {import('./types.js').GrokCliProviderSettings} GrokCliProviderSettings
+ */
+
+/**
+ * Create a Grok CLI provider
+ * @param {GrokCliProviderSettings} [options={}] - Provider configuration options
+ * @returns {GrokCliProvider} Grok CLI provider instance
+ */
+export function createGrokCli(options = {}) {
+	/**
+	 * Create a language model instance
+	 * @param {GrokCliModelId} modelId - Model ID
+	 * @param {GrokCliSettings} [settings={}] - Model settings
+	 * @returns {GrokCliLanguageModel}
+	 */
+	const createModel = (modelId, settings = {}) => {
+		return new GrokCliLanguageModel({
+			id: modelId,
+			settings: {
+				...options.defaultSettings,
+				...settings
+			}
+		});
+	};
+
+	/**
+	 * Provider function
+	 * @param {GrokCliModelId} modelId - Model ID
+	 * @param {GrokCliSettings} [settings] - Model settings
+	 * @returns {GrokCliLanguageModel}
+	 */
+	const provider = function (modelId, settings) {
+		if (new.target) {
+			throw new Error(
+				'The Grok CLI model function cannot be called with the new keyword.'
+			);
+		}
+
+		return createModel(modelId, settings);
+	};
+
+	provider.languageModel = createModel;
+	provider.chat = createModel; // Alias for languageModel
+
+	// Add textEmbeddingModel method that throws NoSuchModelError
+	provider.textEmbeddingModel = (modelId) => {
+		throw new NoSuchModelError({
+			modelId,
+			modelType: 'textEmbeddingModel'
+		});
+	};
+
+	return /** @type {GrokCliProvider} */ (provider);
+}
+
+/**
+ * Default Grok CLI provider instance
+ */
+export const grokCli = createGrokCli();
+
+// Provider exports
+export { GrokCliLanguageModel } from './language-model.js';
+
+// Error handling exports
+export {
+	isAuthenticationError,
+	isTimeoutError,
+	isInstallationError,
+	getErrorMetadata,
+	createAPICallError,
+	createAuthenticationError,
+	createTimeoutError,
+	createInstallationError
+} from './errors.js';
--- a/src/ai-providers/custom-sdk/grok-cli/json-extractor.js
+++ b/src/ai-providers/custom-sdk/grok-cli/json-extractor.js
@@ -0,0 +1,59 @@
+/**
+ * @fileoverview Extract JSON from Grok's response, handling markdown blocks and other formatting
+ */
+
+/**
+ * Extract JSON from Grok's response
+ * @param {string} text - The text to extract JSON from
+ * @returns {string} - The extracted JSON string
+ */
+export function extractJson(text) {
+	// Remove markdown code blocks if present
+	let jsonText = text.trim();
+
+	// Remove ```json blocks
+	jsonText = jsonText.replace(/^```json\s*/gm, '');
+	jsonText = jsonText.replace(/^```\s*/gm, '');
+	jsonText = jsonText.replace(/```\s*$/gm, '');
+
+	// Remove common TypeScript/JavaScript patterns
+	jsonText = jsonText.replace(/^const\s+\w+\s*=\s*/, ''); // Remove "const varName = "
+	jsonText = jsonText.replace(/^let\s+\w+\s*=\s*/, ''); // Remove "let varName = "
+	jsonText = jsonText.replace(/^var\s+\w+\s*=\s*/, ''); // Remove "var varName = "
+	jsonText = jsonText.replace(/;?\s*$/, ''); // Remove trailing semicolons
+
+	// Try to extract JSON object or array
+	const objectMatch = jsonText.match(/{[\s\S]*}/);
+	const arrayMatch = jsonText.match(/\[[\s\S]*\]/);
+
+	if (objectMatch) {
+		jsonText = objectMatch[0];
+	} else if (arrayMatch) {
+		jsonText = arrayMatch[0];
+	}
+
+	// First try to parse as valid JSON
+	try {
+		JSON.parse(jsonText);
+		return jsonText;
+	} catch {
+		// If it's not valid JSON, it might be a JavaScript object literal
+		// Try to convert it to valid JSON
+		try {
+			// This is a simple conversion that handles basic cases
+			// Replace unquoted keys with quoted keys
+			const converted = jsonText
+				.replace(/([{,]\s*)([a-zA-Z_$][a-zA-Z0-9_$]*)\s*:/g, '$1"$2":')
+				// Replace single quotes with double quotes
+				.replace(/'/g, '"');
+
+			// Validate the converted JSON
+			JSON.parse(converted);
+			return converted;
+		} catch {
+			// If all else fails, return the original text
+			// The AI SDK will handle the error appropriately
+			return text;
+		}
+	}
+}
--- a/packages/ai-sdk-provider-grok-cli/src/grok-cli-language-model.ts
+++ b/packages/ai-sdk-provider-grok-cli/src/grok-cli-language-model.ts
@@ -1,51 +1,53 @@
 /**
- * Grok CLI Language Model implementation for AI SDK v5
+ * @fileoverview Grok CLI Language Model implementation
 */

-import { spawn } from 'child_process';
-import { promises as fs } from 'fs';
-import { homedir } from 'os';
-import { join } from 'path';
-import type {
-	LanguageModelV2,
-	LanguageModelV2CallOptions,
-	LanguageModelV2CallWarning
-} from '@ai-sdk/provider';
 import { NoSuchModelError } from '@ai-sdk/provider';
 import { generateId } from '@ai-sdk/provider-utils';
-
+import {
+	createPromptFromMessages,
+	convertFromGrokCliResponse,
+	escapeShellArg
+} from './message-converter.js';
+import { extractJson } from './json-extractor.js';
 import {
 	createAPICallError,
 	createAuthenticationError,
 	createInstallationError,
 	createTimeoutError
 } from './errors.js';
-import { extractJson } from './json-extractor.js';
-import {
-	convertFromGrokCliResponse,
-	createPromptFromMessages,
-	escapeShellArg
-} from './message-converter.js';
-import type {
-	GrokCliLanguageModelOptions,
-	GrokCliModelId,
-	GrokCliSettings
-} from './types.js';
+import { spawn } from 'child_process';
+import { promises as fs } from 'fs';
+import { join } from 'path';
+import { homedir } from 'os';

 /**
- * Grok CLI Language Model implementation for AI SDK v5
+ * @typedef {import('./types.js').GrokCliSettings} GrokCliSettings
+ * @typedef {import('./types.js').GrokCliModelId} GrokCliModelId
 */
-export class GrokCliLanguageModel implements LanguageModelV2 {
-	readonly specificationVersion = 'v2' as const;
-	readonly defaultObjectGenerationMode = 'json' as const;
-	readonly supportsImageUrls = false;
-	readonly supportsStructuredOutputs = false;
-	readonly supportedUrls: Record<string, RegExp[]> = {};

-	readonly modelId: GrokCliModelId;
-	readonly settings: GrokCliSettings;
+/**
+ * @typedef {Object} GrokCliLanguageModelOptions
+ * @property {GrokCliModelId} id - Model ID
+ * @property {GrokCliSettings} [settings] - Model settings
+ */

-	constructor(options: GrokCliLanguageModelOptions) {
+export class GrokCliLanguageModel {
+	specificationVersion = 'v1';
+	defaultObjectGenerationMode = 'json';
+	supportsImageUrls = false;
+	supportsStructuredOutputs = false;
+
+	/** @type {GrokCliModelId} */
+	modelId;
+
+	/** @type {GrokCliSettings} */
+	settings;
+
+	/**
+	 * @param {GrokCliLanguageModelOptions} options
+	 */
+	constructor(options) {
 		this.modelId = options.id;
 		this.settings = options.settings ?? {};

@@ -62,14 +64,15 @@ export class GrokCliLanguageModel implements LanguageModelV2 {
 		}
 	}

-	get provider(): string {
+	get provider() {
 		return 'grok-cli';
 	}

 	/**
 	 * Check if Grok CLI is installed and available
+	 * @returns {Promise<boolean>}
 	 */
-	private async checkGrokCliInstallation(): Promise<boolean> {
+	async checkGrokCliInstallation() {
 		return new Promise((resolve) => {
 			const child = spawn('grok', ['--version'], {
 				stdio: 'pipe'
@@ -82,8 +85,9 @@ export class GrokCliLanguageModel implements LanguageModelV2 {

 	/**
 	 * Get API key from settings or environment
+	 * @returns {Promise<string|null>}
 	 */
-	private async getApiKey(): Promise<string | null> {
+	async getApiKey() {
 		// Check settings first
 		if (this.settings.apiKey) {
 			return this.settings.apiKey;
@@ -107,32 +111,22 @@ export class GrokCliLanguageModel implements LanguageModelV2 {

 	/**
 	 * Execute Grok CLI command
+	 * @param {Array<string>} args - Command line arguments
+	 * @param {Object} options - Execution options
+	 * @returns {Promise<{stdout: string, stderr: string, exitCode: number}>}
 	 */
-	private async executeGrokCli(
-		args: string[],
-		options: { timeout?: number; apiKey?: string } = {}
-	): Promise<{ stdout: string; stderr: string; exitCode: number }> {
-		// Default timeout based on model type
-		let defaultTimeout = 120000; // 2 minutes default
-		if (this.modelId.includes('grok-4')) {
-			defaultTimeout = 600000; // 10 minutes for grok-4 models (they seem to hang during setup)
-		}
-
-		const timeout = options.timeout ?? this.settings.timeout ?? defaultTimeout;
+	async executeGrokCli(args, options = {}) {
+		const timeout = options.timeout || this.settings.timeout || 120000; // 2 minutes default

 		return new Promise((resolve, reject) => {
 			const child = spawn('grok', args, {
 				stdio: 'pipe',
-				cwd: this.settings.workingDirectory || process.cwd(),
-				env:
-					options.apiKey === undefined
-						? process.env
-						: { ...process.env, GROK_CLI_API_KEY: options.apiKey }
+				cwd: this.settings.workingDirectory || process.cwd()
 			});

 			let stdout = '';
 			let stderr = '';
-			let timeoutId: NodeJS.Timeout | undefined;
+			let timeoutId;

 			// Set up timeout
 			if (timeout > 0) {
@@ -148,26 +142,24 @@ export class GrokCliLanguageModel implements LanguageModelV2 {
 				}, timeout);
 			}

-			child.stdout?.on('data', (data) => {
-				const chunk = data.toString();
-				stdout += chunk;
+			child.stdout.on('data', (data) => {
+				stdout += data.toString();
 			});

-			child.stderr?.on('data', (data) => {
-				const chunk = data.toString();
-				stderr += chunk;
+			child.stderr.on('data', (data) => {
+				stderr += data.toString();
 			});

 			child.on('error', (error) => {
 				if (timeoutId) clearTimeout(timeoutId);

-				if ((error as any).code === 'ENOENT') {
+				if (error.code === 'ENOENT') {
 					reject(createInstallationError({}));
 				} else {
 					reject(
 						createAPICallError({
 							message: `Failed to execute Grok CLI: ${error.message}`,
-							code: (error as any).code,
+							code: error.code,
 							stderr: error.message,
 							isRetryable: false
 						})
@@ -188,18 +180,15 @@ export class GrokCliLanguageModel implements LanguageModelV2 {
 	}

 	/**
-	 * Generate comprehensive warnings for unsupported parameters and validation issues
+	 * Generate unsupported parameter warnings
+	 * @param {Object} options - Generation options
+	 * @returns {Array} Warnings array
 	 */
-	private generateAllWarnings(
-		options: LanguageModelV2CallOptions,
-		prompt: string
-	): LanguageModelV2CallWarning[] {
-		const warnings: LanguageModelV2CallWarning[] = [];
-		const unsupportedParams: string[] = [];
+	generateUnsupportedWarnings(options) {
+		const warnings = [];
+		const unsupportedParams = [];

-		// Check for unsupported parameters
-		if (options.temperature !== undefined)
-			unsupportedParams.push('temperature');
+		// Grok CLI supports some parameters but not all AI SDK parameters
 		if (options.topP !== undefined) unsupportedParams.push('topP');
 		if (options.topK !== undefined) unsupportedParams.push('topK');
 		if (options.presencePenalty !== undefined)
@@ -211,51 +200,24 @@ export class GrokCliLanguageModel implements LanguageModelV2 {
 		if (options.seed !== undefined) unsupportedParams.push('seed');

 		if (unsupportedParams.length > 0) {
-			// Add a warning for each unsupported parameter
 			for (const param of unsupportedParams) {
 				warnings.push({
 					type: 'unsupported-setting',
-					setting: param as
-						| 'temperature'
-						| 'topP'
-						| 'topK'
-						| 'presencePenalty'
-						| 'frequencyPenalty'
-						| 'stopSequences'
-						| 'seed',
+					setting: param,
 					details: `Grok CLI does not support the ${param} parameter. It will be ignored.`
 				});
 			}
 		}

-		// Add model validation warnings if needed
-		if (!this.modelId || this.modelId.trim() === '') {
-			warnings.push({
-				type: 'other',
-				message: 'Model ID is empty or invalid'
-			});
-		}
-
-		// Add prompt validation
-		if (!prompt || prompt.trim() === '') {
-			warnings.push({
-				type: 'other',
-				message: 'Prompt is empty'
-			});
-		}
-
 		return warnings;
 	}

 	/**
 	 * Generate text using Grok CLI
+	 * @param {Object} options - Generation options
+	 * @returns {Promise<Object>}
 	 */
-	async doGenerate(options: LanguageModelV2CallOptions) {
-		// Handle abort signal early
-		if (options.abortSignal?.aborted) {
-			throw options.abortSignal.reason || new Error('Request aborted');
-		}
-
+	async doGenerate(options) {
 		// Check CLI installation
 		const isInstalled = await this.checkGrokCliInstallation();
 		if (!isInstalled) {
@@ -272,7 +234,7 @@ export class GrokCliLanguageModel implements LanguageModelV2 {
 		}

 		const prompt = createPromptFromMessages(options.prompt);
-		const warnings = this.generateAllWarnings(options, prompt);
+		const warnings = this.generateUnsupportedWarnings(options);

 		// Build command arguments
 		const args = ['--prompt', escapeShellArg(prompt)];
@@ -282,11 +244,10 @@ export class GrokCliLanguageModel implements LanguageModelV2 {
 			args.push('--model', this.modelId);
 		}

-		// Skip API key parameter if it's likely already configured to avoid hanging
-		// The CLI seems to hang when trying to save API keys for grok-4 models
-		// if (apiKey) {
-		//	args.push('--api-key', apiKey);
-		// }
+		// Add API key if available
+		if (apiKey) {
+			args.push('--api-key', apiKey);
+		}

 		// Add base URL if provided in settings
 		if (this.settings.baseURL) {
@@ -299,7 +260,9 @@ export class GrokCliLanguageModel implements LanguageModelV2 {
 		}

 		try {
-			const result = await this.executeGrokCli(args, { apiKey });
+			const result = await this.executeGrokCli(args, {
+				timeout: this.settings.timeout
+			});

 			if (result.exitCode !== 0) {
 				// Handle authentication errors
@@ -327,37 +290,19 @@ export class GrokCliLanguageModel implements LanguageModelV2 {
 			let text = response.text || '';

 			// Extract JSON if in object-json mode
-			const isObjectJson = (
-				o: unknown
-			): o is { mode: { type: 'object-json' } } =>
-				!!o &&
-				typeof o === 'object' &&
-				'mode' in o &&
-				(o as any).mode?.type === 'object-json';
-			if (isObjectJson(options) && text) {
+			if (options.mode?.type === 'object-json' && text) {
 				text = extractJson(text);
 			}

 			return {
-				content: [
-					{
-						type: 'text' as const,
-						text: text || ''
-					}
-				],
-				usage: response.usage
-					? {
-							inputTokens: response.usage.promptTokens,
-							outputTokens: response.usage.completionTokens,
-							totalTokens: response.usage.totalTokens
-						}
-					: { inputTokens: 0, outputTokens: 0, totalTokens: 0 },
-				finishReason: 'stop' as const,
+				text: text || undefined,
+				usage: response.usage || { promptTokens: 0, completionTokens: 0 },
+				finishReason: 'stop',
 				rawCall: {
 					rawPrompt: prompt,
 					rawSettings: args
 				},
-				warnings: warnings,
+				warnings: warnings.length > 0 ? warnings : undefined,
 				response: {
 					id: generateId(),
 					timestamp: new Date(),
@@ -369,23 +314,20 @@ export class GrokCliLanguageModel implements LanguageModelV2 {
 				providerMetadata: {
 					'grok-cli': {
 						exitCode: result.exitCode,
-						...(result.stderr && { stderr: result.stderr })
+						stderr: result.stderr || undefined
 					}
 				}
 			};
 		} catch (error) {
 			// Re-throw our custom errors
-			if (
-				(error as any).name === 'APICallError' ||
-				(error as any).name === 'LoadAPIKeyError'
-			) {
+			if (error.name === 'APICallError' || error.name === 'LoadAPIKeyError') {
 				throw error;
 			}

 			// Wrap other errors
 			throw createAPICallError({
-				message: `Grok CLI execution failed: ${(error as Error).message}`,
-				code: (error as any).code,
+				message: `Grok CLI execution failed: ${error.message}`,
+				code: error.code,
 				promptExcerpt: prompt.substring(0, 200),
 				isRetryable: false
 			});
@@ -396,39 +338,15 @@ export class GrokCliLanguageModel implements LanguageModelV2 {
 	 * Stream text using Grok CLI
 	 * Note: Grok CLI doesn't natively support streaming, so this simulates streaming
 	 * by generating the full response and then streaming it in chunks
+	 * @param {Object} options - Stream options
+	 * @returns {Promise<Object>}
 	 */
-	async doStream(options: LanguageModelV2CallOptions) {
-		const prompt = createPromptFromMessages(options.prompt);
-		const warnings = this.generateAllWarnings(options, prompt);
+	async doStream(options) {
+		const warnings = this.generateUnsupportedWarnings(options);

 		const stream = new ReadableStream({
 			start: async (controller) => {
-				let abortListener: (() => void) | undefined;
-
 				try {
-					// Handle abort signal
-					if (options.abortSignal?.aborted) {
-						throw options.abortSignal.reason || new Error('Request aborted');
-					}
-
-					// Set up abort listener
-					if (options.abortSignal) {
-						abortListener = () => {
-							controller.enqueue({
-								type: 'error',
-								error:
-									options.abortSignal?.reason || new Error('Request aborted')
-							});
-							controller.close();
-						};
-						options.abortSignal.addEventListener('abort', abortListener, {
-							once: true
-						});
-					}
-
-					// Emit stream-start with warnings
-					controller.enqueue({ type: 'stream-start', warnings });
-
 					// Generate the full response first
 					const result = await this.doGenerate(options);

@@ -441,48 +359,20 @@ export class GrokCliLanguageModel implements LanguageModelV2 {
 					});

 					// Simulate streaming by chunking the text
-					const content = result.content || [];
-					const text =
-						content.length > 0 && content[0].type === 'text'
-							? content[0].text
-							: '';
+					const text = result.text || '';
 					const chunkSize = 50; // Characters per chunk
-					let textPartId: string | undefined;
-
-					// Emit text-start if we have content
-					if (text.length > 0) {
-						textPartId = generateId();
-						controller.enqueue({
-							type: 'text-start',
-							id: textPartId
-						});
-					}

 					for (let i = 0; i < text.length; i += chunkSize) {
-						// Check for abort during streaming
-						if (options.abortSignal?.aborted) {
-							throw options.abortSignal.reason || new Error('Request aborted');
-						}
-
 						const chunk = text.slice(i, i + chunkSize);
 						controller.enqueue({
 							type: 'text-delta',
-							id: textPartId!,
-							delta: chunk
+							textDelta: chunk
 						});

 						// Add small delay to simulate streaming
 						await new Promise((resolve) => setTimeout(resolve, 20));
 					}

-					// Close text part if opened
-					if (textPartId) {
-						controller.enqueue({
-							type: 'text-end',
-							id: textPartId
-						});
-					}
-
 					// Emit finish event
 					controller.enqueue({
 						type: 'finish',
@@ -498,22 +388,19 @@ export class GrokCliLanguageModel implements LanguageModelV2 {
 						error
 					});
 					controller.close();
-				} finally {
-					// Clean up abort listener
-					if (options.abortSignal && abortListener) {
-						options.abortSignal.removeEventListener('abort', abortListener);
-					}
 				}
-			},
-			cancel: () => {
-				// Clean up if stream is cancelled
 			}
 		});

 		return {
 			stream,
+			rawCall: {
+				rawPrompt: createPromptFromMessages(options.prompt),
+				rawSettings: {}
+			},
+			warnings: warnings.length > 0 ? warnings : undefined,
 			request: {
-				body: prompt
+				body: createPromptFromMessages(options.prompt)
 			}
 		};
 	}
--- a/packages/ai-sdk-provider-grok-cli/src/message-converter.ts
+++ b/packages/ai-sdk-provider-grok-cli/src/message-converter.ts
@@ -1,28 +1,17 @@
 /**
- * Message format conversion utilities for Grok CLI provider
+ * @fileoverview Message format conversion utilities for Grok CLI provider
 */

-import type { GrokCliMessage, GrokCliResponse } from './types.js';
-
 /**
- * AI SDK message type (simplified interface)
+ * @typedef {import('./types.js').GrokCliMessage} GrokCliMessage
 */
-interface AISDKMessage {
-	role: string;
-	content:
-		| string
-		| Array<{ type: string; text?: string }>
-		| { text?: string; [key: string]: unknown };
-}

 /**
 * Convert AI SDK messages to Grok CLI compatible format
- * @param messages - AI SDK message array
- * @returns Grok CLI compatible messages
+ * @param {Array<Object>} messages - AI SDK message array
+ * @returns {Array<GrokCliMessage>} Grok CLI compatible messages
 */
-export function convertToGrokCliMessages(
-	messages: AISDKMessage[]
-): GrokCliMessage[] {
+export function convertToGrokCliMessages(messages) {
 	return messages.map((message) => {
 		// Handle different message content types
 		let content = '';
@@ -33,7 +22,7 @@ export function convertToGrokCliMessages(
 			// Handle multi-part content (text and images)
 			content = message.content
 				.filter((part) => part.type === 'text')
-				.map((part) => part.text || '')
+				.map((part) => part.text)
 				.join('\n');
 		} else if (message.content && typeof message.content === 'object') {
 			// Handle object content
@@ -49,17 +38,10 @@ export function convertToGrokCliMessages(

 /**
 * Convert Grok CLI response to AI SDK format
- * @param responseText - Raw response text from Grok CLI (JSONL format)
- * @returns AI SDK compatible response object
+ * @param {string} responseText - Raw response text from Grok CLI (JSONL format)
+ * @returns {Object} AI SDK compatible response object
 */
-export function convertFromGrokCliResponse(responseText: string): {
-	text: string;
-	usage?: {
-		promptTokens: number;
-		completionTokens: number;
-		totalTokens: number;
-	};
-} {
+export function convertFromGrokCliResponse(responseText) {
 	try {
 		// Grok CLI outputs JSONL format - each line is a separate JSON message
 		const lines = responseText
@@ -68,10 +50,10 @@ export function convertFromGrokCliResponse(responseText: string): {
 			.filter((line) => line.trim());

 		// Parse each line as JSON and find assistant messages
-		const messages: GrokCliResponse[] = [];
+		const messages = [];
 		for (const line of lines) {
 			try {
-				const message = JSON.parse(line) as GrokCliResponse;
+				const message = JSON.parse(line);
 				messages.push(message);
 			} catch (parseError) {
 				// Skip invalid JSON lines
@@ -113,10 +95,10 @@ export function convertFromGrokCliResponse(responseText: string): {

 /**
 * Create a prompt string for Grok CLI from messages
- * @param messages - AI SDK message array
- * @returns Formatted prompt string
+ * @param {Array<Object>} messages - AI SDK message array
+ * @returns {string} Formatted prompt string
 */
-export function createPromptFromMessages(messages: AISDKMessage[]): string {
+export function createPromptFromMessages(messages) {
 	const grokMessages = convertToGrokCliMessages(messages);

 	// Create a conversation-style prompt
@@ -140,14 +122,14 @@ export function createPromptFromMessages(messages: AISDKMessage[]): string {

 /**
 * Escape shell arguments for safe CLI execution
- * @param arg - Argument to escape
- * @returns Shell-escaped argument
+ * @param {string} arg - Argument to escape
+ * @returns {string} Shell-escaped argument
 */
-export function escapeShellArg(arg: string | unknown): string {
+export function escapeShellArg(arg) {
 	if (typeof arg !== 'string') {
 		arg = String(arg);
 	}

 	// Replace single quotes with '\''
-	return "'" + (arg as string).replace(/'/g, "'\\''") + "'";
+	return "'" + arg.replace(/'/g, "'\\''") + "'";
 }
--- a/src/ai-providers/custom-sdk/grok-cli/types.js
+++ b/src/ai-providers/custom-sdk/grok-cli/types.js
@@ -0,0 +1,56 @@
+/**
+ * @fileoverview Type definitions for Grok CLI provider
+ */
+
+/**
+ * @typedef {Object} GrokCliSettings
+ * @property {string} [apiKey] - API key for Grok CLI
+ * @property {string} [baseURL] - Base URL for Grok API
+ * @property {string} [model] - Default model to use
+ * @property {number} [timeout] - Timeout in milliseconds
+ * @property {string} [workingDirectory] - Working directory for CLI commands
+ */
+
+/**
+ * @typedef {string} GrokCliModelId
+ * Model identifiers supported by Grok CLI
+ */
+
+/**
+ * @typedef {Object} GrokCliErrorMetadata
+ * @property {string} [code] - Error code
+ * @property {number} [exitCode] - Process exit code
+ * @property {string} [stderr] - Standard error output
+ * @property {string} [stdout] - Standard output
+ * @property {string} [promptExcerpt] - Excerpt of the prompt that caused the error
+ * @property {number} [timeoutMs] - Timeout value in milliseconds
+ */
+
+/**
+ * @typedef {Function} GrokCliProvider
+ * @property {Function} languageModel - Create a language model
+ * @property {Function} chat - Alias for languageModel
+ * @property {Function} textEmbeddingModel - Text embedding model (throws error)
+ */
+
+/**
+ * @typedef {Object} GrokCliProviderSettings
+ * @property {GrokCliSettings} [defaultSettings] - Default settings for all models
+ */
+
+/**
+ * @typedef {Object} GrokCliMessage
+ * @property {string} role - Message role (user, assistant, system)
+ * @property {string} content - Message content
+ */
+
+/**
+ * @typedef {Object} GrokCliResponse
+ * @property {string} content - Response content
+ * @property {Object} [usage] - Token usage information
+ * @property {number} [usage.prompt_tokens] - Input tokens used
+ * @property {number} [usage.completion_tokens] - Output tokens used
+ * @property {number} [usage.total_tokens] - Total tokens used
+ */
+
+export {};
--- a/src/ai-providers/gemini-cli.js
+++ b/src/ai-providers/gemini-cli.js
@@ -9,14 +9,26 @@ import { generateObject, generateText, streamText } from 'ai';
 import { parse } from 'jsonc-parser';
 import { BaseAIProvider } from './base-provider.js';
 import { log } from '../../scripts/modules/utils.js';
-import { createGeminiProvider } from 'ai-sdk-provider-gemini-cli';
+
+let createGeminiProvider;
+
+async function loadGeminiCliModule() {
+	if (!createGeminiProvider) {
+		try {
+			const mod = await import('ai-sdk-provider-gemini-cli');
+			createGeminiProvider = mod.createGeminiProvider;
+		} catch (err) {
+			throw new Error(
+				"Gemini CLI SDK is not installed. Please install 'ai-sdk-provider-gemini-cli' to use the gemini-cli provider."
+			);
+		}
+	}
+}

 export class GeminiCliProvider extends BaseAIProvider {
 	constructor() {
 		super();
 		this.name = 'Gemini CLI';
-		// Gemini CLI requires explicit JSON schema mode
-		this.needsExplicitJsonSchema = true;
 	}

 	/**
@@ -42,6 +54,8 @@ export class GeminiCliProvider extends BaseAIProvider {
 	 */
 	async getClient(params) {
 		try {
+			// Load the Gemini CLI module dynamically
+			await loadGeminiCliModule();
 			// Primary use case: Use existing gemini CLI authentication
 			// Secondary use case: Direct API key (for compatibility)
 			let authOptions = {};
@@ -427,7 +441,7 @@ Generate ${subtaskCount} subtasks based on the original task context. Return ONL
 				model: client(params.modelId),
 				system: systemPrompt,
 				messages: messages,
-				maxOutputTokens: params.maxTokens,
+				maxTokens: params.maxTokens,
 				temperature: params.temperature
 			});

@@ -531,7 +545,7 @@ Generate ${subtaskCount} subtasks based on the original task context. Return ONL
 				model: client(params.modelId),
 				system: systemPrompt,
 				messages: messages,
-				maxOutputTokens: params.maxTokens,
+				maxTokens: params.maxTokens,
 				temperature: params.temperature
 			});

@@ -589,8 +603,8 @@ Generate ${subtaskCount} subtasks based on the original task context. Return ONL
 						system: systemPrompt,
 						messages: messages,
 						schema: params.schema,
-						mode: this.needsExplicitJsonSchema ? 'json' : 'auto',
-						maxOutputTokens: params.maxTokens,
+						mode: 'json', // Use json mode instead of auto for Gemini
+						maxTokens: params.maxTokens,
 						temperature: params.temperature
 					});

--- a/src/ai-providers/grok-cli.js
+++ b/src/ai-providers/grok-cli.js
@@ -3,7 +3,7 @@
 * AI provider implementation for Grok models using Grok CLI.
 */

-import { createGrokCli } from '@tm/ai-sdk-provider-grok-cli';
+import { createGrokCli } from './custom-sdk/grok-cli/index.js';
 import { BaseAIProvider } from './base-provider.js';
 import { getGrokCliSettingsForCommand } from '../../scripts/modules/config-manager.js';

@@ -11,8 +11,6 @@ export class GrokCliProvider extends BaseAIProvider {
 	constructor() {
 		super();
 		this.name = 'Grok CLI';
-		// Grok CLI requires explicit JSON schema mode
-		this.needsExplicitJsonSchema = true;
 	}

 	/**
--- a/src/ai-providers/ollama.js
+++ b/src/ai-providers/ollama.js
@@ -1,9 +1,9 @@
 /**
 * ollama.js
- * AI provider implementation for Ollama models using the ollama-ai-provider-v2 package.
+ * AI provider implementation for Ollama models using the ollama-ai-provider package.
 */

-import { createOllama } from 'ollama-ai-provider-v2';
+import { createOllama } from 'ollama-ai-provider';
 import { BaseAIProvider } from './base-provider.js';

 export class OllamaAIProvider extends BaseAIProvider {
--- a/src/ai-providers/openai.js
+++ b/src/ai-providers/openai.js
@@ -20,6 +20,16 @@ export class OpenAIProvider extends BaseAIProvider {
 		return 'OPENAI_API_KEY';
 	}

+	/**
+	 * Determines if a model requires max_completion_tokens instead of maxTokens
+	 * GPT-5 models require max_completion_tokens parameter
+	 * @param {string} modelId - The model ID to check
+	 * @returns {boolean} True if the model requires max_completion_tokens
+	 */
+	requiresMaxCompletionTokens(modelId) {
+		return modelId && modelId.startsWith('gpt-5');
+	}
+
 	/**
 	 * Creates and returns an OpenAI client instance.
 	 * @param {object} params - Parameters for client initialization
--- a/src/prompts/analyze-complexity.json
+++ b/src/prompts/analyze-complexity.json
@@ -44,8 +44,8 @@
 	},
 	"prompts": {
 		"default": {
-			"system": "You are an expert software architect and project manager analyzing task complexity. Your analysis should consider implementation effort, technical challenges, dependencies, and testing requirements.\n\nIMPORTANT: For each task, provide an analysis object with ALL of the following fields:\n- taskId: The ID of the task being analyzed (positive integer)\n- taskTitle: The title of the task\n- complexityScore: A score from 1-10 indicating complexity\n- recommendedSubtasks: Number of subtasks recommended (non-negative integer; 0 if no expansion needed)\n- expansionPrompt: A prompt to guide subtask generation\n- reasoning: Your reasoning for the complexity score",
-			"user": "{{#if hasCodebaseAnalysis}}## IMPORTANT: Codebase Analysis Required\n\nYou have access to powerful codebase analysis tools. Before analyzing task complexity:\n\n1. Use the Glob tool to explore the project structure and understand the codebase size\n2. Use the Grep tool to search for existing implementations related to each task\n3. Use the Read tool to examine key files that would be affected by these tasks\n4. Understand the current implementation state, patterns used, and technical debt\n\nBased on your codebase analysis:\n- Assess complexity based on ACTUAL code that needs to be modified/created\n- Consider existing abstractions and patterns that could simplify implementation\n- Identify tasks that require refactoring vs. greenfield development\n- Factor in dependencies between existing code and new features\n- Provide more accurate subtask recommendations based on real code structure\n\nProject Root: {{projectRoot}}\n\n{{/if}}Analyze the following tasks to determine their complexity (1-10 scale) and recommend the number of subtasks for expansion. Provide a brief reasoning and an initial expansion prompt for each.{{#if useResearch}} Consider current best practices, common implementation patterns, and industry standards in your analysis.{{/if}}\n\nTasks:\n{{{json tasks}}}\n{{#if gatheredContext}}\n\n# Project Context\n\n{{gatheredContext}}\n{{/if}}\n"
+			"system": "You are an expert software architect and project manager analyzing task complexity. Respond only with the requested valid JSON array.",
+			"user": "{{#if hasCodebaseAnalysis}}## IMPORTANT: Codebase Analysis Required\n\nYou have access to powerful codebase analysis tools. Before analyzing task complexity:\n\n1. Use the Glob tool to explore the project structure and understand the codebase size\n2. Use the Grep tool to search for existing implementations related to each task\n3. Use the Read tool to examine key files that would be affected by these tasks\n4. Understand the current implementation state, patterns used, and technical debt\n\nBased on your codebase analysis:\n- Assess complexity based on ACTUAL code that needs to be modified/created\n- Consider existing abstractions and patterns that could simplify implementation\n- Identify tasks that require refactoring vs. greenfield development\n- Factor in dependencies between existing code and new features\n- Provide more accurate subtask recommendations based on real code structure\n\nProject Root: {{projectRoot}}\n\n{{/if}}Analyze the following tasks to determine their complexity (1-10 scale) and recommend the number of subtasks for expansion. Provide a brief reasoning and an initial expansion prompt for each.{{#if useResearch}} Consider current best practices, common implementation patterns, and industry standards in your analysis.{{/if}}\n\nTasks:\n{{{json tasks}}}\n{{#if gatheredContext}}\n\n# Project Context\n\n{{gatheredContext}}\n{{/if}}\n\nRespond ONLY with a valid JSON array matching the schema:\n[\n  {\n    \"taskId\": <number>,\n    \"taskTitle\": \"<string>\",\n    \"complexityScore\": <number 1-10>,\n    \"recommendedSubtasks\": <number>,\n    \"expansionPrompt\": \"<string>\",\n    \"reasoning\": \"<string>\"\n  },\n  ...\n]\n\nDo not include any explanatory text, markdown formatting, or code block markers before or after the JSON array."
 		}
 	}
 }
--- a/src/prompts/expand-task.json
+++ b/src/prompts/expand-task.json
@@ -68,18 +68,17 @@
 	"prompts": {
 		"complexity-report": {
 			"condition": "expansionPrompt",
-			"system": "You are an AI assistant helping with task breakdown. Generate {{#if (gt subtaskCount 0)}}exactly {{subtaskCount}}{{else}}an appropriate number of{{/if}} subtasks based on the provided prompt and context.\n\nIMPORTANT: Each subtask must include ALL of the following fields:\n- id: MUST be sequential integers starting EXACTLY from {{nextSubtaskId}}. First subtask id={{nextSubtaskId}}, second id={{nextSubtaskId}}+1, etc. DO NOT use any other numbering pattern!\n- title: A clear, actionable title (5-200 characters)\n- description: A detailed description (minimum 10 characters)\n- dependencies: An array of task IDs this subtask depends on (can be empty [])\n- details: Implementation details (minimum 20 characters)\n- status: Must be \"pending\" for new subtasks\n- testStrategy: Testing approach (can be null)",
-
-			"user": "Break down the following task:\n\nParent Task:\nID: {{task.id}}\nTitle: {{task.title}}\nDescription: {{task.description}}\nCurrent details: {{#if task.details}}{{task.details}}{{else}}None{{/if}}\n\n{{expansionPrompt}}{{#if additionalContext}}\n\n{{additionalContext}}{{/if}}{{#if complexityReasoningContext}}\n\n{{complexityReasoningContext}}{{/if}}{{#if gatheredContext}}\n\n# Project Context\n\n{{gatheredContext}}{{/if}}\n\nGenerate {{#if (gt subtaskCount 0)}}exactly {{subtaskCount}}{{else}}an appropriate number of{{/if}} subtasks. CRITICAL: Use sequential IDs starting from {{nextSubtaskId}} (first={{nextSubtaskId}}, second={{nextSubtaskId}}+1, etc.)."
+			"system": "You are an AI assistant helping with task breakdown. Generate {{#if (gt subtaskCount 0)}}exactly {{subtaskCount}}{{else}}an appropriate number of{{/if}} subtasks based on the provided prompt and context.\nRespond ONLY with a valid JSON object containing a single key \"subtasks\" whose value is an array of the generated subtask objects.\nEach subtask object in the array must have keys: \"id\", \"title\", \"description\", \"dependencies\", \"details\", \"status\".\nEnsure the 'id' starts from {{nextSubtaskId}} and is sequential.\nFor 'dependencies', use the full subtask ID format: \"{{task.id}}.1\", \"{{task.id}}.2\", etc. Only reference subtasks within this same task.\nEnsure 'status' is 'pending'.\nDo not include any other text or explanation.",
+			"user": "Break down the following task based on the analysis prompt:\n\nParent Task:\nID: {{task.id}}\nTitle: {{task.title}}\nDescription: {{task.description}}\nCurrent details: {{#if task.details}}{{task.details}}{{else}}None{{/if}}\n\nExpansion Guidance:\n{{expansionPrompt}}{{#if additionalContext}}\n\n{{additionalContext}}{{/if}}{{#if complexityReasoningContext}}\n\n{{complexityReasoningContext}}{{/if}}{{#if gatheredContext}}\n\n# Project Context\n\n{{gatheredContext}}{{/if}}\n\nGenerate {{#if (gt subtaskCount 0)}}exactly {{subtaskCount}}{{else}}an appropriate number of{{/if}} subtasks with sequential IDs starting from {{nextSubtaskId}}."
 		},
 		"research": {
 			"condition": "useResearch === true && !expansionPrompt",
-			"system": "You are an AI assistant with research capabilities analyzing and breaking down software development tasks.\n\nIMPORTANT: Each subtask must include ALL of the following fields:\n- id: MUST be sequential integers starting EXACTLY from {{nextSubtaskId}}. First subtask id={{nextSubtaskId}}, second id={{nextSubtaskId}}+1, etc. DO NOT use any other numbering pattern!\n- title: A clear, actionable title (5-200 characters)\n- description: A detailed description (minimum 10 characters)\n- dependencies: An array of task IDs this subtask depends on (can be empty [])\n- details: Implementation details (minimum 20 characters)\n- status: Must be \"pending\" for new subtasks\n- testStrategy: Testing approach (can be null)",
-			"user": "{{#if hasCodebaseAnalysis}}## IMPORTANT: Codebase Analysis Required\n\nYou have access to powerful codebase analysis tools. Before generating subtasks:\n\n1. Use the Glob tool to explore relevant files for this task (e.g., \"**/*.js\", \"src/**/*.ts\")\n2. Use the Grep tool to search for existing implementations related to this task\n3. Use the Read tool to examine files that would be affected by this task\n4. Understand the current implementation state and patterns used\n\nBased on your analysis:\n- Identify existing code that relates to this task\n- Understand patterns and conventions to follow\n- Generate subtasks that integrate smoothly with existing code\n- Ensure subtasks are specific and actionable based on the actual codebase\n\nProject Root: {{projectRoot}}\n\n{{/if}}Analyze the following task and break it down into {{#if (gt subtaskCount 0)}}exactly {{subtaskCount}}{{else}}an appropriate number of{{/if}} specific subtasks. Each subtask should be actionable and well-defined.\n\nParent Task:\nID: {{task.id}}\nTitle: {{task.title}}\nDescription: {{task.description}}\nCurrent details: {{#if task.details}}{{task.details}}{{else}}None{{/if}}{{#if additionalContext}}\nConsider this context: {{additionalContext}}{{/if}}{{#if complexityReasoningContext}}\nComplexity Analysis Reasoning: {{complexityReasoningContext}}{{/if}}{{#if gatheredContext}}\n\n# Project Context\n\n{{gatheredContext}}{{/if}}\n\nCRITICAL: You MUST use sequential IDs starting from {{nextSubtaskId}}. The first subtask MUST have id={{nextSubtaskId}}, the second MUST have id={{nextSubtaskId}}+1, and so on. Do NOT use parent task ID in subtask numbering!"
+			"system": "You are an AI assistant that responds ONLY with valid JSON objects as requested. The object should contain a 'subtasks' array.",
+			"user": "{{#if hasCodebaseAnalysis}}## IMPORTANT: Codebase Analysis Required\n\nYou have access to powerful codebase analysis tools. Before generating subtasks:\n\n1. Use the Glob tool to explore relevant files for this task (e.g., \"**/*.js\", \"src/**/*.ts\")\n2. Use the Grep tool to search for existing implementations related to this task\n3. Use the Read tool to examine files that would be affected by this task\n4. Understand the current implementation state and patterns used\n\nBased on your analysis:\n- Identify existing code that relates to this task\n- Understand patterns and conventions to follow\n- Generate subtasks that integrate smoothly with existing code\n- Ensure subtasks are specific and actionable based on the actual codebase\n\nProject Root: {{projectRoot}}\n\n{{/if}}Analyze the following task and break it down into {{#if (gt subtaskCount 0)}}exactly {{subtaskCount}}{{else}}an appropriate number of{{/if}} specific subtasks using your research capabilities. Assign sequential IDs starting from {{nextSubtaskId}}.\n\nParent Task:\nID: {{task.id}}\nTitle: {{task.title}}\nDescription: {{task.description}}\nCurrent details: {{#if task.details}}{{task.details}}{{else}}None{{/if}}{{#if additionalContext}}\nConsider this context: {{additionalContext}}{{/if}}{{#if complexityReasoningContext}}\nComplexity Analysis Reasoning: {{complexityReasoningContext}}{{/if}}{{#if gatheredContext}}\n\n# Project Context\n\n{{gatheredContext}}{{/if}}\n\nCRITICAL: Respond ONLY with a valid JSON object containing a single key \"subtasks\". The value must be an array of the generated subtasks, strictly matching this structure:\n\n{\n  \"subtasks\": [\n    {\n      \"id\": <number>, // Sequential ID starting from {{nextSubtaskId}}\n      \"title\": \"<string>\",\n      \"description\": \"<string>\",\n      \"dependencies\": [\"<string>\"], // Use full subtask IDs like [\"{{task.id}}.1\", \"{{task.id}}.2\"]. If no dependencies, use an empty array [].\n      \"details\": \"<string>\",\n      \"testStrategy\": \"<string>\" // Optional\n    },\n    // ... (repeat for {{#if (gt subtaskCount 0)}}{{subtaskCount}}{{else}}appropriate number of{{/if}} subtasks)\n  ]\n}\n\nImportant: For the 'dependencies' field, if a subtask has no dependencies, you MUST use an empty array, for example: \"dependencies\": []. Do not use null or omit the field.\n\nDo not include ANY explanatory text, markdown, or code block markers. Just the JSON object."
 		},
 		"default": {
-			"system": "You are an AI assistant helping with task breakdown for software development. Break down high-level tasks into specific, actionable subtasks that can be implemented sequentially.\n\nIMPORTANT: Each subtask must include ALL of the following fields:\n- id: MUST be sequential integers starting EXACTLY from {{nextSubtaskId}}. First subtask id={{nextSubtaskId}}, second id={{nextSubtaskId}}+1, etc. DO NOT use any other numbering pattern!\n- title: A clear, actionable title (5-200 characters)\n- description: A detailed description (minimum 10 characters)\n- dependencies: An array of task IDs this subtask depends on (can be empty [])\n- details: Implementation details (minimum 20 characters)\n- status: Must be \"pending\" for new subtasks\n- testStrategy: Testing approach (can be null)",
-			"user": "{{#if hasCodebaseAnalysis}}## IMPORTANT: Codebase Analysis Required\n\nYou have access to powerful codebase analysis tools. Before generating subtasks:\n\n1. Use the Glob tool to explore relevant files for this task (e.g., \"**/*.js\", \"src/**/*.ts\")\n2. Use the Grep tool to search for existing implementations related to this task\n3. Use the Read tool to examine files that would be affected by this task\n4. Understand the current implementation state and patterns used\n\nBased on your analysis:\n- Identify existing code that relates to this task\n- Understand patterns and conventions to follow\n- Generate subtasks that integrate smoothly with existing code\n- Ensure subtasks are specific and actionable based on the actual codebase\n\nProject Root: {{projectRoot}}\n\n{{/if}}Break down this task into {{#if (gt subtaskCount 0)}}exactly {{subtaskCount}}{{else}}an appropriate number of{{/if}} specific subtasks:\n\nTask ID: {{task.id}}\nTitle: {{task.title}}\nDescription: {{task.description}}\nCurrent details: {{#if task.details}}{{task.details}}{{else}}None{{/if}}{{#if additionalContext}}\nAdditional context: {{additionalContext}}{{/if}}{{#if complexityReasoningContext}}\nComplexity Analysis Reasoning: {{complexityReasoningContext}}{{/if}}{{#if gatheredContext}}\n\n# Project Context\n\n{{gatheredContext}}{{/if}}\n\nCRITICAL: You MUST use sequential IDs starting from {{nextSubtaskId}}. The first subtask MUST have id={{nextSubtaskId}}, the second MUST have id={{nextSubtaskId}}+1, and so on. Do NOT use parent task ID in subtask numbering!"
+			"system": "You are an AI assistant helping with task breakdown for software development.\nYou need to break down a high-level task into {{#if (gt subtaskCount 0)}}{{subtaskCount}}{{else}}an appropriate number of{{/if}} specific subtasks that can be implemented one by one.\n\nSubtasks should:\n1. Be specific and actionable implementation steps\n2. Follow a logical sequence\n3. Each handle a distinct part of the parent task\n4. Include clear guidance on implementation approach\n5. Have appropriate dependency chains between subtasks (using full subtask IDs)\n6. Collectively cover all aspects of the parent task\n\nFor each subtask, provide:\n- id: Sequential integer starting from the provided nextSubtaskId\n- title: Clear, specific title\n- description: Detailed description\n- dependencies: Array of prerequisite subtask IDs using full format like [\"{{task.id}}.1\", \"{{task.id}}.2\"]\n- details: Implementation details, the output should be in string\n- testStrategy: Optional testing approach\n\nRespond ONLY with a valid JSON object containing a single key \"subtasks\" whose value is an array matching the structure described. Do not include any explanatory text, markdown formatting, or code block markers.",
+			"user": "{{#if hasCodebaseAnalysis}}## IMPORTANT: Codebase Analysis Required\n\nYou have access to powerful codebase analysis tools. Before generating subtasks:\n\n1. Use the Glob tool to explore relevant files for this task (e.g., \"**/*.js\", \"src/**/*.ts\")\n2. Use the Grep tool to search for existing implementations related to this task\n3. Use the Read tool to examine files that would be affected by this task\n4. Understand the current implementation state and patterns used\n\nBased on your analysis:\n- Identify existing code that relates to this task\n- Understand patterns and conventions to follow\n- Generate subtasks that integrate smoothly with existing code\n- Ensure subtasks are specific and actionable based on the actual codebase\n\nProject Root: {{projectRoot}}\n\n{{/if}}Break down this task into {{#if (gt subtaskCount 0)}}exactly {{subtaskCount}}{{else}}an appropriate number of{{/if}} specific subtasks:\n\nTask ID: {{task.id}}\nTitle: {{task.title}}\nDescription: {{task.description}}\nCurrent details: {{#if task.details}}{{task.details}}{{else}}None{{/if}}{{#if additionalContext}}\nAdditional context: {{additionalContext}}{{/if}}{{#if complexityReasoningContext}}\nComplexity Analysis Reasoning: {{complexityReasoningContext}}{{/if}}{{#if gatheredContext}}\n\n# Project Context\n\n{{gatheredContext}}{{/if}}\n\nReturn ONLY the JSON object containing the \"subtasks\" array, matching this structure:\n\n{\n  \"subtasks\": [\n    {\n      \"id\": {{nextSubtaskId}}, // First subtask ID\n      \"title\": \"Specific subtask title\",\n      \"description\": \"Detailed description\",\n      \"dependencies\": [], // e.g., [\"{{task.id}}.1\", \"{{task.id}}.2\"] for dependencies. Use empty array [] if no dependencies\n      \"details\": \"Implementation guidance\",\n      \"testStrategy\": \"Optional testing approach\"\n    },\n    // ... (repeat for {{#if (gt subtaskCount 0)}}a total of {{subtaskCount}}{{else}}an appropriate number of{{/if}} subtasks with sequential IDs)\n  ]\n}"
 		}
 	}
 }
--- a/src/prompts/parse-prd.json
+++ b/src/prompts/parse-prd.json
@@ -56,8 +56,8 @@
 	},
 	"prompts": {
 		"default": {
-			"system": "You are an AI assistant specialized in analyzing Product Requirements Documents (PRDs) and generating a structured, logically ordered, dependency-aware and sequenced list of development tasks in JSON format.{{#if research}}\nBefore breaking down the PRD into tasks, you will:\n1. Research and analyze the latest technologies, libraries, frameworks, and best practices that would be appropriate for this project\n2. Identify any potential technical challenges, security concerns, or scalability issues not explicitly mentioned in the PRD without discarding any explicit requirements or going overboard with complexity -- always aim to provide the most direct path to implementation, avoiding over-engineering or roundabout approaches\n3. Consider current industry standards and evolving trends relevant to this project (this step aims to solve LLM hallucinations and out of date information due to training data cutoff dates)\n4. Evaluate alternative implementation approaches and recommend the most efficient path\n5. Include specific library versions, helpful APIs, and concrete implementation guidance based on your research\n6. Always aim to provide the most direct path to implementation, avoiding over-engineering or roundabout approaches\n\nYour task breakdown should incorporate this research, resulting in more detailed implementation guidance, more accurate dependency mapping, and more precise technology recommendations than would be possible from the PRD text alone, while maintaining all explicit requirements and best practices and all details and nuances of the PRD.{{/if}}\n\nAnalyze the provided PRD content and generate {{#if (gt numTasks 0)}}approximately {{numTasks}}{{else}}an appropriate number of{{/if}} top-level development tasks. If the complexity or the level of detail of the PRD is high, generate more tasks relative to the complexity of the PRD\nEach task should represent a logical unit of work needed to implement the requirements and focus on the most direct and effective way to implement the requirements without unnecessary complexity or overengineering. Include pseudo-code, implementation details, and test strategy for each task. Find the most up to date information to implement each task.\nAssign sequential IDs starting from {{nextId}}. Infer title, description, details, and test strategy for each task based *only* on the PRD content.\nSet status to 'pending', dependencies to an empty array [], and priority to '{{defaultTaskPriority}}' initially for all tasks.\nGenerate a response containing a single key \"tasks\", where the value is an array of task objects adhering to the provided schema.\n\nEach task should follow this JSON structure:\n{\n\t\"id\": number,\n\t\"title\": string,\n\t\"description\": string,\n\t\"status\": \"pending\",\n\t\"dependencies\": number[] (IDs of tasks this depends on),\n\t\"priority\": \"high\" | \"medium\" | \"low\",\n\t\"details\": string (implementation details),\n\t\"testStrategy\": string (validation approach)\n}\n\nGuidelines:\n1. {{#if (gt numTasks 0)}}Unless complexity warrants otherwise{{else}}Depending on the complexity{{/if}}, create {{#if (gt numTasks 0)}}exactly {{numTasks}}{{else}}an appropriate number of{{/if}} tasks, numbered sequentially starting from {{nextId}}\n2. Each task should be atomic and focused on a single responsibility following the most up to date best practices and standards\n3. Order tasks logically - consider dependencies and implementation sequence\n4. Early tasks should focus on setup, core functionality first, then advanced features\n5. Include clear validation/testing approach for each task\n6. Set appropriate dependency IDs (a task can only depend on tasks with lower IDs, potentially including existing tasks with IDs less than {{nextId}} if applicable)\n7. Assign priority (high/medium/low) based on criticality and dependency order\n8. Include detailed implementation guidance in the \"details\" field{{#if research}}, with specific libraries and version recommendations based on your research{{/if}}\n9. If the PRD contains specific requirements for libraries, database schemas, frameworks, tech stacks, or any other implementation details, STRICTLY ADHERE to these requirements in your task breakdown and do not discard them under any circumstance\n10. Focus on filling in any gaps left by the PRD or areas that aren't fully specified, while preserving all explicit requirements\n11. Always aim to provide the most direct path to implementation, avoiding over-engineering or roundabout approaches{{#if research}}\n12. For each task, include specific, actionable guidance based on current industry standards and best practices discovered through research{{/if}}",
-			"user": "{{#if hasCodebaseAnalysis}}## IMPORTANT: Codebase Analysis Required\n\nYou have access to powerful codebase analysis tools. Before generating tasks:\n\n1. Use the Glob tool to explore the project structure (e.g., \"**/*.js\", \"**/*.json\", \"**/README.md\")\n2. Use the Grep tool to search for existing implementations, patterns, and technologies\n3. Use the Read tool to examine key files like package.json, README.md, and main entry points\n4. Analyze the current state of implementation to understand what already exists\n\nBased on your analysis:\n- Identify what components/features are already implemented\n- Understand the technology stack, frameworks, and patterns in use\n- Generate tasks that build upon the existing codebase rather than duplicating work\n- Ensure tasks align with the project's current architecture and conventions\n\nProject Root: {{projectRoot}}\n\n{{/if}}Here's the Product Requirements Document (PRD) to break down into {{#if (gt numTasks 0)}}approximately {{numTasks}}{{else}}an appropriate number of{{/if}} tasks, starting IDs from {{nextId}}:{{#if research}}\n\nRemember to thoroughly research current best practices and technologies before task breakdown to provide specific, actionable implementation details.{{/if}}\n\n{{prdContent}}\n\nIMPORTANT: Your response must be a JSON object with a single property named \"tasks\" containing an array of task objects. Do NOT include metadata or any other properties."
+			"system": "You are an AI assistant specialized in analyzing Product Requirements Documents (PRDs) and generating a structured, logically ordered, dependency-aware and sequenced list of development tasks in JSON format.{{#if research}}\nBefore breaking down the PRD into tasks, you will:\n1. Research and analyze the latest technologies, libraries, frameworks, and best practices that would be appropriate for this project\n2. Identify any potential technical challenges, security concerns, or scalability issues not explicitly mentioned in the PRD without discarding any explicit requirements or going overboard with complexity -- always aim to provide the most direct path to implementation, avoiding over-engineering or roundabout approaches\n3. Consider current industry standards and evolving trends relevant to this project (this step aims to solve LLM hallucinations and out of date information due to training data cutoff dates)\n4. Evaluate alternative implementation approaches and recommend the most efficient path\n5. Include specific library versions, helpful APIs, and concrete implementation guidance based on your research\n6. Always aim to provide the most direct path to implementation, avoiding over-engineering or roundabout approaches\n\nYour task breakdown should incorporate this research, resulting in more detailed implementation guidance, more accurate dependency mapping, and more precise technology recommendations than would be possible from the PRD text alone, while maintaining all explicit requirements and best practices and all details and nuances of the PRD.{{/if}}\n\nAnalyze the provided PRD content and generate {{#if (gt numTasks 0)}}approximately {{numTasks}}{{else}}an appropriate number of{{/if}} top-level development tasks. If the complexity or the level of detail of the PRD is high, generate more tasks relative to the complexity of the PRD\nEach task should represent a logical unit of work needed to implement the requirements and focus on the most direct and effective way to implement the requirements without unnecessary complexity or overengineering. Include pseudo-code, implementation details, and test strategy for each task. Find the most up to date information to implement each task.\nAssign sequential IDs starting from {{nextId}}. Infer title, description, details, and test strategy for each task based *only* on the PRD content.\nSet status to 'pending', dependencies to an empty array [], and priority to '{{defaultTaskPriority}}' initially for all tasks.\nRespond ONLY with a valid JSON object containing a single key \"tasks\", where the value is an array of task objects adhering to the provided Zod schema. Do not include any explanation or markdown formatting.\n\nEach task should follow this JSON structure:\n{\n\t\"id\": number,\n\t\"title\": string,\n\t\"description\": string,\n\t\"status\": \"pending\",\n\t\"dependencies\": number[] (IDs of tasks this depends on),\n\t\"priority\": \"high\" | \"medium\" | \"low\",\n\t\"details\": string (implementation details),\n\t\"testStrategy\": string (validation approach)\n}\n\nGuidelines:\n1. {{#if (gt numTasks 0)}}Unless complexity warrants otherwise{{else}}Depending on the complexity{{/if}}, create {{#if (gt numTasks 0)}}exactly {{numTasks}}{{else}}an appropriate number of{{/if}} tasks, numbered sequentially starting from {{nextId}}\n2. Each task should be atomic and focused on a single responsibility following the most up to date best practices and standards\n3. Order tasks logically - consider dependencies and implementation sequence\n4. Early tasks should focus on setup, core functionality first, then advanced features\n5. Include clear validation/testing approach for each task\n6. Set appropriate dependency IDs (a task can only depend on tasks with lower IDs, potentially including existing tasks with IDs less than {{nextId}} if applicable)\n7. Assign priority (high/medium/low) based on criticality and dependency order\n8. Include detailed implementation guidance in the \"details\" field{{#if research}}, with specific libraries and version recommendations based on your research{{/if}}\n9. If the PRD contains specific requirements for libraries, database schemas, frameworks, tech stacks, or any other implementation details, STRICTLY ADHERE to these requirements in your task breakdown and do not discard them under any circumstance\n10. Focus on filling in any gaps left by the PRD or areas that aren't fully specified, while preserving all explicit requirements\n11. Always aim to provide the most direct path to implementation, avoiding over-engineering or roundabout approaches{{#if research}}\n12. For each task, include specific, actionable guidance based on current industry standards and best practices discovered through research{{/if}}",
+			"user": "{{#if hasCodebaseAnalysis}}## IMPORTANT: Codebase Analysis Required\n\nYou have access to powerful codebase analysis tools. Before generating tasks:\n\n1. Use the Glob tool to explore the project structure (e.g., \"**/*.js\", \"**/*.json\", \"**/README.md\")\n2. Use the Grep tool to search for existing implementations, patterns, and technologies\n3. Use the Read tool to examine key files like package.json, README.md, and main entry points\n4. Analyze the current state of implementation to understand what already exists\n\nBased on your analysis:\n- Identify what components/features are already implemented\n- Understand the technology stack, frameworks, and patterns in use\n- Generate tasks that build upon the existing codebase rather than duplicating work\n- Ensure tasks align with the project's current architecture and conventions\n\nProject Root: {{projectRoot}}\n\n{{/if}}Here's the Product Requirements Document (PRD) to break down into {{#if (gt numTasks 0)}}approximately {{numTasks}}{{else}}an appropriate number of{{/if}} tasks, starting IDs from {{nextId}}:{{#if research}}\n\nRemember to thoroughly research current best practices and technologies before task breakdown to provide specific, actionable implementation details.{{/if}}\n\n{{prdContent}}\n\n\n\t\tReturn your response in this format:\n{\n    \"tasks\": [\n        {\n            \"id\": 1,\n            \"title\": \"Setup Project Repository\",\n            \"description\": \"...\",\n            ...\n        },\n        ...\n    ],\n    \"metadata\": {\n        \"projectName\": \"PRD Implementation\",\n        \"totalTasks\": {{#if (gt numTasks 0)}}{{numTasks}}{{else}}{number of tasks}{{/if}},\n        \"sourceFile\": \"{{prdPath}}\",\n        \"generatedAt\": \"YYYY-MM-DD\"\n    }\n}"
 		}
 	}
 }
--- a/src/prompts/update-task.json
+++ b/src/prompts/update-task.json
@@ -59,13 +59,13 @@
 	},
 	"prompts": {
 		"default": {
-			"system": "You are an AI assistant helping to update a software development task based on new context.{{#if useResearch}} You have access to current best practices and latest technical information to provide research-backed updates.{{/if}}\nYou will be given a task and a prompt describing changes or new implementation details.\nYour job is to update the task to reflect these changes, while preserving its basic structure.\n\nGuidelines:\n1. VERY IMPORTANT: NEVER change the title of the task - keep it exactly as is\n2. Maintain the same ID, status, and dependencies unless specifically mentioned in the prompt{{#if useResearch}}\n3. Research and update the description, details, and test strategy with current best practices\n4. Include specific versions, libraries, and approaches that are current and well-tested{{/if}}{{#if (not useResearch)}}\n3. Update the description, details, and test strategy to reflect the new information\n4. Do not change anything unnecessarily - just adapt what needs to change based on the prompt{{/if}}\n5. Return the complete updated task\n6. VERY IMPORTANT: Preserve all subtasks marked as \"done\" or \"completed\" - do not modify their content\n7. For tasks with completed subtasks, build upon what has already been done rather than rewriting everything\n8. If an existing completed subtask needs to be changed/undone based on the new context, DO NOT modify it directly\n9. Instead, add a new subtask that clearly indicates what needs to be changed or replaced\n10. Use the existence of completed subtasks as an opportunity to make new subtasks more specific and targeted\n11. Ensure any new subtasks have unique IDs that don't conflict with existing ones\n12. CRITICAL: For subtask IDs, use ONLY numeric values (1, 2, 3, etc.) NOT strings (\"1\", \"2\", \"3\")\n13. CRITICAL: Subtask IDs should start from 1 and increment sequentially (1, 2, 3...) - do NOT use parent task ID as prefix{{#if useResearch}}\n14. Include links to documentation or resources where helpful\n15. Focus on practical, implementable solutions using current technologies{{/if}}\n\nThe changes described in the prompt should be thoughtfully applied to make the task more accurate and actionable.",
-			"user": "{{#if hasCodebaseAnalysis}}## IMPORTANT: Codebase Analysis Required\n\nYou have access to powerful codebase analysis tools. Before updating the task:\n\n1. Use the Glob tool to explore the project structure (e.g., \"**/*.js\", \"**/*.json\", \"**/README.md\")\n2. Use the Grep tool to search for existing implementations, patterns, and technologies\n3. Use the Read tool to examine relevant files and understand current implementation\n4. Analyze how the task changes relate to the existing codebase\n\nBased on your analysis:\n- Update task details to reference specific files, functions, or patterns from the codebase\n- Ensure implementation details align with the project's current architecture\n- Include specific code examples or file references where appropriate\n- Consider how changes impact existing components\n\nProject Root: {{projectRoot}}\n\n{{/if}}Here is the task to update{{#if useResearch}} with research-backed information{{/if}}:\n{{{taskJson}}}\n\nPlease {{#if useResearch}}research and {{/if}}update this task based on the following {{#if useResearch}}context:\n{{updatePrompt}}\n\nIncorporate current best practices, latest stable versions, and proven approaches.{{/if}}{{#if (not useResearch)}}new context:\n{{updatePrompt}}{{/if}}\n\nIMPORTANT: {{#if useResearch}}Preserve any subtasks marked as \"done\" or \"completed\".{{/if}}{{#if (not useResearch)}}In the task JSON above, any subtasks with \"status\": \"done\" or \"status\": \"completed\" should be preserved exactly as is. Build your changes around these completed items.{{/if}}\n{{#if gatheredContext}}\n\n# Project Context\n\n{{gatheredContext}}\n{{/if}}\n\nReturn the complete updated task{{#if useResearch}} with research-backed improvements{{/if}}.\n\nIMPORTANT: Your response must be a JSON object with a single property named \"task\" containing the updated task object."
+			"system": "You are an AI assistant helping to update a software development task based on new context.{{#if useResearch}} You have access to current best practices and latest technical information to provide research-backed updates.{{/if}}\nYou will be given a task and a prompt describing changes or new implementation details.\nYour job is to update the task to reflect these changes, while preserving its basic structure.\n\nGuidelines:\n1. VERY IMPORTANT: NEVER change the title of the task - keep it exactly as is\n2. Maintain the same ID, status, and dependencies unless specifically mentioned in the prompt{{#if useResearch}}\n3. Research and update the description, details, and test strategy with current best practices\n4. Include specific versions, libraries, and approaches that are current and well-tested{{/if}}{{#if (not useResearch)}}\n3. Update the description, details, and test strategy to reflect the new information\n4. Do not change anything unnecessarily - just adapt what needs to change based on the prompt{{/if}}\n5. Return a complete valid JSON object representing the updated task\n6. VERY IMPORTANT: Preserve all subtasks marked as \"done\" or \"completed\" - do not modify their content\n7. For tasks with completed subtasks, build upon what has already been done rather than rewriting everything\n8. If an existing completed subtask needs to be changed/undone based on the new context, DO NOT modify it directly\n9. Instead, add a new subtask that clearly indicates what needs to be changed or replaced\n10. Use the existence of completed subtasks as an opportunity to make new subtasks more specific and targeted\n11. Ensure any new subtasks have unique IDs that don't conflict with existing ones\n12. CRITICAL: For subtask IDs, use ONLY numeric values (1, 2, 3, etc.) NOT strings (\"1\", \"2\", \"3\")\n13. CRITICAL: Subtask IDs should start from 1 and increment sequentially (1, 2, 3...) - do NOT use parent task ID as prefix{{#if useResearch}}\n14. Include links to documentation or resources where helpful\n15. Focus on practical, implementable solutions using current technologies{{/if}}\n\nThe changes described in the prompt should be thoughtfully applied to make the task more accurate and actionable.",
+			"user": "{{#if hasCodebaseAnalysis}}## IMPORTANT: Codebase Analysis Required\n\nYou have access to powerful codebase analysis tools. Before updating the task:\n\n1. Use the Glob tool to explore the project structure (e.g., \"**/*.js\", \"**/*.json\", \"**/README.md\")\n2. Use the Grep tool to search for existing implementations, patterns, and technologies\n3. Use the Read tool to examine relevant files and understand current implementation\n4. Analyze how the task changes relate to the existing codebase\n\nBased on your analysis:\n- Update task details to reference specific files, functions, or patterns from the codebase\n- Ensure implementation details align with the project's current architecture\n- Include specific code examples or file references where appropriate\n- Consider how changes impact existing components\n\nProject Root: {{projectRoot}}\n\n{{/if}}Here is the task to update{{#if useResearch}} with research-backed information{{/if}}:\n{{{taskJson}}}\n\nPlease {{#if useResearch}}research and {{/if}}update this task based on the following {{#if useResearch}}context:\n{{updatePrompt}}\n\nIncorporate current best practices, latest stable versions, and proven approaches.{{/if}}{{#if (not useResearch)}}new context:\n{{updatePrompt}}{{/if}}\n\nIMPORTANT: {{#if useResearch}}Preserve any subtasks marked as \"done\" or \"completed\".{{/if}}{{#if (not useResearch)}}In the task JSON above, any subtasks with \"status\": \"done\" or \"status\": \"completed\" should be preserved exactly as is. Build your changes around these completed items.{{/if}}\n{{#if gatheredContext}}\n\n# Project Context\n\n{{gatheredContext}}\n{{/if}}\n\nReturn only the updated task as a valid JSON object{{#if useResearch}} with research-backed improvements{{/if}}."
 		},
 		"append": {
 			"condition": "appendMode === true",
 			"system": "You are an AI assistant helping to append additional information to a software development task. You will be provided with the task's existing details, context, and a user request string.\n\nYour Goal: Based *only* on the user's request and all the provided context (including existing details if relevant to the request), GENERATE the new text content that should be added to the task's details.\nFocus *only* on generating the substance of the update.\n\nOutput Requirements:\n1. Return *only* the newly generated text content as a plain string. Do NOT return a JSON object or any other structured data.\n2. Your string response should NOT include any of the task's original details, unless the user's request explicitly asks to rephrase, summarize, or directly modify existing text.\n3. Do NOT include any timestamps, XML-like tags, markdown, or any other special formatting in your string response.\n4. Ensure the generated text is concise yet complete for the update based on the user request. Avoid conversational fillers or explanations about what you are doing (e.g., do not start with \"Okay, here's the update...\").",
-			"user": "{{#if hasCodebaseAnalysis}}## IMPORTANT: Codebase Analysis Required\n\nYou have access to powerful codebase analysis tools. Before generating the task update:\n\n1. Use the Glob tool to explore the project structure (e.g., \"**/*.js\", \"**/*.json\", \"**/README.md\")\n2. Use the Grep tool to search for existing implementations, patterns, and technologies\n3. Use the Read tool to examine relevant files and understand current implementation\n4. Analyze the current codebase to inform your update\n\nBased on your analysis:\n- Include specific file references, code patterns, or implementation details\n- Ensure suggestions align with the project's current architecture\n- Reference existing components or patterns when relevant\n\nProject Root: {{projectRoot}}\n\n{{/if}}Task Context:\n\nTask: {{{json task}}}\nCurrent Task Details (for context only):\n{{currentDetails}}\n\nUser Request: \"{{updatePrompt}}\"\n\nBased on the User Request and all the Task Context (including current task details provided above), what is the new information or text that should be appended to this task's details? Return this new text as a plain string.\n{{#if gatheredContext}}\n\n# Additional Project Context\n\n{{gatheredContext}}\n{{/if}}"
+			"user": "{{#if hasCodebaseAnalysis}}## IMPORTANT: Codebase Analysis Required\n\nYou have access to powerful codebase analysis tools. Before generating the task update:\n\n1. Use the Glob tool to explore the project structure (e.g., \"**/*.js\", \"**/*.json\", \"**/README.md\")\n2. Use the Grep tool to search for existing implementations, patterns, and technologies\n3. Use the Read tool to examine relevant files and understand current implementation\n4. Analyze the current codebase to inform your update\n\nBased on your analysis:\n- Include specific file references, code patterns, or implementation details\n- Ensure suggestions align with the project's current architecture\n- Reference existing components or patterns when relevant\n\nProject Root: {{projectRoot}}\n\n{{/if}}Task Context:\n\nTask: {{{json task}}}\nCurrent Task Details (for context only):\n{{currentDetails}}\n\nUser Request: \"{{updatePrompt}}\"\n\nBased on the User Request and all the Task Context (including current task details provided above), what is the new information or text that should be appended to this task's details? Return ONLY this new text as a plain string.\n{{#if gatheredContext}}\n\n# Additional Project Context\n\n{{gatheredContext}}\n{{/if}}"
 		}
 	}
 }
--- a/src/prompts/update-tasks.json
+++ b/src/prompts/update-tasks.json
@@ -43,8 +43,8 @@
 	},
 	"prompts": {
 		"default": {
-			"system": "You are an AI assistant helping to update software development tasks based on new context.\nYou will be given a set of tasks and a prompt describing changes or new implementation details.\nYour job is to update the tasks to reflect these changes, while preserving their basic structure.\n\nGuidelines:\n1. Maintain the same IDs, statuses, and dependencies unless specifically mentioned in the prompt\n2. Update titles, descriptions, details, and test strategies to reflect the new information\n3. Do not change anything unnecessarily - just adapt what needs to change based on the prompt\n4. Return ALL the tasks in order, not just the modified ones\n5. VERY IMPORTANT: Preserve all subtasks marked as \"done\" or \"completed\" - do not modify their content\n6. For tasks with completed subtasks, build upon what has already been done rather than rewriting everything\n7. If an existing completed subtask needs to be changed/undone based on the new context, DO NOT modify it directly\n8. Instead, add a new subtask that clearly indicates what needs to be changed or replaced\n9. Use the existence of completed subtasks as an opportunity to make new subtasks more specific and targeted",
-			"user": "{{#if hasCodebaseAnalysis}}## IMPORTANT: Codebase Analysis Required\n\nYou have access to powerful codebase analysis tools. Before updating tasks:\n\n1. Use the Glob tool to explore the project structure (e.g., \"**/*.js\", \"**/*.json\", \"**/README.md\")\n2. Use the Grep tool to search for existing implementations, patterns, and technologies\n3. Use the Read tool to examine relevant files and understand current implementation\n4. Analyze how the new changes relate to the existing codebase\n\nBased on your analysis:\n- Update task details to reference specific files, functions, or patterns from the codebase\n- Ensure implementation details align with the project's current architecture\n- Include specific code examples or file references where appropriate\n- Consider how changes impact existing components\n\nProject Root: {{projectRoot}}\n\n{{/if}}Here are the tasks to update:\n{{{json tasks}}}\n\nPlease update these tasks based on the following new context:\n{{updatePrompt}}\n\nIMPORTANT: In the tasks above, any subtasks with \"status\": \"done\" or \"status\": \"completed\" should be preserved exactly as is. Build your changes around these completed items.{{#if projectContext}}\n\n# Project Context\n\n{{projectContext}}{{/if}}\n\nIMPORTANT: Your response must be a JSON object with a single property named \"tasks\" containing the updated array of tasks."
+			"system": "You are an AI assistant helping to update software development tasks based on new context.\nYou will be given a set of tasks and a prompt describing changes or new implementation details.\nYour job is to update the tasks to reflect these changes, while preserving their basic structure.\n\nCRITICAL RULES:\n1. Return ONLY a JSON array - no explanations, no markdown, no additional text before or after\n2. Each task MUST have ALL fields from the original (do not omit any fields)\n3. Maintain the same IDs, statuses, and dependencies unless specifically mentioned in the prompt\n4. Update titles, descriptions, details, and test strategies to reflect the new information\n5. Do not change anything unnecessarily - just adapt what needs to change based on the prompt\n6. You should return ALL the tasks in order, not just the modified ones\n7. Return a complete valid JSON array with all tasks\n8. VERY IMPORTANT: Preserve all subtasks marked as \"done\" or \"completed\" - do not modify their content\n9. For tasks with completed subtasks, build upon what has already been done rather than rewriting everything\n10. If an existing completed subtask needs to be changed/undone based on the new context, DO NOT modify it directly\n11. Instead, add a new subtask that clearly indicates what needs to be changed or replaced\n12. Use the existence of completed subtasks as an opportunity to make new subtasks more specific and targeted\n\nThe changes described in the prompt should be applied to ALL tasks in the list.",
+			"user": "{{#if hasCodebaseAnalysis}}## IMPORTANT: Codebase Analysis Required\n\nYou have access to powerful codebase analysis tools. Before updating tasks:\n\n1. Use the Glob tool to explore the project structure (e.g., \"**/*.js\", \"**/*.json\", \"**/README.md\")\n2. Use the Grep tool to search for existing implementations, patterns, and technologies\n3. Use the Read tool to examine relevant files and understand current implementation\n4. Analyze how the new changes relate to the existing codebase\n\nBased on your analysis:\n- Update task details to reference specific files, functions, or patterns from the codebase\n- Ensure implementation details align with the project's current architecture\n- Include specific code examples or file references where appropriate\n- Consider how changes impact existing components\n\nProject Root: {{projectRoot}}\n\n{{/if}}Here are the tasks to update:\n{{{json tasks}}}\n\nPlease update these tasks based on the following new context:\n{{updatePrompt}}\n\nIMPORTANT: In the tasks JSON above, any subtasks with \"status\": \"done\" or \"status\": \"completed\" should be preserved exactly as is. Build your changes around these completed items.{{#if projectContext}}\n\n# Project Context\n\n{{projectContext}}{{/if}}\n\nRequired JSON structure for EACH task (ALL fields MUST be present):\n{\n  \"id\": <number>,\n  \"title\": <string>,\n  \"description\": <string>,\n  \"status\": <string>,\n  \"dependencies\": <array>,\n  \"priority\": <string or null>,\n  \"details\": <string or null>,\n  \"testStrategy\": <string or null>,\n  \"subtasks\": <array or null>\n}\n\nReturn a valid JSON array containing ALL the tasks with ALL their fields:\n- id (number) - preserve existing value\n- title (string)\n- description (string)\n- status (string) - preserve existing value unless explicitly changing\n- dependencies (array) - preserve existing value unless explicitly changing\n- priority (string or null)\n- details (string or null)\n- testStrategy (string or null)\n- subtasks (array or null)\n\nReturn ONLY the JSON array now:"
 		}
 	}
 }
--- a/src/schemas/add-task.js
+++ b/src/schemas/add-task.js
@@ -1,21 +0,0 @@
-import { z } from 'zod';
-
-// Schema that matches the inline AiTaskDataSchema from add-task.js
-export const AddTaskResponseSchema = z.object({
-	title: z.string().describe('Clear, concise title for the task'),
-	description: z
-		.string()
-		.describe('A one or two sentence description of the task'),
-	details: z
-		.string()
-		.describe('In-depth implementation details, considerations, and guidance'),
-	testStrategy: z
-		.string()
-		.describe('Detailed approach for verifying task completion'),
-	dependencies: z
-		.array(z.number())
-		.nullable()
-		.describe(
-			'Array of task IDs that this task depends on (must be completed before this task can start)'
-		)
-});
--- a/src/schemas/analyze-complexity.js
+++ b/src/schemas/analyze-complexity.js
@@ -1,14 +0,0 @@
-import { z } from 'zod';
-
-export const ComplexityAnalysisItemSchema = z.object({
-	taskId: z.number().int().positive(),
-	taskTitle: z.string(),
-	complexityScore: z.number().min(1).max(10),
-	recommendedSubtasks: z.number().int().nonnegative(),
-	expansionPrompt: z.string(),
-	reasoning: z.string()
-});
-
-export const ComplexityAnalysisResponseSchema = z.object({
-	complexityAnalysis: z.array(ComplexityAnalysisItemSchema)
-});
--- a/src/schemas/base-schemas.js
+++ b/src/schemas/base-schemas.js
@@ -1,35 +0,0 @@
-import { z } from 'zod';
-
-// Base schemas that will be reused across commands
-export const TaskStatusSchema = z.enum([
-	'pending',
-	'in-progress',
-	'blocked',
-	'done',
-	'cancelled',
-	'deferred'
-]);
-
-export const BaseTaskSchema = z.object({
-	id: z.number().int().positive(),
-	title: z.string().min(1).max(200),
-	description: z.string().min(1),
-	status: TaskStatusSchema,
-	dependencies: z.array(z.union([z.number().int(), z.string()])).default([]),
-	priority: z
-		.enum(['low', 'medium', 'high', 'critical'])
-		.nullable()
-		.default(null),
-	details: z.string().nullable().default(null),
-	testStrategy: z.string().nullable().default(null)
-});
-
-export const SubtaskSchema = z.object({
-	id: z.number().int().positive(),
-	title: z.string().min(5).max(200),
-	description: z.string().min(10),
-	dependencies: z.array(z.number().int()).default([]),
-	details: z.string().min(20),
-	status: z.enum(['pending', 'done', 'completed']).default('pending'),
-	testStrategy: z.string().nullable().default(null)
-});
--- a/src/schemas/expand-task.js
+++ b/src/schemas/expand-task.js
@@ -1,6 +0,0 @@
-import { z } from 'zod';
-import { SubtaskSchema } from './base-schemas.js';
-
-export const ExpandTaskResponseSchema = z.object({
-	subtasks: z.array(SubtaskSchema)
-});
--- a/src/schemas/parse-prd.js
+++ b/src/schemas/parse-prd.js
@@ -1,18 +0,0 @@
-import { z } from 'zod';
-
-// Schema for a single task from PRD parsing
-const PRDSingleTaskSchema = z.object({
-	id: z.number().int().positive(),
-	title: z.string().min(1),
-	description: z.string().min(1),
-	details: z.string().nullable(),
-	testStrategy: z.string().nullable(),
-	priority: z.enum(['high', 'medium', 'low']).nullable(),
-	dependencies: z.array(z.number().int().positive()).nullable(),
-	status: z.string().nullable()
-});
-
-// Schema for the AI response - only expects tasks array since metadata is generated by the code
-export const ParsePRDResponseSchema = z.object({
-	tasks: z.array(PRDSingleTaskSchema)
-});
--- a/src/schemas/registry.js
+++ b/src/schemas/registry.js
@@ -1,27 +0,0 @@
-import { AddTaskResponseSchema } from './add-task.js';
-import { ComplexityAnalysisResponseSchema } from './analyze-complexity.js';
-import { ExpandTaskResponseSchema } from './expand-task.js';
-import { ParsePRDResponseSchema } from './parse-prd.js';
-import { UpdateSubtaskResponseSchema } from './update-subtask.js';
-import { UpdateTaskResponseSchema } from './update-task.js';
-import { UpdateTasksResponseSchema } from './update-tasks.js';
-
-export const COMMAND_SCHEMAS = {
-	'update-tasks': UpdateTasksResponseSchema,
-	'expand-task': ExpandTaskResponseSchema,
-	'analyze-complexity': ComplexityAnalysisResponseSchema,
-	'update-subtask-by-id': UpdateSubtaskResponseSchema,
-	'update-task-by-id': UpdateTaskResponseSchema,
-	'add-task': AddTaskResponseSchema,
-	'parse-prd': ParsePRDResponseSchema
-};
-
-// Export individual schemas for direct access
-export * from './update-tasks.js';
-export * from './expand-task.js';
-export * from './analyze-complexity.js';
-export * from './update-subtask.js';
-export * from './update-task.js';
-export * from './add-task.js';
-export * from './parse-prd.js';
-export * from './base-schemas.js';
--- a/src/schemas/update-subtask.js
+++ b/src/schemas/update-subtask.js
@@ -1,6 +0,0 @@
-import { z } from 'zod';
-import { SubtaskSchema } from './base-schemas.js';
-
-export const UpdateSubtaskResponseSchema = z.object({
-	subtask: SubtaskSchema
-});
--- a/src/schemas/update-task.js
+++ b/src/schemas/update-task.js
@@ -1,6 +0,0 @@
-import { z } from 'zod';
-import { UpdatedTaskSchema } from './update-tasks.js';
-
-export const UpdateTaskResponseSchema = z.object({
-	task: UpdatedTaskSchema
-});
--- a/src/schemas/update-tasks.js
+++ b/src/schemas/update-tasks.js
@@ -1,10 +0,0 @@
-import { z } from 'zod';
-import { BaseTaskSchema, SubtaskSchema } from './base-schemas.js';
-
-export const UpdatedTaskSchema = BaseTaskSchema.extend({
-	subtasks: z.array(SubtaskSchema).nullable().default(null)
-});
-
-export const UpdateTasksResponseSchema = z.object({
-	tasks: z.array(UpdatedTaskSchema)
-});
--- a/tests/integration/claude-code-error-handling.test.js
+++ b/tests/integration/claude-code-error-handling.test.js
@@ -1,52 +0,0 @@
-import { jest } from '@jest/globals';
-
-// Mock AI SDK functions at the top level
-jest.unstable_mockModule('ai', () => ({
-	generateObject: jest.fn(),
-	generateText: jest.fn(),
-	streamText: jest.fn(),
-	streamObject: jest.fn(),
-	zodSchema: jest.fn(),
-	JSONParseError: class JSONParseError extends Error {},
-	NoObjectGeneratedError: class NoObjectGeneratedError extends Error {}
-}));
-
-// Mock CLI failure scenario
-jest.unstable_mockModule('ai-sdk-provider-claude-code', () => ({
-	createClaudeCode: jest.fn(() => {
-		throw new Error('Claude Code CLI not found');
-	})
-}));
-
-// Import the provider after mocking
-const { ClaudeCodeProvider } = await import(
-	'../../src/ai-providers/claude-code.js'
-);
-
-describe('Claude Code Error Handling', () => {
-	beforeEach(() => {
-		jest.clearAllMocks();
-	});
-
-	it('should throw a CLI-not-available error (with or without commandName)', () => {
-		const provider = new ClaudeCodeProvider();
-		expect(() => provider.getClient()).toThrow(
-			/Claude Code CLI not available/i
-		);
-		expect(() => provider.getClient({ commandName: 'test' })).toThrow(
-			/Claude Code CLI not available/i
-		);
-	});
-
-	it('should still support basic provider functionality', () => {
-		const provider = new ClaudeCodeProvider();
-
-		// These should work even if CLI is not available
-		expect(provider.name).toBe('Claude Code');
-		expect(provider.getSupportedModels()).toEqual(['sonnet', 'opus']);
-		expect(provider.isModelSupported('sonnet')).toBe(true);
-		expect(provider.isModelSupported('haiku')).toBe(false);
-		expect(provider.isRequiredApiKey()).toBe(false);
-		expect(() => provider.validateAuth()).not.toThrow();
-	});
-});
--- a/tests/integration/claude-code-optional.test.js
+++ b/tests/integration/claude-code-optional.test.js
@@ -1,128 +1,95 @@
 import { jest } from '@jest/globals';

-// Mock AI SDK functions at the top level
-const generateText = jest.fn();
-const streamText = jest.fn();
-
-jest.unstable_mockModule('ai', () => ({
-	generateObject: jest.fn(),
-	generateText,
-	streamText,
-	streamObject: jest.fn(),
-	zodSchema: jest.fn(),
-	JSONParseError: class JSONParseError extends Error {},
-	NoObjectGeneratedError: class NoObjectGeneratedError extends Error {}
+// Mock the base provider to avoid circular dependencies
+jest.unstable_mockModule('../../src/ai-providers/base-provider.js', () => ({
+	BaseAIProvider: class {
+		constructor() {
+			this.name = 'Base Provider';
+		}
+		handleError(context, error) {
+			throw error;
+		}
+	}
 }));

-// Mock successful provider creation for all tests
-const mockProvider = jest.fn((modelId) => ({
-	id: modelId,
-	doGenerate: jest.fn(),
-	doStream: jest.fn()
-}));
-mockProvider.languageModel = jest.fn((id, settings) => ({ id, settings }));
-mockProvider.chat = mockProvider.languageModel;
+// Mock the claude-code SDK to simulate it not being installed
+jest.unstable_mockModule('@anthropic-ai/claude-code', () => {
+	throw new Error("Cannot find module '@anthropic-ai/claude-code'");
+});

-jest.unstable_mockModule('ai-sdk-provider-claude-code', () => ({
-	createClaudeCode: jest.fn(() => mockProvider)
-}));
-
-// Import the provider after mocking
+// Import after mocking
 const { ClaudeCodeProvider } = await import(
 	'../../src/ai-providers/claude-code.js'
 );

-describe('Claude Code Integration (Optional)', () => {
-	beforeEach(() => {
-		jest.clearAllMocks();
-	});
-
-	it('should create a working provider instance', () => {
-		const provider = new ClaudeCodeProvider();
-		expect(provider.name).toBe('Claude Code');
-		expect(provider.getSupportedModels()).toEqual(['sonnet', 'opus']);
-	});
-
-	it('should support model validation', () => {
-		const provider = new ClaudeCodeProvider();
-		expect(provider.isModelSupported('sonnet')).toBe(true);
-		expect(provider.isModelSupported('opus')).toBe(true);
-		expect(provider.isModelSupported('haiku')).toBe(false);
-		expect(provider.isModelSupported('unknown')).toBe(false);
-	});
-
-	it('should create a client successfully', () => {
-		const provider = new ClaudeCodeProvider();
-		const client = provider.getClient();
-
-		expect(client).toBeDefined();
-		expect(typeof client).toBe('function');
-		expect(client.languageModel).toBeDefined();
-		expect(client.chat).toBeDefined();
-		expect(client.chat).toBe(client.languageModel);
-	});
-
-	it('should pass command-specific settings to client', async () => {
-		const provider = new ClaudeCodeProvider();
-		const client = provider.getClient({ commandName: 'test-command' });
-
-		expect(client).toBeDefined();
-		expect(typeof client).toBe('function');
-		const { createClaudeCode } = await import('ai-sdk-provider-claude-code');
-		expect(createClaudeCode).toHaveBeenCalledTimes(1);
-	});
-
-	it('should handle AI SDK generateText integration', async () => {
-		const provider = new ClaudeCodeProvider();
-		const client = provider.getClient();
-
-		// Mock successful generation
-		generateText.mockResolvedValueOnce({
-			text: 'Hello from Claude Code!',
-			usage: { totalTokens: 10 }
+describe('Claude Code Optional Dependency Integration', () => {
+	describe('when @anthropic-ai/claude-code is not installed', () => {
+		it('should allow provider instantiation', () => {
+			// Provider should instantiate without error
+			const provider = new ClaudeCodeProvider();
+			expect(provider).toBeDefined();
+			expect(provider.name).toBe('Claude Code');
 		});

-		const result = await generateText({
-			model: client('sonnet'),
-			messages: [{ role: 'user', content: 'Hello' }]
+		it('should allow client creation', () => {
+			const provider = new ClaudeCodeProvider();
+			// Client creation should work
+			const client = provider.getClient({});
+			expect(client).toBeDefined();
+			expect(typeof client).toBe('function');
 		});

-		expect(result.text).toBe('Hello from Claude Code!');
-		expect(generateText).toHaveBeenCalledWith({
-			model: expect.any(Object),
-			messages: [{ role: 'user', content: 'Hello' }]
+		it('should fail with clear error when trying to use the model', async () => {
+			const provider = new ClaudeCodeProvider();
+			const client = provider.getClient({});
+			const model = client('opus');
+
+			// The actual usage should fail with the lazy loading error
+			await expect(
+				model.doGenerate({
+					prompt: [{ role: 'user', content: 'Hello' }],
+					mode: { type: 'regular' }
+				})
+			).rejects.toThrow(
+				"Claude Code SDK is not installed. Please install '@anthropic-ai/claude-code' to use the claude-code provider."
+			);
+		});
+
+		it('should provide helpful error message for streaming', async () => {
+			const provider = new ClaudeCodeProvider();
+			const client = provider.getClient({});
+			const model = client('sonnet');
+
+			await expect(
+				model.doStream({
+					prompt: [{ role: 'user', content: 'Hello' }],
+					mode: { type: 'regular' }
+				})
+			).rejects.toThrow(
+				"Claude Code SDK is not installed. Please install '@anthropic-ai/claude-code' to use the claude-code provider."
+			);
 		});
 	});

-	it('should handle AI SDK streamText integration', async () => {
-		const provider = new ClaudeCodeProvider();
-		const client = provider.getClient();
-
-		// Mock successful streaming
-		const mockStream = {
-			textStream: (async function* () {
-				yield 'Streamed response';
-			})()
-		};
-		streamText.mockResolvedValueOnce(mockStream);
-
-		const streamResult = await streamText({
-			model: client('sonnet'),
-			messages: [{ role: 'user', content: 'Stream test' }]
+	describe('provider behavior', () => {
+		it('should not require API key', () => {
+			const provider = new ClaudeCodeProvider();
+			// Should not throw
+			expect(() => provider.validateAuth()).not.toThrow();
+			expect(() => provider.validateAuth({ apiKey: null })).not.toThrow();
 		});

-		expect(streamResult.textStream).toBeDefined();
-		expect(streamText).toHaveBeenCalledWith({
-			model: expect.any(Object),
-			messages: [{ role: 'user', content: 'Stream test' }]
-		});
-	});
+		it('should work with ai-services-unified when provider is configured', async () => {
+			// This tests that the provider can be selected but will fail appropriately
+			// when the actual model is used
+			const provider = new ClaudeCodeProvider();
+			expect(provider).toBeDefined();

-	it('should not require authentication validation', () => {
-		const provider = new ClaudeCodeProvider();
-		expect(provider.isRequiredApiKey()).toBe(false);
-		expect(() => provider.validateAuth()).not.toThrow();
-		expect(() => provider.validateAuth({})).not.toThrow();
-		expect(() => provider.validateAuth({ commandName: 'test' })).not.toThrow();
+			// In real usage, ai-services-unified would:
+			// 1. Get the provider instance (works)
+			// 2. Call provider.getClient() (works)
+			// 3. Create a model (works)
+			// 4. Try to generate (fails with clear error)
+		});
 	});
 });
--- a/tests/integration/cli/complex-cross-tag-scenarios.test.js
+++ b/tests/integration/cli/complex-cross-tag-scenarios.test.js
@@ -330,7 +330,7 @@ describe('Complex Cross-Tag Scenarios', () => {

 	describe('Large Task Set Performance', () => {
 		it('should handle large task sets efficiently', () => {
-			// Create a large task set (50 tasks)
+			// Create a large task set (100 tasks)
 			const largeTaskSet = {
 				master: {
 					tasks: [],
@@ -348,8 +348,8 @@ describe('Complex Cross-Tag Scenarios', () => {
 				}
 			};

-			// Add 25 tasks to master with dependencies
-			for (let i = 1; i <= 25; i++) {
+			// Add 50 tasks to master with dependencies
+			for (let i = 1; i <= 50; i++) {
 				largeTaskSet.master.tasks.push({
 					id: i,
 					title: `Task ${i}`,
@@ -359,8 +359,8 @@ describe('Complex Cross-Tag Scenarios', () => {
 				});
 			}

-			// Add 25 tasks to in-progress (ensure no ID conflict with master)
-			for (let i = 26; i <= 50; i++) {
+			// Add 50 tasks to in-progress
+			for (let i = 51; i <= 100; i++) {
 				largeTaskSet['in-progress'].tasks.push({
 					id: i,
 					title: `Task ${i}`,
@@ -371,32 +371,21 @@ describe('Complex Cross-Tag Scenarios', () => {
 			}

 			fs.writeFileSync(tasksPath, JSON.stringify(largeTaskSet, null, 2));
-			// Execute move; correctness is validated below (no timing assertion)
+			// Should complete within reasonable time
+			const timeout = process.env.CI ? 11000 : 6000;
+			const startTime = Date.now();
 			execSync(
-				`node ${binPath} move --from=25 --from-tag=master --to-tag=in-progress --with-dependencies`,
+				`node ${binPath} move --from=50 --from-tag=master --to-tag=in-progress --with-dependencies`,
 				{ stdio: 'pipe' }
 			);
+			const endTime = Date.now();
+			expect(endTime - startTime).toBeLessThan(timeout);

 			// Verify the move was successful
 			const tasksAfter = JSON.parse(fs.readFileSync(tasksPath, 'utf8'));
-
-			// Verify all tasks in the dependency chain were moved
-			for (let i = 1; i <= 25; i++) {
-				expect(tasksAfter.master.tasks.find((t) => t.id === i)).toBeUndefined();
-				expect(
-					tasksAfter['in-progress'].tasks.find((t) => t.id === i)
-				).toBeDefined();
-			}
-
-			// Verify in-progress still has its original tasks (26-50)
-			for (let i = 26; i <= 50; i++) {
-				expect(
-					tasksAfter['in-progress'].tasks.find((t) => t.id === i)
-				).toBeDefined();
-			}
-
-			// Final count check
-			expect(tasksAfter['in-progress'].tasks).toHaveLength(50); // 25 moved + 25 original
+			expect(
+				tasksAfter['in-progress'].tasks.find((t) => t.id === 50)
+			).toBeDefined();
 		});
 	});

--- a/tests/unit/ai-providers/claude-code.test.js
+++ b/tests/unit/ai-providers/claude-code.test.js
@@ -1,20 +1,21 @@
 import { jest } from '@jest/globals';

-// Mock the ai-sdk-provider-claude-code package
-jest.unstable_mockModule('ai-sdk-provider-claude-code', () => ({
-	createClaudeCode: jest.fn(() => {
-		const provider = (modelId, settings) => ({
-			// Minimal mock language model surface
-			id: modelId,
-			settings,
-			doGenerate: jest.fn(() => ({ text: 'ok', usage: {} })),
-			doStream: jest.fn(() => ({ stream: true }))
-		});
-		provider.languageModel = jest.fn((id, settings) => ({ id, settings }));
-		provider.chat = provider.languageModel;
-		return provider;
+// Mock the claude-code SDK module
+jest.unstable_mockModule(
+	'../../../src/ai-providers/custom-sdk/claude-code/index.js',
+	() => ({
+		createClaudeCode: jest.fn(() => {
+			const provider = (modelId, settings) => ({
+				// Mock language model
+				id: modelId,
+				settings
+			});
+			provider.languageModel = jest.fn((id, settings) => ({ id, settings }));
+			provider.chat = provider.languageModel;
+			return provider;
+		})
 	})
-}));
+);

 // Mock the base provider
 jest.unstable_mockModule('../../../src/ai-providers/base-provider.js', () => ({
@@ -73,14 +74,15 @@ describe('ClaudeCodeProvider', () => {
 			expect(typeof client).toBe('function');
 		});

-		it('should create client without parameters', () => {
-			const client = provider.getClient();
+		it('should create client without API key or base URL', () => {
+			const client = provider.getClient({});
 			expect(client).toBeDefined();
 		});

-		it('should handle commandName parameter', () => {
+		it('should handle params even though they are not used', () => {
 			const client = provider.getClient({
-				commandName: 'test-command'
+				baseURL: 'https://example.com',
+				apiKey: 'unused-key'
 			});
 			expect(client).toBeDefined();
 		});
@@ -93,24 +95,12 @@ describe('ClaudeCodeProvider', () => {
 		});
 	});

-	describe('model support', () => {
-		it('should return supported models', () => {
-			const models = provider.getSupportedModels();
-			expect(models).toEqual(['sonnet', 'opus']);
-		});
-
-		it('should check if model is supported', () => {
-			expect(provider.isModelSupported('sonnet')).toBe(true);
-			expect(provider.isModelSupported('opus')).toBe(true);
-			expect(provider.isModelSupported('haiku')).toBe(false);
-			expect(provider.isModelSupported('unknown')).toBe(false);
-		});
-	});
-
 	describe('error handling', () => {
 		it('should handle client initialization errors', async () => {
 			// Force an error by making createClaudeCode throw
-			const { createClaudeCode } = await import('ai-sdk-provider-claude-code');
+			const { createClaudeCode } = await import(
+				'../../../src/ai-providers/custom-sdk/claude-code/index.js'
+			);
 			createClaudeCode.mockImplementationOnce(() => {
 				throw new Error('Mock initialization error');
 			});
--- a/tests/unit/ai-providers/custom-sdk/claude-code/language-model.test.js
+++ b/tests/unit/ai-providers/custom-sdk/claude-code/language-model.test.js
@@ -0,0 +1,237 @@
+import { jest } from '@jest/globals';
+
+// Mock modules before importing
+jest.unstable_mockModule('@ai-sdk/provider', () => ({
+	NoSuchModelError: class NoSuchModelError extends Error {
+		constructor({ modelId, modelType }) {
+			super(`No such model: ${modelId}`);
+			this.modelId = modelId;
+			this.modelType = modelType;
+		}
+	}
+}));
+
+jest.unstable_mockModule('@ai-sdk/provider-utils', () => ({
+	generateId: jest.fn(() => 'test-id-123')
+}));
+
+jest.unstable_mockModule(
+	'../../../../../src/ai-providers/custom-sdk/claude-code/message-converter.js',
+	() => ({
+		convertToClaudeCodeMessages: jest.fn((prompt) => ({
+			messagesPrompt: 'converted-prompt',
+			systemPrompt: 'system'
+		}))
+	})
+);
+
+jest.unstable_mockModule(
+	'../../../../../src/ai-providers/custom-sdk/claude-code/json-extractor.js',
+	() => ({
+		extractJson: jest.fn((text) => text)
+	})
+);
+
+jest.unstable_mockModule(
+	'../../../../../src/ai-providers/custom-sdk/claude-code/errors.js',
+	() => ({
+		createAPICallError: jest.fn((opts) => new Error(opts.message)),
+		createAuthenticationError: jest.fn((opts) => new Error(opts.message))
+	})
+);
+
+// This mock will be controlled by tests
+let mockClaudeCodeModule = null;
+jest.unstable_mockModule('@anthropic-ai/claude-code', () => {
+	if (mockClaudeCodeModule) {
+		return mockClaudeCodeModule;
+	}
+	throw new Error("Cannot find module '@anthropic-ai/claude-code'");
+});
+
+// Import the module under test
+const { ClaudeCodeLanguageModel } = await import(
+	'../../../../../src/ai-providers/custom-sdk/claude-code/language-model.js'
+);
+
+describe('ClaudeCodeLanguageModel', () => {
+	beforeEach(() => {
+		jest.clearAllMocks();
+		// Reset the module mock
+		mockClaudeCodeModule = null;
+		// Clear module cache to ensure fresh imports
+		jest.resetModules();
+	});
+
+	describe('constructor', () => {
+		it('should initialize with valid model ID', () => {
+			const model = new ClaudeCodeLanguageModel({
+				id: 'opus',
+				settings: { maxTurns: 5 }
+			});
+
+			expect(model.modelId).toBe('opus');
+			expect(model.settings).toEqual({ maxTurns: 5 });
+			expect(model.provider).toBe('claude-code');
+		});
+
+		it('should throw NoSuchModelError for invalid model ID', async () => {
+			expect(
+				() =>
+					new ClaudeCodeLanguageModel({
+						id: '',
+						settings: {}
+					})
+			).toThrow('No such model: ');
+
+			expect(
+				() =>
+					new ClaudeCodeLanguageModel({
+						id: null,
+						settings: {}
+					})
+			).toThrow('No such model: null');
+		});
+	});
+
+	describe('lazy loading of @anthropic-ai/claude-code', () => {
+		it('should throw error when package is not installed', async () => {
+			// Keep mockClaudeCodeModule as null to simulate missing package
+			const model = new ClaudeCodeLanguageModel({
+				id: 'opus',
+				settings: {}
+			});
+
+			await expect(
+				model.doGenerate({
+					prompt: [{ role: 'user', content: 'test' }],
+					mode: { type: 'regular' }
+				})
+			).rejects.toThrow(
+				"Claude Code SDK is not installed. Please install '@anthropic-ai/claude-code' to use the claude-code provider."
+			);
+		});
+
+		it('should load package successfully when available', async () => {
+			// Mock successful package load
+			const mockQuery = jest.fn(async function* () {
+				yield {
+					type: 'assistant',
+					message: { content: [{ type: 'text', text: 'Hello' }] }
+				};
+				yield {
+					type: 'result',
+					subtype: 'done',
+					usage: { output_tokens: 10, input_tokens: 5 }
+				};
+			});
+
+			mockClaudeCodeModule = {
+				query: mockQuery,
+				AbortError: class AbortError extends Error {}
+			};
+
+			// Need to re-import to get fresh module with mocks
+			jest.resetModules();
+			const { ClaudeCodeLanguageModel: FreshModel } = await import(
+				'../../../../../src/ai-providers/custom-sdk/claude-code/language-model.js'
+			);
+
+			const model = new FreshModel({
+				id: 'opus',
+				settings: {}
+			});
+
+			const result = await model.doGenerate({
+				prompt: [{ role: 'user', content: 'test' }],
+				mode: { type: 'regular' }
+			});
+
+			expect(result.text).toBe('Hello');
+			expect(mockQuery).toHaveBeenCalled();
+		});
+
+		it('should only attempt to load package once', async () => {
+			// Get a fresh import to ensure clean state
+			jest.resetModules();
+			const { ClaudeCodeLanguageModel: TestModel } = await import(
+				'../../../../../src/ai-providers/custom-sdk/claude-code/language-model.js'
+			);
+
+			const model = new TestModel({
+				id: 'opus',
+				settings: {}
+			});
+
+			// First call should throw
+			await expect(
+				model.doGenerate({
+					prompt: [{ role: 'user', content: 'test' }],
+					mode: { type: 'regular' }
+				})
+			).rejects.toThrow('Claude Code SDK is not installed');
+
+			// Second call should also throw without trying to load again
+			await expect(
+				model.doGenerate({
+					prompt: [{ role: 'user', content: 'test' }],
+					mode: { type: 'regular' }
+				})
+			).rejects.toThrow('Claude Code SDK is not installed');
+		});
+	});
+
+	describe('generateUnsupportedWarnings', () => {
+		it('should generate warnings for unsupported parameters', () => {
+			const model = new ClaudeCodeLanguageModel({
+				id: 'opus',
+				settings: {}
+			});
+
+			const warnings = model.generateUnsupportedWarnings({
+				temperature: 0.7,
+				maxTokens: 1000,
+				topP: 0.9,
+				seed: 42
+			});
+
+			expect(warnings).toHaveLength(4);
+			expect(warnings[0]).toEqual({
+				type: 'unsupported-setting',
+				setting: 'temperature',
+				details:
+					'Claude Code CLI does not support the temperature parameter. It will be ignored.'
+			});
+		});
+
+		it('should return empty array when no unsupported parameters', () => {
+			const model = new ClaudeCodeLanguageModel({
+				id: 'opus',
+				settings: {}
+			});
+
+			const warnings = model.generateUnsupportedWarnings({});
+			expect(warnings).toEqual([]);
+		});
+	});
+
+	describe('getModel', () => {
+		it('should map model IDs correctly', () => {
+			const model = new ClaudeCodeLanguageModel({
+				id: 'opus',
+				settings: {}
+			});
+
+			expect(model.getModel()).toBe('opus');
+		});
+
+		it('should return unmapped model IDs as-is', () => {
+			const model = new ClaudeCodeLanguageModel({
+				id: 'custom-model',
+				settings: {}
+			});
+
+			expect(model.getModel()).toBe('custom-model');
+		});
+	});
+});
--- a/tests/unit/ai-providers/gemini-cli.test.js
+++ b/tests/unit/ai-providers/gemini-cli.test.js
@@ -524,7 +524,7 @@ describe('GeminiCliProvider', () => {
 					}),
 					system: 'You are a helpful assistant',
 					messages: [{ role: 'user', content: 'Hello' }],
-					maxOutputTokens: 100,
+					maxTokens: 100,
 					temperature: 0.7
 				});
 				expect(result.text).toBe('Hello! How can I help you?');
@@ -550,7 +550,7 @@ describe('GeminiCliProvider', () => {
 					}),
 					system: undefined,
 					messages: [{ role: 'user', content: 'Hello' }],
-					maxOutputTokens: 100,
+					maxTokens: 100,
 					temperature: 0.7
 				});
 			});
@@ -570,7 +570,7 @@ describe('GeminiCliProvider', () => {
 					}),
 					system: 'You are a helpful assistant',
 					messages: [{ role: 'user', content: 'Hello' }],
-					maxOutputTokens: 100,
+					maxTokens: 100,
 					temperature: 0.7
 				});
 				expect(result).toBe(mockStream);
@@ -609,7 +609,7 @@ describe('GeminiCliProvider', () => {
 					messages: [{ role: 'user', content: 'Hello' }],
 					schema: mockObjectParams.schema,
 					mode: 'json',
-					maxOutputTokens: 100,
+					maxTokens: 100,
 					temperature: 0.7
 				});
 				expect(result.object).toEqual({ result: 'success' });
--- a/Show More
+++ b/Show More