Files
claude-task-master/tests/e2e/run_e2e.sh
Ralph Khreish b40139ca05 Release 0.18.0 (#840)
* Update SWE scores (#657)

* docs: Auto-update and format models.md

* feat: Flexible brand rules management (#460)

* chore(docs): update docs and rules related to model management.

* feat(ai): Add OpenRouter AI provider support

Integrates the OpenRouter AI provider using the Vercel AI SDK adapter (@openrouter/ai-sdk-provider). This allows users to configure and utilize models available through the OpenRouter platform.

- Added src/ai-providers/openrouter.js with standard Vercel AI SDK wrapper functions (generateText, streamText, generateObject).

- Updated ai-services-unified.js to include the OpenRouter provider in the PROVIDER_FUNCTIONS map and API key resolution logic.

- Verified config-manager.js handles OpenRouter API key checks correctly.

- Users can configure OpenRouter models via .taskmasterconfig using the task-master models command or MCP models tool. Requires OPENROUTER_API_KEY.

- Enhanced error handling in ai-services-unified.js to provide clearer messages when generateObjectService fails due to lack of underlying tool support in the selected model/provider endpoint.

* feat(cli): Add --status/-s filter flag to show command and get-task MCP tool

Implements the ability to filter subtasks displayed by the `task-master show <id>` command using the `--status` (or `-s`) flag. This is also available in the MCP context.

- Modified `commands.js` to add the `--status` option to the `show` command definition.

- Updated `utils.js` (`findTaskById`) to handle the filtering logic and return original subtask counts/arrays when filtering.

- Updated `ui.js` (`displayTaskById`) to use the filtered subtasks for the table, display a summary line when filtering, and use the original subtask list for the progress bar calculation.

- Updated MCP `get_task` tool and `showTaskDirect` function to accept and pass the `status` parameter.

- Added changeset entry.

* fix(tasks): Improve next task logic to be subtask-aware

* fix(tasks): Enable removing multiple tasks/subtasks via comma-separated IDs

- Refactors the core `removeTask` function (`task-manager/remove-task.js`) to accept and iterate over comma-separated task/subtask IDs.

- Updates dependency cleanup and file regeneration logic to run once after processing all specified IDs.

- Adjusts the `remove-task` CLI command (`commands.js`) description and confirmation prompt to handle multiple IDs correctly.

- Fixes a bug in the CLI confirmation prompt where task/subtask titles were not being displayed correctly.

- Updates the `remove_task` MCP tool description to reflect the new multi-ID capability.

This addresses the previously known issue where only the first ID in a comma-separated list was processed.

Closes #140

* Update README.md (#342)

* Update Discord badge (#337)

* refactor(init): Improve robustness and dependencies; Update template deps for AI SDKs; Silence npm install in MCP; Improve conditional model setup logic; Refactor init.js flags; Tweak Getting Started text; Fix MCP server launch command; Update default model in config template

* Refactor: Improve MCP logging, update E2E & tests

Refactors MCP server logging and updates testing infrastructure.

- MCP Server:

  - Replaced manual logger wrappers with centralized `createLogWrapper` utility.

  - Updated direct function calls to use `{ session, mcpLog }` context.

  - Removed deprecated `model` parameter from analyze, expand-all, expand-task tools.

  - Adjusted MCP tool import paths and parameter descriptions.

- Documentation:

  - Modified `docs/configuration.md`.

  - Modified `docs/tutorial.md`.

- Testing:

  - E2E Script (`run_e2e.sh`):

    - Removed `set -e`.

    - Added LLM analysis function (`analyze_log_with_llm`) & integration.

    - Adjusted test run directory creation timing.

    - Added debug echo statements.

  - Deleted Unit Tests: Removed `ai-client-factory.test.js`, `ai-client-utils.test.js`, `ai-services.test.js`.

  - Modified Fixtures: Updated `scripts/task-complexity-report.json`.

- Dev Scripts:

  - Modified `scripts/dev.js`.

* chore(tests): Passes tests for merge candidate
- Adjusted the interactive model default choice to be 'no change' instead of 'cancel setup'
- E2E script has been perfected and works as designed provided there are all provider API keys .env in the root
- Fixes the entire test suite to make sure it passes with the new architecture.
- Fixes dependency command to properly show there is a validation failure if there is one.
- Refactored config-manager.test.js mocking strategy and fixed assertions to read the real supported-models.json
- Fixed rule-transformer.test.js assertion syntax and transformation logic adjusting replacement for search which was too broad.
- Skip unstable tests in utils.test.js (log, readJSON, writeJSON error paths) due to SIGABRT crash. These tests trigger a native crash (SIGABRT), likely stemming from a conflict between internal chalk usage within the functions and Jest's test environment, possibly related to ESM module handling.

* chore(wtf): removes chai. not sure how that even made it in here. also removes duplicate test in scripts/.

* fix: ensure API key detection properly reads .env in MCP context

Problem:
- Task Master model configuration wasn't properly checking for API keys in the project's .env file when running through MCP
- The isApiKeySet function was only checking session.env and process.env but not inspecting the .env file directly
- This caused incorrect API key status reporting in MCP tools even when keys were properly set in .env

Solution:
- Modified resolveEnvVariable function in utils.js to properly read from .env file at projectRoot
- Updated isApiKeySet to correctly pass projectRoot to resolveEnvVariable
- Enhanced the key detection logic to have consistent behavior between CLI and MCP contexts
- Maintains the correct precedence: session.env → .env file → process.env

Testing:
- Verified working correctly with both MCP and CLI tools
- API keys properly detected in .env file in both contexts
- Deleted .cursor/mcp.json to confirm introspection of .env as fallback works

* fix(update): pass projectRoot through update command flow

Modified ai-services-unified.js, update.js tool, and update-tasks.js direct function to correctly pass projectRoot. This enables the .env file API key fallback mechanism for the update command when running via MCP, ensuring consistent key resolution with the CLI context.

* fix(analyze-complexity): pass projectRoot through analyze-complexity flow

Modified analyze-task-complexity.js core function, direct function, and analyze.js tool to correctly pass projectRoot. Fixed import error in tools/index.js. Added debug logging to _resolveApiKey in ai-services-unified.js. This enables the .env API key fallback for analyze_project_complexity.

* fix(add-task): pass projectRoot and fix logging/refs

Modified add-task core, direct function, and tool to pass projectRoot for .env API key fallback. Fixed logFn reference error and removed deprecated reportProgress call in core addTask function. Verified working.

* fix(parse-prd): pass projectRoot and fix schema/logging

Modified parse-prd core, direct function, and tool to pass projectRoot for .env API key fallback. Corrected Zod schema used in generateObjectService call. Fixed logFn reference error in core parsePRD. Updated unit test mock for utils.js.

* fix(update-task): pass projectRoot and adjust parsing

Modified update-task-by-id core, direct function, and tool to pass projectRoot. Reverted parsing logic in core function to prioritize `{...}` extraction, resolving parsing errors. Fixed ReferenceError by correctly destructuring projectRoot.

* fix(update-subtask): pass projectRoot and allow updating done subtasks

Modified update-subtask-by-id core, direct function, and tool to pass projectRoot for .env API key fallback. Removed check preventing appending details to completed subtasks.

* fix(mcp, expand): pass projectRoot through expand/expand-all flows

Problem: expand_task & expand_all MCP tools failed with .env keys due to missing projectRoot propagation for API key resolution. Also fixed a ReferenceError: wasSilent is not defined in expandTaskDirect.

Solution: Modified core logic, direct functions, and MCP tools for expand-task and expand-all to correctly destructure projectRoot from arguments and pass it down through the context object to the AI service call (generateTextService). Fixed wasSilent scope in expandTaskDirect.

Verification: Tested expand_task successfully in MCP using .env keys. Reviewed expand_all flow for correct projectRoot propagation.

* chore: prettier

* fix(expand-all): add projectRoot to expandAllTasksDirect invokation.

* fix(update-tasks): Improve AI response parsing for 'update' command

Refactors the JSON array parsing logic within
in .

The previous logic primarily relied on extracting content from markdown
code blocks (json or javascript), which proved brittle when the AI
response included comments or non-JSON text within the block, leading to
parsing errors for the  command.

This change modifies the parsing strategy to first attempt extracting
content directly between the outermost '[' and ']' brackets. This is
more robust as it targets the expected array structure directly. If
bracket extraction fails, it falls back to looking for a strict json
code block, then prefix stripping, before attempting a raw parse.

This approach aligns with the successful parsing strategy used for
single-object responses in  and resolves the
parsing errors previously observed with the  command.

* refactor(mcp): introduce withNormalizedProjectRoot HOF for path normalization

Added HOF to mcp tools utils to normalize projectRoot from args/session. Refactored get-task tool to use HOF. Updated relevant documentation.

* refactor(mcp): apply withNormalizedProjectRoot HOF to update tool

Problem: The  MCP tool previously handled project root acquisition and path resolution within its  method, leading to potential inconsistencies and repetition.

Solution: Refactored the  tool () to utilize the new  Higher-Order Function (HOF) from .

Specific Changes:
- Imported  HOF.
- Updated the Zod schema for the  parameter to be optional, as the HOF handles deriving it from the session if not provided.
- Wrapped the entire  function body with the  HOF.
- Removed the manual call to  from within the  function body.
- Destructured the  from the  object received by the wrapped  function, ensuring it's the normalized path provided by the HOF.
- Used the normalized  variable when calling  and when passing arguments to .

This change standardizes project root handling for the  tool, simplifies its  method, and ensures consistent path normalization. This serves as the pattern for refactoring other MCP tools.

* fix: apply to all tools withNormalizedProjectRoot to fix projectRoot issues for linux and windows

* fix: add rest of tools that need wrapper

* chore: cleanup tools to stop using rootFolder and remove unused imports

* chore: more cleanup

* refactor: Improve update-subtask, consolidate utils, update config

This commit introduces several improvements and refactorings across MCP tools, core logic, and configuration.

**Major Changes:**

1.  **Refactor updateSubtaskById:**
    - Switched from generateTextService to generateObjectService for structured AI responses, using a Zod schema (subtaskSchema) for validation.
    - Revised prompts to have the AI generate relevant content based on user request and context (parent/sibling tasks), while explicitly preventing AI from handling timestamp/tag formatting.
    - Implemented **local timestamp generation (new Date().toISOString()) and formatting** (using <info added on ...> tags) within the function *after* receiving the AI response. This ensures reliable and correctly formatted details are appended.
    - Corrected logic to append only the locally formatted, AI-generated content block to the existing subtask.details.

2.  **Consolidate MCP Utilities:**
    - Moved/consolidated the withNormalizedProjectRoot HOF into mcp-server/src/tools/utils.js.
    - Updated MCP tools (like update-subtask.js) to import withNormalizedProjectRoot from the new location.

3.  **Refactor Project Initialization:**
    - Deleted the redundant mcp-server/src/core/direct-functions/initialize-project-direct.js file.
    - Updated mcp-server/src/core/task-master-core.js to import initializeProjectDirect from its correct location (./direct-functions/initialize-project.js).

**Other Changes:**

-   Updated .taskmasterconfig fallback model to claude-3-7-sonnet-20250219.
-   Clarified model cost representation in the models tool description (taskmaster.mdc and mcp-server/src/tools/models.js).

* fix: displayBanner logging when silentMode is active (#385)

* fix: improve error handling, test options, and model configuration

- Enhance error validation in parse-prd.js and update-tasks.js
- Fix bug where mcpLog was incorrectly passed as logWrapper
- Improve error messages and response formatting
- Add --skip-verification flag to E2E tests
- Update MCP server config that ships with init to match new API key structure
- Fix task force/append handling in parse-prd command
- Increase column width in update-tasks display

* chore: fixes parse prd to show loading indicator in cli.

* fix(parse-prd): suggested fix for mcpLog was incorrect. reverting to my previously working code.

* chore(init): No longer ships readme with task-master init (commented out for now). No longer looking for task-master-mcp, instead checked for task-master-ai - this should prevent the init sequence from needlessly adding another mcp server with task-master-mcp to the mpc.json which a ton of people probably ran into.

* chore: restores 3.7 sonnet as the main role.

* fix(add/remove-dependency): dependency mcp tools were failing due to hard-coded tasks path in generate task files.

* chore: removes tasks json backup that was temporarily created.

* fix(next): adjusts mcp tool response to correctly return the next task/subtask. Also adds nextSteps to the next task response.

* chore: prettier

* chore: readme typos

* fix(config): restores sonnet 3.7 as default main role.

* Version Packages

* hotfix: move production package to "dependencies" (#399)

* Version Packages

* Fix: issues with 0.13.0 not working (#402)

* Exit prerelease mode and version packages

* hotfix: move production package to "dependencies"

* Enter prerelease mode and version packages

* Enter prerelease mode and version packages

* chore: cleanup

* chore: improve pre.json and add pre-release workflow

* chore: fix package.json

* chore: cleanup

* chore: improve pre-release workflow

* chore: allow github actions to commit

* extract fileMap and conversionConfig into brand profile

* extract into brand profile

* add windsurf profile

* add remove brand rules function

* fix regex

* add rules command to add/remove rules for a specific brand

* fix post processing for roo

* allow multiples

* add cursor profile

* update test for new structure

* move rules to assets

* use assets/rules for rules files

* use standardized setupMCP function

* fix formatting

* fix formatting

* add logging

* fix escapes

* default to cursor

* allow init with certain rulesets; no more .windsurfrules

* update docs

* update log msg

* fix formatting

* keep mdc extension for cursor

* don't rewrite .mdc to .md inside the files

* fix roo init (add modes)

* fix cursor init (don't use roo transformation by default)

* use more generic function names

* update docs

* fix formatting

* update function names

* add changeset

* add rules to mcp initialize project

* register tool with mcp server

* update docs

* add integration test

* fix cursor initialization

* rule selection

* fix formatting

* fix MCP - remove yes flag

* add import

* update roo tests

* add/update tests

* remove test

* add rules command test

* update MCP responses, centralize rules profiles & helpers

* fix logging and MCP response messages

* fix formatting

* incorrect test

* fix tests

* update fileMap

* fix file extension transformations

* fix formatting

* add rules command test

* test already covered

* fix formatting

* move renaming logic into profiles

* make sure dir is deleted (DS_Store)

* add confirmation for rules removal

* add force flag for rules remove

* use force flag for test

* remove yes parameter

* fix formatting

* import brand profiles from rule-transformer.js

* update comment

* add interactive rules setup

* optimize

* only copy rules specifically listed in fileMap

* update comment

* add cline profile

* add brandDir to remove ambiguity and support Cline

* specify whether to create mcp config and filename

* add mcpConfigName value for parh

* fix formatting

* remove rules just for this repository - only include rules to be distributed

* update error message

* update "brand rules" to "rules"

* update to minor

* remove comment

* remove comments

* move to /src/utils

* optimize imports

* move rules-setup.js to /src/utils

* move rule-transformer.js to /src/utils

* move confirmation to /src/ui/confirm.js

* default to all rules

* use profile js for mcp config settings

* only run rules interactive setup if not provided via command line

* update comments

* initialize with all brands if nothing specified

* update var name

* clean up

* enumerate brands for brand rules

* update instructions

* add test to check for brand profiles

* fix quotes

* update semantics and terminology from 'brand rules' to 'rules profiles'

* fix formatting

* fix formatting

* update function name and remove copying of cursor rules, now handled by rules transformer

* update comment

* rename to mcp-config-setup.js

* use enums for rules actions

* add aggregate reporting for rules add command

* add missing log message

* use simpler path

* use base profile with modifications for each brand

* use displayName and don't select any defaults in setup

* add confirmation if removing ALL rules profiles, and add --force flag on rules remove

* Use profile-detection instead of rules-detection

* add newline at end of mcp config

* add proper formatting for mcp.json

* update rules

* update rules

* update rules

* add checks for other rules and other profile folder items before removing

* update confirmation for rules remove

* update docs

* update changeset

* fix for filepath at bottom of rule

* Update cline profile and add test; adjust other rules tests

* update changeset

* update changeset

* clarify init for all profiles if not specified

* update rule text

* revert text

* use "rule profiles" instead of "rules profiles"

* use standard tool mappings for windsurf

* add Trae support

* update changeset

* update wording

* update to 'rule profile'

* remove unneeded exports to optimize loc

* combine to /src/utils/profiles.js; add codex and claude code profiles

* rename function and add boxen

* add claude and codex integration tests

* organize tests into profiles folder

* mock fs for transformer tests

* update UI

* add cline and trae integration tests

* update test

* update function name

* update formatting

* Update change set with new profiles

* move profile integration tests to subdirectory

* properly create temp directories in /tmp folder

* fix formatting

* use taskmaster subfolder for the 2 TM rules

* update wording

* ensure subdirectory exists

* update rules from next

* update from next

* update taskmaster rule

* add details on new rules command and init

* fix mcp init

* fix MCP path to assets

* remove duplication

* remove duplication

* MCP server path fixes for rules command

* fix for CLI roo rules add/remove

* update tests

* fix formatting

* fix pattern for interactive rule profiles setup

* restore comments

* restore comments

* restore comments

* remove unused import, fix quotes

* add missing integration tests

* add VS Code profile and tests

* update docs and rules to include vscode profile

* add rules subdirectory support per-profile

* move profiles to /src

* fix formatting

* rename to remove ambiguity

* use --setup for rules interactive setup

* Fix Cursor deeplink installation with copy-paste instructions (#723)

* change roo boomerang to orchestrator; update tests that don't use modes

* fix newline

* chore: cleanup

---------

Co-authored-by: Eyal Toledano <eyal@microangel.so>
Co-authored-by: Yuval <yuvalbl@users.noreply.github.com>
Co-authored-by: Marijn van der Werf <marijn.vanderwerf@gmail.com>
Co-authored-by: Eyal Toledano <eutait@gmail.com>
Co-authored-by: Ralph Khreish <35776126+Crunchyman-ralph@users.noreply.github.com>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>

* fix: providers config for azure, bedrock, and vertex (#822)

* fix: providers config for azure, bedrock, and vertex

* chore: improve changelog

* chore: fix CI

* fix: switch to ESM export to avoid mixed format (#633)

* fix: switch to ESM export to avoid mixed format

The CLI entrypoint was using `module.exports` alongside ESM `import` statements,
resulting in an invalid mixed module format. Replaced the CommonJS export with
a proper ESM `export` to maintain consistency and prevent module resolution issues.

* chore: add changeset

---------

Co-authored-by: Ralph Khreish <35776126+Crunchyman-ralph@users.noreply.github.com>

* fix: Fix external provider support (#726)

* fix(bedrock): improve AWS credential handling and add model definitions (#826)

* fix(bedrock): improve AWS credential handling and add model definitions

- Change error to warning when AWS credentials are missing in environment
- Allow fallback to system configuration (aws config files or instance profiles)
- Remove hardcoded region and profile parameters in Bedrock client
- Add Claude 3.7 Sonnet and DeepSeek R1 model definitions for Bedrock
- Update config manager to properly handle Bedrock provider

* chore: cleanup and format and small refactor

---------

Co-authored-by: Ray Krueger <raykrueger@gmail.com>

* docs: Auto-update and format models.md

* Version Packages

* chore: fix package.json

* Fix/expand command tag corruption (#827)

* fix(expand): Fix tag corruption in expand command - Fix tag parameter passing through MCP expand-task flow - Add tag parameter to direct function and tool registration - Fix contextGatherer method name from _buildDependencyContext to _buildDependencyGraphs - Add comprehensive test coverage for tag handling in expand-task - Ensures tagged task structure is preserved during expansion - Prevents corruption when tag is undefined. Fixes expand command causing tag corruption in tagged task lists. All existing tests pass and new test coverage added.

* test(e2e): Add comprehensive tag-aware expand testing to verify tag corruption fix - Add new test section for feature-expand tag creation and testing - Verify tag preservation during expand, force expand, and expand --all operations - Test that master tag remains intact and feature-expand tag receives subtasks correctly - Fix file path references to use correct .taskmaster/tasks/tasks.json location - Fix config file check to use .taskmaster/config.json instead of .taskmasterconfig - All tag corruption verification tests pass successfully in E2E test

* fix(changeset): Update E2E test improvements changeset to properly reflect tag corruption fix verification

* chore(changeset): combine duplicate changesets for expand tag corruption fix

Merge eighty-breads-wonder.md into bright-llamas-enter.md to consolidate
the expand command fix and its comprehensive E2E testing enhancements
into a single changeset entry.

* Delete .changeset/eighty-breads-wonder.md

* Version Packages

* chore: fix package.json

* fix(expand): Enhance context handling in expandAllTasks function
- Added `tag` to context destructuring for better context management.
- Updated `readJSON` call to include `contextTag` for improved data integrity.
- Ensured the correct tag is passed during task expansion to prevent tag corruption.

---------

Co-authored-by: Parththipan Thaniperumkarunai <parththipan.thaniperumkarunai@milkmonkey.de>
Co-authored-by: Parthy <52548018+mm-parthy@users.noreply.github.com>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>

* Add pyproject.toml as project root marker (#804)

* feat: Add pyproject.toml as project root marker - Added 'pyproject.toml' to the project markers array in findProjectRoot() - Enables Task Master to recognize Python projects using pyproject.toml - Improves project root detection for modern Python development workflows - Maintains compatibility with existing Node.js and Git-based detection

* chore: add changeset

---------

Co-authored-by: Ralph Khreish <35776126+Crunchyman-ralph@users.noreply.github.com>

* feat: add Claude Code provider support

Implements Claude Code as a new AI provider that uses the Claude Code CLI
without requiring API keys. This enables users to leverage Claude models
through their local Claude Code installation.

Key changes:
- Add complete AI SDK v1 implementation for Claude Code provider
  - Custom SDK with streaming/non-streaming support
  - Session management for conversation continuity
  - JSON extraction for object generation mode
  - Support for advanced settings (maxTurns, allowedTools, etc.)

- Integrate Claude Code into Task Master's provider system
  - Update ai-services-unified.js to handle keyless authentication
  - Add provider to supported-models.json with opus/sonnet models
  - Ensure correct maxTokens values are applied (opus: 32000, sonnet: 64000)

- Fix maxTokens configuration issue
  - Add max_tokens property to getAvailableModels() output
  - Update setModel() to properly handle claude-code models
  - Create update-config-tokens.js utility for init process

- Add comprehensive documentation
  - User guide with configuration examples
  - Advanced settings explanation and future integration options

The implementation maintains full backward compatibility with existing
providers while adding seamless Claude Code support to all Task Master
commands.

* fix(docs): correct invalid commands in claude-code usage examples

- Remove non-existent 'do', 'estimate', and 'analyze' commands
- Replace with actual Task Master commands: next, show, set-status
- Use correct syntax for parse-prd and analyze-complexity

* feat: make @anthropic-ai/claude-code an optional dependency

This change makes the Claude Code SDK package optional, preventing installation failures for users who don't need Claude Code functionality.

Changes:
- Added @anthropic-ai/claude-code to optionalDependencies in package.json
- Implemented lazy loading in language-model.js to only import the SDK when actually used
- Updated documentation to explain the optional installation requirement
- Applied formatting fixes to ensure code consistency

Benefits:
- Users without Claude Code subscriptions don't need to install the dependency
- Reduces package size for users who don't use Claude Code
- Prevents installation failures if the package is unavailable
- Provides clear error messages when the package is needed but not installed

The implementation uses dynamic imports to load the SDK only when doGenerate() or doStream() is called, ensuring the provider can be instantiated without the package present.

* test: add comprehensive tests for ClaudeCodeProvider

Addresses code review feedback about missing automated tests for the ClaudeCodeProvider.

## Changes

- Added unit tests for ClaudeCodeProvider class covering constructor, validateAuth, and getClient methods
- Added unit tests for ClaudeCodeLanguageModel testing lazy loading behavior and error handling
- Added integration tests verifying optional dependency behavior when @anthropic-ai/claude-code is not installed

## Test Coverage

1. **Unit Tests**:
   - ClaudeCodeProvider: Basic functionality, no API key requirement, client creation
   - ClaudeCodeLanguageModel: Model initialization, lazy loading, error messages, warning generation

2. **Integration Tests**:
   - Optional dependency behavior when package is not installed
   - Clear error messages for users about missing package
   - Provider instantiation works but usage fails gracefully

All tests pass and provide comprehensive coverage for the claude-code provider implementation.

* revert: remove maxTokens update functionality from init

This functionality was out of scope for the Claude Code provider PR.
The automatic updating of maxTokens values in config.json during
initialization is a general improvement that should be in a separate PR.

Additionally, Claude Code ignores maxTokens and temperature parameters
anyway, making this change irrelevant for the Claude Code integration.

Removed:
- scripts/modules/update-config-tokens.js
- Import and usage in scripts/init.js

* docs: add Claude Code support information to README

- Added Claude Code to the list of supported providers in Requirements section
- Noted that Claude Code requires no API key but needs Claude Code CLI
- Added example of configuring claude-code/sonnet model
- Created dedicated Claude Code Support section with key information
- Added link to detailed Claude Code setup documentation

This ensures users are aware of the Claude Code option as a no-API-key
alternative for using Claude models.

* style: apply biome formatting to test files

* fix(models): add missing --claude-code flag to models command

The models command was missing the --claude-code provider flag, preventing users from setting Claude Code models via CLI. While the backend already supported claude-code as a provider hint, there was no command-line flag to trigger it.

Changes:
- Added --claude-code option to models command alongside existing provider flags
- Updated provider flags validation to include claudeCode option
- Added claude-code to providerHint logic for all three model roles (main, research, fallback)
- Updated error message to include --claude-code in list of mutually exclusive flags
- Added example usage in help text

This allows users to properly set Claude Code models using commands like:
  task-master models --set-main sonnet --claude-code
  task-master models --set-main opus --claude-code

Without this flag, users would get "Model ID not found" errors when trying to set claude-code models, as the system couldn't determine the correct provider for generic model names like "sonnet" or "opus".

* chore: add changeset for Claude Code provider feature

* docs: Auto-update and format models.md

* readme: add troubleshooting note for MCP tools not working

* Feature/compatibleapisupport (#830)

* add compatible platform api support

* Adjust the code according to the suggestions

* Fully revised as requested: restored all required checks, improved compatibility, and converted all comments to English.

* feat: Add support for compatible API endpoints via baseURL

* chore: Add changeset for compatible API support

* chore: cleanup

* chore: improve changeset

* fix: package-lock.json

* fix: package-lock.json

---------

Co-authored-by: He-Xun <1226807142@qq.com>

* Rename Roo Code "Boomerang" role to "Orchestrator" (#831)

* feat: Enhanced project initialization with Git worktree detection (#743)

* Fix Cursor deeplink installation with copy-paste instructions (#723)

* detect git worktree

* add changeset

* add aliases and git flags

* add changeset

* rename and update test

* add store tasks in git functionality

* update changeset

* fix newline

* remove unused import

* update command wording

* update command option text

* fix: update task by id (#834)

* store tasks in git by default (#835)

* Call rules interactive setup during init (#833)

* chore: rc version bump

* feat: Claude Code slash commands for Task Master (#774)

* Fix Cursor deeplink installation with copy-paste instructions (#723)

* fix: expand-task (#755)

* docs: Update o3 model price (#751)

* docs: Auto-update and format models.md

* docs: Auto-update and format models.md

* feat: Add Claude Code task master commands

Adds Task Master slash commands for Claude Code under /project:tm/ namespace

---------

Co-authored-by: Joe Danziger <joe@ticc.net>
Co-authored-by: Ralph Khreish <35776126+Crunchyman-ralph@users.noreply.github.com>
Co-authored-by: Volodymyr Zahorniak <7808206+zahorniak@users.noreply.github.com>
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Co-authored-by: neno-is-ooo <204701868+neno-is-ooo@users.noreply.github.com>

* feat: make more compatible with "o" family models (#839)

* docs: Auto-update and format models.md

* docs: Add comprehensive Azure OpenAI configuration documentation (#837)

* docs: Add comprehensive Azure OpenAI configuration documentation

- Add detailed Azure OpenAI configuration section with prerequisites, authentication, and setup options
- Include both global and per-model baseURL configuration examples
- Add comprehensive troubleshooting guide for common Azure OpenAI issues
- Update environment variables section with Azure OpenAI examples
- Add Azure OpenAI models to all model tables (Main, Research, Fallback)
- Include prominent Azure configuration example in main documentation
- Fix azureBaseURL format to use correct Azure OpenAI endpoint structure

Addresses common Azure OpenAI setup challenges and provides clear guidance for new users.

* refactor: Move Azure models from docs/models.md to scripts/modules/supported-models.json

- Remove Azure model entries from documentation tables
- Add Azure provider section to supported-models.json with gpt-4o, gpt-4o-mini, and gpt-4-1
- Maintain consistency with existing model configuration structure

* docs: Auto-update and format models.md

* Version Packages

* chore: format fix

---------

Co-authored-by: Riccardo (Ricky) Esclapon <32306488+ries9112@users.noreply.github.com>
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Co-authored-by: Joe Danziger <joe@ticc.net>
Co-authored-by: Eyal Toledano <eyal@microangel.so>
Co-authored-by: Yuval <yuvalbl@users.noreply.github.com>
Co-authored-by: Marijn van der Werf <marijn.vanderwerf@gmail.com>
Co-authored-by: Eyal Toledano <eutait@gmail.com>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: Nathan Marley <nathan@glowberrylabs.com>
Co-authored-by: Ray Krueger <raykrueger@gmail.com>
Co-authored-by: Parththipan Thaniperumkarunai <parththipan.thaniperumkarunai@milkmonkey.de>
Co-authored-by: Parthy <52548018+mm-parthy@users.noreply.github.com>
Co-authored-by: ejones40 <ethan.jones@fortyau.com>
Co-authored-by: Ben Vargas <ben@vargas.com>
Co-authored-by: V4G4X <34249137+V4G4X@users.noreply.github.com>
Co-authored-by: He-Xun <1226807142@qq.com>
Co-authored-by: neno <github@meaning.systems>
Co-authored-by: Volodymyr Zahorniak <7808206+zahorniak@users.noreply.github.com>
Co-authored-by: neno-is-ooo <204701868+neno-is-ooo@users.noreply.github.com>
Co-authored-by: Jitesh Thakur <56656484+Jitha-afk@users.noreply.github.com>
2025-06-21 13:54:17 -07:00

966 lines
40 KiB
Bash
Executable File

#!/bin/bash
# Treat unset variables as an error when substituting.
set -u
# Prevent errors in pipelines from being masked.
set -o pipefail
# --- Default Settings ---
run_verification_test=true
# --- Argument Parsing ---
# Simple loop to check for the skip flag
# Note: This needs to happen *before* the main block piped to tee
# if we want the decision logged early. Or handle args inside.
# Let's handle it before for clarity.
processed_args=()
while [[ $# -gt 0 ]]; do
case "$1" in
--skip-verification)
run_verification_test=false
echo "[INFO] Argument '--skip-verification' detected. Fallback verification will be skipped."
shift # Consume the flag
;;
--analyze-log)
# Keep the analyze-log flag handling separate for now
# It exits early, so doesn't conflict with the main run flags
processed_args+=("$1")
if [[ $# -gt 1 ]]; then
processed_args+=("$2")
shift 2
else
shift 1
fi
;;
*)
# Unknown argument, pass it along or handle error
# For now, just pass it along in case --analyze-log needs it later
processed_args+=("$1")
shift
;;
esac
done
# Restore processed arguments ONLY if the array is not empty
if [ ${#processed_args[@]} -gt 0 ]; then
set -- "${processed_args[@]}"
fi
# --- Configuration ---
# Assumes script is run from the project root (claude-task-master)
TASKMASTER_SOURCE_DIR="." # Current directory is the source
# Base directory for test runs, relative to project root
BASE_TEST_DIR="$TASKMASTER_SOURCE_DIR/tests/e2e/_runs"
# Log directory, relative to project root
LOG_DIR="$TASKMASTER_SOURCE_DIR/tests/e2e/log"
# Path to the sample PRD, relative to project root
SAMPLE_PRD_SOURCE="$TASKMASTER_SOURCE_DIR/tests/fixtures/sample-prd.txt"
# Path to the main .env file in the source directory
MAIN_ENV_FILE="$TASKMASTER_SOURCE_DIR/.env"
# ---
# <<< Source the helper script >>>
# shellcheck source=tests/e2e/e2e_helpers.sh
source "$TASKMASTER_SOURCE_DIR/tests/e2e/e2e_helpers.sh"
# ==========================================
# >>> Global Helper Functions Defined in run_e2e.sh <<<
# --- Helper Functions (Define globally before export) ---
_format_duration() {
local total_seconds=$1
local minutes=$((total_seconds / 60))
local seconds=$((total_seconds % 60))
printf "%dm%02ds" "$minutes" "$seconds"
}
# Note: This relies on 'overall_start_time' being set globally before the function is called
_get_elapsed_time_for_log() {
local current_time
current_time=$(date +%s)
# Use overall_start_time here, as start_time_for_helpers might not be relevant globally
local elapsed_seconds
elapsed_seconds=$((current_time - overall_start_time))
_format_duration "$elapsed_seconds"
}
log_info() {
echo "[INFO] [$(_get_elapsed_time_for_log)] $(date +"%Y-%m-%d %H:%M:%S") $1"
}
log_success() {
echo "[SUCCESS] [$(_get_elapsed_time_for_log)] $(date +"%Y-%m-%d %H:%M:%S") $1"
}
log_error() {
echo "[ERROR] [$(_get_elapsed_time_for_log)] $(date +"%Y-%m-%d %H:%M:%S") $1" >&2
}
log_step() {
test_step_count=$((test_step_count + 1))
echo ""
echo "============================================="
echo " STEP ${test_step_count}: [$(_get_elapsed_time_for_log)] $(date +"%Y-%m-%d %H:%M:%S") $1"
echo "============================================="
}
# ==========================================
# <<< Export helper functions for subshells >>>
export -f log_info log_success log_error log_step _format_duration _get_elapsed_time_for_log extract_and_sum_cost
# --- Argument Parsing for Analysis-Only Mode ---
# This remains the same, as it exits early if matched
if [ "$#" -ge 1 ] && [ "$1" == "--analyze-log" ]; then
LOG_TO_ANALYZE=""
# Check if a log file path was provided as the second argument
if [ "$#" -ge 2 ] && [ -n "$2" ]; then
LOG_TO_ANALYZE="$2"
echo "[INFO] Using specified log file for analysis: $LOG_TO_ANALYZE"
else
echo "[INFO] Log file not specified. Attempting to find the latest log..."
# Find the latest log file in the LOG_DIR
# Ensure LOG_DIR is absolute for ls to work correctly regardless of PWD
ABS_LOG_DIR="$(cd "$TASKMASTER_SOURCE_DIR/$LOG_DIR" && pwd)"
LATEST_LOG=$(ls -t "$ABS_LOG_DIR"/e2e_run_*.log 2>/dev/null | head -n 1)
if [ -z "$LATEST_LOG" ]; then
echo "[ERROR] No log files found matching 'e2e_run_*.log' in $ABS_LOG_DIR. Cannot analyze." >&2
exit 1
fi
LOG_TO_ANALYZE="$LATEST_LOG"
echo "[INFO] Found latest log file: $LOG_TO_ANALYZE"
fi
# Ensure the log path is absolute (it should be if found by ls, but double-check)
if [[ "$LOG_TO_ANALYZE" != /* ]]; then
LOG_TO_ANALYZE="$(pwd)/$LOG_TO_ANALYZE" # Fallback if relative path somehow occurred
fi
echo "[INFO] Running in analysis-only mode for log: $LOG_TO_ANALYZE"
# --- Derive TEST_RUN_DIR from log file path ---
# Extract timestamp like YYYYMMDD_HHMMSS from e2e_run_YYYYMMDD_HHMMSS.log
log_basename=$(basename "$LOG_TO_ANALYZE")
# Ensure the sed command matches the .log suffix correctly
timestamp_match=$(echo "$log_basename" | sed -n 's/^e2e_run_\([0-9]\{8\}_[0-9]\{6\}\)\.log$/\1/p')
if [ -z "$timestamp_match" ]; then
echo "[ERROR] Could not extract timestamp from log file name: $log_basename" >&2
echo "[ERROR] Expected format: e2e_run_YYYYMMDD_HHMMSS.log" >&2
exit 1
fi
# Construct the expected run directory path relative to project root
EXPECTED_RUN_DIR="$TASKMASTER_SOURCE_DIR/tests/e2e/_runs/run_$timestamp_match"
# Make it absolute
EXPECTED_RUN_DIR_ABS="$(cd "$TASKMASTER_SOURCE_DIR" && pwd)/tests/e2e/_runs/run_$timestamp_match"
if [ ! -d "$EXPECTED_RUN_DIR_ABS" ]; then
echo "[ERROR] Corresponding test run directory not found: $EXPECTED_RUN_DIR_ABS" >&2
exit 1
fi
# Save original dir before changing
ORIGINAL_DIR=$(pwd)
echo "[INFO] Changing directory to $EXPECTED_RUN_DIR_ABS for analysis context..."
cd "$EXPECTED_RUN_DIR_ABS"
# Call the analysis function (sourced from helpers)
echo "[INFO] Calling analyze_log_with_llm function..."
analyze_log_with_llm "$LOG_TO_ANALYZE" "$(cd "$ORIGINAL_DIR/$TASKMASTER_SOURCE_DIR" && pwd)" # Pass absolute project root
ANALYSIS_EXIT_CODE=$?
# Return to original directory
cd "$ORIGINAL_DIR"
exit $ANALYSIS_EXIT_CODE
fi
# --- End Analysis-Only Mode Logic ---
# --- Normal Execution Starts Here (if not in analysis-only mode) ---
# --- Test State Variables ---
# Note: These are mainly for step numbering within the log now, not for final summary
test_step_count=0
start_time_for_helpers=0 # Separate start time for helper functions inside the pipe
total_e2e_cost="0.0" # Initialize total E2E cost
# ---
# --- Log File Setup ---
# Create the log directory if it doesn't exist
mkdir -p "$LOG_DIR"
# Define timestamped log file path
TIMESTAMP=$(date +"%Y%m%d_%H%M%S")
# <<< Use pwd to create an absolute path AND add .log extension >>>
LOG_FILE="$(pwd)/$LOG_DIR/e2e_run_${TIMESTAMP}.log"
# Define and create the test run directory *before* the main pipe
mkdir -p "$BASE_TEST_DIR" # Ensure base exists first
TEST_RUN_DIR="$BASE_TEST_DIR/run_$TIMESTAMP"
mkdir -p "$TEST_RUN_DIR"
# Echo starting message to the original terminal BEFORE the main piped block
echo "Starting E2E test. Output will be shown here and saved to: $LOG_FILE"
echo "Running from directory: $(pwd)"
echo "--- Starting E2E Run ---" # Separator before piped output starts
# Record start time for overall duration *before* the pipe
overall_start_time=$(date +%s)
# <<< DEFINE ORIGINAL_DIR GLOBALLY HERE >>>
ORIGINAL_DIR=$(pwd)
# ==========================================
# >>> MOVE FUNCTION DEFINITION HERE <<<
# --- Helper Functions (Define globally) ---
_format_duration() {
local total_seconds=$1
local minutes=$((total_seconds / 60))
local seconds=$((total_seconds % 60))
printf "%dm%02ds" "$minutes" "$seconds"
}
# Note: This relies on 'overall_start_time' being set globally before the function is called
_get_elapsed_time_for_log() {
local current_time=$(date +%s)
# Use overall_start_time here, as start_time_for_helpers might not be relevant globally
local elapsed_seconds=$((current_time - overall_start_time))
_format_duration "$elapsed_seconds"
}
log_info() {
echo "[INFO] [$(_get_elapsed_time_for_log)] $(date +"%Y-%m-%d %H:%M:%S") $1"
}
log_success() {
echo "[SUCCESS] [$(_get_elapsed_time_for_log)] $(date +"%Y-%m-%d %H:%M:%S") $1"
}
log_error() {
echo "[ERROR] [$(_get_elapsed_time_for_log)] $(date +"%Y-%m-%d %H:%M:%S") $1" >&2
}
log_step() {
test_step_count=$((test_step_count + 1))
echo ""
echo "============================================="
echo " STEP ${test_step_count}: [$(_get_elapsed_time_for_log)] $(date +"%Y-%m-%d %H:%M:%S") $1"
echo "============================================="
}
# ==========================================
# --- Main Execution Block (Piped to tee) ---
# Wrap the main part of the script in braces and pipe its output (stdout and stderr) to tee
{
# Note: Helper functions are now defined globally above,
# but we still need start_time_for_helpers if any logging functions
# called *inside* this block depend on it. If not, it can be removed.
start_time_for_helpers=$(date +%s) # Keep if needed by helpers called inside this block
# Log the verification decision
if [ "$run_verification_test" = true ]; then
log_info "Fallback verification test will be run as part of this E2E test."
else
log_info "Fallback verification test will be SKIPPED (--skip-verification flag detected)."
fi
# --- Dependency Checks ---
log_step "Checking for dependencies (jq, bc)"
if ! command -v jq &> /dev/null; then
log_error "Dependency 'jq' is not installed or not found in PATH. Please install jq (e.g., 'brew install jq' or 'sudo apt-get install jq')."
exit 1
fi
if ! command -v bc &> /dev/null; then
log_error "Dependency 'bc' not installed (for cost calculation). Please install bc (e.g., 'brew install bc' or 'sudo apt-get install bc')."
exit 1
fi
log_success "Dependencies 'jq' and 'bc' found."
# --- Test Setup (Output to tee) ---
log_step "Setting up test environment"
log_step "Creating global npm link for task-master-ai"
if npm link; then
log_success "Global link created/updated."
else
log_error "Failed to run 'npm link'. Check permissions or output for details."
exit 1
fi
log_info "Ensured base test directory exists: $BASE_TEST_DIR"
log_info "Using test run directory (created earlier): $TEST_RUN_DIR"
# Check if source .env file exists
if [ ! -f "$MAIN_ENV_FILE" ]; then
log_error "Source .env file not found at $MAIN_ENV_FILE. Cannot proceed with API-dependent tests."
exit 1
fi
log_info "Source .env file found at $MAIN_ENV_FILE."
# Check if sample PRD exists
if [ ! -f "$SAMPLE_PRD_SOURCE" ]; then
log_error "Sample PRD not found at $SAMPLE_PRD_SOURCE. Please check path."
exit 1
fi
log_info "Copying sample PRD to test directory..."
cp "$SAMPLE_PRD_SOURCE" "$TEST_RUN_DIR/prd.txt"
if [ ! -f "$TEST_RUN_DIR/prd.txt" ]; then
log_error "Failed to copy sample PRD to $TEST_RUN_DIR."
exit 1
fi
log_success "Sample PRD copied."
# ORIGINAL_DIR=$(pwd) # Save original dir # <<< REMOVED FROM HERE
cd "$TEST_RUN_DIR"
log_info "Changed directory to $(pwd)"
# === Copy .env file BEFORE init ===
log_step "Copying source .env file for API keys"
if cp "$ORIGINAL_DIR/.env" ".env"; then
log_success ".env file copied successfully."
else
log_error "Failed to copy .env file from $ORIGINAL_DIR/.env"
exit 1
fi
# ========================================
# --- Test Execution (Output to tee) ---
log_step "Linking task-master-ai package locally"
npm link task-master-ai
log_success "Package linked locally."
log_step "Initializing Task Master project (non-interactive)"
task-master init -y --name="E2E Test $TIMESTAMP" --description="Automated E2E test run"
if [ ! -f ".taskmaster/config.json" ]; then
log_error "Initialization failed: .taskmaster/config.json not found."
exit 1
fi
log_success "Project initialized."
log_step "Parsing PRD"
cmd_output_prd=$(task-master parse-prd ./prd.txt --force 2>&1)
exit_status_prd=$?
echo "$cmd_output_prd"
extract_and_sum_cost "$cmd_output_prd"
if [ $exit_status_prd -ne 0 ] || [ ! -s ".taskmaster/tasks/tasks.json" ]; then
log_error "Parsing PRD failed: .taskmaster/tasks/tasks.json not found or is empty. Exit status: $exit_status_prd"
exit 1
else
log_success "PRD parsed successfully."
fi
log_step "Expanding Task 1 (to ensure subtask 1.1 exists)"
cmd_output_analyze=$(task-master analyze-complexity --research --output complexity_results.json 2>&1)
exit_status_analyze=$?
echo "$cmd_output_analyze"
extract_and_sum_cost "$cmd_output_analyze"
if [ $exit_status_analyze -ne 0 ] || [ ! -f "complexity_results.json" ]; then
log_error "Complexity analysis failed: complexity_results.json not found. Exit status: $exit_status_analyze"
exit 1
else
log_success "Complexity analysis saved to complexity_results.json"
fi
log_step "Generating complexity report"
task-master complexity-report --file complexity_results.json > complexity_report_formatted.log
log_success "Formatted complexity report saved to complexity_report_formatted.log"
log_step "Expanding Task 1 (assuming it exists)"
cmd_output_expand1=$(task-master expand --id=1 2>&1)
exit_status_expand1=$?
echo "$cmd_output_expand1"
extract_and_sum_cost "$cmd_output_expand1"
if [ $exit_status_expand1 -ne 0 ]; then
log_error "Expanding Task 1 failed. Exit status: $exit_status_expand1"
else
log_success "Attempted to expand Task 1."
fi
log_step "Setting status for Subtask 1.1 (assuming it exists)"
task-master set-status --id=1.1 --status=done
log_success "Attempted to set status for Subtask 1.1 to 'done'."
log_step "Listing tasks again (after changes)"
task-master list --with-subtasks > task_list_after_changes.log
log_success "Task list after changes saved to task_list_after_changes.log"
# === Start New Test Section: Tag-Aware Expand Testing ===
log_step "Creating additional tag for expand testing"
task-master add-tag feature-expand --description="Tag for testing expand command with tag preservation"
log_success "Created feature-expand tag."
log_step "Adding task to feature-expand tag"
task-master add-task --tag=feature-expand --prompt="Test task for tag-aware expansion" --priority=medium
# Get the new task ID dynamically
new_expand_task_id=$(jq -r '.["feature-expand"].tasks[-1].id' .taskmaster/tasks/tasks.json)
log_success "Added task $new_expand_task_id to feature-expand tag."
log_step "Verifying tags exist before expand test"
task-master tags > tags_before_expand.log
tag_count_before=$(jq 'keys | length' .taskmaster/tasks/tasks.json)
log_success "Tag count before expand: $tag_count_before"
log_step "Expanding task in feature-expand tag (testing tag corruption fix)"
cmd_output_expand_tagged=$(task-master expand --tag=feature-expand --id="$new_expand_task_id" 2>&1)
exit_status_expand_tagged=$?
echo "$cmd_output_expand_tagged"
extract_and_sum_cost "$cmd_output_expand_tagged"
if [ $exit_status_expand_tagged -ne 0 ]; then
log_error "Tagged expand failed. Exit status: $exit_status_expand_tagged"
else
log_success "Tagged expand completed."
fi
log_step "Verifying tag preservation after expand"
task-master tags > tags_after_expand.log
tag_count_after=$(jq 'keys | length' .taskmaster/tasks/tasks.json)
if [ "$tag_count_before" -eq "$tag_count_after" ]; then
log_success "Tag count preserved: $tag_count_after (no corruption detected)"
else
log_error "Tag corruption detected! Before: $tag_count_before, After: $tag_count_after"
fi
log_step "Verifying master tag still exists and has tasks"
master_task_count=$(jq -r '.master.tasks | length' .taskmaster/tasks/tasks.json 2>/dev/null || echo "0")
if [ "$master_task_count" -gt "0" ]; then
log_success "Master tag preserved with $master_task_count tasks"
else
log_error "Master tag corrupted or empty after tagged expand"
fi
log_step "Verifying feature-expand tag has expanded subtasks"
expanded_subtask_count=$(jq -r ".\"feature-expand\".tasks[] | select(.id == $new_expand_task_id) | .subtasks | length" .taskmaster/tasks/tasks.json 2>/dev/null || echo "0")
if [ "$expanded_subtask_count" -gt "0" ]; then
log_success "Expand successful: $expanded_subtask_count subtasks created in feature-expand tag"
else
log_error "Expand failed: No subtasks found in feature-expand tag"
fi
log_step "Testing force expand with tag preservation"
cmd_output_force_expand=$(task-master expand --tag=feature-expand --id="$new_expand_task_id" --force 2>&1)
exit_status_force_expand=$?
echo "$cmd_output_force_expand"
extract_and_sum_cost "$cmd_output_force_expand"
# Verify tags still preserved after force expand
tag_count_after_force=$(jq 'keys | length' .taskmaster/tasks/tasks.json)
if [ "$tag_count_before" -eq "$tag_count_after_force" ]; then
log_success "Force expand preserved all tags"
else
log_error "Force expand caused tag corruption"
fi
log_step "Testing expand --all with tag preservation"
# Add another task to feature-expand for expand-all testing
task-master add-task --tag=feature-expand --prompt="Second task for expand-all testing" --priority=low
second_expand_task_id=$(jq -r '.["feature-expand"].tasks[-1].id' .taskmaster/tasks/tasks.json)
cmd_output_expand_all=$(task-master expand --tag=feature-expand --all 2>&1)
exit_status_expand_all=$?
echo "$cmd_output_expand_all"
extract_and_sum_cost "$cmd_output_expand_all"
# Verify tags preserved after expand-all
tag_count_after_all=$(jq 'keys | length' .taskmaster/tasks/tasks.json)
if [ "$tag_count_before" -eq "$tag_count_after_all" ]; then
log_success "Expand --all preserved all tags"
else
log_error "Expand --all caused tag corruption"
fi
log_success "Completed expand --all tag preservation test."
# === End New Test Section: Tag-Aware Expand Testing ===
# === Test Model Commands ===
log_step "Checking initial model configuration"
task-master models > models_initial_config.log
log_success "Initial model config saved to models_initial_config.log"
log_step "Setting main model"
task-master models --set-main claude-3-7-sonnet-20250219
log_success "Set main model."
log_step "Setting research model"
task-master models --set-research sonar-pro
log_success "Set research model."
log_step "Setting fallback model"
task-master models --set-fallback claude-3-5-sonnet-20241022
log_success "Set fallback model."
log_step "Checking final model configuration"
task-master models > models_final_config.log
log_success "Final model config saved to models_final_config.log"
log_step "Resetting main model to default (Claude Sonnet) before provider tests"
task-master models --set-main claude-3-7-sonnet-20250219
log_success "Main model reset to claude-3-7-sonnet-20250219."
# === End Model Commands Test ===
# === Fallback Model generateObjectService Verification ===
if [ "$run_verification_test" = true ]; then
log_step "Starting Fallback Model (generateObjectService) Verification (Calls separate script)"
verification_script_path="$ORIGINAL_DIR/tests/e2e/run_fallback_verification.sh"
if [ -x "$verification_script_path" ]; then
log_info "--- Executing Fallback Verification Script: $verification_script_path ---"
verification_output=$("$verification_script_path" "$(pwd)" 2>&1)
verification_exit_code=$?
echo "$verification_output"
extract_and_sum_cost "$verification_output"
log_info "--- Finished Fallback Verification Script Execution (Exit Code: $verification_exit_code) ---"
# Log success/failure based on captured exit code
if [ $verification_exit_code -eq 0 ]; then
log_success "Fallback verification script reported success."
else
log_error "Fallback verification script reported FAILURE (Exit Code: $verification_exit_code)."
fi
else
log_error "Fallback verification script not found or not executable at $verification_script_path. Skipping verification."
fi
else
log_info "Skipping Fallback Verification test as requested by flag."
fi
# === END Verification Section ===
# === Multi-Provider Add-Task Test (Keep as is) ===
log_step "Starting Multi-Provider Add-Task Test Sequence"
# Define providers, models, and flags
# Array order matters: providers[i] corresponds to models[i] and flags[i]
declare -a providers=("anthropic" "openai" "google" "perplexity" "xai" "openrouter")
declare -a models=(
"claude-3-7-sonnet-20250219"
"gpt-4o"
"gemini-2.5-pro-preview-05-06"
"sonar-pro" # Note: This is research-only, add-task might fail if not using research model
"grok-3"
"anthropic/claude-3.7-sonnet" # OpenRouter uses Claude 3.7
)
# Flags: Add provider-specific flags here, e.g., --openrouter. Use empty string if none.
declare -a flags=("" "" "" "" "" "--openrouter")
# Consistent prompt for all providers
add_task_prompt="Create a task to implement user authentication using OAuth 2.0 with Google as the provider. Include steps for registering the app, handling the callback, and storing user sessions."
log_info "Using consistent prompt for add-task tests: \"$add_task_prompt\""
echo "--- Multi-Provider Add Task Summary ---" > provider_add_task_summary.log # Initialize summary log
for i in "${!providers[@]}"; do
provider="${providers[$i]}"
model="${models[$i]}"
flag="${flags[$i]}"
log_step "Testing Add-Task with Provider: $provider (Model: $model)"
# 1. Set the main model for this provider
log_info "Setting main model to $model for $provider ${flag:+using flag $flag}..."
set_model_cmd="task-master models --set-main \"$model\" $flag"
echo "Executing: $set_model_cmd"
if eval $set_model_cmd; then
log_success "Successfully set main model for $provider."
else
log_error "Failed to set main model for $provider. Skipping add-task for this provider."
# Optionally save failure info here if needed for LLM analysis
echo "Provider $provider set-main FAILED" >> provider_add_task_summary.log
continue # Skip to the next provider
fi
# 2. Run add-task
log_info "Running add-task with prompt..."
add_task_output_file="add_task_raw_output_${provider}_${model//\//_}.log" # Sanitize ID
# Run add-task and capture ALL output (stdout & stderr) to a file AND a variable
add_task_cmd_output=$(task-master add-task --prompt "$add_task_prompt" 2>&1 | tee "$add_task_output_file")
add_task_exit_code=${PIPESTATUS[0]}
# 3. Check for success and extract task ID
new_task_id=""
extract_and_sum_cost "$add_task_cmd_output"
if [ $add_task_exit_code -eq 0 ] && (echo "$add_task_cmd_output" | grep -q "✓ Added new task #" || echo "$add_task_cmd_output" | grep -q "✅ New task created successfully:" || echo "$add_task_cmd_output" | grep -q "Task [0-9]\+ Created Successfully"); then
new_task_id=$(echo "$add_task_cmd_output" | grep -o -E "(Task |#)[0-9.]+" | grep -o -E "[0-9.]+" | head -n 1)
if [ -n "$new_task_id" ]; then
log_success "Add-task succeeded for $provider. New task ID: $new_task_id"
echo "Provider $provider add-task SUCCESS (ID: $new_task_id)" >> provider_add_task_summary.log
else
# Succeeded but couldn't parse ID - treat as warning/anomaly
log_error "Add-task command succeeded for $provider, but failed to extract task ID from output."
echo "Provider $provider add-task SUCCESS (ID extraction FAILED)" >> provider_add_task_summary.log
new_task_id="UNKNOWN_ID_EXTRACTION_FAILED"
fi
else
log_error "Add-task command failed for $provider (Exit Code: $add_task_exit_code). See $add_task_output_file for details."
echo "Provider $provider add-task FAILED (Exit Code: $add_task_exit_code)" >> provider_add_task_summary.log
new_task_id="FAILED"
fi
# 4. Run task show if ID was obtained (even if extraction failed, use placeholder)
if [ "$new_task_id" != "FAILED" ] && [ "$new_task_id" != "UNKNOWN_ID_EXTRACTION_FAILED" ]; then
log_info "Running task show for new task ID: $new_task_id"
show_output_file="add_task_show_output_${provider}_id_${new_task_id}.log"
if task-master show "$new_task_id" > "$show_output_file"; then
log_success "Task show output saved to $show_output_file"
else
log_error "task show command failed for ID $new_task_id. Check log."
# Still keep the file, it might contain error output
fi
elif [ "$new_task_id" == "UNKNOWN_ID_EXTRACTION_FAILED" ]; then
log_info "Skipping task show for $provider due to ID extraction failure."
else
log_info "Skipping task show for $provider due to add-task failure."
fi
done # End of provider loop
log_step "Finished Multi-Provider Add-Task Test Sequence"
echo "Provider add-task summary log available at: provider_add_task_summary.log"
# === End Multi-Provider Add-Task Test ===
log_step "Listing tasks again (after multi-add)"
task-master list --with-subtasks > task_list_after_multi_add.log
log_success "Task list after multi-add saved to task_list_after_multi_add.log"
# === Resume Core Task Commands Test ===
log_step "Listing tasks (for core tests)"
task-master list > task_list_core_test_start.log
log_success "Core test initial task list saved."
log_step "Getting next task"
task-master next > next_task_core_test.log
log_success "Core test next task saved."
log_step "Showing Task 1 details"
task-master show 1 > task_1_details_core_test.log
log_success "Task 1 details saved."
log_step "Adding dependency (Task 2 depends on Task 1)"
task-master add-dependency --id=2 --depends-on=1
log_success "Added dependency 2->1."
log_step "Validating dependencies (after add)"
task-master validate-dependencies > validate_dependencies_after_add_core.log
log_success "Dependency validation after add saved."
log_step "Removing dependency (Task 2 depends on Task 1)"
task-master remove-dependency --id=2 --depends-on=1
log_success "Removed dependency 2->1."
log_step "Fixing dependencies (should be no-op now)"
task-master fix-dependencies > fix_dependencies_output_core.log
log_success "Fix dependencies attempted."
# === Start New Test Section: Validate/Fix Bad Dependencies ===
log_step "Intentionally adding non-existent dependency (1 -> 999)"
task-master add-dependency --id=1 --depends-on=999 || log_error "Failed to add non-existent dependency (unexpected)"
# Don't exit even if the above fails, the goal is to test validation
log_success "Attempted to add dependency 1 -> 999."
log_step "Validating dependencies (expecting non-existent error)"
task-master validate-dependencies > validate_deps_non_existent.log 2>&1 || true # Allow command to fail without exiting script
if grep -q "Non-existent dependency ID: 999" validate_deps_non_existent.log; then
log_success "Validation correctly identified non-existent dependency 999."
else
log_error "Validation DID NOT report non-existent dependency 999 as expected. Check validate_deps_non_existent.log"
fi
log_step "Fixing dependencies (should remove 1 -> 999)"
task-master fix-dependencies > fix_deps_after_non_existent.log
log_success "Attempted to fix dependencies."
log_step "Validating dependencies (after fix)"
task-master validate-dependencies > validate_deps_after_fix_non_existent.log 2>&1 || true # Allow potential failure
if grep -q "Non-existent dependency ID: 999" validate_deps_after_fix_non_existent.log; then
log_error "Validation STILL reports non-existent dependency 999 after fix. Check logs."
else
log_success "Validation shows non-existent dependency 999 was removed."
fi
log_step "Intentionally adding circular dependency (4 -> 5 -> 4)"
task-master add-dependency --id=4 --depends-on=5 || log_error "Failed to add dependency 4->5"
task-master add-dependency --id=5 --depends-on=4 || log_error "Failed to add dependency 5->4"
log_success "Attempted to add dependencies 4 -> 5 and 5 -> 4."
log_step "Validating dependencies (expecting circular error)"
task-master validate-dependencies > validate_deps_circular.log 2>&1 || true # Allow command to fail
# Note: Adjust the grep pattern based on the EXACT error message from validate-dependencies
if grep -q -E "Circular dependency detected involving task IDs: (4, 5|5, 4)" validate_deps_circular.log; then
log_success "Validation correctly identified circular dependency between 4 and 5."
else
log_error "Validation DID NOT report circular dependency 4<->5 as expected. Check validate_deps_circular.log"
fi
log_step "Fixing dependencies (should remove one side of 4 <-> 5)"
task-master fix-dependencies > fix_deps_after_circular.log
log_success "Attempted to fix dependencies."
log_step "Validating dependencies (after fix circular)"
task-master validate-dependencies > validate_deps_after_fix_circular.log 2>&1 || true # Allow potential failure
if grep -q -E "Circular dependency detected involving task IDs: (4, 5|5, 4)" validate_deps_after_fix_circular.log; then
log_error "Validation STILL reports circular dependency 4<->5 after fix. Check logs."
else
log_success "Validation shows circular dependency 4<->5 was resolved."
fi
# === End New Test Section ===
# Find the next available task ID dynamically instead of hardcoding 11, 12
# Assuming tasks are added sequentially and we didn't remove any core tasks yet
last_task_id=$(jq '[.master.tasks[].id] | max' .taskmaster/tasks/tasks.json)
manual_task_id=$((last_task_id + 1))
ai_task_id=$((manual_task_id + 1))
log_step "Adding Task $manual_task_id (Manual)"
task-master add-task --title="Manual E2E Task" --description="Add basic health check endpoint" --priority=low --dependencies=3 # Depends on backend setup
log_success "Added Task $manual_task_id manually."
log_step "Adding Task $ai_task_id (AI)"
cmd_output_add_ai=$(task-master add-task --prompt="Implement basic UI styling using CSS variables for colors and spacing" --priority=medium --dependencies=1 2>&1)
exit_status_add_ai=$?
echo "$cmd_output_add_ai"
extract_and_sum_cost "$cmd_output_add_ai"
if [ $exit_status_add_ai -ne 0 ]; then
log_error "Adding AI Task $ai_task_id failed. Exit status: $exit_status_add_ai"
else
log_success "Added Task $ai_task_id via AI prompt."
fi
log_step "Updating Task 3 (update-task AI)"
cmd_output_update_task3=$(task-master update-task --id=3 --prompt="Update backend server setup: Ensure CORS is configured to allow requests from the frontend origin." 2>&1)
exit_status_update_task3=$?
echo "$cmd_output_update_task3"
extract_and_sum_cost "$cmd_output_update_task3"
if [ $exit_status_update_task3 -ne 0 ]; then
log_error "Updating Task 3 failed. Exit status: $exit_status_update_task3"
else
log_success "Attempted update for Task 3."
fi
log_step "Updating Tasks from Task 5 (update AI)"
cmd_output_update_from5=$(task-master update --from=5 --prompt="Refactor the backend storage module to use a simple JSON file (storage.json) instead of an in-memory object for persistence. Update relevant tasks." 2>&1)
exit_status_update_from5=$?
echo "$cmd_output_update_from5"
extract_and_sum_cost "$cmd_output_update_from5"
if [ $exit_status_update_from5 -ne 0 ]; then
log_error "Updating from Task 5 failed. Exit status: $exit_status_update_from5"
else
log_success "Attempted update from Task 5 onwards."
fi
log_step "Expanding Task 8 (AI)"
cmd_output_expand8=$(task-master expand --id=8 2>&1)
exit_status_expand8=$?
echo "$cmd_output_expand8"
extract_and_sum_cost "$cmd_output_expand8"
if [ $exit_status_expand8 -ne 0 ]; then
log_error "Expanding Task 8 failed. Exit status: $exit_status_expand8"
else
log_success "Attempted to expand Task 8."
fi
log_step "Updating Subtask 8.1 (update-subtask AI)"
cmd_output_update_subtask81=$(task-master update-subtask --id=8.1 --prompt="Implementation note: Remember to handle potential API errors and display a user-friendly message." 2>&1)
exit_status_update_subtask81=$?
echo "$cmd_output_update_subtask81"
extract_and_sum_cost "$cmd_output_update_subtask81"
if [ $exit_status_update_subtask81 -ne 0 ]; then
log_error "Updating Subtask 8.1 failed. Exit status: $exit_status_update_subtask81"
else
log_success "Attempted update for Subtask 8.1."
fi
# Add a couple more subtasks for multi-remove test
log_step 'Adding subtasks to Task 2 (for multi-remove test)'
task-master add-subtask --parent=2 --title="Subtask 2.1 for removal"
task-master add-subtask --parent=2 --title="Subtask 2.2 for removal"
log_success "Added subtasks 2.1 and 2.2."
log_step "Removing Subtasks 2.1 and 2.2 (multi-ID)"
task-master remove-subtask --id=2.1,2.2
log_success "Removed subtasks 2.1 and 2.2."
log_step "Setting status for Task 1 to done"
task-master set-status --id=1 --status=done
log_success "Set status for Task 1 to done."
log_step "Getting next task (after status change)"
task-master next > next_task_after_change_core.log
log_success "Next task after change saved."
# === Start New Test Section: List Filtering ===
log_step "Listing tasks filtered by status 'done'"
task-master list --status=done > task_list_status_done.log
log_success "Filtered list saved to task_list_status_done.log (Manual/LLM check recommended)"
# Optional assertion: Check if Task 1 ID exists and Task 2 ID does NOT
# if grep -q "^1\." task_list_status_done.log && ! grep -q "^2\." task_list_status_done.log; then
# log_success "Basic check passed: Task 1 found, Task 2 not found in 'done' list."
# else
# log_error "Basic check failed for list --status=done."
# fi
# === End New Test Section ===
log_step "Clearing subtasks from Task 8"
task-master clear-subtasks --id=8
log_success "Attempted to clear subtasks from Task 8."
log_step "Removing Tasks $manual_task_id and $ai_task_id (multi-ID)"
# Remove the tasks we added earlier
task-master remove-task --id="$manual_task_id,$ai_task_id" -y
log_success "Removed tasks $manual_task_id and $ai_task_id."
# === Start New Test Section: Subtasks & Dependencies ===
log_step "Expanding Task 2 (to ensure multiple tasks have subtasks)"
task-master expand --id=2 # Expand task 2: Backend setup
log_success "Attempted to expand Task 2."
log_step "Listing tasks with subtasks (Before Clear All)"
task-master list --with-subtasks > task_list_before_clear_all.log
log_success "Task list before clear-all saved."
log_step "Clearing ALL subtasks"
task-master clear-subtasks --all
log_success "Attempted to clear all subtasks."
log_step "Listing tasks with subtasks (After Clear All)"
task-master list --with-subtasks > task_list_after_clear_all.log
log_success "Task list after clear-all saved. (Manual/LLM check recommended to verify subtasks removed)"
log_step "Expanding Task 3 again (to have subtasks for next test)"
task-master expand --id=3
log_success "Attempted to expand Task 3."
# Verify 3.1 exists
if ! jq -e '.master.tasks[] | select(.id == 3) | .subtasks[] | select(.id == 1)' .taskmaster/tasks/tasks.json > /dev/null; then
log_error "Subtask 3.1 not found in tasks.json after expanding Task 3."
exit 1
fi
log_step "Adding dependency: Task 4 depends on Subtask 3.1"
task-master add-dependency --id=4 --depends-on=3.1
log_success "Added dependency 4 -> 3.1."
log_step "Showing Task 4 details (after adding subtask dependency)"
task-master show 4 > task_4_details_after_dep_add.log
log_success "Task 4 details saved. (Manual/LLM check recommended for dependency [3.1])"
log_step "Removing dependency: Task 4 depends on Subtask 3.1"
task-master remove-dependency --id=4 --depends-on=3.1
log_success "Removed dependency 4 -> 3.1."
log_step "Showing Task 4 details (after removing subtask dependency)"
task-master show 4 > task_4_details_after_dep_remove.log
log_success "Task 4 details saved. (Manual/LLM check recommended to verify dependency removed)"
# === End New Test Section ===
log_step "Generating task files (final)"
task-master generate
log_success "Generated task files."
# === End Core Task Commands Test ===
# === AI Commands (Re-test some after changes) ===
log_step "Analyzing complexity (AI with Research - Final Check)"
cmd_output_analyze_final=$(task-master analyze-complexity --research --output complexity_results_final.json 2>&1)
exit_status_analyze_final=$?
echo "$cmd_output_analyze_final"
extract_and_sum_cost "$cmd_output_analyze_final"
if [ $exit_status_analyze_final -ne 0 ] || [ ! -f "complexity_results_final.json" ]; then
log_error "Final Complexity analysis failed. Exit status: $exit_status_analyze_final. File found: $(test -f complexity_results_final.json && echo true || echo false)"
exit 1 # Critical for subsequent report step
else
log_success "Final Complexity analysis command executed and file created."
fi
log_step "Generating complexity report (Non-AI - Final Check)"
task-master complexity-report --file complexity_results_final.json > complexity_report_formatted_final.log
log_success "Final Formatted complexity report saved."
# === End AI Commands Re-test ===
log_step "Listing tasks again (final)"
task-master list --with-subtasks > task_list_final.log
log_success "Final task list saved to task_list_final.log"
# --- Test Completion (Output to tee) ---
log_step "E2E Test Steps Completed"
echo ""
ABS_TEST_RUN_DIR="$(pwd)"
echo "Test artifacts and logs are located in: $ABS_TEST_RUN_DIR"
echo "Key artifact files (within above dir):"
ls -1 # List files in the current directory
echo ""
echo "Full script log also available at: $LOG_FILE (relative to project root)"
# Optional: cd back to original directory
# cd "$ORIGINAL_DIR"
# End of the main execution block brace
} 2>&1 | tee "$LOG_FILE"
# --- Final Terminal Message ---
EXIT_CODE=${PIPESTATUS[0]}
overall_end_time=$(date +%s)
total_elapsed_seconds=$((overall_end_time - overall_start_time))
# Format total duration
total_minutes=$((total_elapsed_seconds / 60))
total_sec_rem=$((total_elapsed_seconds % 60))
formatted_total_time=$(printf "%dm%02ds" "$total_minutes" "$total_sec_rem")
# Count steps and successes from the log file *after* the pipe finishes
# Use grep -c for counting lines matching the pattern
# Corrected pattern to match ' STEP X:' format
final_step_count=$(grep -c '^[[:space:]]\+STEP [0-9]\+:' "$LOG_FILE" || true)
final_success_count=$(grep -c '\[SUCCESS\]' "$LOG_FILE" || true) # Count lines containing [SUCCESS]
echo "--- E2E Run Summary ---"
echo "Log File: $LOG_FILE"
echo "Total Elapsed Time: ${formatted_total_time}"
echo "Total Steps Executed: ${final_step_count}" # Use count from log
if [ $EXIT_CODE -eq 0 ]; then
echo "Status: SUCCESS"
# Use counts from log file
echo "Successful Steps: ${final_success_count}/${final_step_count}"
else
echo "Status: FAILED"
# Use count from log file for total steps attempted
echo "Failure likely occurred during/after Step: ${final_step_count}"
# Use count from log file for successes before failure
echo "Successful Steps Before Failure: ${final_success_count}"
echo "Please check the log file '$LOG_FILE' for error details."
fi
echo "-------------------------"
# --- Attempt LLM Analysis ---
# Run this *after* the main execution block and tee pipe finish writing the log file
if [ -d "$TEST_RUN_DIR" ]; then
# Define absolute path to source dir if not already defined (though it should be by setup)
TASKMASTER_SOURCE_DIR_ABS=${TASKMASTER_SOURCE_DIR_ABS:-$(cd "$ORIGINAL_DIR/$TASKMASTER_SOURCE_DIR" && pwd)}
cd "$TEST_RUN_DIR"
# Pass the absolute source directory path
analyze_log_with_llm "$LOG_FILE" "$TASKMASTER_SOURCE_DIR_ABS"
ANALYSIS_EXIT_CODE=$? # Capture the exit code of the analysis function
# Optional: cd back again if needed
cd "$ORIGINAL_DIR" # Ensure we change back to the original directory
else
formatted_duration_for_error=$(_format_duration "$total_elapsed_seconds")
echo "[ERROR] [$formatted_duration_for_error] $(date +"%Y-%m-%d %H:%M:%S") Test run directory $TEST_RUN_DIR not found. Cannot perform LLM analysis." >&2
fi
# Final cost formatting
formatted_total_e2e_cost=$(printf "%.6f" "$total_e2e_cost")
echo "Total E2E AI Cost: $formatted_total_e2e_cost USD"
exit $EXIT_CODE