fix merge conflicts to prep for merge with branch next

- Enhance E2E testing and LLM analysis report and: - Add --analyze-log flag to run_e2e.sh to re-run LLM analysis on existing logs. - Add test:e2e and analyze-log scripts to package.json for easier execution. - Correct display errors and dependency validation output: - Update chalk usage in add-task.js to use bracket notation (chalk[color]) compatible with v5, resolving 'chalk.keyword is not a function' error. - Modify fix-dependencies command output to show red failure box with issue count instead of green success box when validation fails. - Refactor interactive model setup: - Verify inclusion of 'No change' option during interactive model setup flow (task-master models --setup). - Update model definitions: - Add max_tokens field for gpt-4o in supported-models.json. - Remove unused scripts: - Delete prepare-package.js and rule-transformer.test.js. Release candidate
2025-04-29 01:54:42 -04:00
parent 4cf7e8a74a e69a47d382
commit d2f761c652
48 changed files with 4744 additions and 1237 deletions
--- a/tests/e2e/e2e_helpers.sh
+++ b/tests/e2e/e2e_helpers.sh
@@ -0,0 +1,162 @@
+#!/bin/bash
+
+# --- LLM Analysis Helper Function ---
+# This function should be sourced by the main E2E script or test scripts.
+# It requires curl and jq to be installed.
+# It expects the project root path to be passed as the second argument.
+
+analyze_log_with_llm() {
+  local log_file="$1"
+  local project_root="$2" # Expect project root as the second argument
+
+  if [ -z "$project_root" ]; then
+      echo "[HELPER_ERROR] Project root argument is missing. Skipping LLM analysis." >&2
+      return 1
+  fi
+
+  local env_file="${project_root}/.env" # Path to .env in project root
+
+  local provider_summary_log="provider_add_task_summary.log" # File summarizing provider test outcomes
+  local api_key=""
+  # !!! IMPORTANT: Replace with your actual Claude API endpoint if different !!!
+  local api_endpoint="https://api.anthropic.com/v1/messages"
+  # !!! IMPORTANT: Ensure this matches the variable name in your .env file !!!
+  local api_key_name="ANTHROPIC_API_KEY"
+
+  echo "" # Add a newline before analysis starts
+
+  # Check for jq and curl
+  if ! command -v jq &> /dev/null; then
+    echo "[HELPER_ERROR] LLM Analysis requires 'jq'. Skipping analysis." >&2
+    return 1
+  fi
+  if ! command -v curl &> /dev/null; then
+    echo "[HELPER_ERROR] LLM Analysis requires 'curl'. Skipping analysis." >&2
+    return 1
+  fi
+
+  # Check for API Key in the PROJECT ROOT's .env file
+  if [ -f "$env_file" ]; then
+    # Original assignment - Reading from project root .env
+    api_key=$(grep "^${api_key_name}=" "$env_file" | sed -e "s/^${api_key_name}=//" -e 's/^[[:space:]"]*//' -e 's/[[:space:]"]*$//')
+  fi
+
+  if [ -z "$api_key" ]; then
+    echo "[HELPER_ERROR] ${api_key_name} not found or empty in project root .env file ($env_file). Skipping LLM analysis." >&2 # Updated error message
+    return 1
+  fi
+
+  # Log file path is passed as argument, need to ensure it exists relative to where the script *calling* this function is, OR use absolute path.
+  # Assuming absolute path or path relative to the initial PWD for simplicity here.
+  # The calling script passes the correct path relative to the original PWD.
+  if [ ! -f "$log_file" ]; then
+    echo "[HELPER_ERROR] Log file not found: $log_file (PWD: $(pwd)). Check path passed to function. Skipping LLM analysis." >&2 # Updated error
+    return 1
+  fi
+
+  local log_content
+  # Read entire file, handle potential errors
+  log_content=$(cat "$log_file") || {
+    echo "[HELPER_ERROR] Failed to read log file: $log_file. Skipping LLM analysis." >&2
+    return 1
+  }
+
+  # Prepare the prompt using a quoted heredoc for literal interpretation
+  read -r -d '' prompt_template <<'EOF'
+Analyze the following E2E test log for the task-master tool. The log contains output from various 'task-master' commands executed sequentially.
+
+Your goal is to:
+1. Verify if the key E2E steps completed successfully based on the log messages (e.g., init, parse PRD, list tasks, analyze complexity, expand task, set status, manage models, add/remove dependencies, add/update/remove tasks/subtasks, generate files).
+2. **Specifically analyze the Multi-Provider Add-Task Test Sequence:**
+   a. Identify which providers were tested for `add-task`. Look for log steps like "Testing Add-Task with Provider: ..." and the summary log 'provider_add_task_summary.log'.
+   b. For each tested provider, determine if `add-task` succeeded or failed. Note the created task ID if successful.
+   c. Review the corresponding `add_task_show_output_<provider>_id_<id>.log` file (if created) for each successful `add-task` execution.
+   d. **Compare the quality and completeness** of the task generated by each successful provider based on their `show` output. Assign a score (e.g., 1-10, 10 being best) based on relevance to the prompt, detail level, and correctness.
+   e. Note any providers where `add-task` failed or where the task ID could not be extracted.
+3. Identify any general explicit "[ERROR]" messages or stack traces throughout the *entire* log.
+4. Identify any potential warnings or unusual output that might indicate a problem even if not marked as an explicit error.
+5. Provide an overall assessment of the test run's health based *only* on the log content.
+
+Return your analysis **strictly** in the following JSON format. Do not include any text outside of the JSON structure:
+
+{
+  "overall_status": "Success|Failure|Warning",
+  "verified_steps": [ "Initialization", "PRD Parsing", /* ...other general steps observed... */ ],
+  "provider_add_task_comparison": {
+     "prompt_used": "... (extract from log if possible or state 'standard auth prompt') ...",
+     "provider_results": {
+       "anthropic": { "status": "Success|Failure|ID_Extraction_Failed|Set_Model_Failed", "task_id": "...", "score": "X/10 | N/A", "notes": "..." },
+       "openai": { "status": "Success|Failure|...", "task_id": "...", "score": "X/10 | N/A", "notes": "..." },
+       /* ... include all tested providers ... */
+     },
+     "comparison_summary": "Brief overall comparison of generated tasks..."
+   },
+  "detected_issues": [ { "severity": "Error|Warning|Anomaly", "description": "...", "log_context": "[Optional, short snippet from log near the issue]" } ],
+  "llm_summary_points": [ "Overall summary point 1", "Provider comparison highlight", "Any major issues noted" ]
+}
+
+Here is the main log content:
+
+%s
+EOF
+# Note: The final %s is a placeholder for printf later
+
+  local full_prompt
+  # Use printf to substitute the log content into the %s placeholder
+  if ! printf -v full_prompt "$prompt_template" "$log_content"; then
+    echo "[HELPER_ERROR] Failed to format prompt using printf." >&2
+    # It's unlikely printf itself fails, but good practice
+    return 1
+  fi
+
+  # Construct the JSON payload for Claude Messages API
+  local payload
+  payload=$(jq -n --arg prompt "$full_prompt" '{
+    "model": "claude-3-haiku-20240307",  # Using Haiku for faster/cheaper testing
+    "max_tokens": 3072, # Increased slightly
+    "messages": [
+      {"role": "user", "content": $prompt}
+    ]
+    # "temperature": 0.0 # Optional: Lower temperature for more deterministic JSON output
+  }') || {
+      echo "[HELPER_ERROR] Failed to create JSON payload using jq." >&2
+      return 1
+  }
+
+  local response_raw response_http_code response_body
+  # Capture body and HTTP status code separately
+  response_raw=$(curl -s -w "\nHTTP_STATUS_CODE:%{http_code}" -X POST "$api_endpoint" \
+       -H "Content-Type: application/json" \
+       -H "x-api-key: $api_key" \
+       -H "anthropic-version: 2023-06-01" \
+       --data "$payload")
+
+  # Extract status code and body
+  response_http_code=$(echo "$response_raw" | grep '^HTTP_STATUS_CODE:' | sed 's/HTTP_STATUS_CODE://')
+  response_body=$(echo "$response_raw" | sed '$d') # Remove last line (status code)
+
+  if [ "$response_http_code" != "200" ]; then
+      echo "[HELPER_ERROR] LLM API call failed with HTTP status $response_http_code." >&2
+      echo "[HELPER_ERROR] Response Body: $response_body" >&2
+      return 1
+  fi
+
+  if [ -z "$response_body" ]; then
+      echo "[HELPER_ERROR] LLM API call returned empty response body." >&2
+      return 1
+  fi
+
+  # Pipe the raw response body directly to the Node.js parser script
+  if echo "$response_body" | node "${project_root}/tests/e2e/parse_llm_output.cjs" "$log_file"; then
+      echo "[HELPER_SUCCESS] LLM analysis parsed and printed successfully by Node.js script."
+      return 0 # Success
+  else
+      local node_exit_code=$?
+      echo "[HELPER_ERROR] Node.js parsing script failed with exit code ${node_exit_code}."
+      echo "[HELPER_ERROR] Raw API response body (first 500 chars): $(echo "$response_body" | head -c 500)"
+      return 1 # Failure
+  fi
+}
+
+# Export the function so it might be available to subshells if sourced
+export -f analyze_log_with_llm 
--- a/tests/e2e/parse_llm_output.cjs
+++ b/tests/e2e/parse_llm_output.cjs
@@ -0,0 +1,266 @@
+#!/usr/bin/env node
+
+// Note: We will use dynamic import() inside the async callback due to project being type: module
+
+const readline = require('readline');
+const path = require('path'); // Import path module
+
+let inputData = '';
+
+const rl = readline.createInterface({
+	input: process.stdin,
+	output: process.stdout,
+	terminal: false
+});
+
+rl.on('line', (line) => {
+	inputData += line;
+});
+
+// Make the callback async to allow await for dynamic imports
+rl.on('close', async () => {
+	let chalk, boxen, Table;
+	try {
+		// Dynamically import libraries
+		chalk = (await import('chalk')).default;
+		boxen = (await import('boxen')).default;
+		Table = (await import('cli-table3')).default;
+
+		// 1. Parse the initial API response body
+		const apiResponse = JSON.parse(inputData);
+
+		// 2. Extract the text content containing the nested JSON
+		// Robust check for content structure
+		const textContent = apiResponse?.content?.[0]?.text;
+		if (!textContent) {
+			console.error(
+				chalk.red(
+					"Error: Could not find '.content[0].text' in the API response JSON."
+				)
+			);
+			process.exit(1);
+		}
+
+		// 3. Find the start of the actual JSON block
+		const jsonStart = textContent.indexOf('{');
+		const jsonEnd = textContent.lastIndexOf('}');
+
+		if (jsonStart === -1 || jsonEnd === -1 || jsonEnd < jsonStart) {
+			console.error(
+				chalk.red(
+					'Error: Could not find JSON block starting with { and ending with } in the extracted text content.'
+				)
+			);
+			process.exit(1);
+		}
+		const jsonString = textContent.substring(jsonStart, jsonEnd + 1);
+
+		// 4. Parse the extracted JSON string
+		let reportData;
+		try {
+			reportData = JSON.parse(jsonString);
+		} catch (parseError) {
+			console.error(
+				chalk.red('Error: Failed to parse the extracted JSON block.')
+			);
+			console.error(chalk.red('Parse Error:'), parseError.message);
+			process.exit(1);
+		}
+
+		// Ensure reportData is an object
+		if (typeof reportData !== 'object' || reportData === null) {
+			console.error(
+				chalk.red('Error: Parsed report data is not a valid object.')
+			);
+			process.exit(1);
+		}
+
+		// --- Get Log File Path and Format Timestamp ---
+		const logFilePath = process.argv[2]; // Get the log file path argument
+		let formattedTime = 'Unknown';
+		if (logFilePath) {
+			const logBasename = path.basename(logFilePath);
+			const timestampMatch = logBasename.match(/e2e_run_(\d{8}_\d{6})\.log$/);
+			if (timestampMatch && timestampMatch[1]) {
+				const ts = timestampMatch[1]; // YYYYMMDD_HHMMSS
+				// Format into YYYY-MM-DD HH:MM:SS
+				formattedTime = `${ts.substring(0, 4)}-${ts.substring(4, 6)}-${ts.substring(6, 8)} ${ts.substring(9, 11)}:${ts.substring(11, 13)}:${ts.substring(13, 15)}`;
+			}
+		}
+		// --------------------------------------------
+
+		// 5. Generate CLI Report (with defensive checks)
+		console.log(
+			'\n' +
+				chalk.cyan.bold(
+					boxen(
+						`TASKMASTER E2E Log Analysis Report\nRun Time: ${chalk.yellow(formattedTime)}`, // Display formatted time
+						{
+							padding: 1,
+							borderStyle: 'double',
+							borderColor: 'cyan',
+							textAlign: 'center' // Center align title
+						}
+					)
+				) +
+				'\n'
+		);
+
+		// Overall Status
+		let statusColor = chalk.white;
+		const overallStatus = reportData.overall_status || 'Unknown'; // Default if missing
+		if (overallStatus === 'Success') statusColor = chalk.green.bold;
+		if (overallStatus === 'Warning') statusColor = chalk.yellow.bold;
+		if (overallStatus === 'Failure') statusColor = chalk.red.bold;
+		console.log(
+			boxen(`Overall Status: ${statusColor(overallStatus)}`, {
+				padding: { left: 1, right: 1 },
+				margin: { bottom: 1 },
+				borderColor: 'blue'
+			})
+		);
+
+		// LLM Summary Points
+		console.log(chalk.blue.bold('📋 Summary Points:'));
+		if (
+			Array.isArray(reportData.llm_summary_points) &&
+			reportData.llm_summary_points.length > 0
+		) {
+			reportData.llm_summary_points.forEach((point) => {
+				console.log(chalk.white(`  - ${point || 'N/A'}`)); // Handle null/undefined points
+			});
+		} else {
+			console.log(chalk.gray('  No summary points provided.'));
+		}
+		console.log();
+
+		// Verified Steps
+		console.log(chalk.green.bold('✅ Verified Steps:'));
+		if (
+			Array.isArray(reportData.verified_steps) &&
+			reportData.verified_steps.length > 0
+		) {
+			reportData.verified_steps.forEach((step) => {
+				console.log(chalk.green(`  - ${step || 'N/A'}`)); // Handle null/undefined steps
+			});
+		} else {
+			console.log(chalk.gray('  No verified steps listed.'));
+		}
+		console.log();
+
+		// Provider Add-Task Comparison
+		console.log(chalk.magenta.bold('🔄 Provider Add-Task Comparison:'));
+		const comp = reportData.provider_add_task_comparison;
+		if (typeof comp === 'object' && comp !== null) {
+			console.log(
+				chalk.white(`  Prompt Used: ${comp.prompt_used || 'Not specified'}`)
+			);
+			console.log();
+
+			if (
+				typeof comp.provider_results === 'object' &&
+				comp.provider_results !== null &&
+				Object.keys(comp.provider_results).length > 0
+			) {
+				const providerTable = new Table({
+					head: ['Provider', 'Status', 'Task ID', 'Score', 'Notes'].map((h) =>
+						chalk.magenta.bold(h)
+					),
+					colWidths: [15, 18, 10, 12, 45],
+					style: { head: [], border: [] },
+					wordWrap: true
+				});
+
+				for (const provider in comp.provider_results) {
+					const result = comp.provider_results[provider] || {}; // Default to empty object if provider result is null/undefined
+					const status = result.status || 'Unknown';
+					const isSuccess = status === 'Success';
+					const statusIcon = isSuccess ? chalk.green('✅') : chalk.red('❌');
+					const statusText = isSuccess
+						? chalk.green(status)
+						: chalk.red(status);
+					providerTable.push([
+						chalk.white(provider),
+						`${statusIcon} ${statusText}`,
+						chalk.white(result.task_id || 'N/A'),
+						chalk.white(result.score || 'N/A'),
+						chalk.dim(result.notes || 'N/A')
+					]);
+				}
+				console.log(providerTable.toString());
+				console.log();
+			} else {
+				console.log(chalk.gray('  No provider results available.'));
+				console.log();
+			}
+			console.log(chalk.white.bold(`  Comparison Summary:`));
+			console.log(chalk.white(`  ${comp.comparison_summary || 'N/A'}`));
+		} else {
+			console.log(chalk.gray('  Provider comparison data not found.'));
+		}
+		console.log();
+
+		// Detected Issues
+		console.log(chalk.red.bold('🚨 Detected Issues:'));
+		if (
+			Array.isArray(reportData.detected_issues) &&
+			reportData.detected_issues.length > 0
+		) {
+			reportData.detected_issues.forEach((issue, index) => {
+				if (typeof issue !== 'object' || issue === null) return; // Skip invalid issue entries
+
+				const severity = issue.severity || 'Unknown';
+				let boxColor = 'blue';
+				let icon = 'ℹ️';
+				if (severity === 'Error') {
+					boxColor = 'red';
+					icon = '❌';
+				}
+				if (severity === 'Warning') {
+					boxColor = 'yellow';
+					icon = '⚠️';
+				}
+
+				let issueContent = `${chalk.bold('Description:')} ${chalk.white(issue.description || 'N/A')}`;
+				// Only add log context if it exists and is not empty
+				if (issue.log_context && String(issue.log_context).trim()) {
+					issueContent += `\n${chalk.bold('Log Context:')} \n${chalk.dim(String(issue.log_context).trim())}`;
+				}
+
+				console.log(
+					boxen(issueContent, {
+						title: `${icon} Issue ${index + 1}: [${severity}]`,
+						padding: 1,
+						margin: { top: 1, bottom: 0 },
+						borderColor: boxColor,
+						borderStyle: 'round'
+					})
+				);
+			});
+			console.log(); // Add final newline if issues exist
+		} else {
+			console.log(chalk.green('  No specific issues detected by the LLM.'));
+		}
+		console.log();
+
+		console.log(chalk.cyan.bold('========================================'));
+		console.log(chalk.cyan.bold('          End of LLM Report'));
+		console.log(chalk.cyan.bold('========================================\n'));
+	} catch (error) {
+		// Ensure chalk is available for error reporting, provide fallback
+		const errorChalk = chalk || { red: (t) => t, yellow: (t) => t };
+		console.error(
+			errorChalk.red('Error processing LLM response:'),
+			error.message
+		);
+		// Avoid printing potentially huge inputData here unless necessary for debugging
+		// console.error(errorChalk.yellow('Raw input data (first 500 chars):'), inputData.substring(0, 500));
+		process.exit(1);
+	}
+});
+
+// Handle potential errors during stdin reading
+process.stdin.on('error', (err) => {
+	console.error('Error reading standard input:', err);
+	process.exit(1);
+});
--- a/tests/e2e/run_e2e.sh
+++ b/tests/e2e/run_e2e.sh
@@ -18,6 +18,58 @@ SAMPLE_PRD_SOURCE="$TASKMASTER_SOURCE_DIR/tests/fixtures/sample-prd.txt"
 MAIN_ENV_FILE="$TASKMASTER_SOURCE_DIR/.env"
 # ---

+# <<< Source the helper script >>>
+source "$TASKMASTER_SOURCE_DIR/tests/e2e/e2e_helpers.sh"
+
+# --- Argument Parsing for Analysis-Only Mode ---
+if [ "$#" -ge 2 ] && [ "$1" == "--analyze-log" ]; then
+  LOG_TO_ANALYZE="$2"
+  # Ensure the log path is absolute
+  if [[ "$LOG_TO_ANALYZE" != /* ]]; then
+    LOG_TO_ANALYZE="$(pwd)/$LOG_TO_ANALYZE"
+  fi
+  echo "[INFO] Running in analysis-only mode for log: $LOG_TO_ANALYZE"
+
+  # --- Derive TEST_RUN_DIR from log file path --- 
+  # Extract timestamp like YYYYMMDD_HHMMSS from e2e_run_YYYYMMDD_HHMMSS.log
+  log_basename=$(basename "$LOG_TO_ANALYZE")
+  timestamp_match=$(echo "$log_basename" | sed -n 's/^e2e_run_\([0-9]\{8\}_[0-9]\{6\}\).log$/\1/p')
+
+  if [ -z "$timestamp_match" ]; then
+    echo "[ERROR] Could not extract timestamp from log file name: $log_basename" >&2
+    echo "[ERROR] Expected format: e2e_run_YYYYMMDD_HHMMSS.log" >&2
+    exit 1
+  fi
+
+  # Construct the expected run directory path relative to project root
+  EXPECTED_RUN_DIR="$TASKMASTER_SOURCE_DIR/tests/e2e/_runs/run_$timestamp_match"
+  # Make it absolute
+  EXPECTED_RUN_DIR_ABS="$(cd "$TASKMASTER_SOURCE_DIR" && pwd)/tests/e2e/_runs/run_$timestamp_match"
+
+  if [ ! -d "$EXPECTED_RUN_DIR_ABS" ]; then
+    echo "[ERROR] Corresponding test run directory not found: $EXPECTED_RUN_DIR_ABS" >&2
+    exit 1
+  fi
+
+  # Save original dir before changing
+  ORIGINAL_DIR=$(pwd)
+  
+  echo "[INFO] Changing directory to $EXPECTED_RUN_DIR_ABS for analysis context..."
+  cd "$EXPECTED_RUN_DIR_ABS"
+
+  # Call the analysis function (sourced from helpers)
+  echo "[INFO] Calling analyze_log_with_llm function..."
+  analyze_log_with_llm "$LOG_TO_ANALYZE" "$(cd "$ORIGINAL_DIR/$TASKMASTER_SOURCE_DIR" && pwd)" # Pass absolute project root
+  ANALYSIS_EXIT_CODE=$?
+
+  # Return to original directory
+  cd "$ORIGINAL_DIR"
+  exit $ANALYSIS_EXIT_CODE
+fi
+# --- End Analysis-Only Mode Logic ---
+
+# --- Normal Execution Starts Here (if not in analysis-only mode) ---
+
 # --- Test State Variables ---
 # Note: These are mainly for step numbering within the log now, not for final summary
 test_step_count=0
@@ -29,7 +81,8 @@ start_time_for_helpers=0 # Separate start time for helper functions inside the p
 mkdir -p "$LOG_DIR"
 # Define timestamped log file path
 TIMESTAMP=$(date +"%Y%m%d_%H%M%S")
-LOG_FILE="$LOG_DIR/e2e_run_$TIMESTAMP.log"
+# <<< Use pwd to create an absolute path >>>
+LOG_FILE="$(pwd)/$LOG_DIR/e2e_run_$TIMESTAMP"

 # Define and create the test run directory *before* the main pipe
 mkdir -p "$BASE_TEST_DIR" # Ensure base exists first
@@ -44,167 +97,53 @@ echo "--- Starting E2E Run ---" # Separator before piped output starts
 # Record start time for overall duration *before* the pipe
 overall_start_time=$(date +%s)

+# ==========================================
+# >>> MOVE FUNCTION DEFINITION HERE <<<
+# --- Helper Functions (Define globally) ---
+_format_duration() {
+  local total_seconds=$1
+  local minutes=$((total_seconds / 60))
+  local seconds=$((total_seconds % 60))
+  printf "%dm%02ds" "$minutes" "$seconds"
+}
+
+# Note: This relies on 'overall_start_time' being set globally before the function is called
+_get_elapsed_time_for_log() {
+  local current_time=$(date +%s)
+  # Use overall_start_time here, as start_time_for_helpers might not be relevant globally
+  local elapsed_seconds=$((current_time - overall_start_time))
+  _format_duration "$elapsed_seconds"
+}
+
+log_info() {
+  echo "[INFO] [$(_get_elapsed_time_for_log)] $(date +"%Y-%m-%d %H:%M:%S") $1"
+}
+
+log_success() {
+  echo "[SUCCESS] [$(_get_elapsed_time_for_log)] $(date +"%Y-%m-%d %H:%M:%S") $1"
+}
+
+log_error() {
+  echo "[ERROR] [$(_get_elapsed_time_for_log)] $(date +"%Y-%m-%d %H:%M:%S") $1" >&2
+}
+
+log_step() {
+  test_step_count=$((test_step_count + 1))
+  echo ""
+  echo "============================================="
+  echo "  STEP ${test_step_count}: [$(_get_elapsed_time_for_log)] $(date +"%Y-%m-%d %H:%M:%S") $1"
+  echo "============================================="
+}
+
+# ==========================================
+
 # --- Main Execution Block (Piped to tee) ---
 # Wrap the main part of the script in braces and pipe its output (stdout and stderr) to tee
 {
-  # Record start time for helper functions *inside* the pipe
-  start_time_for_helpers=$(date +%s)
-
-  # --- Helper Functions (Output will now go to tee -> terminal & log file) ---
-  _format_duration() {
-    local total_seconds=$1
-    local minutes=$((total_seconds / 60))
-    local seconds=$((total_seconds % 60))
-    printf "%dm%02ds" "$minutes" "$seconds"
-  }
-
-  _get_elapsed_time_for_log() {
-    local current_time=$(date +%s)
-    local elapsed_seconds=$((current_time - start_time_for_helpers))
-    _format_duration "$elapsed_seconds"
-  }
-
-  log_info() {
-    echo "[INFO] [$(_get_elapsed_time_for_log)] $(date +"%Y-%m-%d %H:%M:%S") $1"
-  }
-
-  log_success() {
-    # We no longer increment success_step_count here for the final summary
-    echo "[SUCCESS] [$(_get_elapsed_time_for_log)] $(date +"%Y-%m-%d %H:%M:%S") $1"
-  }
-
-  log_error() {
-    # Output errors to stderr, which gets merged and sent to tee
-    echo "[ERROR] [$(_get_elapsed_time_for_log)] $(date +"%Y-%m-%d %H:%M:%S") $1" >&2
-  }
-
-  log_step() {
-    test_step_count=$((test_step_count + 1))
-    echo ""
-    echo "============================================="
-    echo "  STEP ${test_step_count}: [$(_get_elapsed_time_for_log)] $(date +"%Y-%m-%d %H:%M:%S") $1"
-    echo "============================================="
-  }
-
-  analyze_log_with_llm() {
-    local log_file="$1"
-    local provider_summary_log="provider_add_task_summary.log" # File summarizing provider test outcomes
-    local api_key=""
-    local api_endpoint="https://api.anthropic.com/v1/messages"
-    local api_key_name="CLAUDE_API_KEY"
-
-    echo "" # Add a newline before analysis starts
-    log_info "Attempting LLM analysis of log: $log_file"
-
-    # Check for jq and curl
-    if ! command -v jq &> /dev/null; then
-      log_error "LLM Analysis requires 'jq'. Skipping analysis."
-      return 1
-    fi
-    if ! command -v curl &> /dev/null; then
-      log_error "LLM Analysis requires 'curl'. Skipping analysis."
-      return 1
-    fi
-
-    # Check for API Key in the TEST_RUN_DIR/.env (copied earlier)
-    if [ -f ".env" ]; then
-      # Using grep and sed for better handling of potential quotes/spaces
-      api_key=$(grep "^${api_key_name}=" .env | sed -e "s/^${api_key_name}=//" -e 's/^[[:space:]"]*//' -e 's/[[:space:]"]*$//')
-    fi
-
-    if [ -z "$api_key" ]; then
-      log_error "${api_key_name} not found or empty in .env file in the test run directory ($(pwd)/.env). Skipping LLM analysis."
-      return 1
-    fi
-
-    if [ ! -f "$log_file" ]; then
-      log_error "Log file not found: $log_file. Skipping LLM analysis."
-      return 1
-    fi
-
-    log_info "Reading log file content..."
-    local log_content
-    # Read entire file, handle potential errors
-    log_content=$(cat "$log_file") || {
-      log_error "Failed to read log file: $log_file. Skipping LLM analysis."
-      return 1
-    }
-
-    # Prepare the prompt
-    # Using printf with %s for the log content is generally safer than direct variable expansion
-    local prompt_template='Analyze the following E2E test log for the task-master tool. The log contains output from various '\''task-master'\'' commands executed sequentially.\n\nYour goal is to:\n1. Verify if the key E2E steps completed successfully based on the log messages (e.g., init, parse PRD, list tasks, analyze complexity, expand task, set status, manage models, add/remove dependencies, add/update/remove tasks/subtasks, generate files).\n2. **Specifically analyze the Multi-Provider Add-Task Test Sequence:**\n   a. Identify which providers were tested for `add-task`. Look for log steps like "Testing Add-Task with Provider: ..." and the summary log `'"$provider_summary_log"'`.\n   b. For each tested provider, determine if `add-task` succeeded or failed. Note the created task ID if successful.\n   c. Review the corresponding `add_task_show_output_<provider>_id_<id>.log` file (if created) for each successful `add-task` execution.\n   d. **Compare the quality and completeness** of the task generated by each successful provider based on their `show` output. Assign a score (e.g., 1-10, 10 being best) based on relevance to the prompt, detail level, and correctness.\n   e. Note any providers where `add-task` failed or where the task ID could not be extracted.\n3. Identify any general explicit "[ERROR]" messages or stack traces throughout the *entire* log.\n4. Identify any potential warnings or unusual output that might indicate a problem even if not marked as an explicit error.\n5. Provide an overall assessment of the test run'\''s health based *only* on the log content.\n\nReturn your analysis **strictly** in the following JSON format. Do not include any text outside of the JSON structure:\n\n{\n  "overall_status": "Success|Failure|Warning",\n  "verified_steps": [ "Initialization", "PRD Parsing", /* ...other general steps observed... */ ],\n  "provider_add_task_comparison": {\n     "prompt_used": "... (extract from log if possible or state 'standard auth prompt') ...",\n     "provider_results": {\n       "anthropic": { "status": "Success|Failure|ID_Extraction_Failed|Set_Model_Failed", "task_id": "...", "score": "X/10 | N/A", "notes": "..." },\n       "openai": { "status": "Success|Failure|...", "task_id": "...", "score": "X/10 | N/A", "notes": "..." },\n       /* ... include all tested providers ... */\n     },\n     "comparison_summary": "Brief overall comparison of generated tasks..."\n   },\n  "detected_issues": [ { "severity": "Error|Warning|Anomaly", "description": "...", "log_context": "[Optional, short snippet from log near the issue]" } ],\n  "llm_summary_points": [ "Overall summary point 1", "Provider comparison highlight", "Any major issues noted" ]\n}\n\nHere is the main log content:\n\n%s'
-
-    local full_prompt
-    printf -v full_prompt "$prompt_template" "$log_content"
-
-    # Construct the JSON payload for Claude Messages API
-    # Using jq for robust JSON construction
-    local payload
-    payload=$(jq -n --arg prompt "$full_prompt" '{
-      "model": "claude-3-7-sonnet-20250219", 
-      "max_tokens": 10000,
-      "messages": [
-        {"role": "user", "content": $prompt}
-      ],
-      "temperature": 0.0
-    }') || {
-        log_error "Failed to create JSON payload using jq."
-        return 1
-    }
-
-    log_info "Sending request to LLM API endpoint: $api_endpoint ..."
-    local response_raw response_http_code response_body
-    # Capture body and HTTP status code separately
-    response_raw=$(curl -s -w "\nHTTP_STATUS_CODE:%{http_code}" -X POST "$api_endpoint" \
-         -H "Content-Type: application/json" \
-         -H "x-api-key: $api_key" \
-         -H "anthropic-version: 2023-06-01" \
-         --data "$payload")
-
-    # Extract status code and body
-    response_http_code=$(echo "$response_raw" | grep '^HTTP_STATUS_CODE:' | sed 's/HTTP_STATUS_CODE://')
-    response_body=$(echo "$response_raw" | sed '$d') # Remove last line (status code)
-
-    if [ "$response_http_code" != "200" ]; then
-        log_error "LLM API call failed with HTTP status $response_http_code."
-        log_error "Response Body: $response_body"
-        return 1
-    fi
-
-    if [ -z "$response_body" ]; then
-        log_error "LLM API call returned empty response body."
-        return 1
-    fi
-
-    log_info "Received LLM response (HTTP 200). Parsing analysis JSON..."
-
-    # Extract the analysis JSON string from the API response (adjust jq path if needed)
-    local analysis_json_string
-    analysis_json_string=$(echo "$response_body" | jq -r '.content[0].text' 2>/dev/null) # Assumes Messages API structure
-
-    if [ -z "$analysis_json_string" ]; then
-        log_error "Failed to extract 'content[0].text' from LLM response JSON."
-        log_error "Full API response body: $response_body"
-        return 1
-    fi
-
-    # Validate and pretty-print the extracted JSON
-    if ! echo "$analysis_json_string" | jq -e . > /dev/null 2>&1; then
-        log_error "Extracted content from LLM is not valid JSON."
-        log_error "Raw extracted content: $analysis_json_string"
-        return 1
-    fi
-
-    log_success "LLM analysis completed successfully."
-    echo ""
-    echo "--- LLM Analysis ---"
-    # Pretty print the JSON analysis
-    echo "$analysis_json_string" | jq '.'
-    echo "--------------------"
-
-    return 0
-  }
-  # ---
+  # Note: Helper functions are now defined globally above,
+  # but we still need start_time_for_helpers if any logging functions
+  # called *inside* this block depend on it. If not, it can be removed.
+  start_time_for_helpers=$(date +%s) # Keep if needed by helpers called inside this block

  # --- Test Setup (Output to tee) ---
  log_step "Setting up test environment"
@@ -264,8 +203,8 @@ overall_start_time=$(date +%s)

  log_step "Initializing Task Master project (non-interactive)"
  task-master init -y --name="E2E Test $TIMESTAMP" --description="Automated E2E test run"
-  if [ ! -f ".taskmasterconfig" ] || [ ! -f "package.json" ]; then
-    log_error "Initialization failed: .taskmasterconfig or package.json not found."
+  if [ ! -f ".taskmasterconfig" ]; then
+    log_error "Initialization failed: .taskmasterconfig not found."
    exit 1
  fi
  log_success "Project initialized."
@@ -385,9 +324,9 @@ overall_start_time=$(date +%s)

    # 3. Check for success and extract task ID
    new_task_id=""
-    if [ $add_task_exit_code -eq 0 ] && echo "$add_task_cmd_output" | grep -q "Successfully added task with ID:"; then
+    if [ $add_task_exit_code -eq 0 ] && echo "$add_task_cmd_output" | grep -q "✓ Added new task #"; then
      # Attempt to extract the ID (adjust grep/sed/awk as needed based on actual output format)
-      new_task_id=$(echo "$add_task_cmd_output" | grep "Successfully added task with ID:" | sed 's/.*Successfully added task with ID: \([0-9.]\+\).*/\1/')
+      new_task_id=$(echo "$add_task_cmd_output" | grep "✓ Added new task #" | sed 's/.*✓ Added new task #\([0-9.]\+\).*/\1/')
      if [ -n "$new_task_id" ]; then
        log_success "Add-task succeeded for $provider. New task ID: $new_task_id"
        echo "Provider $provider add-task SUCCESS (ID: $new_task_id)" >> provider_add_task_summary.log
@@ -458,6 +397,68 @@ overall_start_time=$(date +%s)
  task-master fix-dependencies > fix_dependencies_output.log
  log_success "Fix dependencies attempted."

+  # === Start New Test Section: Validate/Fix Bad Dependencies ===
+
+  log_step "Intentionally adding non-existent dependency (1 -> 999)"
+  task-master add-dependency --id=1 --depends-on=999 || log_error "Failed to add non-existent dependency (unexpected)"
+  # Don't exit even if the above fails, the goal is to test validation
+  log_success "Attempted to add dependency 1 -> 999."
+
+  log_step "Validating dependencies (expecting non-existent error)"
+  task-master validate-dependencies > validate_deps_non_existent.log 2>&1 || true # Allow command to fail without exiting script
+  if grep -q "Non-existent dependency ID: 999" validate_deps_non_existent.log; then
+      log_success "Validation correctly identified non-existent dependency 999."
+  else
+      log_error "Validation DID NOT report non-existent dependency 999 as expected. Check validate_deps_non_existent.log"
+      # Consider exiting here if this check fails, as it indicates a validation logic problem
+      # exit 1
+  fi
+
+  log_step "Fixing dependencies (should remove 1 -> 999)"
+  task-master fix-dependencies > fix_deps_after_non_existent.log
+  log_success "Attempted to fix dependencies."
+
+  log_step "Validating dependencies (after fix)"
+  task-master validate-dependencies > validate_deps_after_fix_non_existent.log 2>&1 || true # Allow potential failure
+  if grep -q "Non-existent dependency ID: 999" validate_deps_after_fix_non_existent.log; then
+      log_error "Validation STILL reports non-existent dependency 999 after fix. Check logs."
+      # exit 1
+  else
+      log_success "Validation shows non-existent dependency 999 was removed."
+  fi
+
+
+  log_step "Intentionally adding circular dependency (4 -> 5 -> 4)"
+  task-master add-dependency --id=4 --depends-on=5 || log_error "Failed to add dependency 4->5"
+  task-master add-dependency --id=5 --depends-on=4 || log_error "Failed to add dependency 5->4"
+  log_success "Attempted to add dependencies 4 -> 5 and 5 -> 4."
+
+
+  log_step "Validating dependencies (expecting circular error)"
+  task-master validate-dependencies > validate_deps_circular.log 2>&1 || true # Allow command to fail
+  # Note: Adjust the grep pattern based on the EXACT error message from validate-dependencies
+  if grep -q -E "Circular dependency detected involving task IDs: (4, 5|5, 4)" validate_deps_circular.log; then
+      log_success "Validation correctly identified circular dependency between 4 and 5."
+  else
+      log_error "Validation DID NOT report circular dependency 4<->5 as expected. Check validate_deps_circular.log"
+      # exit 1
+  fi
+
+  log_step "Fixing dependencies (should remove one side of 4 <-> 5)"
+  task-master fix-dependencies > fix_deps_after_circular.log
+  log_success "Attempted to fix dependencies."
+
+  log_step "Validating dependencies (after fix circular)"
+  task-master validate-dependencies > validate_deps_after_fix_circular.log 2>&1 || true # Allow potential failure
+  if grep -q -E "Circular dependency detected involving task IDs: (4, 5|5, 4)" validate_deps_after_fix_circular.log; then
+      log_error "Validation STILL reports circular dependency 4<->5 after fix. Check logs."
+      # exit 1
+  else
+      log_success "Validation shows circular dependency 4<->5 was resolved."
+  fi
+
+  # === End New Test Section ===
+
  log_step "Adding Task 11 (Manual)"
  task-master add-task --title="Manual E2E Task" --description="Add basic health check endpoint" --priority=low --dependencies=3 # Depends on backend setup
  # Assuming the new task gets ID 11 (adjust if PRD parsing changes)
@@ -485,7 +486,7 @@ overall_start_time=$(date +%s)
  log_success "Attempted update for Subtask 8.1."

  # Add a couple more subtasks for multi-remove test
-  log_step "Adding subtasks to Task 2 (for multi-remove test)"
+  log_step 'Adding subtasks to Task 2 (for multi-remove test)'
  task-master add-subtask --parent=2 --title="Subtask 2.1 for removal"
  task-master add-subtask --parent=2 --title="Subtask 2.2 for removal"
  log_success "Added subtasks 2.1 and 2.2."
@@ -502,6 +503,18 @@ overall_start_time=$(date +%s)
  task-master next > next_task_after_change.log
  log_success "Next task after change saved to next_task_after_change.log"

+  # === Start New Test Section: List Filtering ===
+  log_step "Listing tasks filtered by status 'done'"
+  task-master list --status=done > task_list_status_done.log
+  log_success "Filtered list saved to task_list_status_done.log (Manual/LLM check recommended)"
+  # Optional assertion: Check if Task 1 ID exists and Task 2 ID does NOT
+  # if grep -q "^1\." task_list_status_done.log && ! grep -q "^2\." task_list_status_done.log; then
+  #    log_success "Basic check passed: Task 1 found, Task 2 not found in 'done' list."
+  # else
+  #    log_error "Basic check failed for list --status=done."
+  # fi
+  # === End New Test Section ===
+
  log_step "Clearing subtasks from Task 8"
  task-master clear-subtasks --id=8
  log_success "Attempted to clear subtasks from Task 8."
@@ -511,6 +524,46 @@ overall_start_time=$(date +%s)
  task-master remove-task --id=11,12 -y
  log_success "Removed tasks 11 and 12."

+  # === Start New Test Section: Subtasks & Dependencies ===
+
+  log_step "Expanding Task 2 (to ensure multiple tasks have subtasks)"
+  task-master expand --id=2 # Expand task 2: Backend setup
+  log_success "Attempted to expand Task 2."
+
+  log_step "Listing tasks with subtasks (Before Clear All)"
+  task-master list --with-subtasks > task_list_before_clear_all.log
+  log_success "Task list before clear-all saved."
+
+  log_step "Clearing ALL subtasks"
+  task-master clear-subtasks --all
+  log_success "Attempted to clear all subtasks."
+
+  log_step "Listing tasks with subtasks (After Clear All)"
+  task-master list --with-subtasks > task_list_after_clear_all.log
+  log_success "Task list after clear-all saved. (Manual/LLM check recommended to verify subtasks removed)"
+
+  log_step "Expanding Task 1 again (to have subtasks for next test)"
+  task-master expand --id=1
+  log_success "Attempted to expand Task 1 again."
+
+  log_step "Adding dependency: Task 3 depends on Subtask 1.1"
+  task-master add-dependency --id=3 --depends-on=1.1
+  log_success "Added dependency 3 -> 1.1."
+
+  log_step "Showing Task 3 details (after adding subtask dependency)"
+  task-master show 3 > task_3_details_after_dep_add.log
+  log_success "Task 3 details saved. (Manual/LLM check recommended for dependency [1.1])"
+
+  log_step "Removing dependency: Task 3 depends on Subtask 1.1"
+  task-master remove-dependency --id=3 --depends-on=1.1
+  log_success "Removed dependency 3 -> 1.1."
+
+  log_step "Showing Task 3 details (after removing subtask dependency)"
+  task-master show 3 > task_3_details_after_dep_remove.log
+  log_success "Task 3 details saved. (Manual/LLM check recommended to verify dependency removed)"
+
+  # === End New Test Section ===
+
  log_step "Generating task files (final)"
  task-master generate
  log_success "Generated task files."
@@ -601,25 +654,16 @@ fi
 echo "-------------------------"

 # --- Attempt LLM Analysis ---
-echo "DEBUG: Entering LLM Analysis section..."
 # Run this *after* the main execution block and tee pipe finish writing the log file
-# It will read the completed log file and append its output to the terminal (and the log via subsequent writes if tee is still active, though it shouldn't be)
-# Change directory back into the test run dir where .env is located
 if [ -d "$TEST_RUN_DIR" ]; then
-  echo "DEBUG: Found TEST_RUN_DIR: $TEST_RUN_DIR. Attempting cd..."
  cd "$TEST_RUN_DIR"
-  echo "DEBUG: Changed directory to $(pwd). Calling analyze_log_with_llm..."
-  analyze_log_with_llm "$LOG_FILE"
-  echo "DEBUG: analyze_log_with_llm function call finished."
-  # Optional: cd back again if needed, though script is ending
+  analyze_log_with_llm "$LOG_FILE" "$TASKMASTER_SOURCE_DIR"
+  ANALYSIS_EXIT_CODE=$? # Capture the exit code of the analysis function
+  # Optional: cd back again if needed
  # cd "$ORIGINAL_DIR"
 else
-  # Use log_error format even outside the pipe for consistency
-  current_time_for_error=$(date +%s)
-  elapsed_seconds_for_error=$((current_time_for_error - overall_start_time)) # Use overall start time
-  formatted_duration_for_error=$(_format_duration "$elapsed_seconds_for_error")
+  formatted_duration_for_error=$(_format_duration "$total_elapsed_seconds")
  echo "[ERROR] [$formatted_duration_for_error] $(date +"%Y-%m-%d %H:%M:%S") Test run directory $TEST_RUN_DIR not found. Cannot perform LLM analysis." >&2
 fi

-echo "DEBUG: Reached end of script before final exit."
-exit $EXIT_CODE # Exit with the status of the main script block
+exit $EXIT_CODE
--- a/tests/e2e/test_llm_analysis.sh
+++ b/tests/e2e/test_llm_analysis.sh
@@ -0,0 +1,71 @@
+#!/bin/bash
+
+# Script to test the LLM analysis function independently
+
+# Exit on error
+set -u
+set -o pipefail
+
+# Source the helper functions
+HELPER_SCRIPT="tests/e2e/e2e_helpers.sh"
+if [ -f "$HELPER_SCRIPT" ]; then
+  source "$HELPER_SCRIPT"
+  echo "[INFO] Sourced helper script: $HELPER_SCRIPT"
+else
+  echo "[ERROR] Helper script not found at $HELPER_SCRIPT. Exiting." >&2
+  exit 1
+fi
+
+# --- Configuration ---
+# Get the absolute path to the project root (assuming this script is run from the root)
+PROJECT_ROOT="$(pwd)"
+
+# --- Argument Parsing ---
+if [ "$#" -ne 2 ]; then
+  echo "Usage: $0 <path_to_log_file> <path_to_test_run_directory>" >&2
+  echo "Example: $0 tests/e2e/log/e2e_run_YYYYMMDD_HHMMSS.log tests/e2e/_runs/run_YYYYMMDD_HHMMSS" >&2
+  exit 1
+fi
+
+LOG_FILE_REL="$1"     # Relative path from project root
+TEST_RUN_DIR_REL="$2" # Relative path from project root
+
+# Construct absolute paths
+LOG_FILE_ABS="$PROJECT_ROOT/$LOG_FILE_REL"
+TEST_RUN_DIR_ABS="$PROJECT_ROOT/$TEST_RUN_DIR_REL"
+
+# --- Validation ---
+if [ ! -f "$LOG_FILE_ABS" ]; then
+  echo "[ERROR] Log file not found: $LOG_FILE_ABS" >&2
+  exit 1
+fi
+
+if [ ! -d "$TEST_RUN_DIR_ABS" ]; then
+  echo "[ERROR] Test run directory not found: $TEST_RUN_DIR_ABS" >&2
+  exit 1
+fi
+
+if [ ! -f "$TEST_RUN_DIR_ABS/.env" ]; then
+  echo "[ERROR] .env file not found in test run directory: $TEST_RUN_DIR_ABS/.env" >&2
+  exit 1
+fi
+
+
+# --- Execution ---
+echo "[INFO] Changing directory to test run directory: $TEST_RUN_DIR_ABS"
+cd "$TEST_RUN_DIR_ABS" || { echo "[ERROR] Failed to cd into $TEST_RUN_DIR_ABS"; exit 1; }
+
+echo "[INFO] Current directory: $(pwd)"
+echo "[INFO] Calling analyze_log_with_llm function with log file: $LOG_FILE_ABS"
+
+# Call the function (sourced earlier)
+analyze_log_with_llm "$LOG_FILE_ABS"
+ANALYSIS_EXIT_CODE=$?
+
+echo "[INFO] analyze_log_with_llm finished with exit code: $ANALYSIS_EXIT_CODE"
+
+# Optional: cd back to original directory
+# echo "[INFO] Changing back to project root: $PROJECT_ROOT"
+# cd "$PROJECT_ROOT"
+
+exit $ANALYSIS_EXIT_CODE