refactor: Improve update-subtask, consolidate utils, update config

This commit introduces several improvements and refactorings across MCP tools, core logic, and configuration.

**Major Changes:**

1.  **Refactor updateSubtaskById:**
    - Switched from generateTextService to generateObjectService for structured AI responses, using a Zod schema (subtaskSchema) for validation.
    - Revised prompts to have the AI generate relevant content based on user request and context (parent/sibling tasks), while explicitly preventing AI from handling timestamp/tag formatting.
    - Implemented **local timestamp generation (new Date().toISOString()) and formatting** (using <info added on ...> tags) within the function *after* receiving the AI response. This ensures reliable and correctly formatted details are appended.
    - Corrected logic to append only the locally formatted, AI-generated content block to the existing subtask.details.

2.  **Consolidate MCP Utilities:**
    - Moved/consolidated the withNormalizedProjectRoot HOF into mcp-server/src/tools/utils.js.
    - Updated MCP tools (like update-subtask.js) to import withNormalizedProjectRoot from the new location.

3.  **Refactor Project Initialization:**
    - Deleted the redundant mcp-server/src/core/direct-functions/initialize-project-direct.js file.
    - Updated mcp-server/src/core/task-master-core.js to import initializeProjectDirect from its correct location (./direct-functions/initialize-project.js).

**Other Changes:**

-   Updated .taskmasterconfig fallback model to claude-3-7-sonnet-20250219.
-   Clarified model cost representation in the models tool description (taskmaster.mdc and mcp-server/src/tools/models.js).
This commit is contained in:
Eyal Toledano
2025-05-02 17:48:59 -04:00
parent 33559e368c
commit d63964a10e
13 changed files with 711 additions and 232 deletions

View File

@@ -0,0 +1,8 @@
---
'task-master-ai': patch
---
Improved update-subtask
- Now it has context about the parent task details
- It also has context about the subtask before it and the subtask after it (if they exist)
- Not passing all subtasks to stay token efficient

View File

@@ -79,6 +79,7 @@ This document provides a detailed reference for interacting with Taskmaster, cov
* **Usage (CLI):** Run without flags to view current configuration and available models. Use set flags to update specific roles. Use `--setup` for guided configuration, including custom models. To set a custom model via flags, use `--set-<role>=<model_id>` along with either `--ollama` or `--openrouter`. * **Usage (CLI):** Run without flags to view current configuration and available models. Use set flags to update specific roles. Use `--setup` for guided configuration, including custom models. To set a custom model via flags, use `--set-<role>=<model_id>` along with either `--ollama` or `--openrouter`.
* **Notes:** Configuration is stored in `.taskmasterconfig` in the project root. This command/tool modifies that file. Use `listAvailableModels` or `task-master models` to see internally supported models. OpenRouter custom models are validated against their live API. Ollama custom models are not validated live. * **Notes:** Configuration is stored in `.taskmasterconfig` in the project root. This command/tool modifies that file. Use `listAvailableModels` or `task-master models` to see internally supported models. OpenRouter custom models are validated against their live API. Ollama custom models are not validated live.
* **API note:** API keys for selected AI providers (based on their model) need to exist in the mcp.json file to be accessible in MCP context. The API keys must be present in the local .env file for the CLI to be able to read them. * **API note:** API keys for selected AI providers (based on their model) need to exist in the mcp.json file to be accessible in MCP context. The API keys must be present in the local .env file for the CLI to be able to read them.
* **Model costs:** The costs in supported models are expressed in dollars. An input/output value of 3 is $3.00. A value of 0.8 is $0.80.
* **Warning:** DO NOT MANUALLY EDIT THE .taskmasterconfig FILE. Use the included commands either in the MCP or CLI format as needed. Always prioritize MCP tools when available and use the CLI as a fallback. * **Warning:** DO NOT MANUALLY EDIT THE .taskmasterconfig FILE. Use the included commands either in the MCP or CLI format as needed. Always prioritize MCP tools when available and use the CLI as a fallback.
--- ---

View File

@@ -14,7 +14,7 @@
}, },
"fallback": { "fallback": {
"provider": "anthropic", "provider": "anthropic",
"modelId": "claude-3-5-sonnet-20241022", "modelId": "claude-3-7-sonnet-20250219",
"maxTokens": 120000, "maxTokens": 120000,
"temperature": 0.2 "temperature": 0.2
} }

View File

@@ -4,7 +4,6 @@ import {
disableSilentMode disableSilentMode
// isSilentMode // Not used directly here // isSilentMode // Not used directly here
} from '../../../../scripts/modules/utils.js'; } from '../../../../scripts/modules/utils.js';
import { getProjectRootFromSession } from '../../tools/utils.js'; // Adjust path if necessary
import os from 'os'; // Import os module for home directory check import os from 'os'; // Import os module for home directory check
/** /**
@@ -16,60 +15,32 @@ import os from 'os'; // Import os module for home directory check
* @returns {Promise<{success: boolean, data?: any, error?: {code: string, message: string}}>} - Standard result object. * @returns {Promise<{success: boolean, data?: any, error?: {code: string, message: string}}>} - Standard result object.
*/ */
export async function initializeProjectDirect(args, log, context = {}) { export async function initializeProjectDirect(args, log, context = {}) {
const { session } = context; const { session } = context; // Keep session if core logic needs it
const homeDir = os.homedir(); const homeDir = os.homedir();
let targetDirectory = null;
log.info(
`CONTEXT received in direct function: ${context ? JSON.stringify(Object.keys(context)) : 'MISSING or Falsy'}`
);
log.info(
`SESSION extracted in direct function: ${session ? 'Exists' : 'MISSING or Falsy'}`
);
log.info(`Args received in direct function: ${JSON.stringify(args)}`); log.info(`Args received in direct function: ${JSON.stringify(args)}`);
// --- Determine Target Directory --- // --- Determine Target Directory ---
// 1. Prioritize projectRoot passed directly in args // TRUST the projectRoot passed from the tool layer via args
// Ensure it's not null, '/', or the home directory // The HOF in the tool layer already normalized and validated it came from a reliable source (args or session)
if ( const targetDirectory = args.projectRoot;
args.projectRoot &&
args.projectRoot !== '/' &&
args.projectRoot !== homeDir
) {
log.info(`Using projectRoot directly from args: ${args.projectRoot}`);
targetDirectory = args.projectRoot;
} else {
// 2. If args.projectRoot is missing or invalid, THEN try session (as a fallback)
log.warn(
`args.projectRoot ('${args.projectRoot}') is missing or invalid. Attempting to derive from session.`
);
const sessionDerivedPath = getProjectRootFromSession(session, log);
// Validate the session-derived path as well
if (
sessionDerivedPath &&
sessionDerivedPath !== '/' &&
sessionDerivedPath !== homeDir
) {
log.info(
`Using project root derived from session: ${sessionDerivedPath}`
);
targetDirectory = sessionDerivedPath;
} else {
log.error(
`Could not determine a valid project root. args.projectRoot='${args.projectRoot}', sessionDerivedPath='${sessionDerivedPath}'`
);
}
}
// 3. Validate the final targetDirectory // --- Validate the targetDirectory (basic sanity checks) ---
if (!targetDirectory) { if (
// This error now covers cases where neither args.projectRoot nor session provided a valid path !targetDirectory ||
typeof targetDirectory !== 'string' || // Ensure it's a string
targetDirectory === '/' ||
targetDirectory === homeDir
) {
log.error(
`Invalid target directory received from tool layer: '${targetDirectory}'`
);
return { return {
success: false, success: false,
error: { error: {
code: 'INVALID_TARGET_DIRECTORY', code: 'INVALID_TARGET_DIRECTORY',
message: `Cannot initialize project: Could not determine a valid target directory. Please ensure a workspace/folder is open or specify projectRoot.`, message: `Cannot initialize project: Invalid target directory '${targetDirectory}' received. Please ensure a valid workspace/folder is open or specified.`,
details: `Attempted args.projectRoot: ${args.projectRoot}` details: `Received args.projectRoot: ${args.projectRoot}` // Show what was received
}, },
fromCache: false fromCache: false
}; };
@@ -86,11 +57,12 @@ export async function initializeProjectDirect(args, log, context = {}) {
log.info( log.info(
`Temporarily changing CWD to ${targetDirectory} for initialization.` `Temporarily changing CWD to ${targetDirectory} for initialization.`
); );
process.chdir(targetDirectory); // Change CWD to the *validated* targetDirectory process.chdir(targetDirectory); // Change CWD to the HOF-provided root
enableSilentMode(); // Enable silent mode BEFORE calling the core function enableSilentMode();
try { try {
// Always force yes: true when called via MCP to avoid interactive prompts // Construct options ONLY from the relevant flags in args
// The core initializeProject operates in the current CWD, which we just set
const options = { const options = {
aliases: args.addAliases, aliases: args.addAliases,
skipInstall: args.skipInstall, skipInstall: args.skipInstall,
@@ -100,12 +72,11 @@ export async function initializeProjectDirect(args, log, context = {}) {
log.info(`Initializing project with options: ${JSON.stringify(options)}`); log.info(`Initializing project with options: ${JSON.stringify(options)}`);
const result = await initializeProject(options); // Call core logic const result = await initializeProject(options); // Call core logic
// Format success result for handleApiResult
resultData = { resultData = {
message: 'Project initialized successfully.', message: 'Project initialized successfully.',
next_step: next_step:
'Now that the project is initialized, the next step is to create the tasks by parsing a PRD. This will create the tasks folder and the initial task files (tasks folder will be created when parse-prd is run). The parse-prd tool will require a prd.txt file as input (typically found in the project root directory, scripts/ directory). You can create a prd.txt file by asking the user about their idea, and then using the scripts/example_prd.txt file as a template to genrate a prd.txt file in scripts/. You may skip all of this if the user already has a prd.txt file. You can THEN use the parse-prd tool to create the tasks. So: step 1 after initialization is to create a prd.txt file in scripts/prd.txt or confirm the user already has one. Step 2 is to use the parse-prd tool to create the tasks. Do not bother looking for tasks after initialization, just use the parse-prd tool to create the tasks after creating a prd.txt from which to parse the tasks. You do NOT need to reinitialize the project to parse-prd.', 'Now that the project is initialized, the next step is to create the tasks by parsing a PRD. This will create the tasks folder and the initial task files (tasks folder will be created when parse-prd is run). The parse-prd tool will require a prd.txt file as input (typically found in the project root directory, scripts/ directory). You can create a prd.txt file by asking the user about their idea, and then using the scripts/example_prd.txt file as a template to genrate a prd.txt file in scripts/. You may skip all of this if the user already has a prd.txt file. You can THEN use the parse-prd tool to create the tasks. So: step 1 after initialization is to create a prd.txt file in scripts/prd.txt or confirm the user already has one. Step 2 is to use the parse-prd tool to create the tasks. Do not bother looking for tasks after initialization, just use the parse-prd tool to create the tasks after creating a prd.txt from which to parse the tasks. You do NOT need to reinitialize the project to parse-prd.',
...result // Include details returned by initializeProject ...result
}; };
success = true; success = true;
log.info( log.info(
@@ -120,12 +91,11 @@ export async function initializeProjectDirect(args, log, context = {}) {
}; };
success = false; success = false;
} finally { } finally {
disableSilentMode(); // ALWAYS disable silent mode in finally disableSilentMode();
log.info(`Restoring original CWD: ${originalCwd}`); log.info(`Restoring original CWD: ${originalCwd}`);
process.chdir(originalCwd); // Change back to original CWD process.chdir(originalCwd);
} }
// Return in format expected by handleApiResult
if (success) { if (success) {
return { success: true, data: resultData, fromCache: false }; return { success: true, data: resultData, fromCache: false };
} else { } else {

View File

@@ -28,7 +28,7 @@ import { fixDependenciesDirect } from './direct-functions/fix-dependencies.js';
import { complexityReportDirect } from './direct-functions/complexity-report.js'; import { complexityReportDirect } from './direct-functions/complexity-report.js';
import { addDependencyDirect } from './direct-functions/add-dependency.js'; import { addDependencyDirect } from './direct-functions/add-dependency.js';
import { removeTaskDirect } from './direct-functions/remove-task.js'; import { removeTaskDirect } from './direct-functions/remove-task.js';
import { initializeProjectDirect } from './direct-functions/initialize-project-direct.js'; import { initializeProjectDirect } from './direct-functions/initialize-project.js';
import { modelsDirect } from './direct-functions/models.js'; import { modelsDirect } from './direct-functions/models.js';
// Re-export utility functions // Re-export utility functions

View File

@@ -42,7 +42,9 @@ export function registerModelsTool(server) {
listAvailableModels: z listAvailableModels: z
.boolean() .boolean()
.optional() .optional()
.describe('List all available models not currently in use.'), .describe(
'List all available models not currently in use. Input/output costs values are in dollars (3 is $3.00).'
),
projectRoot: z projectRoot: z
.string() .string()
.optional() .optional()

View File

@@ -4,11 +4,13 @@
*/ */
import { z } from 'zod'; import { z } from 'zod';
import { handleApiResult, createErrorResponse } from './utils.js'; import {
handleApiResult,
createErrorResponse,
withNormalizedProjectRoot
} from './utils.js';
import { updateSubtaskByIdDirect } from '../core/task-master-core.js'; import { updateSubtaskByIdDirect } from '../core/task-master-core.js';
import { findTasksJsonPath } from '../core/utils/path-utils.js'; import { findTasksJsonPath } from '../core/utils/path-utils.js';
import path from 'path';
import { withNormalizedProjectRoot } from '../core/utils/project-utils.js';
/** /**
* Register the update-subtask tool with the MCP server * Register the update-subtask tool with the MCP server

View File

@@ -13,20 +13,6 @@
"cost_per_1m_tokens": { "input": 3.0, "output": 15.0 }, "cost_per_1m_tokens": { "input": 3.0, "output": 15.0 },
"allowed_roles": ["main", "fallback"], "allowed_roles": ["main", "fallback"],
"max_tokens": 64000 "max_tokens": 64000
},
{
"id": "claude-3-5-haiku-20241022",
"swe_score": 0.406,
"cost_per_1m_tokens": { "input": 0.8, "output": 4.0 },
"allowed_roles": ["main", "fallback"],
"max_tokens": 64000
},
{
"id": "claude-3-opus-20240229",
"swe_score": 0,
"cost_per_1m_tokens": { "input": 15, "output": 75 },
"allowed_roles": ["main", "fallback"],
"max_tokens": 64000
} }
], ],
"openai": [ "openai": [
@@ -41,7 +27,7 @@
"id": "o1", "id": "o1",
"swe_score": 0.489, "swe_score": 0.489,
"cost_per_1m_tokens": { "input": 15.0, "output": 60.0 }, "cost_per_1m_tokens": { "input": 15.0, "output": 60.0 },
"allowed_roles": ["main", "fallback"] "allowed_roles": ["main"]
}, },
{ {
"id": "o3", "id": "o3",
@@ -53,7 +39,7 @@
"id": "o3-mini", "id": "o3-mini",
"swe_score": 0.493, "swe_score": 0.493,
"cost_per_1m_tokens": { "input": 1.1, "output": 4.4 }, "cost_per_1m_tokens": { "input": 1.1, "output": 4.4 },
"allowed_roles": ["main", "fallback"], "allowed_roles": ["main"],
"max_tokens": 100000 "max_tokens": 100000
}, },
{ {
@@ -66,49 +52,49 @@
"id": "o1-mini", "id": "o1-mini",
"swe_score": 0.4, "swe_score": 0.4,
"cost_per_1m_tokens": { "input": 1.1, "output": 4.4 }, "cost_per_1m_tokens": { "input": 1.1, "output": 4.4 },
"allowed_roles": ["main", "fallback"] "allowed_roles": ["main"]
}, },
{ {
"id": "o1-pro", "id": "o1-pro",
"swe_score": 0, "swe_score": 0,
"cost_per_1m_tokens": { "input": 150.0, "output": 600.0 }, "cost_per_1m_tokens": { "input": 150.0, "output": 600.0 },
"allowed_roles": ["main", "fallback"] "allowed_roles": ["main"]
}, },
{ {
"id": "gpt-4-5-preview", "id": "gpt-4-5-preview",
"swe_score": 0.38, "swe_score": 0.38,
"cost_per_1m_tokens": { "input": 75.0, "output": 150.0 }, "cost_per_1m_tokens": { "input": 75.0, "output": 150.0 },
"allowed_roles": ["main", "fallback"] "allowed_roles": ["main"]
}, },
{ {
"id": "gpt-4-1-mini", "id": "gpt-4-1-mini",
"swe_score": 0, "swe_score": 0,
"cost_per_1m_tokens": { "input": 0.4, "output": 1.6 }, "cost_per_1m_tokens": { "input": 0.4, "output": 1.6 },
"allowed_roles": ["main", "fallback"] "allowed_roles": ["main"]
}, },
{ {
"id": "gpt-4-1-nano", "id": "gpt-4-1-nano",
"swe_score": 0, "swe_score": 0,
"cost_per_1m_tokens": { "input": 0.1, "output": 0.4 }, "cost_per_1m_tokens": { "input": 0.1, "output": 0.4 },
"allowed_roles": ["main", "fallback"] "allowed_roles": ["main"]
}, },
{ {
"id": "gpt-4o-mini", "id": "gpt-4o-mini",
"swe_score": 0.3, "swe_score": 0.3,
"cost_per_1m_tokens": { "input": 0.15, "output": 0.6 }, "cost_per_1m_tokens": { "input": 0.15, "output": 0.6 },
"allowed_roles": ["main", "fallback"] "allowed_roles": ["main"]
}, },
{ {
"id": "gpt-4o-search-preview", "id": "gpt-4o-search-preview",
"swe_score": 0.33, "swe_score": 0.33,
"cost_per_1m_tokens": { "input": 2.5, "output": 10.0 }, "cost_per_1m_tokens": { "input": 2.5, "output": 10.0 },
"allowed_roles": ["main", "fallback", "research"] "allowed_roles": ["research"]
}, },
{ {
"id": "gpt-4o-mini-search-preview", "id": "gpt-4o-mini-search-preview",
"swe_score": 0.3, "swe_score": 0.3,
"cost_per_1m_tokens": { "input": 0.15, "output": 0.6 }, "cost_per_1m_tokens": { "input": 0.15, "output": 0.6 },
"allowed_roles": ["main", "fallback", "research"] "allowed_roles": ["research"]
} }
], ],
"google": [ "google": [
@@ -189,14 +175,6 @@
"allowed_roles": ["main", "fallback", "research"], "allowed_roles": ["main", "fallback", "research"],
"max_tokens": 131072 "max_tokens": 131072
}, },
{
"id": "grok-3-mini",
"name": "Grok 3 Mini",
"swe_score": 0,
"cost_per_1m_tokens": { "input": 0.3, "output": 0.5 },
"allowed_roles": ["main", "fallback", "research"],
"max_tokens": 131072
},
{ {
"id": "grok-3-fast", "id": "grok-3-fast",
"name": "Grok 3 Fast", "name": "Grok 3 Fast",
@@ -204,13 +182,6 @@
"cost_per_1m_tokens": { "input": 5, "output": 25 }, "cost_per_1m_tokens": { "input": 5, "output": 25 },
"allowed_roles": ["main", "fallback", "research"], "allowed_roles": ["main", "fallback", "research"],
"max_tokens": 131072 "max_tokens": 131072
},
{
"id": "grok-3-mini-fast",
"swe_score": 0,
"cost_per_1m_tokens": { "input": 0.6, "output": 4 },
"allowed_roles": ["main", "fallback", "research"],
"max_tokens": 131072
} }
], ],
"ollama": [ "ollama": [
@@ -283,7 +254,7 @@
"id": "deepseek/deepseek-chat-v3-0324", "id": "deepseek/deepseek-chat-v3-0324",
"swe_score": 0, "swe_score": 0,
"cost_per_1m_tokens": { "input": 0.27, "output": 1.1 }, "cost_per_1m_tokens": { "input": 0.27, "output": 1.1 },
"allowed_roles": ["main", "fallback"], "allowed_roles": ["main"],
"max_tokens": 64000 "max_tokens": 64000
}, },
{ {
@@ -312,14 +283,14 @@
"id": "google/gemini-2.5-flash-preview", "id": "google/gemini-2.5-flash-preview",
"swe_score": 0, "swe_score": 0,
"cost_per_1m_tokens": { "input": 0.15, "output": 0.6 }, "cost_per_1m_tokens": { "input": 0.15, "output": 0.6 },
"allowed_roles": ["main", "fallback"], "allowed_roles": ["main"],
"max_tokens": 65535 "max_tokens": 65535
}, },
{ {
"id": "google/gemini-2.5-flash-preview:thinking", "id": "google/gemini-2.5-flash-preview:thinking",
"swe_score": 0, "swe_score": 0,
"cost_per_1m_tokens": { "input": 0.15, "output": 3.5 }, "cost_per_1m_tokens": { "input": 0.15, "output": 3.5 },
"allowed_roles": ["main", "fallback"], "allowed_roles": ["main"],
"max_tokens": 65535 "max_tokens": 65535
}, },
{ {

View File

@@ -3,6 +3,7 @@ import path from 'path';
import chalk from 'chalk'; import chalk from 'chalk';
import boxen from 'boxen'; import boxen from 'boxen';
import Table from 'cli-table3'; import Table from 'cli-table3';
import { z } from 'zod';
import { import {
getStatusWithColor, getStatusWithColor,
@@ -16,7 +17,10 @@ import {
truncate, truncate,
isSilentMode isSilentMode
} from '../utils.js'; } from '../utils.js';
import { generateTextService } from '../ai-services-unified.js'; import {
generateObjectService,
generateTextService
} from '../ai-services-unified.js';
import { getDebugFlag } from '../config-manager.js'; import { getDebugFlag } from '../config-manager.js';
import generateTaskFiles from './generate-task-files.js'; import generateTaskFiles from './generate-task-files.js';
@@ -131,6 +135,17 @@ async function updateSubtaskById(
const subtask = parentTask.subtasks[subtaskIndex]; const subtask = parentTask.subtasks[subtaskIndex];
const subtaskSchema = z.object({
id: z.number().int().positive(),
title: z.string(),
description: z.string().optional(),
status: z.string(),
dependencies: z.array(z.union([z.string(), z.number()])).optional(),
priority: z.string().optional(),
details: z.string().optional(),
testStrategy: z.string().optional()
});
// Only show UI elements for text output (CLI) // Only show UI elements for text output (CLI)
if (outputFormat === 'text') { if (outputFormat === 'text') {
// Show the subtask that will be updated // Show the subtask that will be updated
@@ -168,101 +183,155 @@ async function updateSubtaskById(
); );
} }
let additionalInformation = ''; let parsedAIResponse;
try { try {
// Build Prompts // --- GET PARENT & SIBLING CONTEXT ---
const systemPrompt = `You are an AI assistant helping to update a software development subtask. Your goal is to APPEND new information to the existing details, not replace them. Add a timestamp. const parentContext = {
id: parentTask.id,
title: parentTask.title
// Avoid sending full parent description/details unless necessary
};
const prevSubtask =
subtaskIndex > 0
? {
id: `${parentTask.id}.${parentTask.subtasks[subtaskIndex - 1].id}`,
title: parentTask.subtasks[subtaskIndex - 1].title,
status: parentTask.subtasks[subtaskIndex - 1].status
}
: null;
const nextSubtask =
subtaskIndex < parentTask.subtasks.length - 1
? {
id: `${parentTask.id}.${parentTask.subtasks[subtaskIndex + 1].id}`,
title: parentTask.subtasks[subtaskIndex + 1].title,
status: parentTask.subtasks[subtaskIndex + 1].status
}
: null;
const contextString = `
Parent Task: ${JSON.stringify(parentContext)}
${prevSubtask ? `Previous Subtask: ${JSON.stringify(prevSubtask)}` : ''}
${nextSubtask ? `Next Subtask: ${JSON.stringify(nextSubtask)}` : ''}
`;
const systemPrompt = `You are an AI assistant updating a parent task's subtask. This subtask will be part of a larger parent task and will be used to direct AI agents to complete the subtask. Your goal is to GENERATE new, relevant information based on the user's request (which may be high-level, mid-level or low-level) and APPEND it to the existing subtask 'details' field, wrapped in specific XML-like tags with an ISO 8601 timestamp. Intelligently determine the level of detail to include based on the user's request. Some requests are meant simply to update the subtask with some mid-implementation details, while others are meant to update the subtask with a detailed plan or strategy.
Context Provided:
- The current subtask object.
- Basic info about the parent task (ID, title).
- Basic info about the immediately preceding subtask (ID, title, status), if it exists.
- Basic info about the immediately succeeding subtask (ID, title, status), if it exists.
- A user request string.
Guidelines: Guidelines:
1. Identify the existing 'details' field in the subtask JSON. 1. Analyze the user request considering the provided subtask details AND the context of the parent and sibling tasks.
2. Create a new timestamp string in the format: '[YYYY-MM-DD HH:MM:SS]'. 2. GENERATE new, relevant text content that should be added to the 'details' field. Focus *only* on the substance of the update based on the user request and context. Do NOT add timestamps or any special formatting yourself. Avoid over-engineering the details, provide .
3. Append the new timestamp and the information from the user prompt to the *end* of the existing 'details' field. 3. Update the 'details' field in the subtask object with the GENERATED text content. It's okay if this overwrites previous details in the object you return, as the calling code will handle the final appending.
4. Ensure the final 'details' field is a single, coherent string with the new information added. 4. Return the *entire* updated subtask object (with your generated content in the 'details' field) as a valid JSON object conforming to the provided schema. Do NOT return explanations or markdown formatting.`;
5. Return the *entire* subtask object as a valid JSON, including the updated 'details' field and all other original fields (id, title, status, dependencies, etc.).`;
const subtaskDataString = JSON.stringify(subtask, null, 2); const subtaskDataString = JSON.stringify(subtask, null, 2);
const userPrompt = `Here is the subtask to update:\n${subtaskDataString}\n\nPlease APPEND the following information to the 'details' field, preceded by a timestamp:\n${prompt}\n\nReturn only the updated subtask as a single, valid JSON object.`; // Updated user prompt including context
const userPrompt = `Task Context:\n${contextString}\nCurrent Subtask:\n${subtaskDataString}\n\nUser Request: "${prompt}"\n\nPlease GENERATE new, relevant text content for the 'details' field based on the user request and the provided context. Return the entire updated subtask object as a valid JSON object matching the schema, with the newly generated text placed in the 'details' field.`;
// --- END UPDATED PROMPTS ---
// Call Unified AI Service // Call Unified AI Service using generateObjectService
const role = useResearch ? 'research' : 'main'; const role = useResearch ? 'research' : 'main';
report('info', `Using AI service with role: ${role}`); report('info', `Using AI object service with role: ${role}`);
const responseText = await generateTextService({ parsedAIResponse = await generateObjectService({
prompt: userPrompt, prompt: userPrompt,
systemPrompt: systemPrompt, systemPrompt: systemPrompt,
schema: subtaskSchema,
objectName: 'updatedSubtask',
role, role,
session, session,
projectRoot projectRoot,
maxRetries: 2
}); });
report('success', 'Successfully received text response from AI service'); report(
'success',
'Successfully received object response from AI service'
);
if (outputFormat === 'text' && loadingIndicator) { if (outputFormat === 'text' && loadingIndicator) {
// Stop indicator immediately since generateText is blocking
stopLoadingIndicator(loadingIndicator); stopLoadingIndicator(loadingIndicator);
loadingIndicator = null; loadingIndicator = null;
} }
// Assign the result directly (generateTextService returns the text string) if (!parsedAIResponse || typeof parsedAIResponse !== 'object') {
additionalInformation = responseText ? responseText.trim() : ''; throw new Error('AI did not return a valid object.');
if (!additionalInformation) {
throw new Error('AI returned empty response.'); // Changed error message slightly
} }
report( report(
// Corrected log message to reflect generateText
'success', 'success',
`Successfully generated text using AI role: ${role}.` `Successfully generated object using AI role: ${role}.`
); );
} catch (aiError) { } catch (aiError) {
report('error', `AI service call failed: ${aiError.message}`); report('error', `AI service call failed: ${aiError.message}`);
if (outputFormat === 'text' && loadingIndicator) {
stopLoadingIndicator(loadingIndicator); // Ensure stop on error
loadingIndicator = null;
}
throw aiError; throw aiError;
} // Removed the inner finally block as streamingInterval is gone }
const currentDate = new Date(); // --- TIMESTAMP & FORMATTING LOGIC (Handled Locally) ---
// Extract only the generated content from the AI's response details field.
const generatedContent = parsedAIResponse.details || ''; // Default to empty string
// Format the additional information with timestamp if (generatedContent.trim()) {
const formattedInformation = `\n\n<info added on ${currentDate.toISOString()}>\n${additionalInformation}\n</info added on ${currentDate.toISOString()}>`; // Generate timestamp locally
const timestamp = new Date().toISOString(); // <<< Local Timestamp
// Format the content with XML-like tags and timestamp LOCALLY
const formattedBlock = `<info added on ${timestamp}>\n${generatedContent.trim()}\n</info added on ${timestamp}>`; // <<< Local Formatting
// Append the formatted block to the *original* subtask details
subtask.details =
(subtask.details ? subtask.details + '\n' : '') + formattedBlock; // <<< Local Appending
report(
'info',
'Appended timestamped, formatted block with AI-generated content to subtask.details.'
);
} else {
report(
'warn',
'AI response object did not contain generated content in the "details" field. Original details remain unchanged.'
);
}
// --- END TIMESTAMP & FORMATTING LOGIC ---
// Get a reference to the subtask *after* its details have been updated
const updatedSubtask = parentTask.subtasks[subtaskIndex]; // subtask === updatedSubtask now
report('info', 'Updated subtask details locally after AI generation.');
// --- END UPDATE SUBTASK ---
// Only show debug info for text output (CLI) // Only show debug info for text output (CLI)
if (outputFormat === 'text' && getDebugFlag(session)) { if (outputFormat === 'text' && getDebugFlag(session)) {
console.log( console.log(
'>>> DEBUG: formattedInformation:', '>>> DEBUG: Subtask details AFTER AI update:',
formattedInformation.substring(0, 70) + '...' updatedSubtask.details // Use updatedSubtask
); );
} }
// Append to subtask details and description // Description update logic (keeping as is for now)
// Only show debug info for text output (CLI) if (updatedSubtask.description) {
if (outputFormat === 'text' && getDebugFlag(session)) { // Use updatedSubtask
console.log('>>> DEBUG: Subtask details BEFORE append:', subtask.details); if (prompt.length < 100) {
}
if (subtask.details) {
subtask.details += formattedInformation;
} else {
subtask.details = `${formattedInformation}`;
}
// Only show debug info for text output (CLI)
if (outputFormat === 'text' && getDebugFlag(session)) {
console.log('>>> DEBUG: Subtask details AFTER append:', subtask.details);
}
if (subtask.description) {
// Only append to description if it makes sense (for shorter updates)
if (additionalInformation.length < 200) {
// Only show debug info for text output (CLI)
if (outputFormat === 'text' && getDebugFlag(session)) { if (outputFormat === 'text' && getDebugFlag(session)) {
console.log( console.log(
'>>> DEBUG: Subtask description BEFORE append:', '>>> DEBUG: Subtask description BEFORE append:',
subtask.description updatedSubtask.description // Use updatedSubtask
); );
} }
subtask.description += ` [Updated: ${currentDate.toLocaleDateString()}]`; updatedSubtask.description += ` [Updated: ${new Date().toLocaleDateString()}]`; // Use updatedSubtask
// Only show debug info for text output (CLI)
if (outputFormat === 'text' && getDebugFlag(session)) { if (outputFormat === 'text' && getDebugFlag(session)) {
console.log( console.log(
'>>> DEBUG: Subtask description AFTER append:', '>>> DEBUG: Subtask description AFTER append:',
subtask.description updatedSubtask.description // Use updatedSubtask
); );
} }
} }
@@ -273,10 +342,7 @@ Guidelines:
console.log('>>> DEBUG: About to call writeJSON with updated data...'); console.log('>>> DEBUG: About to call writeJSON with updated data...');
} }
// Update the subtask in the parent task's array // Write the updated tasks to the file (parentTask already contains the updated subtask)
parentTask.subtasks[subtaskIndex] = subtask;
// Write the updated tasks to the file
writeJSON(tasksPath, data); writeJSON(tasksPath, data);
// Only show debug info for text output (CLI) // Only show debug info for text output (CLI)
@@ -302,17 +368,18 @@ Guidelines:
'\n\n' + '\n\n' +
chalk.white.bold('Title:') + chalk.white.bold('Title:') +
' ' + ' ' +
subtask.title + updatedSubtask.title +
'\n\n' + '\n\n' +
chalk.white.bold('Information Added:') + // Update the display to show the new details field
chalk.white.bold('Updated Details:') +
'\n' + '\n' +
chalk.white(truncate(additionalInformation, 300, true)), chalk.white(truncate(updatedSubtask.details || '', 500, true)), // Use updatedSubtask
{ padding: 1, borderColor: 'green', borderStyle: 'round' } { padding: 1, borderColor: 'green', borderStyle: 'round' }
) )
); );
} }
return subtask; return updatedSubtask; // Return the modified subtask object
} catch (error) { } catch (error) {
// Outer catch block handles final errors after loop/attempts // Outer catch block handles final errors after loop/attempts
// Stop indicator on error - only for text output (CLI) // Stop indicator on error - only for text output (CLI)

View File

@@ -1964,7 +1964,7 @@ Implementation notes:
## 31. Implement Integration Tests for Unified AI Service [pending] ## 31. Implement Integration Tests for Unified AI Service [pending]
### Dependencies: 61.18 ### Dependencies: 61.18
### Description: Implement integration tests for `ai-services-unified.js`. These tests should verify the correct routing to different provider modules based on configuration and ensure the unified service functions (`generateTextService`, `generateObjectService`, etc.) work correctly when called from modules like `task-manager.js`. ### Description: Implement integration tests for `ai-services-unified.js`. These tests should verify the correct routing to different provider modules based on configuration and ensure the unified service functions (`generateTextService`, `generateObjectService`, etc.) work correctly when called from modules like `task-manager.js`. [Updated: 5/2/2025] [Updated: 5/2/2025] [Updated: 5/2/2025] [Updated: 5/2/2025]
### Details: ### Details:
@@ -2009,6 +2009,107 @@ For the integration tests of the Unified AI Service, consider the following impl
6. Include tests for configuration changes at runtime and their effect on service behavior. 6. Include tests for configuration changes at runtime and their effect on service behavior.
</info added on 2025-04-20T03:51:23.368Z> </info added on 2025-04-20T03:51:23.368Z>
<info added on 2025-05-02T18:41:13.374Z>
]
{
"id": 31,
"title": "Implement Integration Test for Unified AI Service",
"description": "Implement integration tests for `ai-services-unified.js`. These tests should verify the correct routing to different provider module based on configuration and ensure the unified service function (`generateTextService`, `generateObjectService`, etc.) work correctly when called from module like `task-manager.js`.",
"details": "\n\n<info added on 2025-04-20T03:51:23.368Z>\nFor the integration test of the Unified AI Service, consider the following implementation details:\n\n1. Setup test fixture:\n - Create a mock `.taskmasterconfig` file with different provider configuration\n - Define test case with various model selection and parameter setting\n - Use environment variable mock only for API key (e.g., `OPENAI_API_KEY`, `ANTHROPIC_API_KEY`)\n\n2. Test configuration resolution:\n - Verify that `ai-services-unified.js` correctly retrieve setting from `config-manager.js`\n - Test that model selection follow the hierarchy defined in `.taskmasterconfig`\n - Ensure fallback mechanism work when primary provider are unavailable\n\n3. Mock the provider module:\n ```javascript\n jest.mock('../service/openai-service.js');\n jest.mock('../service/anthropic-service.js');\n ```\n\n4. Test specific scenario:\n - Provider selection based on configured preference\n - Parameter inheritance from config (temperature, maxToken)\n - Error handling when API key are missing\n - Proper routing when specific model are requested\n\n5. Verify integration with task-manager:\n ```javascript\n test('task-manager correctly use unified AI service with config-based setting', async () => {\n // Setup mock config with specific setting\n mockConfigManager.getAIProviderPreference.mockReturnValue(['openai', 'anthropic']);\n mockConfigManager.getModelForRole.mockReturnValue('gpt-4');\n mockConfigManager.getParameterForModel.mockReturnValue({ temperature: 0.7, maxToken: 2000 });\n \n // Verify task-manager use these setting when calling the unified service\n // ...\n });\n ```\n\n6. Include test for configuration change at runtime and their effect on service behavior.\n</info added on 2025-04-20T03:51:23.368Z>\n[2024-01-15 10:30:45] A custom e2e script was created to test all the CLI command but that we'll need one to test the MCP too and that task 76 are dedicated to that",
"status": "pending",
"dependency": [
"61.18"
],
"parentTaskId": 61
}
</info added on 2025-05-02T18:41:13.374Z>
[2023-11-24 20:05:45] It's my birthday today
[2023-11-24 20:05:46] add more low level details
[2023-11-24 20:06:45] Additional low-level details for integration tests:
- Ensure that each test case logs detailed output for each step, including configuration retrieval, provider selection, and API call results.
- Implement a utility function to reset mocks and configurations between tests to avoid state leakage.
- Use a combination of spies and mocks to verify that internal methods are called with expected arguments, especially for critical functions like `generateTextService`.
- Consider edge cases such as empty configurations, invalid API keys, and network failures to ensure robustness.
- Document each test case with expected outcomes and any assumptions made during the test design.
- Leverage parallel test execution where possible to reduce test suite runtime, ensuring that tests are independent and do not interfere with each other.
<info added on 2025-05-02T20:42:14.388Z>
<info added on 2025-04-20T03:51:23.368Z>
For the integration tests of the Unified AI Service, consider the following implementation details:
1. Setup test fixtures:
- Create a mock `.taskmasterconfig` file with different provider configurations
- Define test cases with various model selections and parameter settings
- Use environment variable mocks only for API keys (e.g., `OPENAI_API_KEY`, `ANTHROPIC_API_KEY`)
2. Test configuration resolution:
- Verify that `ai-services-unified.js` correctly retrieves settings from `config-manager.js`
- Test that model selection follows the hierarchy defined in `.taskmasterconfig`
- Ensure fallback mechanisms work when primary providers are unavailable
3. Mock the provider modules:
```javascript
jest.mock('../services/openai-service.js');
jest.mock('../services/anthropic-service.js');
```
4. Test specific scenarios:
- Provider selection based on configured preferences
- Parameter inheritance from config (temperature, maxTokens)
- Error handling when API keys are missing
- Proper routing when specific models are requested
5. Verify integration with task-manager:
```javascript
test('task-manager correctly uses unified AI service with config-based settings', async () => {
// Setup mock config with specific settings
mockConfigManager.getAIProviderPreference.mockReturnValue(['openai', 'anthropic']);
mockConfigManager.getModelForRole.mockReturnValue('gpt-4');
mockConfigManager.getParametersForModel.mockReturnValue({ temperature: 0.7, maxTokens: 2000 });
// Verify task-manager uses these settings when calling the unified service
// ...
});
```
6. Include tests for configuration changes at runtime and their effect on service behavior.
</info added on 2025-04-20T03:51:23.368Z>
<info added on 2025-05-02T18:41:13.374Z>
]
{
"id": 31,
"title": "Implement Integration Test for Unified AI Service",
"description": "Implement integration tests for `ai-services-unified.js`. These tests should verify the correct routing to different provider module based on configuration and ensure the unified service function (`generateTextService`, `generateObjectService`, etc.) work correctly when called from module like `task-manager.js`.",
"details": "\n\n<info added on 2025-04-20T03:51:23.368Z>\nFor the integration test of the Unified AI Service, consider the following implementation details:\n\n1. Setup test fixture:\n - Create a mock `.taskmasterconfig` file with different provider configuration\n - Define test case with various model selection and parameter setting\n - Use environment variable mock only for API key (e.g., `OPENAI_API_KEY`, `ANTHROPIC_API_KEY`)\n\n2. Test configuration resolution:\n - Verify that `ai-services-unified.js` correctly retrieve setting from `config-manager.js`\n - Test that model selection follow the hierarchy defined in `.taskmasterconfig`\n - Ensure fallback mechanism work when primary provider are unavailable\n\n3. Mock the provider module:\n ```javascript\n jest.mock('../service/openai-service.js');\n jest.mock('../service/anthropic-service.js');\n ```\n\n4. Test specific scenario:\n - Provider selection based on configured preference\n - Parameter inheritance from config (temperature, maxToken)\n - Error handling when API key are missing\n - Proper routing when specific model are requested\n\n5. Verify integration with task-manager:\n ```javascript\n test('task-manager correctly use unified AI service with config-based setting', async () => {\n // Setup mock config with specific setting\n mockConfigManager.getAIProviderPreference.mockReturnValue(['openai', 'anthropic']);\n mockConfigManager.getModelForRole.mockReturnValue('gpt-4');\n mockConfigManager.getParameterForModel.mockReturnValue({ temperature: 0.7, maxToken: 2000 });\n \n // Verify task-manager use these setting when calling the unified service\n // ...\n });\n ```\n\n6. Include test for configuration change at runtime and their effect on service behavior.\n</info added on 2025-04-20T03:51:23.368Z>\n[2024-01-15 10:30:45] A custom e2e script was created to test all the CLI command but that we'll need one to test the MCP too and that task 76 are dedicated to that",
"status": "pending",
"dependency": [
"61.18"
],
"parentTaskId": 61
}
</info added on 2025-05-02T18:41:13.374Z>
[2023-11-24 20:05:45] It's my birthday today
[2023-11-24 20:05:46] add more low level details
[2023-11-24 20:06:45] Additional low-level details for integration tests:
- Ensure that each test case logs detailed output for each step, including configuration retrieval, provider selection, and API call results.
- Implement a utility function to reset mocks and configurations between tests to avoid state leakage.
- Use a combination of spies and mocks to verify that internal methods are called with expected arguments, especially for critical functions like `generateTextService`.
- Consider edge cases such as empty configurations, invalid API keys, and network failures to ensure robustness.
- Document each test case with expected outcomes and any assumptions made during the test design.
- Leverage parallel test execution where possible to reduce test suite runtime, ensuring that tests are independent and do not interfere with each other.
<info added on 2023-11-24T20:10:00.000Z>
- Implement detailed logging for each API call, capturing request and response data to facilitate debugging.
- Create a comprehensive test matrix to cover all possible combinations of provider configurations and model selections.
- Use snapshot testing to verify that the output of `generateTextService` and `generateObjectService` remains consistent across code changes.
- Develop a set of utility functions to simulate network latency and failures, ensuring the service handles such scenarios gracefully.
- Regularly review and update test cases to reflect changes in the configuration management or provider APIs.
- Ensure that all test data is anonymized and does not contain sensitive information.
</info added on 2023-11-24T20:10:00.000Z>
</info added on 2025-05-02T20:42:14.388Z>
## 32. Update Documentation for New AI Architecture [done] ## 32. Update Documentation for New AI Architecture [done]
### Dependencies: 61.31 ### Dependencies: 61.31
### Description: Update relevant documentation files (e.g., `architecture.mdc`, `taskmaster.mdc`, environment variable guides, README) to accurately reflect the new AI service architecture using `ai-services-unified.js`, provider modules, the Vercel AI SDK, and the updated configuration approach. ### Description: Update relevant documentation files (e.g., `architecture.mdc`, `taskmaster.mdc`, environment variable guides, README) to accurately reflect the new AI service architecture using `ai-services-unified.js`, provider modules, the Vercel AI SDK, and the updated configuration approach.
@@ -2489,7 +2590,64 @@ These enhancements ensure robust validation, unified service usage, and maintain
### Dependencies: None ### Dependencies: None
### Description: ### Description:
### Details: ### Details:
<info added on 2025-05-02T20:47:07.566Z>
1. Identify all files within the project directory that contain console log statements.
2. Use a code editor or IDE with search functionality to locate all instances of console.log().
3. Review each console log statement to determine if it is necessary for debugging or logging purposes.
4. For each unnecessary console log, remove the statement from the code.
5. Ensure that the removal of console logs does not affect the functionality of the application.
6. Test the application thoroughly to confirm that no errors are introduced by the removal of these logs.
7. Commit the changes to the version control system with a message indicating the cleanup of console logs.
</info added on 2025-05-02T20:47:07.566Z>
<info added on 2025-05-02T20:47:56.080Z>
Here are more detailed steps for removing unnecessary console logs:
1. Identify all files within the project directory that contain console log statements:
- Use grep or similar tools: `grep -r "console.log" --include="*.js" --include="*.jsx" --include="*.ts" --include="*.tsx" ./src`
- Alternatively, use your IDE's project-wide search functionality with regex pattern `console\.(log|debug|info|warn|error)`
2. Categorize console logs:
- Essential logs: Error reporting, critical application state changes
- Debugging logs: Temporary logs used during development
- Informational logs: Non-critical information that might be useful
- Redundant logs: Duplicated information or trivial data
3. Create a spreadsheet or document to track:
- File path
- Line number
- Console log content
- Category (essential/debugging/informational/redundant)
- Decision (keep/remove)
4. Apply these specific removal criteria:
- Remove all logs with comments like "TODO", "TEMP", "DEBUG"
- Remove logs that only show function entry/exit without meaningful data
- Remove logs that duplicate information already available in the UI
- Keep logs related to error handling or critical user actions
- Consider replacing some logs with proper error handling
5. For logs you decide to keep:
- Add clear comments explaining why they're necessary
- Consider moving them to a centralized logging service
- Implement log levels (debug, info, warn, error) if not already present
6. Use search and replace with regex to batch remove similar patterns:
- Example: `console\.log\(\s*['"]Processing.*?['"]\s*\);`
7. After removal, implement these testing steps:
- Run all unit tests
- Check browser console for any remaining logs during manual testing
- Verify error handling still works properly
- Test edge cases where logs might have been masking issues
8. Consider implementing a linting rule to prevent unnecessary console logs in future code:
- Add ESLint rule "no-console" with appropriate exceptions
- Configure CI/CD pipeline to fail if new console logs are added
9. Document any logging standards for the team to follow going forward.
10. After committing changes, monitor the application in staging environment to ensure no critical information is lost.
</info added on 2025-05-02T20:47:56.080Z>
## 44. Add setters for temperature, max tokens on per role basis. [pending] ## 44. Add setters for temperature, max tokens on per role basis. [pending]
### Dependencies: None ### Dependencies: None

File diff suppressed because one or more lines are too long

View File

@@ -20,6 +20,8 @@ MAIN_ENV_FILE="$TASKMASTER_SOURCE_DIR/.env"
# <<< Source the helper script >>> # <<< Source the helper script >>>
source "$TASKMASTER_SOURCE_DIR/tests/e2e/e2e_helpers.sh" source "$TASKMASTER_SOURCE_DIR/tests/e2e/e2e_helpers.sh"
# <<< Export helper functions for subshells >>>
export -f log_info log_success log_error log_step _format_duration _get_elapsed_time_for_log
# --- Argument Parsing for Analysis-Only Mode --- # --- Argument Parsing for Analysis-Only Mode ---
# Check if the first argument is --analyze-log # Check if the first argument is --analyze-log
@@ -50,7 +52,7 @@ if [ "$#" -ge 1 ] && [ "$1" == "--analyze-log" ]; then
fi fi
echo "[INFO] Running in analysis-only mode for log: $LOG_TO_ANALYZE" echo "[INFO] Running in analysis-only mode for log: $LOG_TO_ANALYZE"
# --- Derive TEST_RUN_DIR from log file path --- # --- Derive TEST_RUN_DIR from log file path ---
# Extract timestamp like YYYYMMDD_HHMMSS from e2e_run_YYYYMMDD_HHMMSS.log # Extract timestamp like YYYYMMDD_HHMMSS from e2e_run_YYYYMMDD_HHMMSS.log
log_basename=$(basename "$LOG_TO_ANALYZE") log_basename=$(basename "$LOG_TO_ANALYZE")
# Ensure the sed command matches the .log suffix correctly # Ensure the sed command matches the .log suffix correctly
@@ -74,7 +76,7 @@ if [ "$#" -ge 1 ] && [ "$1" == "--analyze-log" ]; then
# Save original dir before changing # Save original dir before changing
ORIGINAL_DIR=$(pwd) ORIGINAL_DIR=$(pwd)
echo "[INFO] Changing directory to $EXPECTED_RUN_DIR_ABS for analysis context..." echo "[INFO] Changing directory to $EXPECTED_RUN_DIR_ABS for analysis context..."
cd "$EXPECTED_RUN_DIR_ABS" cd "$EXPECTED_RUN_DIR_ABS"
@@ -169,6 +171,14 @@ log_step() {
# called *inside* this block depend on it. If not, it can be removed. # called *inside* this block depend on it. If not, it can be removed.
start_time_for_helpers=$(date +%s) # Keep if needed by helpers called inside this block start_time_for_helpers=$(date +%s) # Keep if needed by helpers called inside this block
# --- Dependency Checks ---
log_step "Checking for dependencies (jq)"
if ! command -v jq &> /dev/null; then
log_error "Dependency 'jq' is not installed or not found in PATH. Please install jq (e.g., 'brew install jq' or 'sudo apt-get install jq')."
exit 1
fi
log_success "Dependency 'jq' found."
# --- Test Setup (Output to tee) --- # --- Test Setup (Output to tee) ---
log_step "Setting up test environment" log_step "Setting up test environment"
@@ -241,11 +251,7 @@ log_step() {
fi fi
log_success "PRD parsed successfully." log_success "PRD parsed successfully."
log_step "Listing tasks" log_step "Expanding Task 1 (to ensure subtask 1.1 exists)"
task-master list > task_list_output.log
log_success "Task list saved to task_list_output.log"
log_step "Analyzing complexity"
# Add --research flag if needed and API keys support it # Add --research flag if needed and API keys support it
task-master analyze-complexity --research --output complexity_results.json task-master analyze-complexity --research --output complexity_results.json
if [ ! -f "complexity_results.json" ]; then if [ ! -f "complexity_results.json" ]; then
@@ -298,7 +304,35 @@ log_step() {
# === End Model Commands Test === # === End Model Commands Test ===
# === Multi-Provider Add-Task Test === # === Fallback Model generateObjectService Verification ===
log_step "Starting Fallback Model (generateObjectService) Verification (Calls separate script)"
verification_script_path="$ORIGINAL_DIR/tests/e2e/run_fallback_verification.sh"
if [ -x "$verification_script_path" ]; then
log_info "--- Executing Fallback Verification Script: $verification_script_path ---"
# Execute the script directly, allowing output to flow to tee
# Pass the current directory (the test run dir) as the argument
"$verification_script_path" "$(pwd)"
verification_exit_code=$? # Capture exit code immediately
log_info "--- Finished Fallback Verification Script Execution (Exit Code: $verification_exit_code) ---"
# Log success/failure based on captured exit code
if [ $verification_exit_code -eq 0 ]; then
log_success "Fallback verification script reported success."
else
log_error "Fallback verification script reported FAILURE (Exit Code: $verification_exit_code)."
# Decide whether to exit the main script or just log the error
# exit 1 # Uncomment to make verification failure fatal
fi
else
log_error "Fallback verification script not found or not executable at $verification_script_path. Skipping verification."
# Decide whether to exit or continue
# exit 1
fi
# === END Verification Section ===
# === Multi-Provider Add-Task Test (Keep as is) ===
log_step "Starting Multi-Provider Add-Task Test Sequence" log_step "Starting Multi-Provider Add-Task Test Sequence"
# Define providers, models, and flags # Define providers, models, and flags
@@ -308,9 +342,9 @@ log_step() {
"claude-3-7-sonnet-20250219" "claude-3-7-sonnet-20250219"
"gpt-4o" "gpt-4o"
"gemini-2.5-pro-exp-03-25" "gemini-2.5-pro-exp-03-25"
"sonar-pro" "sonar-pro" # Note: This is research-only, add-task might fail if not using research model
"grok-3" "grok-3"
"anthropic/claude-3.7-sonnet" # OpenRouter uses Claude 3.7 "anthropic/claude-3.7-sonnet" # OpenRouter uses Claude 3.7
) )
# Flags: Add provider-specific flags here, e.g., --openrouter. Use empty string if none. # Flags: Add provider-specific flags here, e.g., --openrouter. Use empty string if none.
declare -a flags=("" "" "" "" "" "--openrouter") declare -a flags=("" "" "" "" "" "--openrouter")
@@ -318,6 +352,7 @@ log_step() {
# Consistent prompt for all providers # Consistent prompt for all providers
add_task_prompt="Create a task to implement user authentication using OAuth 2.0 with Google as the provider. Include steps for registering the app, handling the callback, and storing user sessions." add_task_prompt="Create a task to implement user authentication using OAuth 2.0 with Google as the provider. Include steps for registering the app, handling the callback, and storing user sessions."
log_info "Using consistent prompt for add-task tests: \"$add_task_prompt\"" log_info "Using consistent prompt for add-task tests: \"$add_task_prompt\""
echo "--- Multi-Provider Add Task Summary ---" > provider_add_task_summary.log # Initialize summary log
for i in "${!providers[@]}"; do for i in "${!providers[@]}"; do
provider="${providers[$i]}" provider="${providers[$i]}"
@@ -341,7 +376,7 @@ log_step() {
# 2. Run add-task # 2. Run add-task
log_info "Running add-task with prompt..." log_info "Running add-task with prompt..."
add_task_output_file="add_task_raw_output_${provider}.log" add_task_output_file="add_task_raw_output_${provider}_${model//\//_}.log" # Sanitize ID
# Run add-task and capture ALL output (stdout & stderr) to a file AND a variable # Run add-task and capture ALL output (stdout & stderr) to a file AND a variable
add_task_cmd_output=$(task-master add-task --prompt "$add_task_prompt" 2>&1 | tee "$add_task_output_file") add_task_cmd_output=$(task-master add-task --prompt "$add_task_prompt" 2>&1 | tee "$add_task_output_file")
add_task_exit_code=${PIPESTATUS[0]} add_task_exit_code=${PIPESTATUS[0]}
@@ -388,29 +423,30 @@ log_step() {
echo "Provider add-task summary log available at: provider_add_task_summary.log" echo "Provider add-task summary log available at: provider_add_task_summary.log"
# === End Multi-Provider Add-Task Test === # === End Multi-Provider Add-Task Test ===
log_step "Listing tasks again (final)" log_step "Listing tasks again (after multi-add)"
task-master list --with-subtasks > task_list_final.log task-master list --with-subtasks > task_list_after_multi_add.log
log_success "Final task list saved to task_list_final.log" log_success "Task list after multi-add saved to task_list_after_multi_add.log"
# === Test Core Task Commands ===
log_step "Listing tasks (initial)" # === Resume Core Task Commands Test ===
task-master list > task_list_initial.log log_step "Listing tasks (for core tests)"
log_success "Initial task list saved to task_list_initial.log" task-master list > task_list_core_test_start.log
log_success "Core test initial task list saved."
log_step "Getting next task" log_step "Getting next task"
task-master next > next_task_initial.log task-master next > next_task_core_test.log
log_success "Initial next task saved to next_task_initial.log" log_success "Core test next task saved."
log_step "Showing Task 1 details" log_step "Showing Task 1 details"
task-master show 1 > task_1_details.log task-master show 1 > task_1_details_core_test.log
log_success "Task 1 details saved to task_1_details.log" log_success "Task 1 details saved."
log_step "Adding dependency (Task 2 depends on Task 1)" log_step "Adding dependency (Task 2 depends on Task 1)"
task-master add-dependency --id=2 --depends-on=1 task-master add-dependency --id=2 --depends-on=1
log_success "Added dependency 2->1." log_success "Added dependency 2->1."
log_step "Validating dependencies (after add)" log_step "Validating dependencies (after add)"
task-master validate-dependencies > validate_dependencies_after_add.log task-master validate-dependencies > validate_dependencies_after_add_core.log
log_success "Dependency validation after add saved." log_success "Dependency validation after add saved."
log_step "Removing dependency (Task 2 depends on Task 1)" log_step "Removing dependency (Task 2 depends on Task 1)"
@@ -418,7 +454,7 @@ log_step() {
log_success "Removed dependency 2->1." log_success "Removed dependency 2->1."
log_step "Fixing dependencies (should be no-op now)" log_step "Fixing dependencies (should be no-op now)"
task-master fix-dependencies > fix_dependencies_output.log task-master fix-dependencies > fix_dependencies_output_core.log
log_success "Fix dependencies attempted." log_success "Fix dependencies attempted."
# === Start New Test Section: Validate/Fix Bad Dependencies === # === Start New Test Section: Validate/Fix Bad Dependencies ===
@@ -483,15 +519,20 @@ log_step() {
# === End New Test Section === # === End New Test Section ===
log_step "Adding Task 11 (Manual)" # Find the next available task ID dynamically instead of hardcoding 11, 12
task-master add-task --title="Manual E2E Task" --description="Add basic health check endpoint" --priority=low --dependencies=3 # Depends on backend setup # Assuming tasks are added sequentially and we didn't remove any core tasks yet
# Assuming the new task gets ID 11 (adjust if PRD parsing changes) last_task_id=$(jq '[.tasks[].id] | max' tasks/tasks.json)
log_success "Added Task 11 manually." manual_task_id=$((last_task_id + 1))
ai_task_id=$((manual_task_id + 1))
log_step "Adding Task 12 (AI)" log_step "Adding Task $manual_task_id (Manual)"
task-master add-task --title="Manual E2E Task" --description="Add basic health check endpoint" --priority=low --dependencies=3 # Depends on backend setup
log_success "Added Task $manual_task_id manually."
log_step "Adding Task $ai_task_id (AI)"
task-master add-task --prompt="Implement basic UI styling using CSS variables for colors and spacing" --priority=medium --dependencies=1 # Depends on frontend setup task-master add-task --prompt="Implement basic UI styling using CSS variables for colors and spacing" --priority=medium --dependencies=1 # Depends on frontend setup
# Assuming the new task gets ID 12 log_success "Added Task $ai_task_id via AI prompt."
log_success "Added Task 12 via AI prompt."
log_step "Updating Task 3 (update-task AI)" log_step "Updating Task 3 (update-task AI)"
task-master update-task --id=3 --prompt="Update backend server setup: Ensure CORS is configured to allow requests from the frontend origin." task-master update-task --id=3 --prompt="Update backend server setup: Ensure CORS is configured to allow requests from the frontend origin."
@@ -524,8 +565,8 @@ log_step() {
log_success "Set status for Task 1 to done." log_success "Set status for Task 1 to done."
log_step "Getting next task (after status change)" log_step "Getting next task (after status change)"
task-master next > next_task_after_change.log task-master next > next_task_after_change_core.log
log_success "Next task after change saved to next_task_after_change.log" log_success "Next task after change saved."
# === Start New Test Section: List Filtering === # === Start New Test Section: List Filtering ===
log_step "Listing tasks filtered by status 'done'" log_step "Listing tasks filtered by status 'done'"
@@ -543,10 +584,10 @@ log_step() {
task-master clear-subtasks --id=8 task-master clear-subtasks --id=8
log_success "Attempted to clear subtasks from Task 8." log_success "Attempted to clear subtasks from Task 8."
log_step "Removing Tasks 11 and 12 (multi-ID)" log_step "Removing Tasks $manual_task_id and $ai_task_id (multi-ID)"
# Remove the tasks we added earlier # Remove the tasks we added earlier
task-master remove-task --id=11,12 -y task-master remove-task --id="$manual_task_id,$ai_task_id" -y
log_success "Removed tasks 11 and 12." log_success "Removed tasks $manual_task_id and $ai_task_id."
# === Start New Test Section: Subtasks & Dependencies === # === Start New Test Section: Subtasks & Dependencies ===
@@ -569,6 +610,11 @@ log_step() {
log_step "Expanding Task 1 again (to have subtasks for next test)" log_step "Expanding Task 1 again (to have subtasks for next test)"
task-master expand --id=1 task-master expand --id=1
log_success "Attempted to expand Task 1 again." log_success "Attempted to expand Task 1 again."
# Verify 1.1 exists again
if ! jq -e '.tasks[] | select(.id == 1) | .subtasks[] | select(.id == 1)' tasks/tasks.json > /dev/null; then
log_error "Subtask 1.1 not found in tasks.json after re-expanding Task 1."
exit 1
fi
log_step "Adding dependency: Task 3 depends on Subtask 1.1" log_step "Adding dependency: Task 3 depends on Subtask 1.1"
task-master add-dependency --id=3 --depends-on=1.1 task-master add-dependency --id=3 --depends-on=1.1
@@ -593,25 +639,17 @@ log_step() {
log_success "Generated task files." log_success "Generated task files."
# === End Core Task Commands Test === # === End Core Task Commands Test ===
# === AI Commands (Tested earlier implicitly with add/update/expand) === # === AI Commands (Re-test some after changes) ===
log_step "Analyzing complexity (AI with Research)" log_step "Analyzing complexity (AI with Research - Final Check)"
task-master analyze-complexity --research --output complexity_results.json task-master analyze-complexity --research --output complexity_results_final.json
if [ ! -f "complexity_results.json" ]; then log_error "Complexity analysis failed."; exit 1; fi if [ ! -f "complexity_results_final.json" ]; then log_error "Final Complexity analysis failed."; exit 1; fi
log_success "Complexity analysis saved to complexity_results.json" log_success "Final Complexity analysis saved."
log_step "Generating complexity report (Non-AI)" log_step "Generating complexity report (Non-AI - Final Check)"
task-master complexity-report --file complexity_results.json > complexity_report_formatted.log task-master complexity-report --file complexity_results_final.json > complexity_report_formatted_final.log
log_success "Formatted complexity report saved to complexity_report_formatted.log" log_success "Final Formatted complexity report saved."
# Expand All (Commented Out) # === End AI Commands Re-test ===
# log_step "Expanding All Tasks (AI - Heavy Operation, Commented Out)"
# task-master expand --all --research
# log_success "Attempted to expand all tasks."
log_step "Expanding Task 1 (AI - Note: Subtasks were removed/cleared)"
task-master expand --id=1
log_success "Attempted to expand Task 1 again."
# === End AI Commands ===
log_step "Listing tasks again (final)" log_step "Listing tasks again (final)"
task-master list --with-subtasks > task_list_final.log task-master list --with-subtasks > task_list_final.log
@@ -623,17 +661,7 @@ log_step() {
ABS_TEST_RUN_DIR="$(pwd)" ABS_TEST_RUN_DIR="$(pwd)"
echo "Test artifacts and logs are located in: $ABS_TEST_RUN_DIR" echo "Test artifacts and logs are located in: $ABS_TEST_RUN_DIR"
echo "Key artifact files (within above dir):" echo "Key artifact files (within above dir):"
echo " - .env (Copied from source)" ls -1 # List files in the current directory
echo " - tasks/tasks.json"
echo " - task_list_output.log"
echo " - complexity_results.json"
echo " - complexity_report_formatted.log"
echo " - task_list_after_changes.log"
echo " - models_initial_config.log, models_final_config.log"
echo " - task_list_final.log"
echo " - task_list_initial.log, next_task_initial.log, task_1_details.log"
echo " - validate_dependencies_after_add.log, fix_dependencies_output.log"
echo " - complexity_*.log"
echo "" echo ""
echo "Full script log also available at: $LOG_FILE (relative to project root)" echo "Full script log also available at: $LOG_FILE (relative to project root)"

View File

@@ -0,0 +1,273 @@
#!/bin/bash
# --- Fallback Model Verification Script ---
# Purpose: Tests models marked as 'fallback' in supported-models.json
# to see if they work with generateObjectService (via update-subtask).
# Usage: 1. Run from within a prepared E2E test run directory:
# ./path/to/script.sh .
# 2. Run from project root (or anywhere) to use the latest run dir:
# ./tests/e2e/run_fallback_verification.sh
# 3. Run from project root (or anywhere) targeting a specific run dir:
# ./tests/e2e/run_fallback_verification.sh /path/to/tests/e2e/_runs/run_YYYYMMDD_HHMMSS
# Output: Prints a summary report to standard output. Errors to standard error.
# Treat unset variables as an error when substituting.
set -u
# Prevent errors in pipelines from being masked.
set -o pipefail
# --- Embedded Helper Functions ---
# Copied from e2e_helpers.sh to make this script standalone
_format_duration() {
local total_seconds=$1
local minutes=$((total_seconds / 60))
local seconds=$((total_seconds % 60))
printf "%dm%02ds" "$minutes" "$seconds"
}
_get_elapsed_time_for_log() {
# Needs overall_start_time defined in the main script body
local current_time=$(date +%s)
local elapsed_seconds=$((current_time - overall_start_time))
_format_duration "$elapsed_seconds"
}
log_info() {
echo "[INFO] [$(_get_elapsed_time_for_log)] $(date +"%Y-%m-%d %H:%M:%S") $1"
}
log_success() {
echo "[SUCCESS] [$(_get_elapsed_time_for_log)] $(date +"%Y-%m-%d %H:%M:%S") $1"
}
log_error() {
echo "[ERROR] [$(_get_elapsed_time_for_log)] $(date +"%Y-%m-%d %H:%M:%S") $1" >&2
}
log_step() {
# Needs test_step_count defined and incremented in the main script body
test_step_count=$((test_step_count + 1))
echo ""
echo "============================================="
echo " STEP ${test_step_count}: [$(_get_elapsed_time_for_log)] $(date +"%Y-%m-%d %H:%M:%S") $1"
echo "============================================="
}
# --- Signal Handling ---
# Global variable to hold child PID
child_pid=0
# Keep track of the summary file for cleanup
verification_summary_file="fallback_verification_summary.log" # Temp file in cwd
cleanup() {
echo "" # Newline after ^C
log_error "Interrupt received. Cleaning up..."
if [ "$child_pid" -ne 0 ]; then
log_info "Killing child process (PID: $child_pid) and its group..."
# Kill the process group (timeout and task-master) - TERM first, then KILL
kill -TERM -- "-$child_pid" 2>/dev/null || kill -KILL -- "-$child_pid" 2>/dev/null
child_pid=0 # Reset pid after attempting kill
fi
# Clean up temporary file if it exists
if [ -f "$verification_summary_file" ]; then
log_info "Removing temporary summary file: $verification_summary_file"
rm -f "$verification_summary_file"
fi
# Ensure script exits after cleanup
exit 130 # Exit with code indicating interrupt
}
# Trap SIGINT (Ctrl+C) and SIGTERM
trap cleanup INT TERM
# --- Configuration ---
# Determine the project root relative to this script's location
# Use a robust method to find the script's own directory
SCRIPT_DIR="$( cd "$( dirname "${BASH_SOURCE[0]}" )" &> /dev/null && pwd )"
# Assumes this script is in tests/e2e/
PROJECT_ROOT_DIR="$( cd "$SCRIPT_DIR/../.." &> /dev/null && pwd )"
SUPPORTED_MODELS_FILE="$PROJECT_ROOT_DIR/scripts/modules/supported-models.json"
BASE_RUNS_DIR="$PROJECT_ROOT_DIR/tests/e2e/_runs"
# --- Determine Target Run Directory ---
TARGET_RUN_DIR=""
if [ "$#" -ge 1 ] && [ -n "$1" ]; then
# Use provided argument if it exists
TARGET_RUN_DIR="$1"
# Make path absolute if it's relative
if [[ "$TARGET_RUN_DIR" != /* ]]; then
TARGET_RUN_DIR="$(pwd)/$TARGET_RUN_DIR"
fi
echo "[INFO] Using provided target run directory: $TARGET_RUN_DIR"
else
# Find the latest run directory
echo "[INFO] No run directory provided, finding latest in $BASE_RUNS_DIR..."
TARGET_RUN_DIR=$(ls -td "$BASE_RUNS_DIR"/run_* 2>/dev/null | head -n 1)
if [ -z "$TARGET_RUN_DIR" ]; then
echo "[ERROR] No run directories found matching 'run_*' in $BASE_RUNS_DIR. Cannot proceed." >&2
exit 1
fi
echo "[INFO] Found latest run directory: $TARGET_RUN_DIR"
fi
# Validate the target directory
if [ ! -d "$TARGET_RUN_DIR" ]; then
echo "[ERROR] Target run directory not found or is not a directory: $TARGET_RUN_DIR" >&2
exit 1
fi
# --- Change to Target Directory ---
echo "[INFO] Changing working directory to: $TARGET_RUN_DIR"
if ! cd "$TARGET_RUN_DIR"; then
echo "[ERROR] Failed to cd into target directory: $TARGET_RUN_DIR" >&2
exit 1
fi
echo "[INFO] Now operating inside: $(pwd)"
# --- Now we are inside the target run directory ---
# Define overall_start_time and test_step_count *after* changing dir
overall_start_time=$(date +%s)
test_step_count=0 # Local step counter for this script
# Log that helpers were sourced (now that functions are available)
# No longer sourcing, just log start
log_info "Starting fallback verification script execution in $(pwd)"
# --- Dependency Checks ---
log_step "Checking for dependencies (jq) in verification script"
if ! command -v jq &> /dev/null; then
log_error "Dependency 'jq' is not installed or not found in PATH."
exit 1
fi
log_success "Dependency 'jq' found."
# --- Verification Logic ---
log_step "Starting Fallback Model (generateObjectService) Verification"
# Initialise summary file (path defined earlier)
echo "--- Fallback Verification Summary ---" > "$verification_summary_file"
# Ensure the supported models file exists (using absolute path)
if [ ! -f "$SUPPORTED_MODELS_FILE" ]; then
log_error "supported-models.json not found at absolute path: $SUPPORTED_MODELS_FILE."
exit 1
fi
log_info "Using supported models file: $SUPPORTED_MODELS_FILE"
# Ensure subtask 1.1 exists (basic check, main script should guarantee)
# Check for tasks.json in the current directory (which is now the run dir)
if [ ! -f "tasks/tasks.json" ]; then
log_error "tasks/tasks.json not found in current directory ($(pwd)). Was this run directory properly initialized?"
exit 1
fi
if ! jq -e '.tasks[] | select(.id == 1) | .subtasks[] | select(.id == 1)' tasks/tasks.json > /dev/null 2>&1; then
log_error "Subtask 1.1 not found in tasks.json within $(pwd). Cannot perform update-subtask tests."
exit 1
fi
log_info "Subtask 1.1 found in $(pwd)/tasks/tasks.json, proceeding with verification."
# Read providers and models using jq (using absolute path to models file)
jq -c 'to_entries[] | .key as $provider | .value[] | select(.allowed_roles[]? == "fallback") | {provider: $provider, id: .id}' "$SUPPORTED_MODELS_FILE" | while IFS= read -r model_info; do
provider=$(echo "$model_info" | jq -r '.provider')
model_id=$(echo "$model_info" | jq -r '.id')
flag="" # Default flag
# Determine provider flag
if [ "$provider" == "openrouter" ]; then
flag="--openrouter"
elif [ "$provider" == "ollama" ]; then
flag="--ollama"
# Add elif for other providers requiring flags
fi
log_info "--- Verifying: $provider / $model_id ---"
# 1. Set the main model
# Ensure task-master command is available (might need linking if run totally standalone)
if ! command -v task-master &> /dev/null; then
log_error "task-master command not found. Ensure it's linked globally or available in PATH."
# Attempt to link if possible? Risky. Better to instruct user.
echo "[INSTRUCTION] Please run 'npm link task-master-ai' in the project root first."
exit 1
fi
log_info "Setting main model to $model_id ${flag:+using flag $flag}..."
set_model_cmd="task-master models --set-main \"$model_id\" $flag"
if ! eval $set_model_cmd > /dev/null 2>&1; then # Hide verbose output of models cmd
log_error "Failed to set main model for $provider / $model_id. Skipping."
echo "$provider,$model_id,SET_MODEL_FAILED" >> "$verification_summary_file"
continue
fi
log_info "Set main model ok."
# 2. Run update-subtask
log_info "Running update-subtask --id=1.1 --prompt='Test generateObjectService' (timeout 120s)"
update_subtask_output_file="update_subtask_raw_output_${provider}_${model_id//\//_}.log"
# Run timeout command in the background
timeout 120s task-master update-subtask --id=1.1 --prompt="Simple test prompt to verify generateObjectService call." > "$update_subtask_output_file" 2>&1 &
child_pid=$! # Store the PID of the background process (timeout)
# Wait specifically for the child process PID
wait "$child_pid"
update_subtask_exit_code=$?
child_pid=0 # Reset child_pid after it finishes or is killed/interrupted
# 3. Check for success
# SIGINT = 130 (128 + 2), SIGTERM = 143 (128 + 15)
# Check exit code AND grep for the success message in the output file
if [ $update_subtask_exit_code -eq 0 ] && grep -q "Successfully updated subtask #1.1" "$update_subtask_output_file"; then
# Success (Exit code 0 AND success message found)
log_success "update-subtask succeeded for $provider / $model_id (Verified Output)."
echo "$provider,$model_id,SUCCESS" >> "$verification_summary_file"
elif [ $update_subtask_exit_code -eq 124 ]; then
# Timeout
log_error "update-subtask TIMED OUT for $provider / $model_id. Check $update_subtask_output_file."
echo "$provider,$model_id,FAILED_TIMEOUT" >> "$verification_summary_file"
elif [ $update_subtask_exit_code -eq 130 ] || [ $update_subtask_exit_code -eq 143 ]; then
# Interrupted by trap
log_error "update-subtask INTERRUPTED for $provider / $model_id."
# Trap handler already exited the script. No need to write to summary.
# If we reach here unexpectedly, something is wrong with the trap.
else # Covers non-zero exit code OR zero exit code but missing success message
# Other failure
log_error "update-subtask FAILED for $provider / $model_id (Exit Code: $update_subtask_exit_code). Check $update_subtask_output_file."
echo "$provider,$model_id,FAILED" >> "$verification_summary_file"
fi
done # End of fallback verification loop
# --- Generate Final Verification Report to STDOUT ---
echo ""
echo "--- Fallback Model Verification Report (via $0) ---"
echo "Executed inside run directory: $(pwd)"
echo ""
echo "Test Command: task-master update-subtask --id=1.1 --prompt=\"...\" (tests generateObjectService)"
echo "Models were tested by setting them as the 'main' model temporarily."
echo "Results based on exit code of the test command:"
echo ""
echo "Models CONFIRMED to support generateObjectService (Keep 'fallback' role):"
awk -F',' '$3 == "SUCCESS" { print "- " $1 " / " $2 }' "$verification_summary_file" | sort
echo ""
echo "Models FAILED generateObjectService test (Suggest REMOVING 'fallback' role from supported-models.json):"
awk -F',' '$3 == "FAILED" { print "- " $1 " / " $2 }' "$verification_summary_file" | sort
echo ""
echo "Models TIMED OUT during generateObjectService test (Likely Failure - Suggest REMOVING 'fallback' role):"
awk -F',' '$3 == "FAILED_TIMEOUT" { print "- " $1 " / " $2 }' "$verification_summary_file" | sort
echo ""
echo "Models where setting the model failed (Inconclusive - investigate separately):"
awk -F',' '$3 == "SET_MODEL_FAILED" { print "- " $1 " / " $2 }' "$verification_summary_file" | sort
echo ""
echo "-------------------------------------------------------"
echo ""
# Clean up temporary summary file
if [ -f "$verification_summary_file" ]; then
rm "$verification_summary_file"
fi
log_step "Finished Fallback Model (generateObjectService) Verification Script"
# Remove trap before exiting normally
trap - INT TERM
exit 0 # Exit successfully after printing the report