fix: enhance task expansion with multiple improvements

This commit resolves several issues with the task expansion system to ensure higher quality subtasks and better synchronization: 1. Task File Generation - Add automatic regeneration of task files after expanding tasks - Ensure individual task text files stay in sync with tasks.json - Avoids manual regeneration steps after task expansion 2. Perplexity API Integration - Fix 'researchPrompt is not defined' error in Perplexity integration - Add specialized research-oriented prompt template - Improve system message for better context and instruction - Better fallback to Claude when Perplexity unavailable 3. Subtask Parsing Improvements - Enhance regex pattern to handle more formatting variations - Implement multiple parsing strategies for different response formats: * Improved section detection with flexible headings * Added support for numbered and bulleted lists * Implemented heuristic-based title and description extraction - Create more meaningful dummy subtasks with relevant titles and descriptions instead of generic placeholders - Ensure minimal descriptions are always provided 4. Quality Verification and Retry System - Add post-expansion verification to identify low-quality subtask sets - Detect tasks with too many generic/placeholder subtasks - Implement interactive retry mechanism with enhanced prompts - Use adjusted settings for retries (research mode, subtask count) - Clear existing subtasks before retry to prevent duplicates - Provide detailed reporting of verification and retry process These changes significantly improve the quality of generated subtasks and reduce the need for manual intervention when subtask generation produces suboptimal results.
2025-03-21 16:25:12 -04:00
parent 3f35783b60
commit eadd13e798
18 changed files with 3291 additions and 975 deletions
--- a/.cursor/rules/dev_workflow.mdc
+++ b/.cursor/rules/dev_workflow.mdc
@@ -1,267 +1,152 @@
 ---
-description: guide the Cursor Agent in using the meta-development script (scripts/dev.js). It also defines the overall workflow for reading, updating, and generating tasks during AI-driven development.
+description: Guide for using meta-development script (scripts/dev.js) to manage task-driven development workflows
 globs: **/*
 alwaysApply: true
 ---
-rules:
-  - name: "Meta Development Workflow for Cursor Agent"
-    description: >
-      Provides comprehensive guidelines on how the agent (Cursor) should coordinate
-      with the meta task script in scripts/dev.js. The agent will call
-      these commands at various points in the coding process to keep
-      tasks.json up to date and maintain a single source of truth for development tasks.
-    triggers:
-      # Potential triggers or states in Cursor where these rules apply.
-      # You may list relevant event names, e.g., "onTaskCompletion" or "onUserCommand"
-      - always
-    steps:
-      - "**Initial Setup**: If starting a new project with a PRD document, run `node scripts/dev.js parse-prd --input=<prd-file.txt>` to generate the initial tasks.json file. This will create a structured task list with IDs, titles, descriptions, dependencies, priorities, and test strategies."

-      - "**Task Discovery**: When a coding session begins, call `node scripts/dev.js list` to see the current tasks, their status, and IDs. This provides a quick overview of all tasks and their current states (pending, done, deferred)."
+- **Development Workflow Process**
+  - Start new projects by running `node scripts/dev.js parse-prd --input=<prd-file.txt>` to generate initial tasks.json
+  - Begin coding sessions with `node scripts/dev.js list` to see current tasks, status, and IDs
+  - Analyze task complexity with `node scripts/dev.js analyze-complexity --research` before breaking down tasks
+  - Select tasks based on dependencies (all marked 'done'), priority level, and ID order
+  - Clarify tasks by checking task files in tasks/ directory or asking for user input
+  - Break down complex tasks using `node scripts/dev.js expand --id=<id>` with appropriate flags
+  - Implement code following task details, dependencies, and project standards
+  - Verify tasks according to test strategies before marking as complete
+  - Mark completed tasks with `node scripts/dev.js set-status --id=<id> --status=done`
+  - Update dependent tasks when implementation differs from original plan
+  - Generate task files with `node scripts/dev.js generate` after updating tasks.json
+  - Respect dependency chains and task priorities when selecting work
+  - Report progress regularly using the list command

-      - "**Task Selection**: Select the next pending task based on these criteria:
-        1. Dependencies: Only select tasks whose dependencies are marked as 'done'
-        2. Priority: Choose higher priority tasks first ('high' > 'medium' > 'low')
-        3. ID order: When priorities are equal, select the task with the lowest ID
-        If multiple tasks are eligible, present options to the user for selection."
+- **Task Complexity Analysis**
+  - Run `node scripts/dev.js analyze-complexity --research` for comprehensive analysis
+  - Review complexity report in scripts/task-complexity-report.json
+  - Focus on tasks with highest complexity scores (8-10) for detailed breakdown
+  - Use analysis results to determine appropriate subtask allocation
+  - Note that reports are automatically used by the expand command

-      - "**Task Clarification**: If a task description is unclear or lacks detail:
-        1. Check if a corresponding task file exists in the tasks/ directory (e.g., task_001.txt)
-        2. If more information is needed, ask the user for clarification
-        3. If architectural changes have occurred, run `node scripts/dev.js update --from=<id> --prompt=\"<new architectural context>\"` to update the task and all subsequent tasks"
+- **Task Breakdown Process**
+  - For tasks with complexity analysis, use `node scripts/dev.js expand --id=<id>`
+  - Otherwise use `node scripts/dev.js expand --id=<id> --subtasks=<number>`
+  - Add `--research` flag to leverage Perplexity AI for research-backed expansion
+  - Use `--prompt="<context>"` to provide additional context when needed
+  - Review and adjust generated subtasks as necessary
+  - Use `--all` flag to expand multiple pending tasks at once

-      - "**Task Breakdown**: For complex tasks that need to be broken down into smaller steps:
-        1. Use `node scripts/dev.js expand --id=<id> --subtasks=<number>` to generate detailed subtasks
-        2. Optionally provide additional context with `--prompt=\"<context>\"` to guide subtask generation
-        3. Review the generated subtasks and adjust if necessary
-        4. For multiple tasks, use `--all` flag to expand all pending tasks that don't have subtasks"
+- **Implementation Drift Handling**
+  - When implementation differs significantly from planned approach
+  - When future tasks need modification due to current implementation choices
+  - When new dependencies or requirements emerge
+  - Call `node scripts/dev.js update --from=<futureTaskId> --prompt="<explanation>"` to update tasks.json

-      - "**Task Implementation**: Implement the code necessary for the chosen task. Follow these guidelines:
-        1. Reference the task's 'details' section for implementation specifics
-        2. Consider dependencies on previous tasks when implementing
-        3. Follow the project's coding standards and patterns
-        4. Create appropriate tests based on the task's 'testStrategy' field"
+- **Task Status Management**
+  - Use 'pending' for tasks ready to be worked on
+  - Use 'done' for completed and verified tasks
+  - Use 'deferred' for postponed tasks
+  - Add custom status values as needed for project-specific workflows

-      - "**Task Verification**: Before marking a task as done, verify it according to:
-        1. The task's specified 'testStrategy'
-        2. Any automated tests in the codebase
-        3. Manual verification if required
-        4. Code quality standards (linting, formatting, etc.)"
+- **Task File Format Reference**
+  ```
+  # Task ID: <id>
+  # Title: <title>
+  # Status: <status>
+  # Dependencies: <comma-separated list of dependency IDs>
+  # Priority: <priority>
+  # Description: <brief description>
+  # Details:
+  <detailed implementation notes>
  
-      - "**Task Completion**: When a task is completed and verified, run `node scripts/dev.js set-status --id=<id> --status=done` to mark it as done in tasks.json. This ensures the task tracking remains accurate."
+  # Test Strategy:
+  <verification approach>
+  ```

-      - "**Implementation Drift Handling**: If during implementation, you discover that:
-        1. The current approach differs significantly from what was planned
-        2. Future tasks need to be modified due to current implementation choices
-        3. New dependencies or requirements have emerged
+- **Command Reference: parse-prd**
+  - Syntax: `node scripts/dev.js parse-prd --input=<prd-file.txt>`
+  - Description: Parses a PRD document and generates a tasks.json file with structured tasks
+  - Parameters: 
+    - `--input=<file>`: Path to the PRD text file (default: sample-prd.txt)
+  - Example: `node scripts/dev.js parse-prd --input=requirements.txt`
+  - Notes: Will overwrite existing tasks.json file. Use with caution.

-        Then call `node scripts/dev.js update --from=<futureTaskId> --prompt=\"Detailed explanation of architectural or implementation changes...\"` to rewrite or re-scope subsequent tasks in tasks.json."
+- **Command Reference: update**
+  - Syntax: `node scripts/dev.js update --from=<id> --prompt="<prompt>"`
+  - Description: Updates tasks with ID >= specified ID based on the provided prompt
+  - Parameters:
+    - `--from=<id>`: Task ID from which to start updating (required)
+    - `--prompt="<text>"`: Explanation of changes or new context (required)
+  - Example: `node scripts/dev.js update --from=4 --prompt="Now we are using Express instead of Fastify."`
+  - Notes: Only updates tasks not marked as 'done'. Completed tasks remain unchanged.

-      - "**Task File Generation**: After any updates to tasks.json (status changes, task updates), run `node scripts/dev.js generate` to regenerate the individual task_XXX.txt files in the tasks/ folder. This ensures that task files are always in sync with tasks.json."
+- **Command Reference: generate**
+  - Syntax: `node scripts/dev.js generate`
+  - Description: Generates individual task files in tasks/ directory based on tasks.json
+  - Parameters: None
+  - Example: `node scripts/dev.js generate`
+  - Notes: Overwrites existing task files. Creates tasks/ directory if needed.

-      - "**Task Status Management**: Use appropriate status values when updating tasks:
-        1. 'pending': Tasks that are ready to be worked on
-        2. 'done': Tasks that have been completed and verified
-        3. 'deferred': Tasks that have been postponed to a later time
-        4. Any other custom status that might be relevant to the project"
+- **Command Reference: set-status**
+  - Syntax: `node scripts/dev.js set-status --id=<id> --status=<status>`
+  - Description: Updates the status of a specific task in tasks.json
+  - Parameters:
+    - `--id=<id>`: ID of the task to update (required)
+    - `--status=<status>`: New status value (required)
+  - Example: `node scripts/dev.js set-status --id=3 --status=done`
+  - Notes: Common values are 'done', 'pending', and 'deferred', but any string is accepted.

-      - "**Dependency Management**: When selecting tasks, always respect the dependency chain:
-        1. Never start a task whose dependencies are not marked as 'done'
-        2. If a dependency task is deferred, consider whether dependent tasks should also be deferred
-        3. If dependency relationships change during development, update tasks.json accordingly"
+- **Command Reference: list**
+  - Syntax: `node scripts/dev.js list`
+  - Description: Lists all tasks in tasks.json with IDs, titles, and status
+  - Parameters: None
+  - Example: `node scripts/dev.js list`
+  - Notes: Provides quick overview of project progress. Use at start of sessions.

-      - "**Progress Reporting**: Periodically (at the beginning of sessions or after completing significant tasks), run `node scripts/dev.js list` to provide the user with an updated view of project progress."
+- **Command Reference: expand**
+  - Syntax: `node scripts/dev.js expand --id=<id> [--num=<number>] [--research] [--prompt="<context>"]`
+  - Description: Expands a task with subtasks for detailed implementation
+  - Parameters:
+    - `--id=<id>`: ID of task to expand (required unless using --all)
+    - `--all`: Expand all pending tasks, prioritized by complexity
+    - `--num=<number>`: Number of subtasks to generate (default: from complexity report)
+    - `--research`: Use Perplexity AI for research-backed generation
+    - `--prompt="<text>"`: Additional context for subtask generation
+    - `--force`: Regenerate subtasks even for tasks that already have them
+  - Example: `node scripts/dev.js expand --id=3 --num=5 --research --prompt="Focus on security aspects"`
+  - Notes: Uses complexity report recommendations if available.

-      - "**Task File Format**: When reading task files, understand they follow this structure:
-        ```
-        # Task ID: <id>
-        # Title: <title>
-        # Status: <status>
-        # Dependencies: <comma-separated list of dependency IDs>
-        # Priority: <priority>
-        # Description: <brief description>
-        # Details:
-        <detailed implementation notes>
+- **Command Reference: analyze-complexity**
+  - Syntax: `node scripts/dev.js analyze-complexity [options]`
+  - Description: Analyzes task complexity and generates expansion recommendations
+  - Parameters:
+    - `--output=<file>, -o`: Output file path (default: scripts/task-complexity-report.json)
+    - `--model=<model>, -m`: Override LLM model to use
+    - `--threshold=<number>, -t`: Minimum score for expansion recommendation (default: 5)
+    - `--file=<path>, -f`: Use alternative tasks.json file
+    - `--research, -r`: Use Perplexity AI for research-backed analysis
+  - Example: `node scripts/dev.js analyze-complexity --research`
+  - Notes: Report includes complexity scores, recommended subtasks, and tailored prompts.

-        # Test Strategy:
-        <verification approach>
-        ```"
+- **Task Structure Fields**
+  - **id**: Unique identifier for the task (Example: `1`)
+  - **title**: Brief, descriptive title (Example: `"Initialize Repo"`)
+  - **description**: Concise summary of what the task involves (Example: `"Create a new repository, set up initial structure."`)
+  - **status**: Current state of the task (Example: `"pending"`, `"done"`, `"deferred"`)
+  - **dependencies**: IDs of prerequisite tasks (Example: `[1, 2]`)
+  - **priority**: Importance level (Example: `"high"`, `"medium"`, `"low"`)
+  - **details**: In-depth implementation instructions (Example: `"Use GitHub client ID/secret, handle callback, set session token."`)
+  - **testStrategy**: Verification approach (Example: `"Deploy and call endpoint to confirm 'Hello World' response."`)
+  - **subtasks**: List of smaller, more specific tasks (Example: `[{"id": 1, "title": "Configure OAuth", ...}]`)

-      - "**Continuous Workflow**: Repeat this process until all tasks relevant to the current development phase are completed. Always maintain tasks.json as the single source of truth for development progress."
-
-  - name: "Meta-Development Script Command Reference"
-    description: >
-      Detailed reference for all commands available in the scripts/dev.js meta-development script.
-      This helps the agent understand the full capabilities of the script and use it effectively.
-    triggers:
-      - always
-    commands:
-      - name: "parse-prd"
-        syntax: "node scripts/dev.js parse-prd --input=<prd-file.txt>"
-        description: "Parses a PRD document and generates a tasks.json file with structured tasks. This initializes the task tracking system."
-        parameters:
-          - "--input=<file>: Path to the PRD text file (default: sample-prd.txt)"
-        example: "node scripts/dev.js parse-prd --input=requirements.txt"
-        notes: "This will overwrite any existing tasks.json file. Use with caution on established projects."
-      
-      - name: "update"
-        syntax: "node scripts/dev.js update --from=<id> --prompt=\"<prompt>\""
-        description: "Updates tasks with ID >= the specified ID based on the provided prompt. Useful for handling implementation drift or architectural changes."
-        parameters:
-          - "--from=<id>: The task ID from which to start updating (required)"
-          - "--prompt=\"<text>\": The prompt explaining the changes or new context (required)"
-        example: "node scripts/dev.js update --from=4 --prompt=\"Now we are using Express instead of Fastify.\""
-        notes: "Only updates tasks that aren't marked as 'done'. Completed tasks remain unchanged."
-      
-      - name: "generate"
-        syntax: "node scripts/dev.js generate"
-        description: "Generates individual task files in the tasks/ directory based on the current state of tasks.json."
-        parameters: "None"
-        example: "node scripts/dev.js generate"
-        notes: "Overwrites existing task files. Creates the tasks/ directory if it doesn't exist."
-      
-      - name: "set-status"
-        syntax: "node scripts/dev.js set-status --id=<id> --status=<status>"
-        description: "Updates the status of a specific task in tasks.json."
-        parameters:
-          - "--id=<id>: The ID of the task to update (required)"
-          - "--status=<status>: The new status (e.g., 'done', 'pending', 'deferred') (required)"
-        example: "node scripts/dev.js set-status --id=3 --status=done"
-        notes: "Common status values are 'done', 'pending', and 'deferred', but any string is accepted."
-      
-      - name: "list"
-        syntax: "node scripts/dev.js list"
-        description: "Lists all tasks in tasks.json with their IDs, titles, and current status."
-        parameters: "None"
-        example: "node scripts/dev.js list"
-        notes: "Provides a quick overview of project progress. Use this at the start of coding sessions."
-      
-      - name: "expand"
-        syntax: "node scripts/dev.js expand --id=<id> [--subtasks=<number>] [--prompt=\"<context>\"]"
-        description: "Expands a task with subtasks for more detailed implementation. Can also expand all tasks with the --all flag."
-        parameters:
-          - "--id=<id>: The ID of the task to expand (required unless using --all)"
-          - "--all: Expand all pending tasks that don't have subtasks"
-          - "--subtasks=<number>: Number of subtasks to generate (default: 3)"
-          - "--prompt=\"<text>\": Additional context to guide subtask generation"
-          - "--force: When used with --all, regenerates subtasks even for tasks that already have them"
-        example: "node scripts/dev.js expand --id=3 --subtasks=5 --prompt=\"Focus on security aspects\""
-        notes: "Tasks marked as 'done' or 'completed' are always skipped. By default, tasks that already have subtasks are skipped unless --force is used."
-
-  - name: "Task Structure Reference"
-    description: >
-      Details the structure of tasks in tasks.json to help the agent understand
-      and work with the task data effectively.
-    triggers:
-      - always
-    task_fields:
-      - name: "id"
-        type: "number"
-        description: "Unique identifier for the task. Used in commands and for tracking dependencies."
-        example: "1"
-      
-      - name: "title"
-        type: "string"
-        description: "Brief, descriptive title of the task."
-        example: "Initialize Repo"
-      
-      - name: "description"
-        type: "string"
-        description: "Concise description of what the task involves."
-        example: "Create a new repository, set up initial structure."
-      
-      - name: "status"
-        type: "string"
-        description: "Current state of the task. Common values: 'pending', 'done', 'deferred'."
-        example: "pending"
-      
-      - name: "dependencies"
-        type: "array of numbers"
-        description: "IDs of tasks that must be completed before this task can be started."
-        example: "[1, 2]"
-      
-      - name: "priority"
-        type: "string"
-        description: "Importance level of the task. Common values: 'high', 'medium', 'low'."
-        example: "high"
-      
-      - name: "details"
-        type: "string"
-        description: "In-depth instructions, references, or context for implementing the task."
-        example: "Use GitHub client ID/secret, handle callback, set session token."
-      
-      - name: "testStrategy"
-        type: "string"
-        description: "Approach for verifying the task has been completed correctly."
-        example: "Deploy and call endpoint to confirm 'Hello World' response."
-      
-      - name: "subtasks"
-        type: "array of objects"
-        description: "List of smaller, more specific tasks that make up the main task."
-        example: "[{\"id\": 1, \"title\": \"Configure OAuth\", \"description\": \"...\", \"status\": \"pending\", \"dependencies\": [], \"acceptanceCriteria\": \"...\"}]"
-
-  - name: "Environment Variables Reference"
-    description: >
-      Details the environment variables that can be used to configure the dev.js script.
-      These variables should be set in a .env file at the root of the project.
-    triggers:
-      - always
-    variables:
-      - name: "ANTHROPIC_API_KEY"
-        required: true
-        description: "Your Anthropic API key for Claude. Required for task generation and expansion."
-        example: "ANTHROPIC_API_KEY=sk-ant-api03-..."
-      
-      - name: "MODEL"
-        required: false
-        default: "claude-3-7-sonnet-20250219"
-        description: "Specify which Claude model to use for task generation and expansion."
-        example: "MODEL=claude-3-opus-20240229"
-      
-      - name: "MAX_TOKENS"
-        required: false
-        default: "4000"
-        description: "Maximum tokens for model responses. Higher values allow for more detailed task generation."
-        example: "MAX_TOKENS=8000"
-      
-      - name: "TEMPERATURE"
-        required: false
-        default: "0.7"
-        description: "Temperature for model responses. Higher values (0.0-1.0) increase creativity but may reduce consistency."
-        example: "TEMPERATURE=0.5"
-      
-      - name: "DEBUG"
-        required: false
-        default: "false"
-        description: "Enable debug logging. When true, detailed logs are written to dev-debug.log."
-        example: "DEBUG=true"
-      
-      - name: "LOG_LEVEL"
-        required: false
-        default: "info"
-        description: "Log level for console output. Options: debug, info, warn, error."
-        example: "LOG_LEVEL=debug"
-      
-      - name: "DEFAULT_SUBTASKS"
-        required: false
-        default: "3"
-        description: "Default number of subtasks when expanding a task."
-        example: "DEFAULT_SUBTASKS=5"
-      
-      - name: "DEFAULT_PRIORITY"
-        required: false
-        default: "medium"
-        description: "Default priority for generated tasks. Options: high, medium, low."
-        example: "DEFAULT_PRIORITY=high"
-      
-      - name: "PROJECT_NAME"
-        required: false
-        default: "MCP SaaS MVP"
-        description: "Override default project name in tasks.json metadata."
-        example: "PROJECT_NAME=My Awesome Project"
-      
-      - name: "PROJECT_VERSION"
-        required: false
-        default: "1.0.0"
-        description: "Override default version in tasks.json metadata."
-        example: "PROJECT_VERSION=2.1.0"
+- **Environment Variables Configuration**
+  - **ANTHROPIC_API_KEY** (Required): Your Anthropic API key for Claude (Example: `ANTHROPIC_API_KEY=sk-ant-api03-...`)
+  - **MODEL** (Default: `"claude-3-7-sonnet-20250219"`): Claude model to use (Example: `MODEL=claude-3-opus-20240229`)
+  - **MAX_TOKENS** (Default: `"4000"`): Maximum tokens for responses (Example: `MAX_TOKENS=8000`)
+  - **TEMPERATURE** (Default: `"0.7"`): Temperature for model responses (Example: `TEMPERATURE=0.5`)
+  - **DEBUG** (Default: `"false"`): Enable debug logging (Example: `DEBUG=true`)
+  - **LOG_LEVEL** (Default: `"info"`): Console output level (Example: `LOG_LEVEL=debug`)
+  - **DEFAULT_SUBTASKS** (Default: `"3"`): Default subtask count (Example: `DEFAULT_SUBTASKS=5`)
+  - **DEFAULT_PRIORITY** (Default: `"medium"`): Default priority (Example: `DEFAULT_PRIORITY=high`)
+  - **PROJECT_NAME** (Default: `"MCP SaaS MVP"`): Project name in metadata (Example: `PROJECT_NAME=My Awesome Project`)
+  - **PROJECT_VERSION** (Default: `"1.0.0"`): Version in metadata (Example: `PROJECT_VERSION=2.1.0`)
+  - **PERPLEXITY_API_KEY**: For research-backed features (Example: `PERPLEXITY_API_KEY=pplx-...`)
+  - **PERPLEXITY_MODEL** (Default: `"sonar-medium-online"`): Perplexity model (Example: `PERPLEXITY_MODEL=sonar-large-online`)
--- a/.env.example
+++ b/.env.example
@@ -4,16 +4,16 @@ PERPLEXITY_API_KEY=your_perplexity_api_key_here  # Format: pplx-...

 # Model Configuration
 MODEL=claude-3-7-sonnet-20250219  # Recommended models: claude-3-7-sonnet-20250219, claude-3-opus-20240229
-PERPLEXITY_MODEL=sonar-small-online  # Perplexity model for research-backed subtasks
-MAX_TOKENS=4000                   # Maximum tokens for model responses
-TEMPERATURE=0.7                   # Temperature for model responses (0.0-1.0)
+PERPLEXITY_MODEL=sonar-pro  # Perplexity model for research-backed subtasks
+MAX_TOKENS=64000                   # Maximum tokens for model responses
+TEMPERATURE=0.4                   # Temperature for model responses (0.0-1.0)

 # Logging Configuration
 DEBUG=false                       # Enable debug logging (true/false)
 LOG_LEVEL=info                    # Log level (debug, info, warn, error)

 # Task Generation Settings
-DEFAULT_SUBTASKS=3                # Default number of subtasks when expanding
+DEFAULT_SUBTASKS=4                # Default number of subtasks when expanding
 DEFAULT_PRIORITY=medium           # Default priority for generated tasks (high, medium, low)

 # Project Metadata (Optional)
--- a/README-task-master.md
+++ b/README-task-master.md
@@ -361,8 +361,80 @@ Please mark it as complete and tell me what I should work on next.

 ## Documentation

-For more detailed documentation on the scripts, see the [scripts/README.md](scripts/README.md) file in your initialized project.
+For more detailed documentation on the scripts and command-line options, see the [scripts/README.md](scripts/README.md) file in your initialized project.

 ## License

 MIT 
+
+### Analyzing Task Complexity
+
+To analyze the complexity of tasks and automatically generate expansion recommendations:
+
+```bash
+npm run dev -- analyze-complexity
+```
+
+This command:
+- Analyzes each task using AI to assess its complexity
+- Recommends optimal number of subtasks based on configured DEFAULT_SUBTASKS
+- Generates tailored prompts for expanding each task
+- Creates a comprehensive JSON report with ready-to-use commands
+- Saves the report to scripts/task-complexity-report.json by default
+
+Options:
+```bash
+# Save report to a custom location
+npm run dev -- analyze-complexity --output=my-report.json
+
+# Use a specific LLM model
+npm run dev -- analyze-complexity --model=claude-3-opus-20240229
+
+# Set a custom complexity threshold (1-10)
+npm run dev -- analyze-complexity --threshold=6
+
+# Use an alternative tasks file
+npm run dev -- analyze-complexity --file=custom-tasks.json
+
+# Use Perplexity AI for research-backed complexity analysis
+npm run dev -- analyze-complexity --research
+```
+
+The generated report contains:
+- Complexity analysis for each task (scored 1-10)
+- Recommended number of subtasks based on complexity
+- AI-generated expansion prompts customized for each task
+- Ready-to-run expansion commands directly within each task analysis
+
+### Smart Task Expansion
+
+The `expand` command now automatically checks for and uses the complexity report:
+
+```bash
+# Expand a task, using complexity report recommendations if available
+npm run dev -- expand --id=8
+
+# Expand all tasks, prioritizing by complexity score if a report exists
+npm run dev -- expand --all
+```
+
+When a complexity report exists:
+- Tasks are automatically expanded using the recommended subtask count and prompts
+- When expanding all tasks, they're processed in order of complexity (highest first)
+- Research-backed generation is preserved from the complexity analysis
+- You can still override recommendations with explicit command-line options
+
+Example workflow:
+```bash
+# Generate the complexity analysis report with research capabilities
+npm run dev -- analyze-complexity --research
+
+# Review the report in scripts/task-complexity-report.json
+
+# Expand tasks using the optimized recommendations
+npm run dev -- expand --id=8
+# or expand all tasks
+npm run dev -- expand --all
+```
+
+This integration ensures that task expansion is informed by thorough complexity analysis, resulting in better subtask organization and more efficient development.
--- a/assets/scripts_README.md
+++ b/assets/scripts_README.md
@@ -0,0 +1,181 @@
+# Meta-Development Script
+
+This folder contains a **meta-development script** (`dev.js`) and related utilities that manage tasks for an AI-driven or traditional software development workflow. The script revolves around a `tasks.json` file, which holds an up-to-date list of development tasks.
+
+## Overview
+
+In an AI-driven development process—particularly with tools like [Cursor](https://www.cursor.so/)—it's beneficial to have a **single source of truth** for tasks. This script allows you to:
+
+1. **Parse** a PRD or requirements document (`.txt`) to initialize a set of tasks (`tasks.json`).
+2. **List** all existing tasks (IDs, statuses, titles).
+3. **Update** tasks to accommodate new prompts or architecture changes (useful if you discover "implementation drift").
+4. **Generate** individual task files (e.g., `task_001.txt`) for easy reference or to feed into an AI coding workflow.
+5. **Set task status**—mark tasks as `done`, `pending`, or `deferred` based on progress.
+6. **Expand** tasks with subtasks—break down complex tasks into smaller, more manageable subtasks.
+7. **Research-backed subtask generation**—use Perplexity AI to generate more informed and contextually relevant subtasks.
+
+## Configuration
+
+The script can be configured through environment variables in a `.env` file at the root of the project:
+
+### Required Configuration
+- `ANTHROPIC_API_KEY`: Your Anthropic API key for Claude
+
+### Optional Configuration
+- `MODEL`: Specify which Claude model to use (default: "claude-3-7-sonnet-20250219")
+- `MAX_TOKENS`: Maximum tokens for model responses (default: 4000)
+- `TEMPERATURE`: Temperature for model responses (default: 0.7)
+- `PERPLEXITY_API_KEY`: Your Perplexity API key for research-backed subtask generation
+- `PERPLEXITY_MODEL`: Specify which Perplexity model to use (default: "sonar-medium-online")
+- `DEBUG`: Enable debug logging (default: false)
+- `LOG_LEVEL`: Log level - debug, info, warn, error (default: info)
+- `DEFAULT_SUBTASKS`: Default number of subtasks when expanding (default: 3)
+- `DEFAULT_PRIORITY`: Default priority for generated tasks (default: medium)
+- `PROJECT_NAME`: Override default project name in tasks.json
+- `PROJECT_VERSION`: Override default version in tasks.json
+
+## How It Works
+
+1. **`tasks.json`**:  
+   - A JSON file at the project root containing an array of tasks (each with `id`, `title`, `description`, `status`, etc.).  
+   - The `meta` field can store additional info like the project's name, version, or reference to the PRD.  
+   - Tasks can have `subtasks` for more detailed implementation steps.
+
+2. **Script Commands**  
+   You can run the script via:
+
+   ```bash
+   node scripts/dev.js [command] [options]
+   ```
+
+   Available commands:
+
+   - `parse-prd`: Generate tasks from a PRD document
+   - `list`: Display all tasks with their status
+   - `update`: Update tasks based on new information
+   - `generate`: Create individual task files
+   - `set-status`: Change a task's status
+   - `expand`: Add subtasks to a task or all tasks
+
+   Run `node scripts/dev.js` without arguments to see detailed usage information.
+
+## Listing Tasks
+
+The `list` command allows you to view all tasks and their status:
+
+```bash
+# List all tasks
+node scripts/dev.js list
+
+# List tasks with a specific status
+node scripts/dev.js list --status=pending
+
+# List tasks and include their subtasks
+node scripts/dev.js list --with-subtasks
+
+# List tasks with a specific status and include their subtasks
+node scripts/dev.js list --status=pending --with-subtasks
+```
+
+## Updating Tasks
+
+The `update` command allows you to update tasks based on new information or implementation changes:
+
+```bash
+# Update tasks starting from ID 4 with a new prompt
+node scripts/dev.js update --from=4 --prompt="Refactor tasks from ID 4 onward to use Express instead of Fastify"
+
+# Update all tasks (default from=1)
+node scripts/dev.js update --prompt="Add authentication to all relevant tasks"
+
+# Specify a different tasks file
+node scripts/dev.js update --file=custom-tasks.json --from=5 --prompt="Change database from MongoDB to PostgreSQL"
+```
+
+Notes:
+- The `--prompt` parameter is required and should explain the changes or new context
+- Only tasks that aren't marked as 'done' will be updated
+- Tasks with ID >= the specified --from value will be updated
+
+## Setting Task Status
+
+The `set-status` command allows you to change a task's status:
+
+```bash
+# Mark a task as done
+node scripts/dev.js set-status --id=3 --status=done
+
+# Mark a task as pending
+node scripts/dev.js set-status --id=4 --status=pending
+
+# Mark a specific subtask as done
+node scripts/dev.js set-status --id=3.1 --status=done
+
+# Mark multiple tasks at once
+node scripts/dev.js set-status --id=1,2,3 --status=done
+```
+
+Notes:
+- When marking a parent task as "done", all of its subtasks will automatically be marked as "done" as well
+- Common status values are 'done', 'pending', and 'deferred', but any string is accepted
+- You can specify multiple task IDs by separating them with commas
+- Subtask IDs are specified using the format `parentId.subtaskId` (e.g., `3.1`)
+
+## Expanding Tasks
+
+The `expand` command allows you to break down tasks into subtasks for more detailed implementation:
+
+```bash
+# Expand a specific task with 3 subtasks (default)
+node scripts/dev.js expand --id=3
+
+# Expand a specific task with 5 subtasks
+node scripts/dev.js expand --id=3 --num=5
+
+# Expand a task with additional context
+node scripts/dev.js expand --id=3 --prompt="Focus on security aspects"
+
+# Expand all pending tasks that don't have subtasks
+node scripts/dev.js expand --all
+
+# Force regeneration of subtasks for all pending tasks
+node scripts/dev.js expand --all --force
+
+# Use Perplexity AI for research-backed subtask generation
+node scripts/dev.js expand --id=3 --research
+
+# Use Perplexity AI for research-backed generation on all pending tasks
+node scripts/dev.js expand --all --research
+```
+
+Notes:
+- Tasks marked as 'done' or 'completed' are always skipped
+- By default, tasks that already have subtasks are skipped unless `--force` is used
+- Subtasks include title, description, dependencies, and acceptance criteria
+- The `--research` flag uses Perplexity AI to generate more informed and contextually relevant subtasks
+- If Perplexity API is unavailable, the script will fall back to using Anthropic's Claude
+
+## AI Integration
+
+The script integrates with two AI services:
+
+1. **Anthropic Claude**: Used for parsing PRDs, generating tasks, and creating subtasks.
+2. **Perplexity AI**: Used for research-backed subtask generation when the `--research` flag is specified.
+
+The Perplexity integration uses the OpenAI client to connect to Perplexity's API, which provides enhanced research capabilities for generating more informed subtasks. If the Perplexity API is unavailable or encounters an error, the script will automatically fall back to using Anthropic's Claude.
+
+To use the Perplexity integration:
+1. Obtain a Perplexity API key
+2. Add `PERPLEXITY_API_KEY` to your `.env` file
+3. Optionally specify `PERPLEXITY_MODEL` in your `.env` file (default: "sonar-medium-online")
+4. Use the `--research` flag with the `expand` command
+
+## Logging
+
+The script supports different logging levels controlled by the `LOG_LEVEL` environment variable:
+- `debug`: Detailed information, typically useful for troubleshooting
+- `info`: Confirmation that things are working as expected (default)
+- `warn`: Warning messages that don't prevent execution
+- `error`: Error messages that might prevent execution
+
+When `DEBUG=true` is set, debug logs are also written to a `dev-debug.log` file in the project root.
--- a/package-lock.json
+++ b/package-lock.json
@@ -1,19 +1,20 @@
 {
  "name": "claude-task-master",
-  "version": "1.3.1",
+  "version": "1.4.6",
  "lockfileVersion": 3,
  "requires": true,
  "packages": {
    "": {
      "name": "claude-task-master",
-      "version": "1.3.1",
+      "version": "1.4.6",
      "license": "MIT",
      "dependencies": {
        "@anthropic-ai/sdk": "^0.39.0",
        "chalk": "^4.1.2",
        "commander": "^11.1.0",
        "dotenv": "^16.3.1",
-        "openai": "^4.86.1"
+        "openai": "^4.86.1",
+        "punycode": "^2.3.1"
      },
      "bin": {
        "claude-task-init": "scripts/init.js"
@@ -521,6 +522,15 @@
        }
      }
    },
+    "node_modules/punycode": {
+      "version": "2.3.1",
+      "resolved": "https://registry.npmjs.org/punycode/-/punycode-2.3.1.tgz",
+      "integrity": "sha512-vYt7UD1U9Wg6138shLtLOvdAu+8DsC/ilFtEVHcH+wydcSpNE20AfSOduf6MkRFahL5FY7X1oU7nKVZFtfq8Fg==",
+      "license": "MIT",
+      "engines": {
+        "node": ">=6"
+      }
+    },
    "node_modules/supports-color": {
      "version": "7.2.0",
      "resolved": "https://registry.npmjs.org/supports-color/-/supports-color-7.2.0.tgz",
--- a/package.json
+++ b/package.json
@@ -1,6 +1,6 @@
 {
  "name": "claude-task-master",
-  "version": "1.4.2",
+  "version": "1.4.6",
  "description": "A task management system for AI-driven development with Claude",
  "main": "index.js",
  "type": "module",
@@ -29,7 +29,8 @@
    "chalk": "^4.1.2",
    "commander": "^11.1.0",
    "dotenv": "^16.3.1",
-    "openai": "^4.86.1"
+    "openai": "^4.86.1",
+    "punycode": "^2.3.1"
  },
  "engines": {
    "node": ">=14.0.0"
--- a/scripts/README.md
+++ b/scripts/README.md
@@ -179,3 +179,79 @@ The script supports different logging levels controlled by the `LOG_LEVEL` envir
 - `error`: Error messages that might prevent execution

 When `DEBUG=true` is set, debug logs are also written to a `dev-debug.log` file in the project root.
+
+## Analyzing Task Complexity
+
+The `analyze-complexity` command allows you to automatically assess task complexity and generate expansion recommendations:
+
+```bash
+# Analyze all tasks and generate expansion recommendations
+node scripts/dev.js analyze-complexity
+
+# Specify a custom output file
+node scripts/dev.js analyze-complexity --output=custom-report.json
+
+# Override the model used for analysis
+node scripts/dev.js analyze-complexity --model=claude-3-opus-20240229
+
+# Set a custom complexity threshold (1-10)
+node scripts/dev.js analyze-complexity --threshold=6
+
+# Use Perplexity AI for research-backed complexity analysis
+node scripts/dev.js analyze-complexity --research
+```
+
+Notes:
+- The command uses Claude to analyze each task's complexity (or Perplexity with --research flag)
+- Tasks are scored on a scale of 1-10
+- Each task receives a recommended number of subtasks based on DEFAULT_SUBTASKS configuration
+- The default output path is `scripts/task-complexity-report.json`
+- Each task in the analysis includes a ready-to-use `expansionCommand` that can be copied directly to the terminal or executed programmatically
+- Tasks with complexity scores below the threshold (default: 5) may not need expansion
+- The research flag provides more contextual and informed complexity assessments
+
+### Integration with Expand Command
+
+The `expand` command automatically checks for and uses complexity analysis if available:
+
+```bash
+# Expand a task, using complexity report recommendations if available
+node scripts/dev.js expand --id=8
+
+# Expand all tasks, prioritizing by complexity score if a report exists
+node scripts/dev.js expand --all
+
+# Override recommendations with explicit values
+node scripts/dev.js expand --id=8 --num=5 --prompt="Custom prompt"
+```
+
+When a complexity report exists:
+- The `expand` command will use the recommended subtask count from the report (unless overridden)
+- It will use the tailored expansion prompt from the report (unless a custom prompt is provided)
+- When using `--all`, tasks are sorted by complexity score (highest first)
+- The `--research` flag is preserved from the complexity analysis to expansion
+
+The output report structure is:
+```json
+{
+  "meta": {
+    "generatedAt": "2023-06-15T12:34:56.789Z",
+    "tasksAnalyzed": 20,
+    "thresholdScore": 5,
+    "projectName": "Your Project Name",
+    "usedResearch": true
+  },
+  "complexityAnalysis": [
+    {
+      "taskId": 8,
+      "taskTitle": "Develop Implementation Drift Handling",
+      "complexityScore": 9.5,
+      "recommendedSubtasks": 6,
+      "expansionPrompt": "Create subtasks that handle detecting...",
+      "reasoning": "This task requires sophisticated logic...",
+      "expansionCommand": "node scripts/dev.js expand --id=8 --num=6 --prompt=\"Create subtasks...\" --research"
+    },
+    // More tasks sorted by complexity score (highest first)
+  ]
+}
+```
--- a/scripts/dev.js
+++ b/scripts/dev.js
--- a/scripts/example_prd.txt
+++ b/scripts/example_prd.txt
@@ -13,7 +13,8 @@
 - User personas
 - Key user flows
 - UI/UX considerations]
-
+</context>
+<PRD>
 # Technical Architecture  
 [Outline the technical implementation details:
 - System components
@@ -25,23 +26,22 @@
 [Break down the development process into phases:
 - MVP requirements
 - Future enhancements
- Timeline estimates]
+- Do not think about timelines whatsoever -- all that matters is scope and detailing exactly what needs to be build in each phase so it can later be cut up into tasks]

-# Success Metrics  
-[Define how success will be measured:
- Key performance indicators
- User adoption metrics
- Business goals]
+# Logical Dependency Chain
+[Define the logical order of development:
+- Which features need to be built first (foundation)
+- Getting as quickly as possible to something usable/visible front end that works
+- Properly pacing and scoping each feature so it is atomic but can also be built upon and improved as development approaches]

 # Risks and Mitigations  
 [Identify potential risks and how they'll be addressed:
 - Technical challenges
- Market risks
+- Figuring out the MVP that we can build upon
 - Resource constraints]

 # Appendix  
 [Include any additional information:
 - Research findings
- Competitive analysis
 - Technical specifications]
-</context> 
+</PRD>
--- a/scripts/init.js
+++ b/scripts/init.js
@@ -81,7 +81,7 @@ function copyTemplateFile(templateName, targetPath, replacements = {}) {
      sourcePath = path.join(__dirname, 'dev.js');
      break;
    case 'scripts_README.md':
-      sourcePath = path.join(__dirname, 'README.md');
+      sourcePath = path.join(__dirname, '..', 'assets', 'scripts_README.md');
      break;
    case 'dev_workflow.mdc':
      sourcePath = path.join(__dirname, '..', '.cursor', 'rules', 'dev_workflow.mdc');
@@ -269,13 +269,24 @@ function createProjectStructure(projectName, projectDescription, projectVersion,
    log('warn', 'Git not available, skipping repository initialization');
  }
  
+  // Run npm install automatically
+  log('info', 'Installing dependencies...');
+  try {
+    execSync('npm install', { stdio: 'inherit', cwd: targetDir });
+    log('info', `${COLORS.green}Dependencies installed successfully!${COLORS.reset}`);
+  } catch (error) {
+    log('error', 'Failed to install dependencies:', error.message);
+    log('error', 'Please run npm install manually');
+  }
+  
  log('info', `${COLORS.green}${COLORS.bright}Project initialized successfully!${COLORS.reset}`);
  log('info', '');
  log('info', 'Next steps:');
-  log('info', '1. Run `npm install` to install dependencies');
-  log('info', '2. Create a .env file with your ANTHROPIC_API_KEY (see .env.example)');
-  log('info', '3. Add your PRD to the project');
-  log('info', '4. Run `npm run parse-prd -- --input=<your-prd-file.txt>` to generate tasks');
+  log('info', '1. Create a .env file with your ANTHROPIC_API_KEY (see .env.example)');
+  log('info', '2. Add your PRD.txt to the /scripts directory');
+  log('info', '3. Ask Cursor Agent to parse your PRD.txt and generate tasks');
+  log('info', '└── You can also manually run `npm run parse-prd -- --input=<your-prd-file.txt>` to generate tasks');
+  log('info', '4. Review the README.md file to learn how to use other commands via Cursor Agent.');
  log('info', '');
 }

--- a/scripts/prd.txt
+++ b/scripts/prd.txt
@@ -1,350 +1,528 @@
-<context>
-# Overview  
-The MCP SaaS is a **hosted Model Context Protocol (MCP) server platform** that lets users spin up and customize MCP servers on demand. MCP is an open standard that provides a “USB-C port for AI” – a unified way to connect AI assistants to various data sources and tools ([Introduction - Model Context Protocol](https://modelcontextprotocol.io/introduction#:~:text=MCP%20is%20an%20open%20protocol,different%20data%20sources%20and%20tools)). Instead of running connectors locally or building custom integrations for each data source, developers can use MCP to expose data through standard servers and let AI applications (MCP clients) connect to them ([Introducing the Model Context Protocol \ Anthropic](https://www.anthropic.com/news/model-context-protocol#:~:text=The%20Model%20Context%20Protocol%20is,that%20connect%20to%20these%20servers)). Our service extends this concept by hosting these MCP servers in the cloud and offering a web interface for configuration. The value proposition is that **developers and teams can easily integrate their own data and tools with AI models (like Claude or IDE-based agents) without managing infrastructure** or writing boilerplate code.

-**Key Differentiators:** This platform distinguishes itself from the existing open-source MCP servers by focusing on ease of use, hosting, and expanded capabilities:  
-
- **No Self-Hosting Required:** Open-source MCP implementations typically run on a user’s local machine or server ([Introducing the Model Context Protocol \ Anthropic](https://www.anthropic.com/news/model-context-protocol#:~:text=Claude,to%20the%20Claude%20Desktop%20app)). Our service eliminates the need to set up or maintain servers – deployment is handled automatically on a global cloud platform. This means even non-dev users or those in restricted IT environments can use MCP tools remotely, **enabling cloud access** to what were previously local-only connectors.  
- **Easy Configuration & Customization:** Instead of cloning repos and running command-line tools, users get a **friendly dashboard** to select from a library of MCP tools and configure them with a few clicks. This lowers the barrier to entry and speeds up integration.  
- **Multiple Tools in One Server:** With open-source MCP, each server typically provides one capability or data source ([Introduction - Model Context Protocol](https://modelcontextprotocol.io/introduction#:~:text=At%20its%20core%2C%20MCP%20follows,can%20connect%20to%20multiple%20servers)). Our hosted solution will allow users to **combine multiple tools on a single MCP server instance** (if desired) by simply toggling them on/off. This creates a composite toolset accessible via one endpoint, which is unique compared to the one-tool-per-server model.  
- **Premium Tool Library:** While Anthropic has open-sourced many connectors (Google Drive, Slack, GitHub, Git, Postgres, Puppeteer, etc. ([Introducing the Model Context Protocol \ Anthropic](https://www.anthropic.com/news/model-context-protocol#:~:text=Claude%203,GitHub%2C%20Git%2C%20Postgres%2C%20and%20Puppeteer))), our service curates and extends this library with **premium tools** not readily available elsewhere. These could include connectors to enterprise apps, enhanced versions of open tools with additional features, or brand-new integrations developed in-house. Subscribers get immediate access to these tools without hunting through GitHub repos.  
- **Built-in Analytics and Management:** The platform provides **usage metrics, logs, and monitoring** out-of-the-box – capabilities that are not present in basic open-source MCP servers. Users can track how often their tools are called and monitor performance via the dashboard, helping them manage usage and debug issues.  
- **Integration Ready Outputs:** Instead of figuring out how to run an MCP server and connect it, users receive ready-to-use endpoints (an `npx` command, JSON configuration snippet, or direct SSE URL) that plug into AI clients with minimal effort. This streamlines the process of hooking up the MCP server to Claude, Cursor IDE, or other MCP-compatible AI tools.
-
-By addressing the above, the service makes **MCP accessible as a hassle-free cloud service**. This drives our core value: **“Your custom AI tools, one click away”** – enabling rapid setup and integration of context-providing tools for AI assistants, while the platform handles the heavy lifting of hosting, scaling, and maintenance.
-
- ([What is Model Context Protocol?](https://portkey.ai/blog/model-context-protocol-for-llm-appls)) *Figure: General MCP architecture. An “MCP Host” (e.g. Claude, Cursor IDE, or another AI tool) can connect via the MCP protocol to one or more **MCP servers**, each interfacing with a specific resource or service. In a typical setup, MCP servers might run on *your local machine* to expose local files, databases, or APIs ([Introduction - Model Context Protocol](https://modelcontextprotocol.io/introduction#:~:text=At%20its%20core%2C%20MCP%20follows,can%20connect%20to%20multiple%20servers)). Our product moves this into the cloud – hosting those MCP servers for you – so the AI assistant can reach your tools from anywhere via a secure internet endpoint.*  
-
-# Core Features  
-
-**User Authentication (GitHub OAuth):**  The platform will use GitHub OAuth for sign-in and sign-up. Users can log in with their GitHub credentials, streamlining onboarding for developers. OAuth ensures we don’t handle raw passwords and can easily fetch basic profile info (username, email) to create the user account. Upon first login, a new user profile is created in our system linked to their GitHub ID. This also sets the stage for future integrations (e.g. pulling GitHub repos as data sources, or verifying student/hobby status via GitHub). The PRD priority is to implement GitHub OAuth, but the system will be designed to allow adding other OAuth providers later (e.g. Google, Microsoft) for broader enterprise appeal.
-
-**Dashboard for Selecting and Configuring Tools:**  A core part of the user experience is a **web dashboard** where users can create and manage their hosted MCP servers. Key elements of the dashboard include:  
-
- **MCP Server List:** A home screen showing all MCP instances the user has created, with status (running/stopped), name, and key details (number of tools, last active time, etc.). From here, users can click “Create New Server.”  
- **Tool Library Browser:** When creating or editing an MCP server, users are presented with a **catalog of available tools** (the library of MCP connectors). Each tool listing includes a name, description, and possibly an icon or category. Users can search or filter (e.g. by “file system”, “database”, “API integration”, etc.). For MVP, this library is curated (initially we’ll include popular connectors like file access, GitHub, Slack, databases, web browser automation, etc.).  
- **Add/Remove Tools UI:** Users can add a tool to their MCP instance by selecting it from the library. Upon adding, the tool might require configuration – for example, the Slack tool might need an API token, or the Google Drive tool might need OAuth credentials. The dashboard will provide a form for each tool’s required settings (with field validations and help text). Users can also remove tools from the config with a click (which will update the server on redeploy).  
- **Configuration Management:** In addition to tool-specific settings, the server itself may have configurations: a name, a description, or global environment variables that multiple tools might use. The dashboard allows editing these. For example, if multiple tools need an API key for the same service, the user could set it once globally.  
- **One-Click Deployment:** A prominent **“Deploy” or “Save & Deploy” button** will provision or update the hosted MCP server with the selected tools. This triggers the backend automation (see Technical Architecture) to either create a new Cloudflare Worker or update an existing one with the new tool set. Feedback (like a loading spinner and status messages) is provided during deployment. In MVP, deployments should complete within a few seconds. After deployment, the dashboard will show the server’s **connection info (endpoints and commands)** for the user to integrate.
-
-**Add/Remove Tools from a Hosted MCP Instance:** Users are not locked into their initial choices; they can modify their MCP server’s capabilities post-creation. This feature means:  
-
- **Dynamic Tool Management:** From the dashboard, selecting an existing MCP instance allows users to see which tools are currently enabled. They can add new ones or remove some and then re-deploy. The backend will handle updating the running instance (which may involve restarting it with a new config). This dynamic configurability encourages experimentation – users can start with a minimal toolset and grow it over time.  
- **Hot Reload vs. Restart:** (For MVP, a full restart on config change is acceptable.) In future iterations, we might support hot-swapping tools without downtime. For now, after a user updates the tool list and redeploys, the platform will restart that MCP server instance with the new configuration. Any connected AI clients may need to reconnect (we will document this).  
- **Versioning (Future Consideration):** The system will keep track of tool versions or last-updated time. In MVP, we assume using latest stable versions of each tool library. Later, we might let users pin a specific version of a tool connector or roll back changes if a new tool config causes issues.
-
-**Automated Deployment on Cloudflare Workers AI:** The hosting backbone of the product is **Cloudflare Workers** (with Workers AI capabilities if needed). Each MCP server instance is essentially a serverless function (or set of functions) running on Cloudflare’s global network, close to users. Key requirements and behaviors:  
-
- **Deployment Automation:** When the user hits deploy, our backend uses Cloudflare’s API to either upload a Cloudflare Worker script or create a new instance of a pre-built worker with configuration. The Worker contains the logic for the selected MCP tools. Cloudflare’s environment runs the code in a serverless manner – we benefit from automatic scaling, low-latency global access, and not having to manage VM or container infrastructure.  
- **Isolated Instances:** Each MCP server runs in isolation (sandboxed by Cloudflare’s architecture per script/instance). This ensures that one user’s server (and data/API keys) isn’t accessible to another. We leverage Workers **Namespaces or environment bindings** to pass each instance its config securely. For example, if a user’s MCP server includes a database password or API token, that will be stored as an encrypted secret and bound to their Worker instance only.  
- **Workers AI Compatibility:** While our primary use-case is running connector logic (which might just be network calls or file I/O), using Cloudflare **Workers AI** means we have the option to also execute ML models at the edge if needed. This isn’t a core MVP feature, but it’s a forward-looking choice – e.g., if a tool involves vector embeddings or running a small model, it could leverage Workers AI’s GPU support. For now, **the focus is on connectors**; we simply ensure the platform can deploy to the Workers runtime environment successfully. (Cloudflare Workers provides the needed compute and networking for MCP servers just like running them locally, but in a serverless way.)  
- **Scaling and Performance:** Because Workers scale automatically and are **pay-as-you-go ([Workers AI: serverless GPU-powered inference on Cloudflare’s global network](https://blog.cloudflare.com/workers-ai/#:~:text=That%27s%20why%20we%20are%20excited,and%20it%27s%20built%20from%20the))**, each MCP server can handle multiple concurrent requests or SSE streams without manual intervention. The service should impose sensible limits (through pricing tiers) but not require the user to worry about load – if their usage grows, Cloudflare will seamlessly handle more requests up to our set quotas. This is a major advantage over self-hosting, where the user would need to deploy to a server or cloud instance themselves.
-
-**Output Formats for Integration:** Once a user’s MCP server is deployed, the platform provides **multiple ways to integrate it with AI tools**. Different users have different workflows, so we support:  
-
- **NPX Command:** An `npx` command is provided for users who want a quick CLI invocation. For example, after deployment the dashboard might show a command like `npx mcp-client@latest --server-url https://<user>.ourservice.dev --api-key $KEY`. Running this command locally would launch an MCP client that connects to the hosted server (perhaps using the official MCP client SDK under the hood). This is useful for tools or environments that can run local commands (for instance, if Claude’s desktop app or another IDE expects a local process, the npx script can act as a local proxy to the remote server). It also serves as a quick test: users can run the npx command in a terminal to verify their server responds as expected.  
- **JSON Configuration:** For developer tools like Cursor IDE or other IDEs that support MCP, we provide a JSON config snippet that the user can drop into their settings. This JSON includes details such as the server name, transport type (`sse` or `stdio`), and the endpoint URL. For example:  
-  ```json
-  {
-    "name": "My MCP Server",
-    "transport": "sse",
-    "url": "https://<user>.ourservice.dev/sse",
-    "api_key": "<KEY>"
-  }
-  ```  
-  A format like this can be placed in Cursor’s `.cursor/mcp.json` or in an application’s config file to inform the client about the custom tool. We will document how to use this JSON for various AI clients. Providing this ready-made configuration saves users from manually typing details, reducing integration friction.  
- **SSE Endpoint URL:** For direct integration (especially with Claude or any system that allows a URL), we give the **Secure SSE URL** of the hosted MCP server. For example: `https://<instance-id>.ourservice.dev/sse?key=<secret>`. This endpoint implements the MCP protocol over Server-Sent Events – the standard way to connect remote MCP servers ([Cursor – Model Context Protocol](https://docs.cursor.com/context/model-context-protocol#:~:text=Cursor%20implements%20an%20MCP%20client%2C,transports)) ([Cursor – Model Context Protocol](https://docs.cursor.com/context/model-context-protocol#:~:text=assuming%20it%20is%20running%20locally,8765)). A user can input this URL into an interface like “Add MCP Server” in Cursor (choosing SSE transport) or in future Claude interfaces that accept remote URLs. The SSE endpoint streams events and data between the AI assistant and the server in real-time. In MVP, we’ll support SSE (which covers most remote use-cases); STDIO is mainly for local, so we won’t need to support a remote stdio beyond the npx local proxy.  
-
-In all cases, **authentication and security** are considered. The npx command and SSE URL include a secret API key (or use an auth header) so that only the rightful user (and their AI client) can access the MCP server. The JSON config will mention how to include the API key as well. This multi-format output ensures that whether the user is technical or not, and whichever AI tool they use, they have a straightforward way to plug in their new MCP server.
-
-**User Analytics & Monitoring:** To help users understand and control their usage, the platform will include analytics features:  
-
- **Dashboard Metrics:** For each MCP server instance, the dashboard will display key metrics such as number of API calls made (this could be overall requests or broken down by tool), data transferred, and active time. For example, a user might see “**Calls this month: 850**” and “**Active connections: 0 (idle)**” for an instance. This information updates periodically (possibly with a refresh button or live updates via WebSocket).  
- **Usage Graphs:** A simple chart (e.g. line graph) could show usage over time – e.g. daily request count in the last 30 days – especially for paid tiers where usage limits apply. MVP can use a basic library to plot calls per day. If real-time plotting is too much, at least a summary count and last active timestamp will be provided.  
- **Logs (MVP Limited):** While a fully featured log viewer might be a later addition, the MVP will capture basic event logs for each server (e.g. “Tool X called with query Y,” “Error: failed to fetch from API Z”). In the dashboard, users can view recent log entries or errors to debug their tools’ behavior. We might limit the log history to last N events or last 24 hours for performance.  
- **Alerts and Notifications:** (Future) The system can send an email or dashboard alert if the user approaches their usage limits or if an instance encounters repeated errors. This is not required at launch, but designing the analytics with hooks for alerts in mind will help scalability.  
-
-All analytics are accessible through the secure dashboard. Internally, we’ll collect this data via our API gateway or within the Worker (for lightweight metrics) and store it in a database or analytics service. The goal is to provide transparency so users can **monitor their MCP servers’ health and usage**, making the product feel reliable and professional for production use.
-
-# Pricing Strategy  
-Our pricing model will be **freemium with tiered subscriptions**, designed to accommodate individual tinkerers up to enterprise teams. We will have **Free, Paid, and High-Tier (Enterprise)** plans, with multiple dimensions for scaling revenue as users grow. Key pricing dimensions include: the number of tools enabled, usage volume (API calls), number of active instances, and advanced enterprise needs.
-
-**Free Tier (Developer Hobby Plan):** This tier lowers the barrier to entry and encourages viral adoption, while imposing limits that naturally lead serious users to upgrade. Features and limits of the Free plan:  
-
- **Limited Tools per Server:** A free user can enable a small number of tools on any single MCP server (e.g. *up to 2 tools* per instance). This allows trying out a couple of integrations (for example, connecting to a local filesystem and one API), but for richer servers with many capabilities, an upgrade is required.  
- **1 Active MCP Instance:** The free tier might allow only one active server at a time (or possibly 1 concurrent and up to 2 total configured, to let users experiment). This ensures heavy users who need multiple separate agents (e.g. different projects or contexts) will consider paying.  
- **Usage Cap:** We will include a generous but limited number of API calls or events per month (for instance, *5,000 calls per month* free). This is enough for small projects and testing, but if the user starts relying on the service in earnest (e.g. daily use with an AI coding assistant), they’ll likely hit the limit. We can also rate-limit the free usage (like X calls/minute) to prevent abuse.  
- **Community Support:** Free users have access to documentation and community forums for support. Direct support response or SLA is not guaranteed at this tier.  
- **No Cost (Freemium):** As the name implies, this tier is $0. It’s aimed at students, hobbyists, or professionals prototyping an idea. By offering meaningful functionality for free, we hope to drive adoption and word-of-mouth (e.g. developers sharing the tool with colleagues, or writing blog posts about using it).
-
-**Paid Tier (Pro / Team Plan):** The paid tier will likely have a fixed monthly subscription (for example, **$X per month** for Pro) which unlocks higher limits and possibly additional features. We also consider usage-based billing for overages. Key attributes:  
-
- **More Tools per Server:** A higher allowance on how many tools can be combined in one MCP instance (e.g. *up to 5 or 10 tools* on Pro). This encourages users to build powerful composite connectors on one endpoint, which is valuable for complex use cases.  
- **Multiple Instances:** Pro users can run more simultaneous MCP servers – for instance, *up to 3 active instances*. This is useful if a small team has different projects (one MCP server for codebase access, another for a database, etc.) or if one user wants to separate concerns.  
- **Increased API Call Quota:** The monthly call limit is higher (e.g. *100,000 calls per month included*). If a user exceeds the included calls, we may charge an overage fee per 1,000 calls (or suggest upgrading to a higher plan). We will also lift any strict rate limits, allowing bursty usage as long as it stays within monthly allotment.  
- **Premium Tools Access:** Certain “premium” connectors (especially those that might incur cost or require special maintenance) could be reserved for paid plans. For example, an integration to a proprietary enterprise software or a high-compute tool might only be available to paying users. Pro users get access to the full library of standard tools and these premium ones. (We must be transparent about which those are in the library UI with a lock icon or similar.)  
- **Analytics & Support:** Paid users get more advanced analytics (longer log retention, more detailed usage breakdown) and priority support (email support with 24-48h response). While the MVP might treat all users the same initially, the plan is to eventually offer better support to subscribers. Possibly, Pro users could also share access with a small team (e.g. invite 2-3 collaborators to view the dashboard or manage servers – though team features might be an enterprise feature later).  
- **Revenue Expansion:** The Pro plan not only has a subscription fee but also provides avenues for expansion revenue. If a team uses significantly more than the included quota, we’ll have clear overage pricing. Additionally, if they need more instances or tools beyond the plan limits, we might allow purchasing add-ons (e.g. “Extra 2 instances for $Y” or “additional 50k calls”). Initially, however, we’ll keep it simple with just the base Pro plan and encourage upgrade to Enterprise for big needs.
-
-**High-Tier / Enterprise Plan:** For organizations with large scale or specific requirements, we will offer an Enterprise tier (likely custom-priced, sales-driven). This caters to companies that might use the service in production for many users or across departments. Features might include:  
-
- **Unlimited or Negotiated Limits:** Enterprise customers could have *custom limits* (e.g. they might need 10+ MCP instances, or >1 million calls per month). We would negotiate a contract that fits their usage, possibly with volume discounts. Essentially, this tier removes the friction of quotas – the service can scale to the organization’s needs with pricing scaled accordingly.  
- **Enterprise-Only Features:** This could include single sign-on (SSO) integration for their team, dedicated account management, and the ability to create sub-accounts or team roles (e.g. admin, developer, viewer roles in the dashboard). We might also allow **on-premise or virtual private deployment** for enterprises that have compliance restrictions – e.g. deploying our MCP hosting stack into their cloud, or offering a region-specific deployment if they need data residency. (These are future possibilities once the core product is stable.)  
- **Security and SLA:** Enterprise plan would come with a **custom SLA (uptime guarantee)** and priority support (perhaps 24/7 support or a dedicated support engineer contact). Security features like audit logs, encryption options, and compliance (HIPAA, GDPR assurances, etc.) would be packaged here to satisfy enterprise IT requirements.  
- **Pricing Model:** Likely a yearly contract or monthly minimum, plus usage. For example, an enterprise might pay a base fee (to cover up to X usage) and then tiered pricing for overages or additional instances beyond that. Because multiple dimensions are in play (tools, instances, calls), we’ll remain flexible. One approach is to have an **enterprise platform fee** that unlocks everything (unlimited tools, many instances), and then purely usage-based billing for API calls beyond a certain threshold. This way, enterprise customers essentially pay for what they use, but with guarantees of service and support.  
- **Potential Add-On Services:** We could upsell services like custom tool development (our team building an MCP connector that the enterprise specifically needs), training sessions, or integration assistance as part of a professional services package.
-
-**Revenue Expansion Considerations:** We have multiple axes to grow revenue per user: if a user needs more tools per server, more servers, or more throughput, they can either move to the next tier or pay add-ons. Over time, we might introduce ala carte add-ons even for Pro (like “enable 2 more tools on your instance for $5/mo” if that proves viable). However, the initial strategy is to keep plans straightforward to avoid analysis paralysis. The free tier drives adoption, the Pro tier converts power users and small teams with a predictable monthly price, and the Enterprise tier captures large customers with a scalable, custom approach. This tiered model, combined with usage-based components, should support **logarithmic MRR growth** – as customers succeed and use the service more, their spend increases in a way that feels natural and value-aligned.
-
-# User Flow  
-
-This section describes the end-to-end **user journey**, from first visiting the site to running an MCP server and using it. Ensuring a frictionless, intuitive flow is key to onboarding and retention.
-
-**1. Sign-Up and Onboarding:** A new user arrives at our landing page, which highlights the value (“Host your own AI tool server in minutes,” etc.). They can click **“Get Started – It’s Free”**, which brings them to the GitHub OAuth login. After OAuth, the user lands in the application. If it’s their first time, we present a brief onboarding sequence: for example, a welcome message and a quick tour of the dashboard. We might highlight where to create a server and how to access docs. *MVP detail:* The tour can be as simple as tooltips or a one-time modal with instructions (“1. Create a server, 2. Add tools, 3. Connect to Claude/Cursor…”). The user’s account is now created (Free tier by default), and they may have 0 servers initially. We could also automatically create a **“Default” MCP server** with no tools (in stopped state) to prompt them to configure it, but this is optional. The key is that after sign-up, the user knows how to proceed to create their first MCP instance.
-
-**2. Creating a New MCP Server:** The user clicks **“Create New MCP Server”** (a prominent button on the dashboard). A dialog or dedicated page appears to set up the new server. They will input:  
-   - A **Name** for the server (e.g. “My Project Tools” or “Salesforce Connector”). This helps identify it later, and could be used in integration settings as a display name.  
-   - (Optional) a short **Description** or notes, especially if they have multiple servers (not critical for MVP, but nice for clarity).  
-   - Possibly, choose a **base template** (if we offer any presets). MVP might not have templates, but in future, we might show template options like “💻 Codebase Assistant – includes Git + Filesystem tools” or “📊 Data Assistant – includes Google Drive + Spreadsheet,” to jump-start configuration. In MVP, likely the user will build from scratch, so this step is minimal.  
-   - Click **“Create”**, which leads them into the configuration interface for the new server.
-
-**3. Selecting Tools and Integrations:** Now on the MCP server config page (or modal), the user sees the library of tools (as described in Core Features – Dashboard). They can browse or search for a tool to add. For each tool:  
-   - The user clicks “Add” (or a toggle switch) to include that tool. Immediately, that tool might expand a configuration pane. For example, if they add **Google Drive**, we display fields to enter Google API credentials or initiate an OAuth flow. Or if they add **Filesystem**, maybe a path or permission setting is needed (though for a cloud-hosted service, filesystem access might be simulated or limited to a cloud storage bucket – this detail will be covered in technical design).  
-   - The user provides required config for each selected tool. We will validate inputs (e.g. check token formats, required fields not empty). If any tool has optional settings, those can be left default.  
-   - They can repeat this for multiple tools. We’ll ensure that adding a tool does not conflict with others. (In general, MCP servers can have multiple tools and the protocol distinguishes them by name when the AI calls them, so it should be fine). If there are any incompatible combinations, the UI should prevent or warn, though none are expected in MVP.  
-   - The UI may assign a default **port or endpoint name** for each tool behind the scenes. (For instance, MCP might identify tools by name; if needed, we ensure unique names or aliases, but likely the tools come with predefined identifiers that the client will see.) We abstract this detail from the user, they just care that those capabilities are present.
-
-**4. Deploying the MCP Server & Receiving Credentials:** Once satisfied with the selected tools and their configuration, the user hits **“Deploy”**. The UI then transitions to a deployment status: showing a spinner or progress bar with messages like “Deploying your server…”, “Provisioning on Cloudflare…”, etc. During this time, the backend is packaging the code and deploying to Cloudflare Workers (as detailed in Technical Architecture). On success (which ideally takes just a few seconds), the status updates to **“Your MCP server is live!”** and presents the integration info. The user will see:  
-
-   - **Server URL / SSE Endpoint:** e.g. `https://john123.mcp.example.com/sse` (the exact format TBD, but likely a subdomain or unique path). Alongside this, a copyable **API Key** or token (if we use query param or require an `Authorization` header). We will show a copy button for convenience. We might display the full curl example like `curl <URL>?key=XYZ` for advanced users to test connectivity.  
-   - **NPX Command:** e.g. ``npx @ourservice/client -s john123.mcp.example.com -k YOUR_KEY`` or a similar one-liner. This is shown in a code block with another copy button. Possibly we also allow the user to pick a specific client command – for instance, if we have an official NPM package, that’s one, but if their environment is Python, we might show a `pip install ourservice && python -m ourservice_client ...` command instead. MVP can start with just the npx (Node) option since Node is common. Documentation can cover other methods.  
-   - **JSON Config:** We present a pre-filled JSON snippet as described earlier. The user can toggle between, say, “Cursor config” and “Generic JSON” if needed, but likely one format works for most (transport + URL + key). This snippet is in a text area for copying. We will specifically note in the UI: “Use this in tools like Cursor IDE (paste into .cursor/mcp.json)”.  
-   - **Summary of Tools:** We’ll also list which tools are running on this server (for confirmation). E.g. “Tools enabled: Slack, Google Drive”. This helps the user remember what this endpoint includes, which is useful if they set up multiple servers.  
-
-   The user can now proceed to integrate this with their AI assistant of choice.
-
-**5. Integrating with AI Tools (Claude, Cursor, etc.):** This step happens outside our platform but is crucial to document. We assume once the user has the connection info, they will:  
-
-   - **In Cursor IDE:** go to Settings -> MCP -> “Add New MCP Server”. They will choose **Type: SSE**, give it a nickname (e.g. “My Tools”), and paste the URL we gave (including the `/sse` and perhaps the key if required in the URL forma ([Cursor – Model Context Protocol](https://docs.cursor.com/context/model-context-protocol#:~:text=For%20SSE%20servers%2C%20the%20URL,http%3A%2F%2Fexample.com%3A8000%2Fsse))4】. They save, and the Cursor client might test the connection. Once added, the tools should appear in Cursor’s interface (the agent will list the tool names). We should ensure that the tools’ identifiers we provide are descriptive for the user. For example, if Slack tool is added, Cursor might show an ability like “Slack: channel messaging” – this comes from the MCP server advertising its tools. Our service will make sure the server properly reports the tool names so that clients display them.  
-   - **In Claude (Anthropic’s Claude Desktop):** Currently, Claude Desktop supports local MCP servers via its UI. For remote, users might run the npx command we gave in a terminal on their machine. That npx command effectively connects our cloud MCP server to Claude as if it were local (likely by piping Claude’s requests to our cloud via SSE under the hood). Alternatively, if Claude introduces a way to enter an SSE URL, the user can use that directly. We will provide guidance in our documentation for Claude users. (For MVP, we may test and document the npx approach if that’s the primary method). The user experience would be: run `npx ...` before asking Claude to use the tool, and Claude will detect the MCP server connection through its local interface.  
-   - **Testing:** The user might then ask Claude or the Cursor agent to perform an action using the new tool (e.g. “Find files in my Drive about project X” or “Post a message on Slack channel Y”). They should see the AI utilizing the tool, and our backend would log the interaction (which the user can see in their dashboard logs). This successful round-trip will validate the setup.
-
-**6. Managing and Monitoring via Dashboard:** After initial setup, the user can always return to our web dashboard to manage their MCP servers. Typical actions and flows:  
-
-   - **View Status:** On the dashboard, each server might show a green “Running” status. If needed, we might allow pausing/stopping an instance to conserve usage (though since Workers are event-driven, an idle server doesn’t cost, but stopping could be logically disabling access). MVP might not require a stop function, but could be a nice control (with a play/stop button per instance).  
-   - **Edit Configuration:** The user can click on a server to edit it. This brings back the tool selection UI. They can add a new tool or remove one, or update credentials (say their API key changed). After making changes, they deploy again. We ensure they know that the endpoint might restart briefly. Once updated, they can immediately use the new tools (or see that removed ones are no longer available to the AI).  
-   - **Rotate Credentials:** If the user fears a key leak or just wants to rotate their server’s URL/key, we’ll provide a **Regenerate Key** option. This will issue a new secret and update the endpoint (the old one becomes invalid). The dashboard will then show the updated info for them to reintegrate on the client side. (This is more of a security feature, possibly not mandatory for MVP, but good to include if time permits.)  
-   - **Monitor Usage:** The user can navigate to an analytics section (or within each server’s detail page) to see the usage stats. For example, they open their server and see charts or counts of how many calls were made this week. If they are on a paid plan, this helps them gauge if they’re nearing limits. If on free, it shows how soon they might need to upgrade.  
-   - **Upgrade Prompt:** If a user on free tier hits a limit (e.g. tries to add a 3rd tool), the UI should guide them to upgrade. For instance, when they click “Add” on a third tool, we can show a modal: “Upgrade to Pro to enable more than 2 tools on one server.” This way, the flow itself drives the upsell at natural points. Similarly, if their usage is maxed out, an alert or banner in the dashboard can say “You’ve reached the free plan limit – upgrade to continue uninterrupted service.”  
-   - **Multiple Servers Management:** If the user has several MCP servers, they can manage each one separately. The dashboard list view allows them to select which server to view or edit. If on Free plan with only one server, this part is simple. On Pro, managing multiple might involve a tabbed interface or a list with selection. MVP can keep it basic (a list of names and clicking one loads its details).  
-
-Throughout the user flow, we emphasize **clarity and simplicity**. Each step should be as guided as possible: e.g., tooltips or help icons near complex concepts (like what SSE means, or how to use the npx command). We will also maintain a **Documentation section** or links to a docs site with step-by-step guides (including screenshots) for these flows. A smooth user flow not only helps first-time users succeed but also reduces support requests and increases conversion (a user who easily sets up a working integration is more likely to become a paying customer).
-
-# Technical Architecture  
-
-The system architecture is designed for **scalability, security, and ease of deployment** across all components – from the web app to the MCP servers running on Cloudflare. Below is an overview of each major component and how they interact:
-
-**1. Component Overview:**  
-   - **Web Frontend (Dashboard):** A single-page application (SPA) or modern web app (built with React/Vue or similar) that users interact with. It communicates with our backend via RESTful or GraphQL APIs. It handles the UI for login, configuring servers, and displaying analytics.  
-   - **Backend API & Orchestration:** A server (or set of serverless functions) that handles all application logic: user authentication callbacks (GitHub OAuth processing), CRUD operations for MCP server configurations, triggering deployments to Cloudflare, and serving usage data. This can be built with Node.js (Express/Fastify) or Python (FastAPI) etc., depending on team expertise. We might host this on a cloud platform or even Cloudflare Workers if we go serverless all-in. The backend has access to a **database** and third-party APIs (Cloudflare API, etc.).  
-   - **Database:** A persistent store for user data. Likely a PostgreSQL or a serverless DB (Cloudflare D1 if sticking to CF, or Firebase/Firestore, etc.). It stores users, their MCP server configs (which tools enabled, any stored credentials or settings), usage logs, and subscription info. Security is paramount: any sensitive info (like API tokens for tools) should be encrypted at rest.  
-   - **Cloudflare Workers (MCP Instances):** Each user’s MCP server runs as code deployed to Cloudflare’s Workers infrastructure (which is global, serverless). We either deploy one Worker **per MCP instance** or use a multi-tenant approach with one Worker handling multiple instances keyed by URL subpaths. For isolation and simplicity, one-per-instance is preferred in MVP. These Workers contain the logic for the selected tools (essentially they include the MCP server libraries for those tools). When an AI client connects via SSE or when the npx client calls, the Worker engages the tool logic and communicates over the MCP protocol.  
-   - **API Gateway / Reverse Proxy:** (If needed) We might have an API gateway layer that fronts the Workers. However, Cloudflare Workers themselves can directly receive HTTP requests on unique routes or subdomains. We will likely use Cloudflare’s routing (e.g., workers.dev subdomains or custom domain routing) to direct requests for a given instance to the correct Worker. We will incorporate **rate limiting** and API key checking at this layer (either within the Worker code or via Cloudflare’s API Gateway rules) to enforce our usage policies.  
-   - **Analytics & Monitoring Service:** To collect usage data, we might use Cloudflare’s built-in analytics (they have some for Workers), or instrument our code to log events to our database or a separate analytics pipeline (like Segment or a simple logging DB table). This service isn’t user-facing but feeds the Dashboard metrics. It could be as simple as writing an entry to a “usage” table every time a request is handled, then aggregating by user.
-
-**2. MCP Server Deployment Process:**  
-When a user hits “Deploy”, here’s what happens under the hood (MVP approach):  
-   - The frontend calls our **Backend API** (e.g., `POST /deploy`) with the user’s desired config (selected tools and their settings).  
-   - The backend validates that config (ensures user is allowed that many tools, checks config completeness). Then, it prepares the code bundle for the MCP server. We likely have a template or library for each tool. For example, we maintain an NPM package or a set of modules for each official tool connector (possibly leveraging the open-source implementatio ([Introducing the Model Context Protocol \ Anthropic](https://www.anthropic.com/news/model-context-protocol#:~:text=Claude%203,GitHub%2C%20Git%2C%20Postgres%2C%20and%20Puppeteer))4】). The backend will **dynamically build a Worker script** that includes only the selected tools. This could be done via a build script or by assembling a JS code string.  
-   - Using Cloudflare’s API (or SDK), the backend will **upload the script** as a Cloudflare Worker. If it’s a new server, we create a new worker with an ID tied to the user’s instance (e.g. user `uid123-server1`). If updating, we update that existing worker script. Cloudflare Workers can be deployed via a REST API where we send the JS/WASM bundle and some metadata. We’ll also set any **Environment Variables or Secrets** for that worker at this time – for example, if the user provided a Slack token, we add it as a secret env var accessible to the Worker. We also include an env var for the API key the worker should require for incoming requests, generated by us for this instance.  
-   - Cloudflare deploys this globally. The backend then configures a route: e.g. `john123.mcp.example.com/*` or similar to map to this Worker. Alternatively, we use Cloudflare’s subdomain per worker feature (workers can be addressed at `<worker-name>.<subdomain>.workers.dev`). We’ll provide the user with that route as the SSE endpoint.  
-   - The backend returns success to the frontend with the connection details (the same ones we show the user). We store the mapping of instance -> endpoint, tools, etc. in our DB as well. 
-
-This process leverages **serverless deployment** so we don’t manage servers ourselves. It’s important that build and deployment are fast; if building a custom bundle per deploy is too slow, an alternative is deploying a single “universal” worker that has all tools and just reads the config on each request. However, that might be heavier and less secure (all user configs in one runtime). MVP will try per-instance deployment, and we can optimize as needed (e.g., caching built bundles for identical tool sets, though that’s an edge case).
-
-**3. Security Considerations:**  
-   - **API Keys & Auth:** Every request from an AI client to an MCP server must include a valid API key or token that we issued to the user. We will likely use a long random token (e.g. UUID4 or similar) per instance. The Cloudflare Worker checks this token on each request (for SSE, it checks at connection start). If missing or invalid, it rejects the connection. This prevents others from hitting the endpoint if they discover the URL. Communication can be over HTTPS (always, since Cloudflare provides SSL termination), so the token is not exposed in plain text.  
-   - **User Credential Storage:** Any credentials the user provides for tools (like a database password, API tokens for third-party services) will be stored encrypted in our database and only decrypted when deploying to Cloudflare (then set as env vars). We’ll use encryption keys managed by our service (or KMS if available). The frontend will transmit those securely (HTTPS, and possibly we never log them). On Cloudflare’s side, environment secrets are also encrypted at rest. We ensure that when a user deletes a server or a credential, we remove it from our storage and can also remove it from Cloudflare Worker (via their API).  
-   - **Isolation:** By using separate Worker instances, each MCP server is isolated from others. There’s no risk of data leakage across users. Even if multiple instances run on the same physical machine in Cloudflare’s network, their memory and env are separated by the runtime.  
-   - **API Gateway & Rate Limiting:** We will implement rate limiting rules – for instance, using Cloudflare’s **Workers KV or Durable Object** to count requests per API key. If a user exceeds their quota, the Worker can start returning an error (or a notice event) indicating they are over limit. We might also integrate Cloudflare’s own **API Gateway** product or use a middleware in the Worker to throttle. The decision is between simplicity (do it in-code with counters) vs. using Cloudflare config. MVP might do a simple in-memory count + periodic flush to DB for persistence of usage counts. (Given Workers may spawn in many locations, a Durable Object or centralized counter service might be needed for global accuracy – this is a technical challenge to solve in implementation).  
-   - **Audit & Logging:** All actions on the backend (login, create server, deploy, delete) will be logged in an audit log (internal) with timestamp and user ID. This is useful for debugging and security audits. For enterprise especially, we’ll want a record of changes. The Workers can also log accesses (time, which tool was invoked) to a log store. Cloudflare Workers can use a service like **Workers Analytics Engine** or simply send log events back to our API. MVP will capture essential events (e.g. “user X called tool Y at T time”) to enable our analytics and possibly to troubleshoot issues.  
-
-**4. Performance & Scalability:**  
-   - The choice of Cloudflare Workers means the **MCP servers scale automatically** with incoming load – Cloudflare will run as many isolates as needed to handle concurrent SSE streams or tool requests. We need to ensure our code for each tool is efficient (but since many tools just call out to an API or perform I/O, the compute per request is small).  
-   - Our **backend API** (for the dashboard) should also be scalable. It will see far less load than the Workers since the heavy traffic is AI agent <-> MCP server communications which go direct to Workers. The backend mainly handles user actions and periodic polling from the dashboard. We can host the backend on a scalable platform or even as a Cloudflare Worker (though likely we’ll use a traditional app server for easier development initially).  
-   - The database should handle potentially many small writes (if logging every request for analytics). We may introduce a buffer or batching for analytics writes to not bottleneck on the DB. Using a time-series DB or an analytics service might be prudent if usage grows (future consideration).  
-   - **Global Access:** Since Cloudflare’s network is worldwide, no additional CDN is needed – users in Europe, Americas, Asia will all have low latency connecting to their MCP endpoint. This makes using the tools feel responsive, which is critical for a good UX in AI assistants (nobody wants a tool that responds slowly).  
-   - If any tool involves heavy processing (like large file reading), we will test that within the constraints of Workers (CPU time, memory). Cloudflare has limits (e.g. 50ms CPU time per request by default, can be higher on paid plans, and memory ~128MB by default). We should note that and possibly restrict certain tools (like “heavy data processing”) or use streaming properly so as not to hit limits. If necessary, for extremely heavy tasks we could integrate with another service or queue (beyond MVP scope).  
-
-**5. Secure API Key Management:**  
-   - Each user (or each of their MCP instances) will have an **API key or token** generated by us. We might generate one key per user that can access all their instances (with an instance identifier in requests), or simpler: one unique key per instance. The latter is more secure if the user wants to share one server’s access with someone without exposing others. MVP will issue one key per instance, shown in the dashboard.  
-   - The keys will be long random strings (e.g. 32+ chars) and stored hashed in our database (for security, similar to how passwords are stored). The actual key is only shown to the user when created. If they lose it, they can regenerate a new one (which invalidates the old).  
-   - When an AI client connects (via SSE or uses the npx client), it must provide this key. The Cloudflare Worker for that instance will have the expected hash or key in its environment (set at deploy time). It authenticates on each connection. If auth fails, it doesn’t serve the tool functions.  
-   - Additionally, our backend API (for managing the account) will have its own authentication – since we use OAuth, we’ll issue a session token or JWT to the frontend to include in subsequent API calls. Standard web security (HTTPS, CSRF protection if needed, etc.) will be followed for the dashboard APIs.  
-   - We will enforce that sensitive operations (like deleting an MCP server, or viewing a sensitive config value) require an authenticated session. Possibly re-auth (OAuth) if high security needed, though probably not necessary here.  
-   - Overall, the architecture is designed such that even if one component is compromised, other user data remains safe. For example, even if someone gets the list of deployed Worker scripts, they can’t use them without keys; if someone breaches the database, user’s third-party API creds are encrypted; if someone steals a user’s dashboard session, they still can’t call the MCP tools without also having the API key, etc. Defense in depth will be applied.
-
-In summary, the technical architecture leverages modern serverless components to deliver a robust, scalable service. By using Cloudflare Workers for execution, we offload a lot of ops overhead and get global performance. The backend and dashboard handle the user experience and orchestration, with security measures at every step (OAuth for identity, API keys for tool access, rate limits for fairness). This setup is designed to support an MVP launch with a single developer or small team in mind, but can scale to many users and be expanded in functionality over time.
-
-# Go-to-Market & Viral Growth Strategy  
-
-Launching a developer-focused micro SaaS requires careful planning to drive adoption and conversion. Our go-to-market approach will emphasize **frictionless onboarding, community engagement, and a freemium model that encourages upgrades**. We will also build virality and word-of-mouth into the product experience. Key strategies include:
-
-**Frictionless Onboarding:** From first impression to first success, we minimize barriers:  
- **One-Click Sign Up:** Using GitHub OAuth (as described) means in two clicks a user is in the app. No lengthy forms or credit card required for the free tier. We explicitly allow users to explore without entering payment info upfront. This encourages more trial signups.  
- **Instant Value (Time-to-First-Tool):** Our onboarding flow will aim to have users deploy their first MCP server within ~5 minutes of signing up. We provide guided tutorials or even an interactive setup wizard. For instance, a new user could be prompted: “What do you want to connect to your AI first? [GitHub Repo] [Slack] [Custom...]” – they pick one, and we walk them through adding that tool and hitting deploy. Achieving a quick “wow it works” moment (like seeing Claude retrieve data from their chosen source) greatly increases the chance they stick around.  
- **Default Templates:** To make it even easier, we might offer a few **pre-configured server templates** on sign-up (especially if we detect their use case). E.g., “Connect your codebase to Claude” as a one-click that sets up the Git + Filesystem tools with sane defaults. The user only needs to provide a repo link or token. These templates could be showcased on the landing page and in-app, making it clear that very useful scenarios can be achieved with minimal setup.  
- **Educational Content:** We will prepare short **YouTube videos, GIFs, or docs** that show how to integrate with Cursor or Claude. Possibly a 2-minute demo of someone using our service to, say, have Claude answer questions from a private Notion (via our connector). These materials will be linked in the app and shared on social media to attract interest. A smooth onboarding backed by helpful content reduces drop-offs.
-
-**Community and Viral Loop:** As an extension of an open-source protocol, we will tap into the existing community and foster new sharing mechanisms:  
- **Integration with Anthropic/Cursor Community:** We will actively participate in forums (like Reddit, Discord, or the Cursor community forum) where MCP early adopters discuss. By solving a real pain point (hosting), we can get word-of-mouth among developers who are trying MCP locally. For example, if someone asks “How can I run MCP server in the cloud for my team?”, our solution can be the answer – leading to organic adoption. We might even collaborate with Anthropic to be listed as a resource for MCP (if they’re open to third-party services).  
- **Referral Incentives:** We can implement a referral program: users who invite others (and those who join) get a bonus, such as extra free usage or a discount on upgrade. E.g., “Get 20% more free calls for 3 months for each friend who signs up with your link.” Developers are often enthusiastic to share useful new tools, especially if there’s a perk. This can drive viral growth through personal networks or developer communities.  
- **Social Sharing of Achievements:** Whenever a user successfully sets up a useful integration, encourage them to share. The app could have a subtle “Share your setup” prompt – perhaps generating a sanitized summary (not leaking keys) of “I just connected Slack and GitHub to my AI using [ProductName]!” with a link. On the landing site, highlight cool use cases (“See how @devMike is using it to…”) to create buzz.  
- **Marketplace & Templates (Future Viral Loop):** As mentioned in the roadmap, a marketplace of user-contributed MCP servers could be huge for virality. Users might publish interesting connectors or combinations, making the platform a go-to for discovering AI tools. While not MVP, laying the groundwork for this (and hinting it’s “coming soon”) can excite early adopters. Users who publish content are likely to share it, bringing others to the platform.
-
-**Logarithmic MRR Growth Strategy:** We plan for revenue to increase with both **new customer acquisition and expansion within existing customers**:  
- **High Conversion Freemium:** The free tier is generous enough to attract users and let them prove the value on a small scale, but it’s intentionally capped so that any serious usage triggers an upgrade. We will monitor usage patterns – e.g., if 25% of free users hit a limit within a month, that indicates a healthy funnel to upsell. Our job is then to prompt them with timely upgrade offers (“You’re 90% of your free quota – upgrade now to avoid interruption”). Because the tool naturally becomes part of a workflow (AI assistant), users will have a strong incentive to keep it running smoothly, which encourages converting to paid before hitting a wall. This should lead to a high conversion rate from free to paid relative to typical SaaS, as long as we demonstrate clear value.  
- **Expansion in Accounts:** For Pro (or higher) subscribers, we look for opportunities to increase their usage. This is aided by our pricing model’s multiple dimensions. For instance, a small team might start on Pro with one project, then realize they can use the service for another project – now they need another instance or more tools, which might bump them toward Enterprise or buying an add-on. We will reach out (or automate prompts) when usage is climbing: “Looks like your team is getting great use out of the platform! Did you know you can add another server for just $X or upgrade to Enterprise for unlimited…”. Good customer success can drive upsells.  
- **Partnerships and Integrations:** We will seek partnerships with complementary platforms (for example, IDEs like Cursor or code hosting like Replit) to feature our service as “MCP hosting partner” or similar. If, say, a dev platform suggests our service for connecting their environment to AI, that funnel can bring high-value users. Partner deals might include revenue share, but more importantly they accelerate user acquisition.  
- **Content Marketing & SEO:** We’ll create content (blog posts, tutorials) around keywords like “Claude MCP hosting”, “AI tool integration”, “connect [X] to LLM with MCP”. This will capture search traffic as interest in MCP grows (being an emerging tech from late 20 ([Introducing the Model Context Protocol \ Anthropic](https://www.anthropic.com/news/model-context-protocol#:~:text=The%20Model%20Context%20Protocol%20is,that%20connect%20to%20these%20servers))5】, people will seek info). By being early with content, we can establish domain authority and get consistent sign-ups through organic channels. Over time this reduces paid marketing needs and produces steady growth – a logarithmic curve as cumulative content drives compounding traffic.
-
-**Freemium Model to Drive High Conversion:**  
-Our freemium is designed such that using the product more deeply also increases its stickiness. For example, a single MCP server with one tool might be “nice to have”, but when a user configures 3-4 tools and integrates it into their daily AI workflow, it becomes an integral part of their productivity. At that point, paying a modest fee for reliability and capacity is a no-brainer. We also ensure that the paid plans have clear **added value** beyond just higher limits, like the premium connectors and better support. This way, users don’t feel “forced” to pay – they feel they *want* to pay to get the full experience. High conversion is further supported by:  
-
- **Trust and Reliability:** From the get-go, we’ll emphasize that this is a reliable hosted solution (perhaps showcasing uptime or testimonials). Users are more willing to pay if they trust the service for mission-critical use. Any outage or major bug could hurt that trust, so part of go-to-market is ensuring the MVP is stable and communicating transparently about status (maybe a status page).  
- **Customer Feedback Loop:** Early on, we will personally reach out to active free users to ask about their experience, and what would make it worth paying for. This not only builds goodwill (they feel heard) but also gives us insight to refine pricing or features. Satisfied early users can become evangelists.  
- **Conversion CTAs:** Within the app, strategic call-to-action prompts will remind users of the benefits of upgrading. For instance, on the analytics page we might show “On Pro, you’d also see detailed per-tool usage” or if they try to add a premium tool on free, it’ll show a lock icon and “Available on Pro plan”. These nudges, without being too naggy, keep the upgrade option in view.
-
-By combining a smooth onboarding (to drive sign-ups), community/viral hooks (to multiply sign-ups), and a thoughtfully crafted freemium funnel (to turn sign-ups into revenue), we aim to grow MRR steadily. The strategy focuses on **developer happiness and empowerment** – if we make users feel empowered by the product, they will naturally promote it and invest in it. As usage of large-language-model tools like Claude grows (and MCP as a standard grows), our service should ride that wave and capture a significant user base by being the easiest way to deploy MCP. Success will be measured by conversion rates, retention rates, and referrals, all of which this strategy seeks to maximize.
-
-# Future Roadmap  
-
-After a successful MVP launch, we have a broad vision for expanding the product’s capabilities and market reach. We will prioritize features that enhance the platform’s usefulness, create network effects, and cater to more demanding use cases (enterprise, advanced developers). Below are key items on our future roadmap:
-
-**Marketplace for User-Created MCP Servers:** One of the most exciting opportunities is to turn the platform into not just a host, but a hub for MCP tools. In the future, we plan to allow users to **publish their own MCP server configurations or custom tools** to a marketplace/discoverability section. This could work as follows: advanced users can upload code for a new tool (or entire MCP server) that they developed – for example, a connector to a less common SaaS or a specialized data source. We would sandbox and review these for security, then allow them to be one-click installable by other users. This creates a **network effect**: the more tools the community contributes, the more valuable the platform becomes to every user. We might also introduce a rating/review system so the best community-contributed tools rise to the top. In terms of monetization, we could let creators offer tools for free or for a price (taking a commission), effectively opening a **developer marketplace** similar to app stores. This not only drives viral growth (people share their creations) as mentioned, but also can generate additional revenue and engagement. This is a longer-term project as it involves vetting third-party code and possibly providing sandboxing beyond what Cloudflare offers, but even a curated “recipes library” of configurations could be a stepping stone (e.g. users sharing JSON configs for certain combinations which others can import).
-
-**Expansion of Premium Tools Library:** We will continuously grow the built-in library of tools, especially focusing on **high-value integrations**. Some examples on the roadmap: connectors for enterprise software like Salesforce, databases like Oracle or Microsoft SQL (for legacy systems), knowledge bases like Confluence or SharePoint, or even hardware/IoT integrations for specialized use. Many of these may not be available in the open-source MCP repo or might not be trivial for users to set up by themselves. By providing them, we increase the reasons users come to our platform. Premium tools might also include *composite tools* – for instance, a “Quick Analytics” tool that, behind the scenes, uses a combination of database + spreadsheet logic to let an AI do data analysis. These complex scenarios can be packaged as easy-to-add tools on our platform, which users would otherwise struggle to implement solo. Each new premium tool can be a marketing opportunity (“Now supporting X!”) to attract users who specifically need that. Our development team might partner with API providers (say, partner with Notion or Airtable to make an MCP tool for their product). Over time, our library could become the most comprehensive collection of AI integration tools available. We will also update existing tools to improve them (e.g. if Slack releases new APIs, we update our Slack connector with more features) – ensuring our paying customers always have cutting-edge capabilities.
-
-**Enterprise-Focused Offerings:** As we gain traction, likely larger companies will show interest. Beyond the Enterprise plan features discussed in pricing, there are specific roadmap items to better serve enterprise clients:  
- **Team Administration:** Develop features for org admins to manage multiple users under one billing. This includes creating organization accounts where several developers can collaborate on MCP servers, share access to instances, and have role-based permissions. For example, an admin can create an MCP server and a teammate can use it or view logs. This is crucial for adoption in a team setting and for enterprises that want to manage everything centrally.  
- **Single Sign-On (SSO):** Implement SSO integration with providers like Okta, Azure AD, or Google Workspace. Many enterprises require that employees authenticate through their centralized system. This not only smooths login for users but also satisfies security/compliance requirements.  
- **On-Premise/Private Deployment:** While our core offering is cloud (multi-tenant), some enterprise customers might demand a self-hosted version (due to strict data policies). A future option is a **self-hosted appliance** or a dedicated instance of our service on their cloud. This could be delivered as a container or as a managed single-tenant deployment (perhaps using their Cloudflare account or Workers on their behalf). It’s a non-trivial addition, but it could unlock contracts with big clients (e.g., a bank that wants to run everything in-house). We would likely only pursue this for significant deals, and it would come with premium pricing.  
- **Compliance and Certifications:** Over time we will work on obtaining relevant certifications (SOC 2, ISO 27001, etc.) to assure enterprises that our service meets security standards. This isn’t a feature per se, but part of the roadmap to being enterprise-ready by addressing legal/infosec hurdles that large companies have for vendors.  
- **Enhanced Analytics & Reporting:** Enterprises might want detailed reports of usage, perhaps by user or project, and integration with their monitoring tools. We could build an **export or API** for usage data, so enterprises can pull MCP usage into their internal dashboards. Also, audit logs of who accessed what tool could be provided for compliance tracking.  
-
-**AI Model Integrations and Evolution:** The AI landscape is fast-moving. We will keep an eye on how MCP itself evolves and how other AI systems might integrate:  
- If OpenAI or other LLM providers start supporting a similar plugin protocol, we could adapt our platform to serve those as well. For example, if a future version of ChatGPT allowed external tools (like plugins, but standardized), our service could host those connectors. Because MCP is open, it might become widely adopted, or it could converge with other standards. Our architecture being flexible means we can tweak the interfaces or add new output formats as needed (e.g., maybe a direct integration with LangChain or other agent frameworks as a client).  
- We might also incorporate **Function Calling** as a concept: some AI models prefer function call interfaces to tools. If relevant, we could present our tools to models via a function-calling gateway. This is speculative, but essentially staying flexible to serve as the “glue” between any AI assistant and any data/tool.  
- **Performance and ML features:** In the future, we might integrate more AI logic into the platform – e.g., optimizing how context is fetched (maybe pre-fetching or caching results from tools to answer queries faster). We could build a layer that intelligently routes tool requests or caches common queries (helpful for performance-sensitive enterprise use). 
-
-**Improved User Experience and Features:** As we gather user feedback post-MVP, we’ll continuously improve the core experience:  
- A likely request is **more robust debugging tools** for MCP servers. We could enhance the “log viewer” into an interactive console where users can simulate tool calls or inspect the internal state of their server. This would help developers test their configurations. Possibly integrate with the open-source MCP Inspector to ([Introduction - Model Context Protocol](https://modelcontextprotocol.io/introduction#:~:text=Building%20MCP%20with%20LLMs%20Learn,with%20our%20interactive%20debugging%20tool))7】 in our UI.  
- Another area is **mobile access or notifications** – e.g., an admin wants to know if their server is down or hit an error. We might create a simple mobile-friendly dashboard or push notifications for certain events (like “Your instance hit an error, click to view logs”).  
- **Internationalization**: If the tool gains global traction, having the dashboard in multiple languages could be on the roadmap to capture non-English speaking developer communities (especially since AI is worldwide).  
- **UI/UX polish**: Post-MVP, invest in making the UI more intuitive based on analytics (where do users drop off in the flow?) and feedback. This includes everything from better forms for config to perhaps a “dark mode” for the developer-friendly vibe.
-
-**Scalability and Technical Enhancements:** As usage grows, we’ll revisit the architecture to ensure it scales cost-effectively:  
- We might develop a **smart deployment manager** that batches or optimizes Cloudflare Worker usage. For example, if thousands of instances are running, we’ll work closely with Cloudflare or consider alternate execution methods to keep costs manageable. Cloudflare may introduce new features (like a higher-level MCP service) – if so, we’ll integrate rather than compete (maybe our service becomes a management layer on top of such features).  
- Implementation of **Durable Objects or Shared Processes:** If many instances use the same tool, perhaps we can load one copy of code and serve multiple, to save memory. These are low-level optimizations that might reduce per-instance overhead, allowing more users on the platform cheaply. The roadmap includes continuous performance profiling and cost analysis to refine the backend.  
- **AI/Automation for Support:** Eventually, we could use an AI agent (perhaps powered by our own MCP setup!) to help support and guide users in-app. For instance, a chatbot that can answer “Why is my server not responding?” by checking their logs or configuration. This leverages our love for AI within the product and provides quicker help.
-
-</context>
+# Claude Task Master - Product Requirements Document

 <PRD>
-1. Overview
-Project Name: MCP SaaS MVP
-Objective: Provide a web-based platform where users can host customizable Model Context Protocol (MCP) servers in the cloud (on Cloudflare). Users can sign up with GitHub OAuth, pick from a library of “tools,” configure their own MCP server, and receive a secure SSE endpoint or npx command to integrate with AI assistants (e.g., Claude, Cursor IDE). The platform will track usage, enforce plan limits, and provide analytics.
+# Technical Architecture  

-1.1 Key Goals
-Simplify MCP Hosting: Users can deploy MCP servers without managing infrastructure.
-Curated Tools Library: Provide a built-in catalog of connectors (Slack, GitHub, Postgres, etc.).
-Plan Tiers: Freemium approach with usage-based or tool-based constraints.
-Analytics: Users can see basic usage stats (call counts) in a web dashboard.
-Scalability: Cloudflare Workers for hosting, scaling automatically on demand.
-2. Core Features
-User Authentication
+## System Components
+1. **Task Management Core**
+   - Tasks.json file structure (single source of truth)
+   - Task model with dependencies, priorities, and metadata
+   - Task state management system
+   - Task file generation subsystem

-GitHub OAuth sign-in.
-Store user profiles (plan tier, usage) in a Cloudflare D1 database.
-MCP Server Creation
+2. **AI Integration Layer**
+   - Anthropic Claude API integration
+   - Perplexity API integration (optional)
+   - Prompt engineering components
+   - Response parsing and processing

-Create/Edit servers in a dashboard UI.
-Assign a unique API key and SSE endpoint.
-Manage multiple tools per server, each with user-provided config (API tokens, etc.).
-Deployment
+3. **Command Line Interface**
+   - Command parsing and execution
+   - Interactive user input handling
+   - Display and formatting utilities
+   - Status reporting and feedback system

-On creation or update, compile a Worker script that includes selected tools.
-Programmatically deploy via Cloudflare’s API.
-Provide user-friendly output: SSE URL, or an npx command for local bridging.
-Tool Integrations (MVP)
+4. **Cursor AI Integration**
+   - Cursor rules documentation
+   - Agent interaction patterns
+   - Workflow guideline specifications

-Slack (send/read messages)
-GitHub (list repos/files)
-Possibly a DB connector (Postgres) or a minimal placeholders for future expansions.
-Usage Analytics
+## Data Models

-Log each tool usage request for a given MCP server.
-Show monthly call counts in a simple chart or numeric table.
-Plan Enforcement
+### Task Model
+```json
+{
+  "id": 1,
+  "title": "Task Title",
+  "description": "Brief task description",
+  "status": "pending|done|deferred",
+  "dependencies": [0],
+  "priority": "high|medium|low",
+  "details": "Detailed implementation instructions",
+  "testStrategy": "Verification approach details",
+  "subtasks": [
+    {
+      "id": 1,
+      "title": "Subtask Title",
+      "description": "Subtask description",
+      "status": "pending|done|deferred",
+      "dependencies": [],
+      "acceptanceCriteria": "Verification criteria"
+    }
+  ]
+}
+```

-Free Tier: e.g., 1 server, 2 tools, limited monthly calls.
-Paid Tier: higher limits, premium tools, more calls.
-Enforcement logic blocks creation if user tries to exceed their plan constraints.
-Meta-Development Scripts
+### Tasks Collection Model
+```json
+{
+  "meta": {
+    "projectName": "Project Name",
+    "version": "1.0.0",
+    "prdSource": "path/to/prd.txt",
+    "createdAt": "ISO-8601 timestamp",
+    "updatedAt": "ISO-8601 timestamp"
+  },
+  "tasks": [
+    // Array of Task objects
+  ]
+}
+```

-A set of Node.js scripts to generate tasks, update them based on new prompts, and produce individual files for AI-driven development. (See “tasks.json” below.)
-3. Technical Architecture
-Frontend (React)
+### Task File Format
+```
+# Task ID: <id>
+# Title: <title>
+# Status: <status>
+# Dependencies: <comma-separated list of dependency IDs>
+# Priority: <priority>
+# Description: <brief description>
+# Details:
+<detailed implementation notes>

-Hosted on Cloudflare Pages or as static assets in a Worker.
-Communicates with a backend (Wrangler-based or separate Worker) via REST/GraphQL for server CRUD, usage analytics, etc.
-Backend
+# Test Strategy:
+<verification approach>

-Cloudflare Workers (Node.js environment).
-Handles user OAuth, CRUD for MCP servers, triggers Worker deployment for each server instance.
-Database: Cloudflare D1
+# Subtasks:
+1. <subtask title> - <subtask description>
+```

-users (id, github_id, plan, created_at, etc.)
-mcp_servers (id, user_id, name, api_key, config, etc.)
-server_tools (server_id, tool_name, config_json, etc.)
-usage_logs (server_id, timestamp, usage_count, etc.)
-Worker Deployment
+## APIs and Integrations
+1. **Anthropic Claude API**
+   - Authentication via API key
+   - Prompt construction and streaming
+   - Response parsing and extraction
+   - Error handling and retries

-Each MCP server is either a dedicated Worker or a single multi-tenant Worker that checks the server’s config on requests.
-SSE endpoint (/sse) implements the MCP protocol for AI assistants.
-Security
+2. **Perplexity API (via OpenAI client)**
+   - Authentication via API key
+   - Research-oriented prompt construction
+   - Enhanced contextual response handling
+   - Fallback mechanisms to Claude

-OAuth tokens and user credentials stored securely in D1.
-Each MCP server uses a unique API key.
-Rate limits or usage checks at Worker runtime.
-4. MVP Scope vs. Future Enhancements
-In Scope (MVP)
+3. **File System API**
+   - Reading/writing tasks.json
+   - Managing individual task files
+   - Command execution logging
+   - Debug logging system

-Basic Slack and GitHub tool connectors.
-Simple plan limits (free vs. paid placeholders).
-Basic usage analytics.
-Minimal admin features (e.g., toggle a user to paid in DB).
-Dashboard to create/edit servers and see usage stats.
-Out of Scope (Future)
+## Infrastructure Requirements
+1. **Node.js Runtime**
+   - Version 14.0.0 or higher
+   - ES Module support
+   - File system access rights
+   - Command execution capabilities

-Payment processing integration (Stripe).
-Marketplace for community-contributed MCP servers.
-Advanced enterprise features (SSO, compliance).
-Team management (multiple logins per org).
-5. Go-To-Market Strategy (Brief)
-Freemium: Users can sign up instantly with GitHub OAuth, get a free tier.
-Upsell: Show usage limits and prompt users to upgrade upon hitting them.
-Developer Community: Provide easy instructions for integration with Claude or Cursor, emphasize no-code/low-code setup.
-Future: Offer enterprise-tier with custom usage agreements.
-6. Development Plan (High-Level)
-Phase A: Project scaffolding (repo, Wrangler, D1 setup).
-Phase B: User authentication (GitHub OAuth) and plan logic.
-Phase C: MCP server CRUD, generating API keys.
-Phase D: Tool library management and UI to add/remove tools.
-Phase E: Worker deployment logic, SSE endpoint.
-Phase F: Implement Slack, GitHub tools with usage logging.
-Phase G: Analytics & usage display in dashboard.
-Phase H: Final polish, plan enforcement, better UI.
-Phase I: Meta scripts for tasks and continuous AI-driven dev.
+2. **Configuration Management**
+   - Environment variable handling
+   - .env file support
+   - Configuration validation
+   - Sensible defaults with overrides

+3. **Development Environment**
+   - Git repository
+   - NPM package management
+   - Cursor editor integration
+   - Command-line terminal access
+
+# Development Roadmap  
+
+## Phase 1: Core Task Management System
+1. **Task Data Structure**
+   - Design and implement the tasks.json structure
+   - Create task model validation
+   - Implement basic task operations (create, read, update)
+   - Develop file system interactions
+
+2. **Command Line Interface Foundation**
+   - Implement command parsing with Commander.js
+   - Create help documentation
+   - Implement colorized console output
+   - Add logging system with configurable levels
+
+3. **Basic Task Operations**
+   - Implement task listing functionality
+   - Create task status update capability
+   - Add dependency tracking
+   - Implement priority management
+
+4. **Task File Generation**
+   - Create task file templates
+   - Implement generation from tasks.json
+   - Add bi-directional synchronization
+   - Implement proper file naming and organization
+
+## Phase 2: AI Integration
+1. **Claude API Integration**
+   - Implement API authentication
+   - Create prompt templates for PRD parsing
+   - Design response handlers
+   - Add error management and retries
+
+2. **PRD Parsing System**
+   - Implement PRD file reading
+   - Create PRD to task conversion logic
+   - Add intelligent dependency inference
+   - Implement priority assignment logic
+
+3. **Task Expansion With Claude**
+   - Create subtask generation prompts
+   - Implement subtask creation workflow
+   - Add context-aware expansion capabilities
+   - Implement parent-child relationship management
+
+4. **Implementation Drift Handling**
+   - Add capability to update future tasks
+   - Implement task rewriting based on new context
+   - Create dependency chain updates
+   - Preserve completed work while updating future tasks
+
+## Phase 3: Advanced Features
+1. **Perplexity Integration**
+   - Implement Perplexity API authentication
+   - Create research-oriented prompts
+   - Add fallback to Claude when unavailable
+   - Implement response quality comparison logic
+
+2. **Research-Backed Subtask Generation**
+   - Create specialized research prompts
+   - Implement context enrichment
+   - Add domain-specific knowledge incorporation
+   - Create more detailed subtask generation
+
+3. **Batch Operations**
+   - Implement multi-task status updates
+   - Add bulk subtask generation
+   - Create task filtering and querying
+   - Implement advanced dependency management
+
+4. **Project Initialization**
+   - Create project templating system
+   - Implement interactive setup
+   - Add environment configuration
+   - Create documentation generation
+
+## Phase 4: Cursor AI Integration
+1. **Cursor Rules Implementation**
+   - Create dev_workflow.mdc documentation
+   - Implement cursor_rules.mdc
+   - Add self_improve.mdc
+   - Design rule integration documentation
+
+2. **Agent Workflow Guidelines**
+   - Document task discovery workflow
+   - Create task selection guidelines
+   - Implement implementation guidance
+   - Add verification procedures
+
+3. **Agent Command Integration**
+   - Document command syntax for agents
+   - Create example interactions
+   - Implement agent response patterns
+   - Add context management for agents
+
+4. **User Documentation**
+   - Create detailed README
+   - Add scripts documentation
+   - Implement example workflows
+   - Create troubleshooting guides
+
+# Logical Dependency Chain
+
+## Foundation Layer
+1. **Task Data Structure**
+   - Must be implemented first as all other functionality depends on this
+   - Defines the core data model for the entire system
+   - Establishes the single source of truth concept
+
+2. **Command Line Interface**
+   - Built on top of the task data structure
+   - Provides the primary user interaction mechanism
+   - Required for all subsequent operations to be accessible
+
+3. **Basic Task Operations**
+   - Depends on both task data structure and CLI
+   - Provides the fundamental operations for task management
+   - Enables the minimal viable workflow
+
+## Functional Layer
+4. **Task File Generation**
+   - Depends on task data structure and basic operations
+   - Creates the individual task files for reference
+   - Enables the file-based workflow complementing tasks.json
+
+5. **Claude API Integration**
+   - Independent of most previous components but needs the task data structure
+   - Provides the AI capabilities that enhance the system
+   - Gateway to advanced task generation features
+
+6. **PRD Parsing System**
+   - Depends on Claude API integration and task data structure
+   - Enables the initial task generation workflow
+   - Creates the starting point for new projects
+
+## Enhancement Layer
+7. **Task Expansion With Claude**
+   - Depends on Claude API integration and basic task operations
+   - Enhances existing tasks with more detailed subtasks
+   - Improves the implementation guidance
+
+8. **Implementation Drift Handling**
+   - Depends on Claude API integration and task operations
+   - Addresses a key challenge in AI-driven development
+   - Maintains the relevance of task planning as implementation evolves
+
+9. **Perplexity Integration**
+   - Can be developed in parallel with other features after Claude integration
+   - Enhances the quality of generated content
+   - Provides research-backed improvements
+
+## Advanced Layer
+10. **Research-Backed Subtask Generation**
+    - Depends on Perplexity integration and task expansion
+    - Provides higher quality, more contextual subtasks
+    - Enhances the value of the task breakdown
+
+11. **Batch Operations**
+    - Depends on basic task operations
+    - Improves efficiency for managing multiple tasks
+    - Quality-of-life enhancement for larger projects
+
+12. **Project Initialization**
+    - Depends on most previous components being stable
+    - Provides a smooth onboarding experience
+    - Creates a complete project setup in one step
+
+## Integration Layer
+13. **Cursor Rules Implementation**
+    - Can be developed in parallel after basic functionality
+    - Provides the guidance for Cursor AI agent
+    - Enhances the AI-driven workflow
+
+14. **Agent Workflow Guidelines**
+    - Depends on Cursor rules implementation
+    - Structures how the agent interacts with the system
+    - Ensures consistent agent behavior
+
+15. **Agent Command Integration**
+    - Depends on agent workflow guidelines
+    - Provides specific command patterns for the agent
+    - Optimizes the agent-user interaction
+
+16. **User Documentation**
+    - Should be developed alongside all features
+    - Must be completed before release
+    - Ensures users can effectively use the system
+
+# Risks and Mitigations  
+
+## Technical Challenges
+
+### API Reliability
+**Risk**: Anthropic or Perplexity API could have downtime, rate limiting, or breaking changes.
+**Mitigation**: 
+- Implement robust error handling with exponential backoff
+- Add fallback mechanisms (Claude fallback for Perplexity)
+- Cache important responses to reduce API dependency
+- Support offline mode for critical functions
+
+### Model Output Variability
+**Risk**: AI models may produce inconsistent or unexpected outputs.
+**Mitigation**:
+- Design robust prompt templates with strict output formatting requirements
+- Implement response validation and error detection
+- Add self-correction mechanisms and retries with improved prompts
+- Allow manual editing of generated content
+
+### Node.js Version Compatibility
+**Risk**: Differences in Node.js versions could cause unexpected behavior.
+**Mitigation**:
+- Clearly document minimum Node.js version requirements
+- Use transpilers if needed for compatibility
+- Test across multiple Node.js versions
+- Handle version-specific features gracefully
+
+## MVP Definition
+
+### Feature Prioritization
+**Risk**: Including too many features in the MVP could delay release and adoption.
+**Mitigation**:
+- Define MVP as core task management + basic Claude integration
+- Ensure each phase delivers a complete, usable product
+- Implement feature flags for easy enabling/disabling of features
+- Get early user feedback to validate feature importance
+
+### Scope Creep
+**Risk**: The project could expand beyond its original intent, becoming too complex.
+**Mitigation**:
+- Maintain a strict definition of what the tool is and isn't
+- Focus on task management for AI-driven development
+- Evaluate new features against core value proposition
+- Implement extensibility rather than building every feature
+
+### User Expectations
+**Risk**: Users might expect a full project management solution rather than a task tracking system.
+**Mitigation**:
+- Clearly communicate the tool's purpose and limitations
+- Provide integration points with existing project management tools
+- Focus on the unique value of AI-driven development
+- Document specific use cases and example workflows
+
+## Resource Constraints
+
+### Development Capacity
+**Risk**: Limited development resources could delay implementation.
+**Mitigation**:
+- Phase implementation to deliver value incrementally
+- Focus on core functionality first
+- Leverage open source libraries where possible
+- Design for extensibility to allow community contributions
+
+### AI Cost Management
+**Risk**: Excessive API usage could lead to high costs.
+**Mitigation**:
+- Implement token usage tracking and reporting
+- Add configurable limits to prevent unexpected costs
+- Cache responses where appropriate
+- Optimize prompts for token efficiency
+- Support local LLM options in the future
+
+### Documentation Overhead
+**Risk**: Complexity of the system requires extensive documentation that is time-consuming to maintain.
+**Mitigation**:
+- Use AI to help generate and maintain documentation
+- Create self-documenting commands and features
+- Implement progressive documentation (basic to advanced)
+- Build help directly into the CLI
+
+# Appendix  
+
+## AI Prompt Engineering Specifications
+
+### PRD Parsing Prompt Structure
+```
+You are assisting with transforming a Product Requirements Document (PRD) into a structured set of development tasks.
+
+Given the following PRD, create a comprehensive list of development tasks that would be needed to implement the described product.
+
+For each task:
+1. Assign a short, descriptive title
+2. Write a concise description
+3. Identify dependencies (which tasks must be completed before this one)
+4. Assign a priority (high, medium, low)
+5. Include detailed implementation notes
+6. Describe a test strategy to verify completion
+
+Structure the tasks in a logical order of implementation.
+
+PRD:
+{prd_content}
+```
+
+### Task Expansion Prompt Structure
+```
+You are helping to break down a development task into more manageable subtasks.
+
+Main task:
+Title: {task_title}
+Description: {task_description}
+Details: {task_details}
+
+Please create {num_subtasks} specific subtasks that together would accomplish this main task.
+
+For each subtask, provide:
+1. A clear, actionable title
+2. A concise description
+3. Any dependencies on other subtasks
+4. Specific acceptance criteria to verify completion
+
+Additional context:
+{additional_context}
+```
+
+### Research-Backed Expansion Prompt Structure
+```
+You are a technical researcher and developer helping to break down a software development task into detailed, well-researched subtasks.
+
+Main task:
+Title: {task_title}
+Description: {task_description}
+Details: {task_details}
+
+Research the latest best practices, technologies, and implementation patterns for this type of task. Then create {num_subtasks} specific, actionable subtasks that together would accomplish the main task.
+
+For each subtask:
+1. Provide a clear, specific title
+2. Write a detailed description including technical approach
+3. Identify dependencies on other subtasks
+4. Include specific acceptance criteria
+5. Reference any relevant libraries, tools, or resources that should be used
+
+Consider security, performance, maintainability, and user experience in your recommendations.
+```
+
+## Task File System Specification
+
+### Directory Structure
+```
+/
+├── .cursor/
+│   └── rules/
+│       ├── dev_workflow.mdc
+│       ├── cursor_rules.mdc
+│       └── self_improve.mdc
+├── scripts/
+│   ├── dev.js
+│   └── README.md
+├── tasks/
+│   ├── task_001.txt
+│   ├── task_002.txt
+│   └── ...
+├── .env
+├── .env.example
+├── .gitignore
+├── package.json
+├── README.md
+└── tasks.json
+```
+
+### Task ID Specification
+- Main tasks: Sequential integers (1, 2, 3, ...)
+- Subtasks: Parent ID + dot + sequential integer (1.1, 1.2, 2.1, ...)
+- ID references: Used in dependencies, command parameters
+- ID ordering: Implies suggested implementation order
+
+## Command-Line Interface Specification
+
+### Global Options
+- `--help`: Display help information
+- `--version`: Display version information
+- `--file=<file>`: Specify an alternative tasks.json file
+- `--quiet`: Reduce output verbosity
+- `--debug`: Increase output verbosity
+- `--json`: Output in JSON format (for programmatic use)
+
+### Command Structure
+- `node scripts/dev.js <command> [options]`
+- All commands operate on tasks.json by default
+- Commands follow consistent parameter naming
+- Common parameter styles: `--id=<id>`, `--status=<status>`, `--prompt="<text>"`
+- Boolean flags: `--all`, `--force`, `--with-subtasks`
+
+## API Integration Specifications
+
+### Anthropic API Configuration
+- Authentication: ANTHROPIC_API_KEY environment variable
+- Model selection: MODEL environment variable
+- Default model: claude-3-7-sonnet-20250219
+- Maximum tokens: MAX_TOKENS environment variable (default: 4000)
+- Temperature: TEMPERATURE environment variable (default: 0.7)
+
+### Perplexity API Configuration
+- Authentication: PERPLEXITY_API_KEY environment variable
+- Model selection: PERPLEXITY_MODEL environment variable
+- Default model: sonar-medium-online
+- Connection: Via OpenAI client
+- Fallback: Use Claude if Perplexity unavailable
 </PRD>
--- a/scripts/prepare-package.js
+++ b/scripts/prepare-package.js
@@ -127,6 +127,7 @@ function preparePackage() {
    'assets/env.example',
    'assets/gitignore',
    'assets/example_prd.txt',
+    'assets/scripts_README.md',
    '.cursor/rules/dev_workflow.mdc',
    '.cursor/rules/cursor_rules.mdc',
    '.cursor/rules/self_improve.mdc'
--- a/scripts/task-complexity-report.json
+++ b/scripts/task-complexity-report.json
@@ -0,0 +1,171 @@
+{
+  "meta": {
+    "generatedAt": "2025-03-21T20:01:53.007Z",
+    "tasksAnalyzed": 20,
+    "thresholdScore": 5,
+    "projectName": "Your Project Name",
+    "usedResearch": true
+  },
+  "complexityAnalysis": [
+    {
+      "taskId": 1,
+      "taskTitle": "Implement Task Data Structure",
+      "complexityScore": 8,
+      "recommendedSubtasks": 5,
+      "expansionPrompt": "Break down the task of creating the tasks.json structure into subtasks focusing on schema design, model creation, validation, file operations, and error handling.",
+      "reasoning": "This task involves multiple critical components including schema design, model creation, and file operations, each requiring detailed attention and validation."
+    },
+    {
+      "taskId": 2,
+      "taskTitle": "Develop Command Line Interface Foundation",
+      "complexityScore": 7,
+      "recommendedSubtasks": 5,
+      "expansionPrompt": "Divide the CLI development into subtasks such as command parsing, help documentation, console output, logging system, and global options handling.",
+      "reasoning": "Creating a CLI involves several distinct functionalities that need to be implemented and integrated, each contributing to the overall complexity."
+    },
+    {
+      "taskId": 3,
+      "taskTitle": "Implement Basic Task Operations",
+      "complexityScore": 9,
+      "recommendedSubtasks": 6,
+      "expansionPrompt": "Break down the task operations into subtasks including listing, creating, updating, deleting, status changes, dependency management, and priority handling.",
+      "reasoning": "This task requires implementing a wide range of operations, each with its own logic and dependencies, increasing the complexity."
+    },
+    {
+      "taskId": 4,
+      "taskTitle": "Create Task File Generation System",
+      "complexityScore": 7,
+      "recommendedSubtasks": 5,
+      "expansionPrompt": "Divide the file generation system into subtasks such as template creation, file generation, synchronization, file naming, and update handling.",
+      "reasoning": "The task involves creating a system that generates and synchronizes files, requiring careful handling of templates and updates."
+    },
+    {
+      "taskId": 5,
+      "taskTitle": "Integrate Anthropic Claude API",
+      "complexityScore": 8,
+      "recommendedSubtasks": 6,
+      "expansionPrompt": "Break down the API integration into subtasks including authentication, prompt templates, response handling, error management, token tracking, and model configuration.",
+      "reasoning": "Integrating an external API involves multiple steps from authentication to response handling, each adding to the complexity."
+    },
+    {
+      "taskId": 6,
+      "taskTitle": "Build PRD Parsing System",
+      "complexityScore": 9,
+      "recommendedSubtasks": 6,
+      "expansionPrompt": "Divide the PRD parsing system into subtasks such as file reading, prompt engineering, task conversion, dependency inference, priority assignment, and chunking.",
+      "reasoning": "Parsing PRDs and converting them into tasks requires handling various complexities including dependency inference and priority assignment."
+    },
+    {
+      "taskId": 7,
+      "taskTitle": "Implement Task Expansion with Claude",
+      "complexityScore": 7,
+      "recommendedSubtasks": 5,
+      "expansionPrompt": "Break down the task expansion into subtasks including prompt creation, workflow implementation, context-aware expansion, relationship management, subtask specification, and regeneration.",
+      "reasoning": "Expanding tasks into subtasks using AI involves creating prompts and managing relationships, adding to the complexity."
+    },
+    {
+      "taskId": 8,
+      "taskTitle": "Develop Implementation Drift Handling",
+      "complexityScore": 8,
+      "recommendedSubtasks": 5,
+      "expansionPrompt": "Divide the drift handling into subtasks including task updates, rewriting, dependency chain updates, completed work preservation, and update analysis.",
+      "reasoning": "Handling implementation drift requires updating tasks and dependencies while preserving completed work, increasing complexity."
+    },
+    {
+      "taskId": 9,
+      "taskTitle": "Integrate Perplexity API",
+      "complexityScore": 7,
+      "recommendedSubtasks": 5,
+      "expansionPrompt": "Break down the API integration into subtasks including authentication, prompt templates, response handling, fallback logic, quality comparison, and model selection.",
+      "reasoning": "Integrating another external API involves similar complexities as the Claude API integration, including authentication and response handling."
+    },
+    {
+      "taskId": 10,
+      "taskTitle": "Create Research-Backed Subtask Generation",
+      "complexityScore": 8,
+      "recommendedSubtasks": 6,
+      "expansionPrompt": "Divide the research-backed generation into subtasks including prompt creation, context enrichment, domain knowledge incorporation, detailed generation, and reference inclusion.",
+      "reasoning": "Enhancing subtask generation with research requires handling domain-specific knowledge and context enrichment, adding complexity."
+    },
+    {
+      "taskId": 11,
+      "taskTitle": "Implement Batch Operations",
+      "complexityScore": 7,
+      "recommendedSubtasks": 5,
+      "expansionPrompt": "Break down the batch operations into subtasks including status updates, subtask generation, task filtering, dependency management, prioritization, and command creation.",
+      "reasoning": "Implementing batch operations involves handling multiple tasks simultaneously, each with its own set of operations."
+    },
+    {
+      "taskId": 12,
+      "taskTitle": "Develop Project Initialization System",
+      "complexityScore": 8,
+      "recommendedSubtasks": 6,
+      "expansionPrompt": "Divide the project initialization into subtasks including templating, setup wizard, environment configuration, directory structure, example tasks, and default configuration.",
+      "reasoning": "Creating a project initialization system involves setting up multiple components and configurations, increasing complexity."
+    },
+    {
+      "taskId": 13,
+      "taskTitle": "Create Cursor Rules Implementation",
+      "complexityScore": 7,
+      "recommendedSubtasks": 5,
+      "expansionPrompt": "Break down the Cursor rules implementation into subtasks including documentation creation, rule implementation, directory setup, and integration documentation.",
+      "reasoning": "Implementing Cursor rules involves creating documentation and setting up directory structures, adding to the complexity."
+    },
+    {
+      "taskId": 14,
+      "taskTitle": "Develop Agent Workflow Guidelines",
+      "complexityScore": 7,
+      "recommendedSubtasks": 5,
+      "expansionPrompt": "Divide the agent workflow guidelines into subtasks including task discovery, selection, implementation, verification, prioritization, and dependency handling.",
+      "reasoning": "Creating guidelines for AI agents involves defining workflows and handling dependencies, increasing complexity."
+    },
+    {
+      "taskId": 15,
+      "taskTitle": "Implement Agent Command Integration",
+      "complexityScore": 8,
+      "recommendedSubtasks": 6,
+      "expansionPrompt": "Break down the agent command integration into subtasks including command syntax, example interactions, response patterns, context management, special flags, and output interpretation.",
+      "reasoning": "Integrating commands for AI agents involves handling syntax, responses, and context, adding to the complexity."
+    },
+    {
+      "taskId": 16,
+      "taskTitle": "Create Configuration Management System",
+      "complexityScore": 8,
+      "recommendedSubtasks": 6,
+      "expansionPrompt": "Divide the configuration management into subtasks including environment handling, .env support, validation, defaults, template creation, documentation, and API key security.",
+      "reasoning": "Implementing a robust configuration system involves handling environment variables, validation, and security, increasing complexity."
+    },
+    {
+      "taskId": 17,
+      "taskTitle": "Implement Comprehensive Logging System",
+      "complexityScore": 7,
+      "recommendedSubtasks": 5,
+      "expansionPrompt": "Break down the logging system into subtasks including log levels, output destinations, command logging, API logging, error tracking, metrics, and file rotation.",
+      "reasoning": "Creating a logging system involves implementing multiple log levels and destinations, adding to the complexity."
+    },
+    {
+      "taskId": 18,
+      "taskTitle": "Create Comprehensive User Documentation",
+      "complexityScore": 8,
+      "recommendedSubtasks": 6,
+      "expansionPrompt": "Divide the user documentation into subtasks including README creation, command reference, configuration guide, examples, troubleshooting, API documentation, and best practices.",
+      "reasoning": "Developing comprehensive documentation involves covering multiple aspects of the system, increasing complexity."
+    },
+    {
+      "taskId": 19,
+      "taskTitle": "Implement Error Handling and Recovery",
+      "complexityScore": 8,
+      "recommendedSubtasks": 6,
+      "expansionPrompt": "Break down the implementation of error handling and recovery into 6 subtasks, focusing on different aspects like message formatting, API handling, file system recovery, data validation, command errors, and system state recovery. For each subtask, specify the key components to implement and any specific techniques or best practices to consider.",
+      "reasoning": "High complexity due to system-wide implementation, multiple error types, and recovery mechanisms. Requires careful design and integration across various system components."
+    },
+    {
+      "taskId": 20,
+      "taskTitle": "Create Token Usage Tracking and Cost Management",
+      "complexityScore": 7,
+      "recommendedSubtasks": 5,
+      "expansionPrompt": "Divide the token usage tracking and cost management system into 5 subtasks, covering usage tracking implementation, limit configuration, reporting and cost estimation, caching and optimization, and alert system development. For each subtask, outline the main features to implement and any key considerations for effective integration with the existing system.",
+      "reasoning": "Moderate to high complexity due to the need for accurate tracking, optimization strategies, and integration with existing API systems. Involves both data processing and user-facing features."
+    }
+  ]
+}
--- a/templates/README-task-master.md
+++ b/templates/README-task-master.md
@@ -361,8 +361,80 @@ Please mark it as complete and tell me what I should work on next.

 ## Documentation

-For more detailed documentation on the scripts, see the [scripts/README.md](scripts/README.md) file in your initialized project.
+For more detailed documentation on the scripts and command-line options, see the [scripts/README.md](scripts/README.md) file in your initialized project.

 ## License

 MIT 
+
+### Analyzing Task Complexity
+
+To analyze the complexity of tasks and automatically generate expansion recommendations:
+
+```bash
+npm run dev -- analyze-complexity
+```
+
+This command:
+- Analyzes each task using AI to assess its complexity
+- Recommends optimal number of subtasks based on configured DEFAULT_SUBTASKS
+- Generates tailored prompts for expanding each task
+- Creates a comprehensive JSON report with ready-to-use commands
+- Saves the report to scripts/task-complexity-report.json by default
+
+Options:
+```bash
+# Save report to a custom location
+npm run dev -- analyze-complexity --output=my-report.json
+
+# Use a specific LLM model
+npm run dev -- analyze-complexity --model=claude-3-opus-20240229
+
+# Set a custom complexity threshold (1-10)
+npm run dev -- analyze-complexity --threshold=6
+
+# Use an alternative tasks file
+npm run dev -- analyze-complexity --file=custom-tasks.json
+
+# Use Perplexity AI for research-backed complexity analysis
+npm run dev -- analyze-complexity --research
+```
+
+The generated report contains:
+- Complexity analysis for each task (scored 1-10)
+- Recommended number of subtasks based on complexity
+- AI-generated expansion prompts customized for each task
+- Ready-to-run expansion commands directly within each task analysis
+
+### Smart Task Expansion
+
+The `expand` command now automatically checks for and uses the complexity report:
+
+```bash
+# Expand a task, using complexity report recommendations if available
+npm run dev -- expand --id=8
+
+# Expand all tasks, prioritizing by complexity score if a report exists
+npm run dev -- expand --all
+```
+
+When a complexity report exists:
+- Tasks are automatically expanded using the recommended subtask count and prompts
+- When expanding all tasks, they're processed in order of complexity (highest first)
+- Research-backed generation is preserved from the complexity analysis
+- You can still override recommendations with explicit command-line options
+
+Example workflow:
+```bash
+# Generate the complexity analysis report with research capabilities
+npm run dev -- analyze-complexity --research
+
+# Review the report in scripts/task-complexity-report.json
+
+# Expand tasks using the optimized recommendations
+npm run dev -- expand --id=8
+# or expand all tasks
+npm run dev -- expand --all
+```
+
+This integration ensures that task expansion is informed by thorough complexity analysis, resulting in better subtask organization and more efficient development.
--- a/templates/dev.js
+++ b/templates/dev.js
@@ -31,6 +31,26 @@
 *      -> Use --no-research to disable research-backed generation.
 *      -> Add --force when using --all to regenerate subtasks for tasks that already have them.
 *      -> Note: Tasks marked as 'done' or 'completed' are always skipped.
+ *      -> If a complexity report exists for the specified task, its recommended 
+ *         subtask count and expansion prompt will be used (unless overridden).
+ *
+ *   7) analyze-complexity [options]
+ *      -> Analyzes task complexity and generates expansion recommendations
+ *      -> Generates a report in scripts/task-complexity-report.json by default 
+ *      -> Uses configured LLM to assess task complexity and create tailored expansion prompts
+ *      -> Can use Perplexity AI for research-backed analysis with --research flag
+ *      -> Each task includes:
+ *         - Complexity score (1-10)
+ *         - Recommended number of subtasks (based on DEFAULT_SUBTASKS config)
+ *         - Detailed expansion prompt
+ *         - Reasoning for complexity assessment
+ *         - Ready-to-run expansion command
+ *      -> Options:
+ *         --output, -o <file>: Specify output file path (default: 'scripts/task-complexity-report.json')
+ *         --model, -m <model>: Override LLM model to use for analysis
+ *         --threshold, -t <number>: Set minimum complexity score (1-10) for expansion recommendation (default: 5)
+ *         --file, -f <path>: Use alternative tasks.json file instead of default
+ *         --research, -r: Use Perplexity AI for research-backed complexity analysis
 *
 * Usage examples:
 *   node dev.js parse-prd --input=sample-prd.txt
@@ -43,6 +63,10 @@
 *   node dev.js expand --id=3 --no-research
 *   node dev.js expand --all
 *   node dev.js expand --all --force
+ *   node dev.js analyze-complexity
+ *   node dev.js analyze-complexity --output=custom-report.json
+ *   node dev.js analyze-complexity --threshold=6 --model=claude-3.7-sonnet
+ *   node dev.js analyze-complexity --research
 */

 import fs from 'fs';
@@ -662,6 +686,17 @@ function setTaskStatus(tasksPath, taskIdInput, newStatus) {
  const oldStatus = task.status || 'pending';
  task.status = newStatus;
  
+  // Automatically update subtasks if the parent task is being marked as done
+  if (newStatus === 'done' && task.subtasks && Array.isArray(task.subtasks) && task.subtasks.length > 0) {
+    log('info', `Task ${taskId} has ${task.subtasks.length} subtasks that will be marked as done too.`);
+    
+    task.subtasks.forEach(subtask => {
+      const oldSubtaskStatus = subtask.status || 'pending';
+      subtask.status = newStatus;
+      log('info', `  └─ Updated subtask ${taskId}.${subtask.id} status from '${oldSubtaskStatus}' to '${newStatus}'`);
+    });
+  }
+  
  // Save the changes
  writeJSON(tasksPath, data);
  log('info', `Updated task ${taskId} status from '${oldStatus}' to '${newStatus}'`);
@@ -728,6 +763,29 @@ async function expandTask(taskId, numSubtasks = CONFIG.defaultSubtasks, useResea
      return;
    }
    
+    // Check for complexity report
+    const complexityReport = readComplexityReport();
+    let recommendedSubtasks = numSubtasks;
+    let recommendedPrompt = additionalContext;
+    
+    // If report exists and has data for this task, use it
+    if (complexityReport) {
+      const taskAnalysis = findTaskInComplexityReport(complexityReport, parseInt(taskId));
+      if (taskAnalysis) {
+        // Only use report values if not explicitly overridden by command line
+        if (numSubtasks === CONFIG.defaultSubtasks && taskAnalysis.recommendedSubtasks) {
+          recommendedSubtasks = taskAnalysis.recommendedSubtasks;
+          console.log(chalk.blue(`Using recommended subtask count from complexity analysis: ${recommendedSubtasks}`));
+        }
+        
+        if (!additionalContext && taskAnalysis.expansionPrompt) {
+          recommendedPrompt = taskAnalysis.expansionPrompt;
+          console.log(chalk.blue(`Using recommended prompt from complexity analysis`));
+          console.log(chalk.gray(`Prompt: ${recommendedPrompt.substring(0, 100)}...`));
+        }
+      }
+    }
+    
    // Initialize subtasks array if it doesn't exist
    if (!task.subtasks) {
      task.subtasks = [];
@@ -742,9 +800,9 @@ async function expandTask(taskId, numSubtasks = CONFIG.defaultSubtasks, useResea
    let subtasks;
    if (useResearch) {
      console.log(chalk.blue(`Using Perplexity AI for research-backed subtask generation...`));
-      subtasks = await generateSubtasksWithPerplexity(task, numSubtasks, nextSubtaskId, additionalContext);
+      subtasks = await generateSubtasksWithPerplexity(task, recommendedSubtasks, nextSubtaskId, recommendedPrompt);
    } else {
-      subtasks = await generateSubtasks(task, numSubtasks, nextSubtaskId, additionalContext);
+      subtasks = await generateSubtasks(task, recommendedSubtasks, nextSubtaskId, recommendedPrompt);
    }
    
    // Add the subtasks to the task
@@ -785,7 +843,7 @@ async function expandAllTasks(numSubtasks = CONFIG.defaultSubtasks, useResearch
    }
    
    // Filter tasks that are not completed
-    const tasksToExpand = tasksData.tasks.filter(task => 
+    let tasksToExpand = tasksData.tasks.filter(task => 
      task.status !== 'completed' && task.status !== 'done'
    );
    
@@ -794,18 +852,51 @@ async function expandAllTasks(numSubtasks = CONFIG.defaultSubtasks, useResearch
      return 0;
    }
    
-    console.log(chalk.blue(`Expanding ${tasksToExpand.length} tasks with ${numSubtasks} subtasks each...`));
+    // Check for complexity report
+    const complexityReport = readComplexityReport();
+    let usedComplexityReport = false;
+    
+    // If complexity report exists, sort tasks by complexity
+    if (complexityReport && complexityReport.complexityAnalysis) {
+      console.log(chalk.blue('Found complexity report. Prioritizing tasks by complexity score.'));
+      usedComplexityReport = true;
+      
+      // Create a map of task IDs to their complexity scores
+      const complexityMap = new Map();
+      complexityReport.complexityAnalysis.forEach(analysis => {
+        complexityMap.set(analysis.taskId, analysis.complexityScore);
+      });
+      
+      // Sort tasks by complexity score (highest first)
+      tasksToExpand.sort((a, b) => {
+        const scoreA = complexityMap.get(a.id) || 0;
+        const scoreB = complexityMap.get(b.id) || 0;
+        return scoreB - scoreA;
+      });
+      
+      // Log the sorted tasks
+      console.log(chalk.blue('Tasks will be expanded in this order (by complexity):'));
+      tasksToExpand.forEach(task => {
+        const score = complexityMap.get(task.id) || 'N/A';
+        console.log(chalk.blue(`  Task ${task.id}: ${task.title} (Complexity: ${score})`));
+      });
+    }
+    
+    console.log(chalk.blue(`\nExpanding ${tasksToExpand.length} tasks...`));
    
    let tasksExpanded = 0;
    
    // Expand each task
    for (const task of tasksToExpand) {
      console.log(chalk.blue(`\nExpanding task ${task.id}: ${task.title}`));
+      
+      // The check for usedComplexityReport is redundant since expandTask will handle it anyway
      await expandTask(task.id, numSubtasks, useResearch, additionalContext);
+      
      tasksExpanded++;
    }
    
-    console.log(chalk.green(`\nExpanded ${tasksExpanded} tasks with ${numSubtasks} subtasks each.`));
+    console.log(chalk.green(`\nExpanded ${tasksExpanded} tasks.`));
    return tasksExpanded;
  } catch (error) {
    console.error(chalk.red('Error expanding all tasks:'), error);
@@ -1192,10 +1283,16 @@ Research the task thoroughly and ensure the subtasks are comprehensive, specific
      console.log(chalk.blue('Using Perplexity AI for research-backed subtask generation...'));
      const result = await perplexity.chat.completions.create({
        model: PERPLEXITY_MODEL,
-        messages: [{
-          role: "user",
-          content: prompt
-        }],
+        messages: [
+          {
+            role: "system",
+            content: "You are a technical analysis AI that only responds with clean, valid JSON. Never include explanatory text or markdown formatting in your response."
+          },
+          {
+            role: "user",
+            content: researchPrompt
+          }
+        ],
        temperature: TEMPERATURE,
        max_tokens: MAX_TOKENS,
      });
@@ -1402,9 +1499,641 @@ async function main() {
      }
    });

+  program
+    .command('analyze-complexity')
+    .description('Analyze tasks and generate complexity-based expansion recommendations')
+    .option('-o, --output <file>', 'Output file path for the report', 'scripts/task-complexity-report.json')
+    .option('-m, --model <model>', 'LLM model to use for analysis (defaults to configured model)')
+    .option('-t, --threshold <number>', 'Minimum complexity score to recommend expansion (1-10)', '5')
+    .option('-f, --file <file>', 'Path to the tasks file', 'tasks/tasks.json')
+    .option('-r, --research', 'Use Perplexity AI for research-backed complexity analysis')
+    .action(async (options) => {
+      const tasksPath = options.file || 'tasks/tasks.json';
+      const outputPath = options.output;
+      const modelOverride = options.model;
+      const thresholdScore = parseFloat(options.threshold);
+      const useResearch = options.research || false;
+      
+      console.log(chalk.blue(`Analyzing task complexity from: ${tasksPath}`));
+      console.log(chalk.blue(`Output report will be saved to: ${outputPath}`));
+      
+      if (useResearch) {
+        console.log(chalk.blue('Using Perplexity AI for research-backed complexity analysis'));
+      }
+      
+      await analyzeTaskComplexity(options);
+    });
+
  await program.parseAsync(process.argv);
 }

+/**
+ * Analyzes task complexity and generates expansion recommendations
+ * @param {Object} options Command options
+ */
+async function analyzeTaskComplexity(options) {
+  const tasksPath = options.file || 'tasks/tasks.json';
+  const outputPath = options.output || 'scripts/task-complexity-report.json';
+  const modelOverride = options.model;
+  const thresholdScore = parseFloat(options.threshold || '5');
+  const useResearch = options.research || false;
+  
+  console.log(chalk.blue(`Analyzing task complexity and generating expansion recommendations...`));
+  
+  try {
+    // Read tasks.json
+    console.log(chalk.blue(`Reading tasks from ${tasksPath}...`));
+    const tasksData = readJSON(tasksPath);
+    
+    if (!tasksData || !tasksData.tasks || !Array.isArray(tasksData.tasks) || tasksData.tasks.length === 0) {
+      throw new Error('No tasks found in the tasks file');
+    }
+    
+    console.log(chalk.blue(`Found ${tasksData.tasks.length} tasks to analyze.`));
+    
+    // Prepare the prompt for the LLM
+    const prompt = generateComplexityAnalysisPrompt(tasksData);
+    
+    // Start loading indicator
+    const loadingIndicator = startLoadingIndicator('Calling AI to analyze task complexity...');
+    
+    let fullResponse = '';
+    let streamingInterval = null;
+    
+    try {
+      // If research flag is set, use Perplexity first
+      if (useResearch) {
+        try {
+          console.log(chalk.blue('Using Perplexity AI for research-backed complexity analysis...'));
+          
+          // Modify prompt to include more context for Perplexity and explicitly request JSON
+          const researchPrompt = `You are conducting a detailed analysis of software development tasks to determine their complexity and how they should be broken down into subtasks.
+
+Please research each task thoroughly, considering best practices, industry standards, and potential implementation challenges before providing your analysis.
+
+CRITICAL: You MUST respond ONLY with a valid JSON array. Do not include ANY explanatory text, markdown formatting, or code block markers.
+
+${prompt}
+
+Your response must be a clean JSON array only, following exactly this format:
+[
+  {
+    "taskId": 1,
+    "taskTitle": "Example Task",
+    "complexityScore": 7,
+    "recommendedSubtasks": 4,
+    "expansionPrompt": "Detailed prompt for expansion",
+    "reasoning": "Explanation of complexity assessment"
+  },
+  // more tasks...
+]
+
+DO NOT include any text before or after the JSON array. No explanations, no markdown formatting.`;
+          
+          const result = await perplexity.chat.completions.create({
+            model: PERPLEXITY_MODEL,
+            messages: [
+              {
+                role: "system", 
+                content: "You are a technical analysis AI that only responds with clean, valid JSON. Never include explanatory text or markdown formatting in your response."
+              },
+              {
+                role: "user",
+                content: researchPrompt
+              }
+            ],
+            temperature: TEMPERATURE,
+            max_tokens: MAX_TOKENS,
+          });
+          
+          // Extract the response text
+          fullResponse = result.choices[0].message.content;
+          console.log(chalk.green('Successfully generated complexity analysis with Perplexity AI'));
+          
+          if (streamingInterval) clearInterval(streamingInterval);
+          stopLoadingIndicator(loadingIndicator);
+          
+          // ALWAYS log the first part of the response for debugging
+          console.log(chalk.gray('Response first 200 chars:'));
+          console.log(chalk.gray(fullResponse.substring(0, 200)));
+        } catch (perplexityError) {
+          console.log(chalk.yellow('Falling back to Claude for complexity analysis...'));
+          console.log(chalk.gray('Perplexity error:'), perplexityError.message);
+          
+          // Continue to Claude as fallback
+          await useClaudeForComplexityAnalysis();
+        }
+      } else {
+        // Use Claude directly if research flag is not set
+        await useClaudeForComplexityAnalysis();
+      }
+      
+      // Helper function to use Claude for complexity analysis
+      async function useClaudeForComplexityAnalysis() {
+        // Call the LLM API with streaming
+        const stream = await anthropic.messages.create({
+          max_tokens: CONFIG.maxTokens,
+          model: modelOverride || CONFIG.model,
+          temperature: CONFIG.temperature,
+          messages: [{ role: "user", content: prompt }],
+          system: "You are an expert software architect and project manager analyzing task complexity. Respond only with valid JSON.",
+          stream: true
+        });
+        
+        // Update loading indicator to show streaming progress
+        let dotCount = 0;
+        streamingInterval = setInterval(() => {
+          readline.cursorTo(process.stdout, 0);
+          process.stdout.write(`Receiving streaming response from Claude${'.'.repeat(dotCount)}`);
+          dotCount = (dotCount + 1) % 4;
+        }, 500);
+        
+        // Process the stream
+        for await (const chunk of stream) {
+          if (chunk.type === 'content_block_delta' && chunk.delta.text) {
+            fullResponse += chunk.delta.text;
+          }
+        }
+        
+        clearInterval(streamingInterval);
+        stopLoadingIndicator(loadingIndicator);
+        
+        console.log(chalk.green("Completed streaming response from Claude API!"));
+      }
+      
+      // Parse the JSON response
+      console.log(chalk.blue(`Parsing complexity analysis...`));
+      let complexityAnalysis;
+      try {
+        // Clean up the response to ensure it's valid JSON
+        let cleanedResponse = fullResponse;
+        
+        // First check for JSON code blocks (common in markdown responses)
+        const codeBlockMatch = fullResponse.match(/```(?:json)?\s*([\s\S]*?)\s*```/);
+        if (codeBlockMatch) {
+          cleanedResponse = codeBlockMatch[1];
+          console.log(chalk.blue("Extracted JSON from code block"));
+        } else {
+          // Look for a complete JSON array pattern
+          // This regex looks for an array of objects starting with [ and ending with ]
+          const jsonArrayMatch = fullResponse.match(/(\[\s*\{\s*"[^"]*"\s*:[\s\S]*\}\s*\])/);
+          if (jsonArrayMatch) {
+            cleanedResponse = jsonArrayMatch[1];
+            console.log(chalk.blue("Extracted JSON array pattern"));
+          } else {
+            // Try to find the start of a JSON array and capture to the end
+            const jsonStartMatch = fullResponse.match(/(\[\s*\{[\s\S]*)/);
+            if (jsonStartMatch) {
+              cleanedResponse = jsonStartMatch[1];
+              // Try to find a proper closing to the array
+              const properEndMatch = cleanedResponse.match(/([\s\S]*\}\s*\])/);
+              if (properEndMatch) {
+                cleanedResponse = properEndMatch[1];
+              }
+              console.log(chalk.blue("Extracted JSON from start of array to end"));
+            }
+          }
+        }
+        
+        // Log the cleaned response for debugging
+        console.log(chalk.gray("Attempting to parse cleaned JSON..."));
+        console.log(chalk.gray("Cleaned response (first 100 chars):"));
+        console.log(chalk.gray(cleanedResponse.substring(0, 100)));
+        console.log(chalk.gray("Last 100 chars:"));
+        console.log(chalk.gray(cleanedResponse.substring(cleanedResponse.length - 100)));
+        
+        // More aggressive cleaning - strip any non-JSON content at the beginning or end
+        const strictArrayMatch = cleanedResponse.match(/(\[\s*\{[\s\S]*\}\s*\])/);
+        if (strictArrayMatch) {
+          cleanedResponse = strictArrayMatch[1];
+          console.log(chalk.blue("Applied strict JSON array extraction"));
+        }
+        
+        try {
+          complexityAnalysis = JSON.parse(cleanedResponse);
+        } catch (jsonError) {
+          console.log(chalk.yellow("Initial JSON parsing failed, attempting to fix common JSON issues..."));
+          
+          // Try to fix common JSON issues
+          // 1. Remove any trailing commas in arrays or objects
+          cleanedResponse = cleanedResponse.replace(/,(\s*[\]}])/g, '$1');
+          
+          // 2. Ensure property names are double-quoted
+          cleanedResponse = cleanedResponse.replace(/(\s*)(\w+)(\s*):(\s*)/g, '$1"$2"$3:$4');
+          
+          // 3. Replace single quotes with double quotes for property values
+          cleanedResponse = cleanedResponse.replace(/:(\s*)'([^']*)'(\s*[,}])/g, ':$1"$2"$3');
+          
+          // 4. Add a special fallback option if we're still having issues
+          try {
+            complexityAnalysis = JSON.parse(cleanedResponse);
+            console.log(chalk.green("Successfully parsed JSON after fixing common issues"));
+          } catch (fixedJsonError) {
+            console.log(chalk.red("Failed to parse JSON even after fixes, attempting more aggressive cleanup..."));
+            
+            // Try to extract and process each task individually
+            try {
+              const taskMatches = cleanedResponse.match(/\{\s*"taskId"\s*:\s*(\d+)[^}]*\}/g);
+              if (taskMatches && taskMatches.length > 0) {
+                console.log(chalk.yellow(`Found ${taskMatches.length} task objects, attempting to process individually`));
+                
+                complexityAnalysis = [];
+                for (const taskMatch of taskMatches) {
+                  try {
+                    // Try to parse each task object individually
+                    const fixedTask = taskMatch.replace(/,\s*$/, ''); // Remove trailing commas
+                    const taskObj = JSON.parse(`${fixedTask}`);
+                    if (taskObj && taskObj.taskId) {
+                      complexityAnalysis.push(taskObj);
+                    }
+                  } catch (taskParseError) {
+                    console.log(chalk.yellow(`Could not parse individual task: ${taskMatch.substring(0, 30)}...`));
+                  }
+                }
+                
+                if (complexityAnalysis.length > 0) {
+                  console.log(chalk.green(`Successfully parsed ${complexityAnalysis.length} tasks individually`));
+                } else {
+                  throw new Error("Could not parse any tasks individually");
+                }
+              } else {
+                throw fixedJsonError;
+              }
+            } catch (individualError) {
+              console.log(chalk.red("All parsing attempts failed"));
+              throw jsonError; // throw the original error
+            }
+          }
+        }
+        
+        // Ensure complexityAnalysis is an array
+        if (!Array.isArray(complexityAnalysis)) {
+          console.log(chalk.yellow('Response is not an array, checking if it contains an array property...'));
+          
+          // Handle the case where the response might be an object with an array property
+          if (complexityAnalysis.tasks || complexityAnalysis.analysis || complexityAnalysis.results) {
+            complexityAnalysis = complexityAnalysis.tasks || complexityAnalysis.analysis || complexityAnalysis.results;
+          } else {
+            // If no recognizable array property, wrap it as an array if it's an object
+            if (typeof complexityAnalysis === 'object' && complexityAnalysis !== null) {
+              console.log(chalk.yellow('Converting object to array...'));
+              complexityAnalysis = [complexityAnalysis];
+            } else {
+              throw new Error('Response does not contain a valid array or object');
+            }
+          }
+        }
+        
+        // Final check to ensure we have an array
+        if (!Array.isArray(complexityAnalysis)) {
+          throw new Error('Failed to extract an array from the response');
+        }
+        
+        // Check that we have an analysis for each task in the input file
+        const taskIds = tasksData.tasks.map(t => t.id);
+        const analysisTaskIds = complexityAnalysis.map(a => a.taskId);
+        const missingTaskIds = taskIds.filter(id => !analysisTaskIds.includes(id));
+
+        if (missingTaskIds.length > 0) {
+          console.log(chalk.yellow(`Missing analysis for ${missingTaskIds.length} tasks: ${missingTaskIds.join(', ')}`));
+          console.log(chalk.blue(`Attempting to analyze missing tasks...`));
+          
+          // Create a subset of tasksData with just the missing tasks
+          const missingTasks = {
+            meta: tasksData.meta,
+            tasks: tasksData.tasks.filter(t => missingTaskIds.includes(t.id))
+          };
+          
+          // Generate a prompt for just the missing tasks
+          const missingTasksPrompt = generateComplexityAnalysisPrompt(missingTasks);
+          
+          // Call the same AI model to analyze the missing tasks
+          let missingAnalysisResponse = '';
+          
+          try {
+            // Start a new loading indicator
+            const missingTasksLoadingIndicator = startLoadingIndicator('Analyzing missing tasks...');
+            
+            // Use the same AI model as the original analysis
+            if (useResearch) {
+              // Create the same research prompt but for missing tasks
+              const missingTasksResearchPrompt = `You are conducting a detailed analysis of software development tasks to determine their complexity and how they should be broken down into subtasks.
+
+Please research each task thoroughly, considering best practices, industry standards, and potential implementation challenges before providing your analysis.
+
+CRITICAL: You MUST respond ONLY with a valid JSON array. Do not include ANY explanatory text, markdown formatting, or code block markers.
+
+${missingTasksPrompt}
+
+Your response must be a clean JSON array only, following exactly this format:
+[
+  {
+    "taskId": 1,
+    "taskTitle": "Example Task",
+    "complexityScore": 7,
+    "recommendedSubtasks": 4,
+    "expansionPrompt": "Detailed prompt for expansion",
+    "reasoning": "Explanation of complexity assessment"
+  },
+  // more tasks...
+]
+
+DO NOT include any text before or after the JSON array. No explanations, no markdown formatting.`;
+
+              const result = await perplexity.chat.completions.create({
+                model: PERPLEXITY_MODEL,
+                messages: [
+                  {
+                    role: "system", 
+                    content: "You are a technical analysis AI that only responds with clean, valid JSON. Never include explanatory text or markdown formatting in your response."
+                  },
+                  {
+                    role: "user",
+                    content: missingTasksResearchPrompt
+                  }
+                ],
+                temperature: TEMPERATURE,
+                max_tokens: MAX_TOKENS,
+              });
+              
+              // Extract the response
+              missingAnalysisResponse = result.choices[0].message.content;
+            } else {
+              // Use Claude
+              const stream = await anthropic.messages.create({
+                max_tokens: CONFIG.maxTokens,
+                model: modelOverride || CONFIG.model,
+                temperature: CONFIG.temperature,
+                messages: [{ role: "user", content: missingTasksPrompt }],
+                system: "You are an expert software architect and project manager analyzing task complexity. Respond only with valid JSON.",
+                stream: true
+              });
+              
+              // Process the stream
+              for await (const chunk of stream) {
+                if (chunk.type === 'content_block_delta' && chunk.delta.text) {
+                  missingAnalysisResponse += chunk.delta.text;
+                }
+              }
+            }
+            
+            // Stop the loading indicator
+            stopLoadingIndicator(missingTasksLoadingIndicator);
+            
+            // Parse the response using the same parsing logic as before
+            let missingAnalysis;
+            try {
+              // Clean up the response to ensure it's valid JSON (using same logic as above)
+              let cleanedResponse = missingAnalysisResponse;
+              
+              // Use the same JSON extraction logic as before
+              // ... (code omitted for brevity, it would be the same as the original parsing)
+              
+              // First check for JSON code blocks
+              const codeBlockMatch = missingAnalysisResponse.match(/```(?:json)?\s*([\s\S]*?)\s*```/);
+              if (codeBlockMatch) {
+                cleanedResponse = codeBlockMatch[1];
+                console.log(chalk.blue("Extracted JSON from code block for missing tasks"));
+              } else {
+                // Look for a complete JSON array pattern
+                const jsonArrayMatch = missingAnalysisResponse.match(/(\[\s*\{\s*"[^"]*"\s*:[\s\S]*\}\s*\])/);
+                if (jsonArrayMatch) {
+                  cleanedResponse = jsonArrayMatch[1];
+                  console.log(chalk.blue("Extracted JSON array pattern for missing tasks"));
+                } else {
+                  // Try to find the start of a JSON array and capture to the end
+                  const jsonStartMatch = missingAnalysisResponse.match(/(\[\s*\{[\s\S]*)/);
+                  if (jsonStartMatch) {
+                    cleanedResponse = jsonStartMatch[1];
+                    // Try to find a proper closing to the array
+                    const properEndMatch = cleanedResponse.match(/([\s\S]*\}\s*\])/);
+                    if (properEndMatch) {
+                      cleanedResponse = properEndMatch[1];
+                    }
+                    console.log(chalk.blue("Extracted JSON from start of array to end for missing tasks"));
+                  }
+                }
+              }
+              
+              // More aggressive cleaning if needed
+              const strictArrayMatch = cleanedResponse.match(/(\[\s*\{[\s\S]*\}\s*\])/);
+              if (strictArrayMatch) {
+                cleanedResponse = strictArrayMatch[1];
+                console.log(chalk.blue("Applied strict JSON array extraction for missing tasks"));
+              }
+              
+              try {
+                missingAnalysis = JSON.parse(cleanedResponse);
+              } catch (jsonError) {
+                // Try to fix common JSON issues (same as before)
+                cleanedResponse = cleanedResponse.replace(/,(\s*[\]}])/g, '$1');
+                cleanedResponse = cleanedResponse.replace(/(\s*)(\w+)(\s*):(\s*)/g, '$1"$2"$3:$4');
+                cleanedResponse = cleanedResponse.replace(/:(\s*)'([^']*)'(\s*[,}])/g, ':$1"$2"$3');
+                
+                try {
+                  missingAnalysis = JSON.parse(cleanedResponse);
+                  console.log(chalk.green("Successfully parsed JSON for missing tasks after fixing common issues"));
+                } catch (fixedJsonError) {
+                  // Try the individual task extraction as a last resort
+                  console.log(chalk.red("Failed to parse JSON for missing tasks, attempting individual extraction..."));
+                  
+                  const taskMatches = cleanedResponse.match(/\{\s*"taskId"\s*:\s*(\d+)[^}]*\}/g);
+                  if (taskMatches && taskMatches.length > 0) {
+                    console.log(chalk.yellow(`Found ${taskMatches.length} task objects, attempting to process individually`));
+                    
+                    missingAnalysis = [];
+                    for (const taskMatch of taskMatches) {
+                      try {
+                        const fixedTask = taskMatch.replace(/,\s*$/, '');
+                        const taskObj = JSON.parse(`${fixedTask}`);
+                        if (taskObj && taskObj.taskId) {
+                          missingAnalysis.push(taskObj);
+                        }
+                      } catch (taskParseError) {
+                        console.log(chalk.yellow(`Could not parse individual task: ${taskMatch.substring(0, 30)}...`));
+                      }
+                    }
+                    
+                    if (missingAnalysis.length === 0) {
+                      throw new Error("Could not parse any missing tasks");
+                    }
+                  } else {
+                    throw fixedJsonError;
+                  }
+                }
+              }
+              
+              // Ensure it's an array
+              if (!Array.isArray(missingAnalysis)) {
+                if (missingAnalysis && typeof missingAnalysis === 'object') {
+                  missingAnalysis = [missingAnalysis];
+                } else {
+                  throw new Error("Missing tasks analysis is not an array or object");
+                }
+              }
+              
+              // Add the missing analyses to the main analysis array
+              console.log(chalk.green(`Successfully analyzed ${missingAnalysis.length} missing tasks`));
+              complexityAnalysis = [...complexityAnalysis, ...missingAnalysis];
+              
+              // Re-check for missing tasks
+              const updatedAnalysisTaskIds = complexityAnalysis.map(a => a.taskId);
+              const stillMissingTaskIds = taskIds.filter(id => !updatedAnalysisTaskIds.includes(id));
+              
+              if (stillMissingTaskIds.length > 0) {
+                console.log(chalk.yellow(`Warning: Still missing analysis for ${stillMissingTaskIds.length} tasks: ${stillMissingTaskIds.join(', ')}`));
+              } else {
+                console.log(chalk.green(`All tasks now have complexity analysis!`));
+              }
+            } catch (error) {
+              console.error(chalk.red(`Error analyzing missing tasks: ${error.message}`));
+              console.log(chalk.yellow(`Continuing with partial analysis...`));
+            }
+          } catch (error) {
+            console.error(chalk.red(`Error during retry for missing tasks: ${error.message}`));
+            console.log(chalk.yellow(`Continuing with partial analysis...`));
+          }
+        }
+      } catch (error) {
+        console.error(chalk.red(`Failed to parse LLM response as JSON: ${error.message}`));
+        if (CONFIG.debug) {
+          console.debug(chalk.gray(`Raw response: ${fullResponse}`));
+        }
+        throw new Error('Invalid response format from LLM. Expected JSON.');
+      }
+      
+      // Create the final report
+      const report = {
+        meta: {
+          generatedAt: new Date().toISOString(),
+          tasksAnalyzed: tasksData.tasks.length,
+          thresholdScore: thresholdScore,
+          projectName: tasksData.meta?.projectName || 'Your Project Name',
+          usedResearch: useResearch
+        },
+        complexityAnalysis: complexityAnalysis
+      };
+      
+      // Write the report to file
+      console.log(chalk.blue(`Writing complexity report to ${outputPath}...`));
+      writeJSON(outputPath, report);
+      
+      console.log(chalk.green(`Task complexity analysis complete. Report written to ${outputPath}`));
+      
+      // Display a summary of findings
+      const highComplexity = complexityAnalysis.filter(t => t.complexityScore >= 8).length;
+      const mediumComplexity = complexityAnalysis.filter(t => t.complexityScore >= 5 && t.complexityScore < 8).length;
+      const lowComplexity = complexityAnalysis.filter(t => t.complexityScore < 5).length;
+      const totalAnalyzed = complexityAnalysis.length;
+      
+      console.log('\nComplexity Analysis Summary:');
+      console.log('----------------------------');
+      console.log(`Tasks in input file: ${tasksData.tasks.length}`);
+      console.log(`Tasks successfully analyzed: ${totalAnalyzed}`);
+      console.log(`High complexity tasks: ${highComplexity}`);
+      console.log(`Medium complexity tasks: ${mediumComplexity}`);
+      console.log(`Low complexity tasks: ${lowComplexity}`);
+      console.log(`Sum verification: ${highComplexity + mediumComplexity + lowComplexity} (should equal ${totalAnalyzed})`);
+      console.log(`Research-backed analysis: ${useResearch ? 'Yes' : 'No'}`);
+      console.log(`\nSee ${outputPath} for the full report and expansion commands.`);
+      
+    } catch (error) {
+      if (streamingInterval) clearInterval(streamingInterval);
+      stopLoadingIndicator(loadingIndicator);
+      throw error;
+    }
+  } catch (error) {
+    console.error(chalk.red(`Error analyzing task complexity: ${error.message}`));
+    process.exit(1);
+  }
+}
+
+/**
+ * Generates the prompt for the LLM to analyze task complexity
+ * @param {Object} tasksData The tasks data from tasks.json
+ * @returns {string} The prompt for the LLM
+ */
+function generateComplexityAnalysisPrompt(tasksData) {
+  return `
+You are an expert software architect and project manager. Your task is to analyze the complexity of development tasks and determine how many subtasks each should be broken down into.
+
+Below is a list of development tasks with their descriptions and details. For each task:
+1. Assess its complexity on a scale of 1-10
+2. Recommend the optimal number of subtasks (between ${Math.max(3, CONFIG.defaultSubtasks - 1)}-${Math.min(8, CONFIG.defaultSubtasks + 2)})
+3. Suggest a specific prompt that would help generate good subtasks for this task
+4. Explain your reasoning briefly
+
+Tasks:
+${tasksData.tasks.map(task => `
+ID: ${task.id}
+Title: ${task.title}
+Description: ${task.description}
+Details: ${task.details}
+Dependencies: ${JSON.stringify(task.dependencies || [])}
+Priority: ${task.priority || 'medium'}
+`).join('\n---\n')}
+
+Analyze each task and return a JSON array with the following structure for each task:
+[
+  {
+    "taskId": number,
+    "taskTitle": string,
+    "complexityScore": number (1-10),
+    "recommendedSubtasks": number (${Math.max(3, CONFIG.defaultSubtasks - 1)}-${Math.min(8, CONFIG.defaultSubtasks + 2)}),
+    "expansionPrompt": string (a specific prompt for generating good subtasks),
+    "reasoning": string (brief explanation of your assessment)
+  },
+  ...
+]
+
+IMPORTANT: Make sure to include an analysis for EVERY task listed above, with the correct taskId matching each task's ID.
+`;
+}
+
+/**
+ * Sanitizes a prompt string for use in a shell command
+ * @param {string} prompt The prompt to sanitize
+ * @returns {string} Sanitized prompt
+ */
+function sanitizePrompt(prompt) {
+  // Replace double quotes with escaped double quotes
+  return prompt.replace(/"/g, '\\"');
+}
+
+/**
+ * Reads and parses the complexity report if it exists
+ * @param {string} customPath - Optional custom path to the report
+ * @returns {Object|null} The parsed complexity report or null if not found
+ */
+function readComplexityReport(customPath = null) {
+  try {
+    const reportPath = customPath || path.join(process.cwd(), 'scripts', 'task-complexity-report.json');
+    if (!fs.existsSync(reportPath)) {
+      return null;
+    }
+    
+    const reportData = fs.readFileSync(reportPath, 'utf8');
+    return JSON.parse(reportData);
+  } catch (error) {
+    console.log(chalk.yellow(`Could not read complexity report: ${error.message}`));
+    return null;
+  }
+}
+
+/**
+ * Finds a task analysis in the complexity report
+ * @param {Object} report - The complexity report
+ * @param {number} taskId - The task ID to find
+ * @returns {Object|null} The task analysis or null if not found
+ */
+function findTaskInComplexityReport(report, taskId) {
+  if (!report || !report.complexityAnalysis || !Array.isArray(report.complexityAnalysis)) {
+    return null;
+  }
+  
+  return report.complexityAnalysis.find(task => task.taskId === taskId);
+}
+
 main().catch(err => {
  log('error', err);
  process.exit(1);
--- a/templates/dev_workflow.mdc
+++ b/templates/dev_workflow.mdc
@@ -1,267 +1,152 @@
 ---
-description: guide the Cursor Agent in using the meta-development script (scripts/dev.js). It also defines the overall workflow for reading, updating, and generating tasks during AI-driven development.
-globs: scripts/dev.js, tasks.json, tasks/*.txt
+description: Guide for using meta-development script (scripts/dev.js) to manage task-driven development workflows
+globs: **/*
 alwaysApply: true
 ---
-rules:
-  - name: "Meta Development Workflow for Cursor Agent"
-    description: >
-      Provides comprehensive guidelines on how the agent (Cursor) should coordinate
-      with the meta task script in scripts/dev.js. The agent will call
-      these commands at various points in the coding process to keep
-      tasks.json up to date and maintain a single source of truth for development tasks.
-    triggers:
-      # Potential triggers or states in Cursor where these rules apply.
-      # You may list relevant event names, e.g., "onTaskCompletion" or "onUserCommand"
-      - always
-    steps:
-      - "**Initial Setup**: If starting a new project with a PRD document, run `node scripts/dev.js parse-prd --input=<prd-file.txt>` to generate the initial tasks.json file. This will create a structured task list with IDs, titles, descriptions, dependencies, priorities, and test strategies."

-      - "**Task Discovery**: When a coding session begins, call `node scripts/dev.js list` to see the current tasks, their status, and IDs. This provides a quick overview of all tasks and their current states (pending, done, deferred)."
+- **Development Workflow Process**
+  - Start new projects by running `node scripts/dev.js parse-prd --input=<prd-file.txt>` to generate initial tasks.json
+  - Begin coding sessions with `node scripts/dev.js list` to see current tasks, status, and IDs
+  - Analyze task complexity with `node scripts/dev.js analyze-complexity --research` before breaking down tasks
+  - Select tasks based on dependencies (all marked 'done'), priority level, and ID order
+  - Clarify tasks by checking task files in tasks/ directory or asking for user input
+  - Break down complex tasks using `node scripts/dev.js expand --id=<id>` with appropriate flags
+  - Implement code following task details, dependencies, and project standards
+  - Verify tasks according to test strategies before marking as complete
+  - Mark completed tasks with `node scripts/dev.js set-status --id=<id> --status=done`
+  - Update dependent tasks when implementation differs from original plan
+  - Generate task files with `node scripts/dev.js generate` after updating tasks.json
+  - Respect dependency chains and task priorities when selecting work
+  - Report progress regularly using the list command

-      - "**Task Selection**: Select the next pending task based on these criteria:
-        1. Dependencies: Only select tasks whose dependencies are marked as 'done'
-        2. Priority: Choose higher priority tasks first ('high' > 'medium' > 'low')
-        3. ID order: When priorities are equal, select the task with the lowest ID
-        If multiple tasks are eligible, present options to the user for selection."
+- **Task Complexity Analysis**
+  - Run `node scripts/dev.js analyze-complexity --research` for comprehensive analysis
+  - Review complexity report in scripts/task-complexity-report.json
+  - Focus on tasks with highest complexity scores (8-10) for detailed breakdown
+  - Use analysis results to determine appropriate subtask allocation
+  - Note that reports are automatically used by the expand command

-      - "**Task Clarification**: If a task description is unclear or lacks detail:
-        1. Check if a corresponding task file exists in the tasks/ directory (e.g., task_001.txt)
-        2. If more information is needed, ask the user for clarification
-        3. If architectural changes have occurred, run `node scripts/dev.js update --from=<id> --prompt=\"<new architectural context>\"` to update the task and all subsequent tasks"
+- **Task Breakdown Process**
+  - For tasks with complexity analysis, use `node scripts/dev.js expand --id=<id>`
+  - Otherwise use `node scripts/dev.js expand --id=<id> --subtasks=<number>`
+  - Add `--research` flag to leverage Perplexity AI for research-backed expansion
+  - Use `--prompt="<context>"` to provide additional context when needed
+  - Review and adjust generated subtasks as necessary
+  - Use `--all` flag to expand multiple pending tasks at once

-      - "**Task Breakdown**: For complex tasks that need to be broken down into smaller steps:
-        1. Use `node scripts/dev.js expand --id=<id> --subtasks=<number>` to generate detailed subtasks
-        2. Optionally provide additional context with `--prompt=\"<context>\"` to guide subtask generation
-        3. Review the generated subtasks and adjust if necessary
-        4. For multiple tasks, use `--all` flag to expand all pending tasks that don't have subtasks"
+- **Implementation Drift Handling**
+  - When implementation differs significantly from planned approach
+  - When future tasks need modification due to current implementation choices
+  - When new dependencies or requirements emerge
+  - Call `node scripts/dev.js update --from=<futureTaskId> --prompt="<explanation>"` to update tasks.json

-      - "**Task Implementation**: Implement the code necessary for the chosen task. Follow these guidelines:
-        1. Reference the task's 'details' section for implementation specifics
-        2. Consider dependencies on previous tasks when implementing
-        3. Follow the project's coding standards and patterns
-        4. Create appropriate tests based on the task's 'testStrategy' field"
+- **Task Status Management**
+  - Use 'pending' for tasks ready to be worked on
+  - Use 'done' for completed and verified tasks
+  - Use 'deferred' for postponed tasks
+  - Add custom status values as needed for project-specific workflows

-      - "**Task Verification**: Before marking a task as done, verify it according to:
-        1. The task's specified 'testStrategy'
-        2. Any automated tests in the codebase
-        3. Manual verification if required
-        4. Code quality standards (linting, formatting, etc.)"
+- **Task File Format Reference**
+  ```
+  # Task ID: <id>
+  # Title: <title>
+  # Status: <status>
+  # Dependencies: <comma-separated list of dependency IDs>
+  # Priority: <priority>
+  # Description: <brief description>
+  # Details:
+  <detailed implementation notes>
  
-      - "**Task Completion**: When a task is completed and verified, run `node scripts/dev.js set-status --id=<id> --status=done` to mark it as done in tasks.json. This ensures the task tracking remains accurate."
+  # Test Strategy:
+  <verification approach>
+  ```

-      - "**Implementation Drift Handling**: If during implementation, you discover that:
-        1. The current approach differs significantly from what was planned
-        2. Future tasks need to be modified due to current implementation choices
-        3. New dependencies or requirements have emerged
+- **Command Reference: parse-prd**
+  - Syntax: `node scripts/dev.js parse-prd --input=<prd-file.txt>`
+  - Description: Parses a PRD document and generates a tasks.json file with structured tasks
+  - Parameters: 
+    - `--input=<file>`: Path to the PRD text file (default: sample-prd.txt)
+  - Example: `node scripts/dev.js parse-prd --input=requirements.txt`
+  - Notes: Will overwrite existing tasks.json file. Use with caution.

-        Then call `node scripts/dev.js update --from=<futureTaskId> --prompt=\"Detailed explanation of architectural or implementation changes...\"` to rewrite or re-scope subsequent tasks in tasks.json."
+- **Command Reference: update**
+  - Syntax: `node scripts/dev.js update --from=<id> --prompt="<prompt>"`
+  - Description: Updates tasks with ID >= specified ID based on the provided prompt
+  - Parameters:
+    - `--from=<id>`: Task ID from which to start updating (required)
+    - `--prompt="<text>"`: Explanation of changes or new context (required)
+  - Example: `node scripts/dev.js update --from=4 --prompt="Now we are using Express instead of Fastify."`
+  - Notes: Only updates tasks not marked as 'done'. Completed tasks remain unchanged.

-      - "**Task File Generation**: After any updates to tasks.json (status changes, task updates), run `node scripts/dev.js generate` to regenerate the individual task_XXX.txt files in the tasks/ folder. This ensures that task files are always in sync with tasks.json."
+- **Command Reference: generate**
+  - Syntax: `node scripts/dev.js generate`
+  - Description: Generates individual task files in tasks/ directory based on tasks.json
+  - Parameters: None
+  - Example: `node scripts/dev.js generate`
+  - Notes: Overwrites existing task files. Creates tasks/ directory if needed.

-      - "**Task Status Management**: Use appropriate status values when updating tasks:
-        1. 'pending': Tasks that are ready to be worked on
-        2. 'done': Tasks that have been completed and verified
-        3. 'deferred': Tasks that have been postponed to a later time
-        4. Any other custom status that might be relevant to the project"
+- **Command Reference: set-status**
+  - Syntax: `node scripts/dev.js set-status --id=<id> --status=<status>`
+  - Description: Updates the status of a specific task in tasks.json
+  - Parameters:
+    - `--id=<id>`: ID of the task to update (required)
+    - `--status=<status>`: New status value (required)
+  - Example: `node scripts/dev.js set-status --id=3 --status=done`
+  - Notes: Common values are 'done', 'pending', and 'deferred', but any string is accepted.

-      - "**Dependency Management**: When selecting tasks, always respect the dependency chain:
-        1. Never start a task whose dependencies are not marked as 'done'
-        2. If a dependency task is deferred, consider whether dependent tasks should also be deferred
-        3. If dependency relationships change during development, update tasks.json accordingly"
+- **Command Reference: list**
+  - Syntax: `node scripts/dev.js list`
+  - Description: Lists all tasks in tasks.json with IDs, titles, and status
+  - Parameters: None
+  - Example: `node scripts/dev.js list`
+  - Notes: Provides quick overview of project progress. Use at start of sessions.

-      - "**Progress Reporting**: Periodically (at the beginning of sessions or after completing significant tasks), run `node scripts/dev.js list` to provide the user with an updated view of project progress."
+- **Command Reference: expand**
+  - Syntax: `node scripts/dev.js expand --id=<id> [--num=<number>] [--research] [--prompt="<context>"]`
+  - Description: Expands a task with subtasks for detailed implementation
+  - Parameters:
+    - `--id=<id>`: ID of task to expand (required unless using --all)
+    - `--all`: Expand all pending tasks, prioritized by complexity
+    - `--num=<number>`: Number of subtasks to generate (default: from complexity report)
+    - `--research`: Use Perplexity AI for research-backed generation
+    - `--prompt="<text>"`: Additional context for subtask generation
+    - `--force`: Regenerate subtasks even for tasks that already have them
+  - Example: `node scripts/dev.js expand --id=3 --num=5 --research --prompt="Focus on security aspects"`
+  - Notes: Uses complexity report recommendations if available.

-      - "**Task File Format**: When reading task files, understand they follow this structure:
-        ```
-        # Task ID: <id>
-        # Title: <title>
-        # Status: <status>
-        # Dependencies: <comma-separated list of dependency IDs>
-        # Priority: <priority>
-        # Description: <brief description>
-        # Details:
-        <detailed implementation notes>
+- **Command Reference: analyze-complexity**
+  - Syntax: `node scripts/dev.js analyze-complexity [options]`
+  - Description: Analyzes task complexity and generates expansion recommendations
+  - Parameters:
+    - `--output=<file>, -o`: Output file path (default: scripts/task-complexity-report.json)
+    - `--model=<model>, -m`: Override LLM model to use
+    - `--threshold=<number>, -t`: Minimum score for expansion recommendation (default: 5)
+    - `--file=<path>, -f`: Use alternative tasks.json file
+    - `--research, -r`: Use Perplexity AI for research-backed analysis
+  - Example: `node scripts/dev.js analyze-complexity --research`
+  - Notes: Report includes complexity scores, recommended subtasks, and tailored prompts.

-        # Test Strategy:
-        <verification approach>
-        ```"
+- **Task Structure Fields**
+  - **id**: Unique identifier for the task (Example: `1`)
+  - **title**: Brief, descriptive title (Example: `"Initialize Repo"`)
+  - **description**: Concise summary of what the task involves (Example: `"Create a new repository, set up initial structure."`)
+  - **status**: Current state of the task (Example: `"pending"`, `"done"`, `"deferred"`)
+  - **dependencies**: IDs of prerequisite tasks (Example: `[1, 2]`)
+  - **priority**: Importance level (Example: `"high"`, `"medium"`, `"low"`)
+  - **details**: In-depth implementation instructions (Example: `"Use GitHub client ID/secret, handle callback, set session token."`)
+  - **testStrategy**: Verification approach (Example: `"Deploy and call endpoint to confirm 'Hello World' response."`)
+  - **subtasks**: List of smaller, more specific tasks (Example: `[{"id": 1, "title": "Configure OAuth", ...}]`)

-      - "**Continuous Workflow**: Repeat this process until all tasks relevant to the current development phase are completed. Always maintain tasks.json as the single source of truth for development progress."
-
-  - name: "Meta-Development Script Command Reference"
-    description: >
-      Detailed reference for all commands available in the scripts/dev.js meta-development script.
-      This helps the agent understand the full capabilities of the script and use it effectively.
-    triggers:
-      - always
-    commands:
-      - name: "parse-prd"
-        syntax: "node scripts/dev.js parse-prd --input=<prd-file.txt>"
-        description: "Parses a PRD document and generates a tasks.json file with structured tasks. This initializes the task tracking system."
-        parameters:
-          - "--input=<file>: Path to the PRD text file (default: sample-prd.txt)"
-        example: "node scripts/dev.js parse-prd --input=requirements.txt"
-        notes: "This will overwrite any existing tasks.json file. Use with caution on established projects."
-      
-      - name: "update"
-        syntax: "node scripts/dev.js update --from=<id> --prompt=\"<prompt>\""
-        description: "Updates tasks with ID >= the specified ID based on the provided prompt. Useful for handling implementation drift or architectural changes."
-        parameters:
-          - "--from=<id>: The task ID from which to start updating (required)"
-          - "--prompt=\"<text>\": The prompt explaining the changes or new context (required)"
-        example: "node scripts/dev.js update --from=4 --prompt=\"Now we are using Express instead of Fastify.\""
-        notes: "Only updates tasks that aren't marked as 'done'. Completed tasks remain unchanged."
-      
-      - name: "generate"
-        syntax: "node scripts/dev.js generate"
-        description: "Generates individual task files in the tasks/ directory based on the current state of tasks.json."
-        parameters: "None"
-        example: "node scripts/dev.js generate"
-        notes: "Overwrites existing task files. Creates the tasks/ directory if it doesn't exist."
-      
-      - name: "set-status"
-        syntax: "node scripts/dev.js set-status --id=<id> --status=<status>"
-        description: "Updates the status of a specific task in tasks.json."
-        parameters:
-          - "--id=<id>: The ID of the task to update (required)"
-          - "--status=<status>: The new status (e.g., 'done', 'pending', 'deferred') (required)"
-        example: "node scripts/dev.js set-status --id=3 --status=done"
-        notes: "Common status values are 'done', 'pending', and 'deferred', but any string is accepted."
-      
-      - name: "list"
-        syntax: "node scripts/dev.js list"
-        description: "Lists all tasks in tasks.json with their IDs, titles, and current status."
-        parameters: "None"
-        example: "node scripts/dev.js list"
-        notes: "Provides a quick overview of project progress. Use this at the start of coding sessions."
-      
-      - name: "expand"
-        syntax: "node scripts/dev.js expand --id=<id> [--subtasks=<number>] [--prompt=\"<context>\"]"
-        description: "Expands a task with subtasks for more detailed implementation. Can also expand all tasks with the --all flag."
-        parameters:
-          - "--id=<id>: The ID of the task to expand (required unless using --all)"
-          - "--all: Expand all pending tasks that don't have subtasks"
-          - "--subtasks=<number>: Number of subtasks to generate (default: 3)"
-          - "--prompt=\"<text>\": Additional context to guide subtask generation"
-          - "--force: When used with --all, regenerates subtasks even for tasks that already have them"
-        example: "node scripts/dev.js expand --id=3 --subtasks=5 --prompt=\"Focus on security aspects\""
-        notes: "Tasks marked as 'done' or 'completed' are always skipped. By default, tasks that already have subtasks are skipped unless --force is used."
-
-  - name: "Task Structure Reference"
-    description: >
-      Details the structure of tasks in tasks.json to help the agent understand
-      and work with the task data effectively.
-    triggers:
-      - always
-    task_fields:
-      - name: "id"
-        type: "number"
-        description: "Unique identifier for the task. Used in commands and for tracking dependencies."
-        example: "1"
-      
-      - name: "title"
-        type: "string"
-        description: "Brief, descriptive title of the task."
-        example: "Initialize Repo"
-      
-      - name: "description"
-        type: "string"
-        description: "Concise description of what the task involves."
-        example: "Create a new repository, set up initial structure."
-      
-      - name: "status"
-        type: "string"
-        description: "Current state of the task. Common values: 'pending', 'done', 'deferred'."
-        example: "pending"
-      
-      - name: "dependencies"
-        type: "array of numbers"
-        description: "IDs of tasks that must be completed before this task can be started."
-        example: "[1, 2]"
-      
-      - name: "priority"
-        type: "string"
-        description: "Importance level of the task. Common values: 'high', 'medium', 'low'."
-        example: "high"
-      
-      - name: "details"
-        type: "string"
-        description: "In-depth instructions, references, or context for implementing the task."
-        example: "Use GitHub client ID/secret, handle callback, set session token."
-      
-      - name: "testStrategy"
-        type: "string"
-        description: "Approach for verifying the task has been completed correctly."
-        example: "Deploy and call endpoint to confirm 'Hello World' response."
-      
-      - name: "subtasks"
-        type: "array of objects"
-        description: "List of smaller, more specific tasks that make up the main task."
-        example: "[{\"id\": 1, \"title\": \"Configure OAuth\", \"description\": \"...\", \"status\": \"pending\", \"dependencies\": [], \"acceptanceCriteria\": \"...\"}]"
-
-  - name: "Environment Variables Reference"
-    description: >
-      Details the environment variables that can be used to configure the dev.js script.
-      These variables should be set in a .env file at the root of the project.
-    triggers:
-      - always
-    variables:
-      - name: "ANTHROPIC_API_KEY"
-        required: true
-        description: "Your Anthropic API key for Claude. Required for task generation and expansion."
-        example: "ANTHROPIC_API_KEY=sk-ant-api03-..."
-      
-      - name: "MODEL"
-        required: false
-        default: "claude-3-7-sonnet-20250219"
-        description: "Specify which Claude model to use for task generation and expansion."
-        example: "MODEL=claude-3-opus-20240229"
-      
-      - name: "MAX_TOKENS"
-        required: false
-        default: "4000"
-        description: "Maximum tokens for model responses. Higher values allow for more detailed task generation."
-        example: "MAX_TOKENS=8000"
-      
-      - name: "TEMPERATURE"
-        required: false
-        default: "0.7"
-        description: "Temperature for model responses. Higher values (0.0-1.0) increase creativity but may reduce consistency."
-        example: "TEMPERATURE=0.5"
-      
-      - name: "DEBUG"
-        required: false
-        default: "false"
-        description: "Enable debug logging. When true, detailed logs are written to dev-debug.log."
-        example: "DEBUG=true"
-      
-      - name: "LOG_LEVEL"
-        required: false
-        default: "info"
-        description: "Log level for console output. Options: debug, info, warn, error."
-        example: "LOG_LEVEL=debug"
-      
-      - name: "DEFAULT_SUBTASKS"
-        required: false
-        default: "3"
-        description: "Default number of subtasks when expanding a task."
-        example: "DEFAULT_SUBTASKS=5"
-      
-      - name: "DEFAULT_PRIORITY"
-        required: false
-        default: "medium"
-        description: "Default priority for generated tasks. Options: high, medium, low."
-        example: "DEFAULT_PRIORITY=high"
-      
-      - name: "PROJECT_NAME"
-        required: false
-        default: "MCP SaaS MVP"
-        description: "Override default project name in tasks.json metadata."
-        example: "PROJECT_NAME=My Awesome Project"
-      
-      - name: "PROJECT_VERSION"
-        required: false
-        default: "1.0.0"
-        description: "Override default version in tasks.json metadata."
-        example: "PROJECT_VERSION=2.1.0"
+- **Environment Variables Configuration**
+  - **ANTHROPIC_API_KEY** (Required): Your Anthropic API key for Claude (Example: `ANTHROPIC_API_KEY=sk-ant-api03-...`)
+  - **MODEL** (Default: `"claude-3-7-sonnet-20250219"`): Claude model to use (Example: `MODEL=claude-3-opus-20240229`)
+  - **MAX_TOKENS** (Default: `"4000"`): Maximum tokens for responses (Example: `MAX_TOKENS=8000`)
+  - **TEMPERATURE** (Default: `"0.7"`): Temperature for model responses (Example: `TEMPERATURE=0.5`)
+  - **DEBUG** (Default: `"false"`): Enable debug logging (Example: `DEBUG=true`)
+  - **LOG_LEVEL** (Default: `"info"`): Console output level (Example: `LOG_LEVEL=debug`)
+  - **DEFAULT_SUBTASKS** (Default: `"3"`): Default subtask count (Example: `DEFAULT_SUBTASKS=5`)
+  - **DEFAULT_PRIORITY** (Default: `"medium"`): Default priority (Example: `DEFAULT_PRIORITY=high`)
+  - **PROJECT_NAME** (Default: `"MCP SaaS MVP"`): Project name in metadata (Example: `PROJECT_NAME=My Awesome Project`)
+  - **PROJECT_VERSION** (Default: `"1.0.0"`): Version in metadata (Example: `PROJECT_VERSION=2.1.0`)
+  - **PERPLEXITY_API_KEY**: For research-backed features (Example: `PERPLEXITY_API_KEY=pplx-...`)
+  - **PERPLEXITY_MODEL** (Default: `"sonar-medium-online"`): Perplexity model (Example: `PERPLEXITY_MODEL=sonar-large-online`)
--- a/templates/example_prd.txt
+++ b/templates/example_prd.txt
@@ -13,7 +13,8 @@
 - User personas
 - Key user flows
 - UI/UX considerations]
-
+</context>
+<PRD>
 # Technical Architecture  
 [Outline the technical implementation details:
 - System components
@@ -25,23 +26,22 @@
 [Break down the development process into phases:
 - MVP requirements
 - Future enhancements
- Timeline estimates]
+- Do not think about timelines whatsoever -- all that matters is scope and detailing exactly what needs to be build in each phase so it can later be cut up into tasks]

-# Success Metrics  
-[Define how success will be measured:
- Key performance indicators
- User adoption metrics
- Business goals]
+# Logical Dependency Chain
+[Define the logical order of development:
+- Which features need to be built first (foundation)
+- Getting as quickly as possible to something usable/visible front end that works
+- Properly pacing and scoping each feature so it is atomic but can also be built upon and improved as development approaches]

 # Risks and Mitigations  
 [Identify potential risks and how they'll be addressed:
 - Technical challenges
- Market risks
+- Figuring out the MVP that we can build upon
 - Resource constraints]

 # Appendix  
 [Include any additional information:
 - Research findings
- Competitive analysis
 - Technical specifications]
-</context> 
+</PRD>
--- a/templates/scripts_README.md
+++ b/templates/scripts_README.md
@@ -12,6 +12,7 @@ In an AI-driven development process—particularly with tools like [Cursor](http
 4. **Generate** individual task files (e.g., `task_001.txt`) for easy reference or to feed into an AI coding workflow.
 5. **Set task status**—mark tasks as `done`, `pending`, or `deferred` based on progress.
 6. **Expand** tasks with subtasks—break down complex tasks into smaller, more manageable subtasks.
+7. **Research-backed subtask generation**—use Perplexity AI to generate more informed and contextually relevant subtasks.

 ## Configuration

@@ -24,6 +25,8 @@ The script can be configured through environment variables in a `.env` file at t
 - `MODEL`: Specify which Claude model to use (default: "claude-3-7-sonnet-20250219")
 - `MAX_TOKENS`: Maximum tokens for model responses (default: 4000)
 - `TEMPERATURE`: Temperature for model responses (default: 0.7)
+- `PERPLEXITY_API_KEY`: Your Perplexity API key for research-backed subtask generation
+- `PERPLEXITY_MODEL`: Specify which Perplexity model to use (default: "sonar-medium-online")
 - `DEBUG`: Enable debug logging (default: false)
 - `LOG_LEVEL`: Log level - debug, info, warn, error (default: info)
 - `DEFAULT_SUBTASKS`: Default number of subtasks when expanding (default: 3)
@@ -56,6 +59,44 @@ The script can be configured through environment variables in a `.env` file at t

   Run `node scripts/dev.js` without arguments to see detailed usage information.

+## Listing Tasks
+
+The `list` command allows you to view all tasks and their status:
+
+```bash
+# List all tasks
+node scripts/dev.js list
+
+# List tasks with a specific status
+node scripts/dev.js list --status=pending
+
+# List tasks and include their subtasks
+node scripts/dev.js list --with-subtasks
+
+# List tasks with a specific status and include their subtasks
+node scripts/dev.js list --status=pending --with-subtasks
+```
+
+## Updating Tasks
+
+The `update` command allows you to update tasks based on new information or implementation changes:
+
+```bash
+# Update tasks starting from ID 4 with a new prompt
+node scripts/dev.js update --from=4 --prompt="Refactor tasks from ID 4 onward to use Express instead of Fastify"
+
+# Update all tasks (default from=1)
+node scripts/dev.js update --prompt="Add authentication to all relevant tasks"
+
+# Specify a different tasks file
+node scripts/dev.js update --file=custom-tasks.json --from=5 --prompt="Change database from MongoDB to PostgreSQL"
+```
+
+Notes:
+- The `--prompt` parameter is required and should explain the changes or new context
+- Only tasks that aren't marked as 'done' will be updated
+- Tasks with ID >= the specified --from value will be updated
+
 ## Setting Task Status

 The `set-status` command allows you to change a task's status:
@@ -89,7 +130,7 @@ The `expand` command allows you to break down tasks into subtasks for more detai
 node scripts/dev.js expand --id=3

 # Expand a specific task with 5 subtasks
-node scripts/dev.js expand --id=3 --subtasks=5
+node scripts/dev.js expand --id=3 --num=5

 # Expand a task with additional context
 node scripts/dev.js expand --id=3 --prompt="Focus on security aspects"
@@ -99,12 +140,35 @@ node scripts/dev.js expand --all

 # Force regeneration of subtasks for all pending tasks
 node scripts/dev.js expand --all --force
+
+# Use Perplexity AI for research-backed subtask generation
+node scripts/dev.js expand --id=3 --research
+
+# Use Perplexity AI for research-backed generation on all pending tasks
+node scripts/dev.js expand --all --research
 ```

 Notes:
 - Tasks marked as 'done' or 'completed' are always skipped
 - By default, tasks that already have subtasks are skipped unless `--force` is used
 - Subtasks include title, description, dependencies, and acceptance criteria
+- The `--research` flag uses Perplexity AI to generate more informed and contextually relevant subtasks
+- If Perplexity API is unavailable, the script will fall back to using Anthropic's Claude
+
+## AI Integration
+
+The script integrates with two AI services:
+
+1. **Anthropic Claude**: Used for parsing PRDs, generating tasks, and creating subtasks.
+2. **Perplexity AI**: Used for research-backed subtask generation when the `--research` flag is specified.
+
+The Perplexity integration uses the OpenAI client to connect to Perplexity's API, which provides enhanced research capabilities for generating more informed subtasks. If the Perplexity API is unavailable or encounters an error, the script will automatically fall back to using Anthropic's Claude.
+
+To use the Perplexity integration:
+1. Obtain a Perplexity API key
+2. Add `PERPLEXITY_API_KEY` to your `.env` file
+3. Optionally specify `PERPLEXITY_MODEL` in your `.env` file (default: "sonar-medium-online")
+4. Use the `--research` flag with the `expand` command

 ## Logging

@@ -115,3 +179,79 @@ The script supports different logging levels controlled by the `LOG_LEVEL` envir
 - `error`: Error messages that might prevent execution

 When `DEBUG=true` is set, debug logs are also written to a `dev-debug.log` file in the project root.
+
+## Analyzing Task Complexity
+
+The `analyze-complexity` command allows you to automatically assess task complexity and generate expansion recommendations:
+
+```bash
+# Analyze all tasks and generate expansion recommendations
+node scripts/dev.js analyze-complexity
+
+# Specify a custom output file
+node scripts/dev.js analyze-complexity --output=custom-report.json
+
+# Override the model used for analysis
+node scripts/dev.js analyze-complexity --model=claude-3-opus-20240229
+
+# Set a custom complexity threshold (1-10)
+node scripts/dev.js analyze-complexity --threshold=6
+
+# Use Perplexity AI for research-backed complexity analysis
+node scripts/dev.js analyze-complexity --research
+```
+
+Notes:
+- The command uses Claude to analyze each task's complexity (or Perplexity with --research flag)
+- Tasks are scored on a scale of 1-10
+- Each task receives a recommended number of subtasks based on DEFAULT_SUBTASKS configuration
+- The default output path is `scripts/task-complexity-report.json`
+- Each task in the analysis includes a ready-to-use `expansionCommand` that can be copied directly to the terminal or executed programmatically
+- Tasks with complexity scores below the threshold (default: 5) may not need expansion
+- The research flag provides more contextual and informed complexity assessments
+
+### Integration with Expand Command
+
+The `expand` command automatically checks for and uses complexity analysis if available:
+
+```bash
+# Expand a task, using complexity report recommendations if available
+node scripts/dev.js expand --id=8
+
+# Expand all tasks, prioritizing by complexity score if a report exists
+node scripts/dev.js expand --all
+
+# Override recommendations with explicit values
+node scripts/dev.js expand --id=8 --num=5 --prompt="Custom prompt"
+```
+
+When a complexity report exists:
+- The `expand` command will use the recommended subtask count from the report (unless overridden)
+- It will use the tailored expansion prompt from the report (unless a custom prompt is provided)
+- When using `--all`, tasks are sorted by complexity score (highest first)
+- The `--research` flag is preserved from the complexity analysis to expansion
+
+The output report structure is:
+```json
+{
+  "meta": {
+    "generatedAt": "2023-06-15T12:34:56.789Z",
+    "tasksAnalyzed": 20,
+    "thresholdScore": 5,
+    "projectName": "Your Project Name",
+    "usedResearch": true
+  },
+  "complexityAnalysis": [
+    {
+      "taskId": 8,
+      "taskTitle": "Develop Implementation Drift Handling",
+      "complexityScore": 9.5,
+      "recommendedSubtasks": 6,
+      "expansionPrompt": "Create subtasks that handle detecting...",
+      "reasoning": "This task requires sophisticated logic...",
+      "expansionCommand": "node scripts/dev.js expand --id=8 --num=6 --prompt=\"Create subtasks...\" --research"
+    },
+    // More tasks sorted by complexity score (highest first)
+  ]
+}
+```