version patch

0.1.10
fix: prevent temp file accumulation during long agent runs
2026-03-20 20:33:09 +00:00 · 2026-02-09 08:56:33 +02:00 · 2026-02-09 08:55:49 +02:00 · 2026-02-09 08:54:52 +02:00 · 2026-02-08 15:51:11 +02:00 · 2026-02-08 15:50:50 +02:00
57 changed files with 4048 additions and 471 deletions
--- a/.claude/templates/coding_prompt.template.md
+++ b/.claude/templates/coding_prompt.template.md
@@ -90,13 +90,13 @@ Use browser automation tools:

 - Navigate to the app in a real browser
 - Interact like a human user (click, type, scroll)
- Take screenshots at each step
+- Take screenshots at each step (use inline screenshots only -- do NOT save screenshot files to disk)
 - Verify both functionality AND visual appearance

 **DO:**

 - Test through the UI with clicks and keyboard input
- Take screenshots to verify visual appearance
+- Take screenshots to verify visual appearance (inline only, never save to disk)
 - Check for console errors in browser
 - Verify complete user workflows end-to-end

@@ -194,6 +194,8 @@ Before context fills up:

 Use Playwright MCP tools (`browser_*`) for UI verification. Key tools: `navigate`, `click`, `type`, `fill_form`, `take_screenshot`, `console_messages`, `network_requests`. All tools have auto-wait built in.

+**Screenshot rule:** Always use inline mode (base64). NEVER save screenshots as files to disk.
+
 Test like a human user with mouse and keyboard. Use `browser_console_messages` to detect errors. Don't bypass UI with JavaScript evaluation.

 ---
--- a/.claude/templates/testing_prompt.template.md
+++ b/.claude/templates/testing_prompt.template.md
@@ -31,14 +31,14 @@ For the feature returned:
 1. Read and understand the feature's verification steps
 2. Navigate to the relevant part of the application
 3. Execute each verification step using browser automation
-4. Take screenshots to document the verification
+4. Take screenshots to document the verification (inline only -- do NOT save to disk)
 5. Check for console errors

 Use browser automation tools:

 **Navigation & Screenshots:**
 - browser_navigate - Navigate to a URL
- browser_take_screenshot - Capture screenshot (use for visual verification)
+- browser_take_screenshot - Capture screenshot (inline mode only -- never save to disk)
 - browser_snapshot - Get accessibility tree snapshot

 **Element Interaction:**
@@ -79,7 +79,7 @@ A regression has been introduced. You MUST fix it:

 4. **Verify the fix:**
   - Run through all verification steps again
-   - Take screenshots confirming the fix
+   - Take screenshots confirming the fix (inline only, never save to disk)

 5. **Mark as passing after fix:**
   ```
@@ -110,7 +110,7 @@ A regression has been introduced. You MUST fix it:
 All interaction tools have **built-in auto-wait** -- no manual timeouts needed.

 - `browser_navigate` - Navigate to URL
- `browser_take_screenshot` - Capture screenshot
+- `browser_take_screenshot` - Capture screenshot (inline only, never save to disk)
 - `browser_snapshot` - Get accessibility tree
 - `browser_click` - Click elements
 - `browser_type` - Type text
--- a/.env.example
+++ b/.env.example
@@ -9,11 +9,6 @@
 # - webkit: Safari engine
 # - msedge: Microsoft Edge
 # PLAYWRIGHT_BROWSER=firefox
-#
-# PLAYWRIGHT_HEADLESS: Run browser without visible window
-# - true: Browser runs in background, saves CPU (default)
-# - false: Browser opens a visible window (useful for debugging)
-# PLAYWRIGHT_HEADLESS=true

 # Extra Read Paths (Optional)
 # Comma-separated list of absolute paths for read-only access to external directories.
@@ -25,40 +20,37 @@
 # Google Cloud Vertex AI Configuration (Optional)
 # To use Claude via Vertex AI on Google Cloud Platform, uncomment and set these variables.
 # Requires: gcloud CLI installed and authenticated (run: gcloud auth application-default login)
-# Note: Use @ instead of - in model names (e.g., claude-opus-4-5@20251101)
+# Note: Use @ instead of - in model names for date-suffixed models (e.g., claude-sonnet-4-5@20250929)
 #
 # CLAUDE_CODE_USE_VERTEX=1
 # CLOUD_ML_REGION=us-east5
 # ANTHROPIC_VERTEX_PROJECT_ID=your-gcp-project-id
-# ANTHROPIC_DEFAULT_OPUS_MODEL=claude-opus-4-5@20251101
+# ANTHROPIC_DEFAULT_OPUS_MODEL=claude-opus-4-6
 # ANTHROPIC_DEFAULT_SONNET_MODEL=claude-sonnet-4-5@20250929
 # ANTHROPIC_DEFAULT_HAIKU_MODEL=claude-3-5-haiku@20241022

-# GLM/Alternative API Configuration (Optional)
-# To use Zhipu AI's GLM models instead of Claude, uncomment and set these variables.
-# This only affects AutoForge - your global Claude Code settings remain unchanged.
-# Get an API key at: https://z.ai/subscribe
+# ===================
+# Alternative API Providers (GLM, Ollama, Kimi, Custom)
+# ===================
+# Configure via Settings UI (recommended) or set env vars below.
+# When both are set, env vars take precedence.
 #
+# GLM (Zhipu AI):
 # ANTHROPIC_BASE_URL=https://api.z.ai/api/anthropic
-# ANTHROPIC_AUTH_TOKEN=your-zhipu-api-key
-# API_TIMEOUT_MS=3000000
-# ANTHROPIC_DEFAULT_SONNET_MODEL=glm-4.7
+# ANTHROPIC_AUTH_TOKEN=your-glm-api-key
 # ANTHROPIC_DEFAULT_OPUS_MODEL=glm-4.7
-# ANTHROPIC_DEFAULT_HAIKU_MODEL=glm-4.5-air
-
-# Ollama Local Model Configuration (Optional)
-# To use local models via Ollama instead of Claude, uncomment and set these variables.
-# Requires Ollama v0.14.0+ with Anthropic API compatibility.
-# See: https://ollama.com/blog/claude
+# ANTHROPIC_DEFAULT_SONNET_MODEL=glm-4.7
+# ANTHROPIC_DEFAULT_HAIKU_MODEL=glm-4.7
 #
+# Ollama (Local):
 # ANTHROPIC_BASE_URL=http://localhost:11434
-# ANTHROPIC_AUTH_TOKEN=ollama
-# API_TIMEOUT_MS=3000000
-# ANTHROPIC_DEFAULT_SONNET_MODEL=qwen3-coder
 # ANTHROPIC_DEFAULT_OPUS_MODEL=qwen3-coder
+# ANTHROPIC_DEFAULT_SONNET_MODEL=qwen3-coder
 # ANTHROPIC_DEFAULT_HAIKU_MODEL=qwen3-coder
 #
-# Model recommendations:
-# - For best results, use a capable coding model like qwen3-coder or deepseek-coder-v2
-# - You can use the same model for all tiers, or different models per tier
-# - Larger models (70B+) work best for Opus tier, smaller (7B-20B) for Haiku
+# Kimi (Moonshot):
+# ANTHROPIC_BASE_URL=https://api.kimi.com/coding/
+# ANTHROPIC_API_KEY=your-kimi-api-key
+# ANTHROPIC_DEFAULT_OPUS_MODEL=kimi-k2.5
+# ANTHROPIC_DEFAULT_SONNET_MODEL=kimi-k2.5
+# ANTHROPIC_DEFAULT_HAIKU_MODEL=kimi-k2.5
--- a/CLAUDE.md
+++ b/CLAUDE.md
@@ -408,44 +408,23 @@ Run coding agents via Google Cloud Vertex AI:
   CLAUDE_CODE_USE_VERTEX=1
   CLOUD_ML_REGION=us-east5
   ANTHROPIC_VERTEX_PROJECT_ID=your-gcp-project-id
-   ANTHROPIC_DEFAULT_OPUS_MODEL=claude-opus-4-5@20251101
+   ANTHROPIC_DEFAULT_OPUS_MODEL=claude-opus-4-6
   ANTHROPIC_DEFAULT_SONNET_MODEL=claude-sonnet-4-5@20250929
   ANTHROPIC_DEFAULT_HAIKU_MODEL=claude-3-5-haiku@20241022
   ```

 **Note:** Use `@` instead of `-` in model names for Vertex AI.

-### Ollama Local Models (Optional)
+### Alternative API Providers (GLM, Ollama, Kimi, Custom)

-Run coding agents using local models via Ollama v0.14.0+:
+Alternative providers are configured via the **Settings UI** (gear icon > API Provider section). Select a provider, set the base URL, auth token, and model — no `.env` changes needed.

-1. Install Ollama: https://ollama.com
-2. Start Ollama: `ollama serve`
-3. Pull a coding model: `ollama pull qwen3-coder`
-4. Configure `.env`:
-   ```
-   ANTHROPIC_BASE_URL=http://localhost:11434
-   ANTHROPIC_AUTH_TOKEN=ollama
-   API_TIMEOUT_MS=3000000
-   ANTHROPIC_DEFAULT_SONNET_MODEL=qwen3-coder
-   ANTHROPIC_DEFAULT_OPUS_MODEL=qwen3-coder
-   ANTHROPIC_DEFAULT_HAIKU_MODEL=qwen3-coder
-   ```
-5. Run AutoForge normally - it will use your local Ollama models
+**Available providers:** Claude (default), GLM (Zhipu AI), Ollama (local models), Kimi (Moonshot), Custom

-**Recommended coding models:**
- `qwen3-coder` - Good balance of speed and capability
- `deepseek-coder-v2` - Strong coding performance
- `codellama` - Meta's code-focused model
-
-**Model tier mapping:**
- Use the same model for all tiers, or map different models per capability level
- Larger models (70B+) work best for Opus tier
- Smaller models (7B-20B) work well for Haiku tier
-
-**Known limitations:**
- Smaller context windows than Claude (model-dependent)
- Extended context beta disabled (not supported by Ollama)
+**Ollama notes:**
+- Requires Ollama v0.14.0+ with Anthropic API compatibility
+- Install: https://ollama.com → `ollama serve` → `ollama pull qwen3-coder`
+- Recommended models: `qwen3-coder`, `deepseek-coder-v2`, `codellama`
 - Performance depends on local hardware (GPU recommended)

 ## Claude Code Integration
--- a/README.md
+++ b/README.md
@@ -6,9 +6,9 @@ A long-running autonomous coding agent powered by the Claude Agent SDK. This too

 ## Video Tutorial

-[![Watch the tutorial](https://img.youtube.com/vi/lGWFlpffWk4/hqdefault.jpg)](https://youtu.be/lGWFlpffWk4)
+[![Watch the tutorial](https://img.youtube.com/vi/nKiPOxDpcJY/hqdefault.jpg)](https://youtu.be/nKiPOxDpcJY)

-> **[Watch the setup and usage guide →](https://youtu.be/lGWFlpffWk4)**
+> **[Watch the setup and usage guide →](https://youtu.be/nKiPOxDpcJY)**

 ---

@@ -326,37 +326,13 @@ When test progress increases, the agent sends:
 }
 ```

-### Using GLM Models (Alternative to Claude)
+### Alternative API Providers (GLM, Ollama, Kimi, Custom)

-Add these variables to your `.env` file to use Zhipu AI's GLM models:
+Alternative providers are configured via the **Settings UI** (gear icon > API Provider). Select your provider, set the base URL, auth token, and model directly in the UI — no `.env` changes needed.

-```bash
-ANTHROPIC_BASE_URL=https://api.z.ai/api/anthropic
-ANTHROPIC_AUTH_TOKEN=your-zhipu-api-key
-API_TIMEOUT_MS=3000000
-ANTHROPIC_DEFAULT_SONNET_MODEL=glm-4.7
-ANTHROPIC_DEFAULT_OPUS_MODEL=glm-4.7
-ANTHROPIC_DEFAULT_HAIKU_MODEL=glm-4.5-air
-```
+Available providers: **Claude** (default), **GLM** (Zhipu AI), **Ollama** (local models), **Kimi** (Moonshot), **Custom**

-This routes AutoForge's API requests through Zhipu's Claude-compatible API, allowing you to use GLM-4.7 and other models. **This only affects AutoForge** - your global Claude Code settings remain unchanged.
-
-Get an API key at: https://z.ai/subscribe
-
-### Using Ollama Local Models
-
-Add these variables to your `.env` file to run agents with local models via Ollama v0.14.0+:
-
-```bash
-ANTHROPIC_BASE_URL=http://localhost:11434
-ANTHROPIC_AUTH_TOKEN=ollama
-API_TIMEOUT_MS=3000000
-ANTHROPIC_DEFAULT_SONNET_MODEL=qwen3-coder
-ANTHROPIC_DEFAULT_OPUS_MODEL=qwen3-coder
-ANTHROPIC_DEFAULT_HAIKU_MODEL=qwen3-coder
-```
-
-See the [CLAUDE.md](CLAUDE.md) for recommended models and known limitations.
+For Ollama, install [Ollama v0.14.0+](https://ollama.com), run `ollama serve`, and pull a coding model (e.g., `ollama pull qwen3-coder`). Then select "Ollama" in the Settings UI.

 ### Using Vertex AI

@@ -366,7 +342,7 @@ Add these variables to your `.env` file to run agents via Google Cloud Vertex AI
 CLAUDE_CODE_USE_VERTEX=1
 CLOUD_ML_REGION=us-east5
 ANTHROPIC_VERTEX_PROJECT_ID=your-gcp-project-id
-ANTHROPIC_DEFAULT_OPUS_MODEL=claude-opus-4-5@20251101
+ANTHROPIC_DEFAULT_OPUS_MODEL=claude-opus-4-6
 ANTHROPIC_DEFAULT_SONNET_MODEL=claude-sonnet-4-5@20250929
 ANTHROPIC_DEFAULT_HAIKU_MODEL=claude-3-5-haiku@20241022
 ```
--- a/autonomous_agent_demo.py
+++ b/autonomous_agent_demo.py
@@ -44,8 +44,10 @@ from dotenv import load_dotenv
 # IMPORTANT: Must be called BEFORE importing other modules that read env vars at load time
 load_dotenv()

+import os
+
 from agent import run_autonomous_agent
-from registry import DEFAULT_MODEL, get_project_path
+from registry import DEFAULT_MODEL, get_effective_sdk_env, get_project_path


 def parse_args() -> argparse.Namespace:
@@ -195,6 +197,14 @@ def main() -> None:
    # Note: Authentication is handled by start.bat/start.sh before this script runs.
    # The Claude SDK auto-detects credentials from ~/.claude/.credentials.json

+    # Apply UI-configured provider settings to this process's environment.
+    # This ensures CLI-launched agents respect Settings UI provider config (GLM, Ollama, etc.).
+    # Uses setdefault so explicit env vars / .env file take precedence.
+    sdk_overrides = get_effective_sdk_env()
+    for key, value in sdk_overrides.items():
+        if value:  # Only set non-empty values (empty values are used to clear conflicts)
+            os.environ.setdefault(key, value)
+
    # Handle deprecated --parallel flag
    if args.parallel is not None:
        print("WARNING: --parallel is deprecated. Use --concurrency instead.", flush=True)
@@ -263,6 +273,17 @@ def main() -> None:
            )
        else:
            # Entry point mode - always use unified orchestrator
+            # Clean up stale temp files before starting (prevents temp folder bloat)
+            from temp_cleanup import cleanup_stale_temp
+            cleanup_stats = cleanup_stale_temp()
+            if cleanup_stats["dirs_deleted"] > 0 or cleanup_stats["files_deleted"] > 0:
+                mb_freed = cleanup_stats["bytes_freed"] / (1024 * 1024)
+                print(
+                    f"[CLEANUP] Removed {cleanup_stats['dirs_deleted']} dirs, "
+                    f"{cleanup_stats['files_deleted']} files ({mb_freed:.1f} MB freed)",
+                    flush=True,
+                )
+
            from parallel_orchestrator import run_parallel_orchestrator

            # Clamp concurrency to valid range (1-5)
--- a/bin/autoforge.js
+++ b/bin/autoforge.js
--- a/client.py
+++ b/client.py
@@ -16,7 +16,6 @@ from claude_agent_sdk import ClaudeAgentOptions, ClaudeSDKClient
 from claude_agent_sdk.types import HookContext, HookInput, HookMatcher, SyncHookJSONOutput
 from dotenv import load_dotenv

-from env_constants import API_ENV_VARS
 from security import SENSITIVE_DIRECTORIES, bash_security_hook

 # Load environment variables from .env file if present
@@ -46,8 +45,9 @@ def convert_model_for_vertex(model: str) -> str:
    """
    Convert model name format for Vertex AI compatibility.

-    Vertex AI uses @ to separate model name from version (e.g., claude-opus-4-5@20251101)
-    while the Anthropic API uses - (e.g., claude-opus-4-5-20251101).
+    Vertex AI uses @ to separate model name from version (e.g., claude-sonnet-4-5@20250929)
+    while the Anthropic API uses - (e.g., claude-sonnet-4-5-20250929).
+    Models without a date suffix (e.g., claude-opus-4-6) pass through unchanged.

    Args:
        model: Model name in Anthropic format (with hyphens)
@@ -61,7 +61,7 @@ def convert_model_for_vertex(model: str) -> str:
        return model

    # Pattern: claude-{name}-{version}-{date} -> claude-{name}-{version}@{date}
-    # Example: claude-opus-4-5-20251101 -> claude-opus-4-5@20251101
+    # Example: claude-sonnet-4-5-20250929 -> claude-sonnet-4-5@20250929
    # The date is always 8 digits at the end
    match = re.match(r'^(claude-.+)-(\d{8})$', model)
    if match:
@@ -446,17 +446,17 @@ def create_client(
        mcp_servers["playwright"] = {
            "command": "npx",
            "args": playwright_args,
+            "env": {
+                "NODE_COMPILE_CACHE": "",  # Disable V8 compile caching to prevent .node file accumulation in %TEMP%
+            },
        }

    # Build environment overrides for API endpoint configuration
-    # These override system env vars for the Claude CLI subprocess,
-    # allowing AutoForge to use alternative APIs (e.g., GLM) without
-    # affecting the user's global Claude Code settings
-    sdk_env = {}
-    for var in API_ENV_VARS:
-        value = os.getenv(var)
-        if value:
-            sdk_env[var] = value
+    # Uses get_effective_sdk_env() which reads provider settings from the database,
+    # ensuring UI-configured alternative providers (GLM, Ollama, Kimi, Custom) propagate
+    # correctly to the Claude CLI subprocess
+    from registry import get_effective_sdk_env
+    sdk_env = get_effective_sdk_env()

    # Detect alternative API mode (Ollama, GLM, or Vertex AI)
    base_url = sdk_env.get("ANTHROPIC_BASE_URL", "")
--- a/env_constants.py
+++ b/env_constants.py
@@ -15,6 +15,7 @@ API_ENV_VARS: list[str] = [
    # Core API configuration
    "ANTHROPIC_BASE_URL",              # Custom API endpoint (e.g., https://api.z.ai/api/anthropic)
    "ANTHROPIC_AUTH_TOKEN",            # API authentication token
+    "ANTHROPIC_API_KEY",               # API key (used by Kimi and other providers)
    "API_TIMEOUT_MS",                  # Request timeout in milliseconds
    # Model tier overrides
    "ANTHROPIC_DEFAULT_SONNET_MODEL",  # Model override for Sonnet
--- a/mcp_server/feature_mcp.py
+++ b/mcp_server/feature_mcp.py
@@ -984,5 +984,35 @@ def feature_set_dependencies(
        return json.dumps({"error": f"Failed to set dependencies: {str(e)}"})


+@mcp.tool()
+def ask_user(
+    questions: Annotated[list[dict], Field(description="List of questions to ask, each with question, header, options (list of {label, description}), and multiSelect (bool)")]
+) -> str:
+    """Ask the user structured questions with selectable options.
+
+    Use this when you need clarification or want to offer choices to the user.
+    Each question has a short header, the question text, and 2-4 clickable options.
+    The user's selections will be returned as your next message.
+
+    Args:
+        questions: List of questions, each with:
+            - question (str): The question to ask
+            - header (str): Short label (max 12 chars)
+            - options (list): Each with label (str) and description (str)
+            - multiSelect (bool): Allow multiple selections (default false)
+
+    Returns:
+        Acknowledgment that questions were presented to the user
+    """
+    # Validate input
+    for i, q in enumerate(questions):
+        if not all(key in q for key in ["question", "header", "options"]):
+            return json.dumps({"error": f"Question at index {i} missing required fields"})
+        if len(q["options"]) < 2 or len(q["options"]) > 4:
+            return json.dumps({"error": f"Question at index {i} must have 2-4 options"})
+
+    return "Questions presented to the user. Their response will arrive as your next message."
+
+
 if __name__ == "__main__":
    mcp.run()
--- a/package.json
+++ b/package.json
@@ -1,6 +1,6 @@
 {
  "name": "autoforge-ai",
-  "version": "0.1.1",
+  "version": "0.1.10",
  "description": "Autonomous coding agent with web UI - build complete apps with AI",
  "license": "AGPL-3.0",
  "bin": {
@@ -34,6 +34,7 @@
    "registry.py",
    "rate_limit_utils.py",
    "security.py",
+    "temp_cleanup.py",
    "requirements-prod.txt",
    "pyproject.toml",
    ".env.example",
--- a/parallel_orchestrator.py
+++ b/parallel_orchestrator.py
@@ -846,7 +846,7 @@ class ParallelOrchestrator:
                "encoding": "utf-8",
                "errors": "replace",
                "cwd": str(self.project_dir),  # Run from project dir so CLI creates .claude/ in project
-                "env": {**os.environ, "PYTHONUNBUFFERED": "1"},
+                "env": {**os.environ, "PYTHONUNBUFFERED": "1", "NODE_COMPILE_CACHE": ""},
            }
            if sys.platform == "win32":
                popen_kwargs["creationflags"] = subprocess.CREATE_NO_WINDOW
@@ -909,7 +909,7 @@ class ParallelOrchestrator:
                "encoding": "utf-8",
                "errors": "replace",
                "cwd": str(self.project_dir),  # Run from project dir so CLI creates .claude/ in project
-                "env": {**os.environ, "PYTHONUNBUFFERED": "1"},
+                "env": {**os.environ, "PYTHONUNBUFFERED": "1", "NODE_COMPILE_CACHE": ""},
            }
            if sys.platform == "win32":
                popen_kwargs["creationflags"] = subprocess.CREATE_NO_WINDOW
@@ -1013,7 +1013,7 @@ class ParallelOrchestrator:
                    "encoding": "utf-8",
                    "errors": "replace",
                    "cwd": str(self.project_dir),  # Run from project dir so CLI creates .claude/ in project
-                    "env": {**os.environ, "PYTHONUNBUFFERED": "1"},
+                    "env": {**os.environ, "PYTHONUNBUFFERED": "1", "NODE_COMPILE_CACHE": ""},
                }
                if sys.platform == "win32":
                    popen_kwargs["creationflags"] = subprocess.CREATE_NO_WINDOW
@@ -1074,7 +1074,7 @@ class ParallelOrchestrator:
            "encoding": "utf-8",
            "errors": "replace",
            "cwd": str(AUTOFORGE_ROOT),
-            "env": {**os.environ, "PYTHONUNBUFFERED": "1"},
+            "env": {**os.environ, "PYTHONUNBUFFERED": "1", "NODE_COMPILE_CACHE": ""},
        }
        if sys.platform == "win32":
            popen_kwargs["creationflags"] = subprocess.CREATE_NO_WINDOW
@@ -1160,6 +1160,19 @@ class ParallelOrchestrator:
                debug_log.log("CLEANUP", f"Error killing process tree for {agent_type} agent", error=str(e))
            self._on_agent_complete(feature_id, proc.returncode, agent_type, proc)

+    def _run_inter_session_cleanup(self):
+        """Run lightweight cleanup between agent sessions.
+
+        Removes stale temp files and project screenshots to prevent
+        disk space accumulation during long overnight runs.
+        """
+        try:
+            from temp_cleanup import cleanup_project_screenshots, cleanup_stale_temp
+            cleanup_stale_temp()
+            cleanup_project_screenshots(self.project_dir)
+        except Exception as e:
+            debug_log.log("CLEANUP", f"Inter-session cleanup failed (non-fatal): {e}")
+
    def _signal_agent_completed(self):
        """Signal that an agent has completed, waking the main loop.

@@ -1235,6 +1248,8 @@ class ParallelOrchestrator:
                pid=proc.pid,
                feature_id=feature_id,
                status=status)
+            # Run lightweight cleanup between sessions
+            self._run_inter_session_cleanup()
            # Signal main loop that an agent slot is available
            self._signal_agent_completed()
            return
@@ -1301,6 +1316,8 @@ class ParallelOrchestrator:
        else:
            print(f"Feature #{feature_id} {status}", flush=True)

+        # Run lightweight cleanup between sessions
+        self._run_inter_session_cleanup()
        # Signal main loop that an agent slot is available
        self._signal_agent_completed()

--- a/registry.py
+++ b/registry.py
@@ -46,10 +46,16 @@ def _migrate_registry_dir() -> None:
 # Available models with display names
 # To add a new model: add an entry here with {"id": "model-id", "name": "Display Name"}
 AVAILABLE_MODELS = [
-    {"id": "claude-opus-4-5-20251101", "name": "Claude Opus 4.5"},
-    {"id": "claude-sonnet-4-5-20250929", "name": "Claude Sonnet 4.5"},
+    {"id": "claude-opus-4-6", "name": "Claude Opus"},
+    {"id": "claude-sonnet-4-5-20250929", "name": "Claude Sonnet"},
 ]

+# Map legacy model IDs to their current replacements.
+# Used by get_all_settings() to auto-migrate stale values on first read after upgrade.
+LEGACY_MODEL_MAP = {
+    "claude-opus-4-5-20251101": "claude-opus-4-6",
+}
+
 # List of valid model IDs (derived from AVAILABLE_MODELS)
 VALID_MODELS = [m["id"] for m in AVAILABLE_MODELS]

@@ -59,7 +65,7 @@ VALID_MODELS = [m["id"] for m in AVAILABLE_MODELS]
 _env_default_model = os.getenv("ANTHROPIC_DEFAULT_OPUS_MODEL")
 if _env_default_model is not None:
    _env_default_model = _env_default_model.strip()
-DEFAULT_MODEL = _env_default_model or "claude-opus-4-5-20251101"
+DEFAULT_MODEL = _env_default_model or "claude-opus-4-6"

 # Ensure env-provided DEFAULT_MODEL is in VALID_MODELS for validation consistency
 # (idempotent: only adds if missing, doesn't alter AVAILABLE_MODELS semantics)
@@ -598,6 +604,9 @@ def get_all_settings() -> dict[str, str]:
    """
    Get all settings as a dictionary.

+    Automatically migrates legacy model IDs (e.g. claude-opus-4-5-20251101 -> claude-opus-4-6)
+    on first read after upgrade. This is a one-time silent migration.
+
    Returns:
        Dictionary mapping setting keys to values.
    """
@@ -606,9 +615,159 @@ def get_all_settings() -> dict[str, str]:
        session = SessionLocal()
        try:
            settings = session.query(Settings).all()
-            return {s.key: s.value for s in settings}
+            result = {s.key: s.value for s in settings}
+
+            # Auto-migrate legacy model IDs
+            migrated = False
+            for key in ("model", "api_model"):
+                old_id = result.get(key)
+                if old_id and old_id in LEGACY_MODEL_MAP:
+                    new_id = LEGACY_MODEL_MAP[old_id]
+                    setting = session.query(Settings).filter(Settings.key == key).first()
+                    if setting:
+                        setting.value = new_id
+                        setting.updated_at = datetime.now()
+                        result[key] = new_id
+                        migrated = True
+                        logger.info("Migrated setting '%s': %s -> %s", key, old_id, new_id)
+
+            if migrated:
+                session.commit()
+
+            return result
        finally:
            session.close()
    except Exception as e:
        logger.warning("Failed to read settings: %s", e)
        return {}
+
+
+# =============================================================================
+# API Provider Definitions
+# =============================================================================
+
+API_PROVIDERS: dict[str, dict[str, Any]] = {
+    "claude": {
+        "name": "Claude (Anthropic)",
+        "base_url": None,
+        "requires_auth": False,
+        "models": [
+            {"id": "claude-opus-4-6", "name": "Claude Opus"},
+            {"id": "claude-sonnet-4-5-20250929", "name": "Claude Sonnet"},
+        ],
+        "default_model": "claude-opus-4-6",
+    },
+    "kimi": {
+        "name": "Kimi K2.5 (Moonshot)",
+        "base_url": "https://api.kimi.com/coding/",
+        "requires_auth": True,
+        "auth_env_var": "ANTHROPIC_API_KEY",
+        "models": [{"id": "kimi-k2.5", "name": "Kimi K2.5"}],
+        "default_model": "kimi-k2.5",
+    },
+    "glm": {
+        "name": "GLM (Zhipu AI)",
+        "base_url": "https://api.z.ai/api/anthropic",
+        "requires_auth": True,
+        "auth_env_var": "ANTHROPIC_AUTH_TOKEN",
+        "models": [
+            {"id": "glm-4.7", "name": "GLM 4.7"},
+            {"id": "glm-4.5-air", "name": "GLM 4.5 Air"},
+        ],
+        "default_model": "glm-4.7",
+    },
+    "ollama": {
+        "name": "Ollama (Local)",
+        "base_url": "http://localhost:11434",
+        "requires_auth": False,
+        "models": [
+            {"id": "qwen3-coder", "name": "Qwen3 Coder"},
+            {"id": "deepseek-coder-v2", "name": "DeepSeek Coder V2"},
+        ],
+        "default_model": "qwen3-coder",
+    },
+    "custom": {
+        "name": "Custom Provider",
+        "base_url": "",
+        "requires_auth": True,
+        "auth_env_var": "ANTHROPIC_AUTH_TOKEN",
+        "models": [],
+        "default_model": "",
+    },
+}
+
+
+def get_effective_sdk_env() -> dict[str, str]:
+    """Build environment variable dict for Claude SDK based on current API provider settings.
+
+    When api_provider is "claude" (or unset), falls back to existing env vars (current behavior).
+    For other providers, builds env dict from stored settings (api_base_url, api_auth_token, api_model).
+
+    Returns:
+        Dict ready to merge into subprocess env or pass to SDK.
+    """
+    all_settings = get_all_settings()
+    provider_id = all_settings.get("api_provider", "claude")
+
+    if provider_id == "claude":
+        # Default behavior: forward existing env vars
+        from env_constants import API_ENV_VARS
+        sdk_env: dict[str, str] = {}
+        for var in API_ENV_VARS:
+            value = os.getenv(var)
+            if value:
+                sdk_env[var] = value
+        return sdk_env
+
+    # Alternative provider: build env from settings
+    provider = API_PROVIDERS.get(provider_id)
+    if not provider:
+        logger.warning("Unknown API provider '%s', falling back to claude", provider_id)
+        from env_constants import API_ENV_VARS
+        sdk_env = {}
+        for var in API_ENV_VARS:
+            value = os.getenv(var)
+            if value:
+                sdk_env[var] = value
+        return sdk_env
+
+    sdk_env: dict[str, str] = {}
+
+    # Explicitly clear credentials that could leak from the server process env.
+    # For providers using ANTHROPIC_AUTH_TOKEN (GLM, Custom), clear ANTHROPIC_API_KEY.
+    # For providers using ANTHROPIC_API_KEY (Kimi), clear ANTHROPIC_AUTH_TOKEN.
+    # This prevents the Claude CLI from using the wrong credentials.
+    auth_env_var = provider.get("auth_env_var", "ANTHROPIC_AUTH_TOKEN")
+    if auth_env_var == "ANTHROPIC_AUTH_TOKEN":
+        sdk_env["ANTHROPIC_API_KEY"] = ""
+    elif auth_env_var == "ANTHROPIC_API_KEY":
+        sdk_env["ANTHROPIC_AUTH_TOKEN"] = ""
+
+    # Clear Vertex AI vars when using non-Vertex alternative providers
+    sdk_env["CLAUDE_CODE_USE_VERTEX"] = ""
+    sdk_env["CLOUD_ML_REGION"] = ""
+    sdk_env["ANTHROPIC_VERTEX_PROJECT_ID"] = ""
+
+    # Base URL
+    base_url = all_settings.get("api_base_url") or provider.get("base_url")
+    if base_url:
+        sdk_env["ANTHROPIC_BASE_URL"] = base_url
+
+    # Auth token
+    auth_token = all_settings.get("api_auth_token")
+    if auth_token:
+        sdk_env[auth_env_var] = auth_token
+
+    # Model - set all three tier overrides to the same model
+    model = all_settings.get("api_model") or provider.get("default_model")
+    if model:
+        sdk_env["ANTHROPIC_DEFAULT_OPUS_MODEL"] = model
+        sdk_env["ANTHROPIC_DEFAULT_SONNET_MODEL"] = model
+        sdk_env["ANTHROPIC_DEFAULT_HAIKU_MODEL"] = model
+
+    # Timeout
+    timeout = all_settings.get("api_timeout_ms")
+    if timeout:
+        sdk_env["API_TIMEOUT_MS"] = timeout
+
+    return sdk_env
--- a/server/main.py
+++ b/server/main.py
@@ -61,6 +61,17 @@ UI_DIST_DIR = ROOT_DIR / "ui" / "dist"
@asynccontextmanager
 async def lifespan(app: FastAPI):
    """Lifespan context manager for startup and shutdown."""
+    # Startup - clean up stale temp files (Playwright profiles, .node cache, etc.)
+    try:
+        from temp_cleanup import cleanup_stale_temp
+        stats = cleanup_stale_temp()
+        if stats["dirs_deleted"] > 0 or stats["files_deleted"] > 0:
+            mb_freed = stats["bytes_freed"] / (1024 * 1024)
+            logger.info("Startup temp cleanup: %d dirs, %d files, %.1f MB freed",
+                        stats["dirs_deleted"], stats["files_deleted"], mb_freed)
+    except Exception as e:
+        logger.warning("Startup temp cleanup failed (non-fatal): %s", e)
+
    # Startup - clean up orphaned lock files from previous runs
    cleanup_orphaned_locks()
    cleanup_orphaned_devserver_locks()
--- a/server/routers/agent.py
+++ b/server/routers/agent.py
@@ -32,7 +32,7 @@ def _get_settings_defaults() -> tuple[bool, str, int, bool, int]:

    settings = get_all_settings()
    yolo_mode = (settings.get("yolo_mode") or "false").lower() == "true"
-    model = settings.get("model", DEFAULT_MODEL)
+    model = settings.get("api_model") or settings.get("model", DEFAULT_MODEL)

    # Parse testing agent settings with defaults
    try:
--- a/server/routers/assistant_chat.py
+++ b/server/routers/assistant_chat.py
@@ -26,7 +26,7 @@ from ..services.assistant_database import (
    get_conversations,
 )
 from ..utils.project_helpers import get_project_path as _get_project_path
-from ..utils.validation import is_valid_project_name as validate_project_name
+from ..utils.validation import validate_project_name

 logger = logging.getLogger(__name__)

@@ -207,30 +207,38 @@ async def assistant_chat_websocket(websocket: WebSocket, project_name: str):
    Client -> Server:
    - {"type": "start", "conversation_id": int | null} - Start/resume session
    - {"type": "message", "content": "..."} - Send user message
+    - {"type": "answer", "answers": {...}} - Answer to structured questions
    - {"type": "ping"} - Keep-alive ping

    Server -> Client:
    - {"type": "conversation_created", "conversation_id": int} - New conversation created
    - {"type": "text", "content": "..."} - Text chunk from Claude
    - {"type": "tool_call", "tool": "...", "input": {...}} - Tool being called
+    - {"type": "question", "questions": [...]} - Structured questions for user
    - {"type": "response_done"} - Response complete
    - {"type": "error", "content": "..."} - Error message
    - {"type": "pong"} - Keep-alive pong
    """
-    if not validate_project_name(project_name):
+    # Always accept WebSocket first to avoid opaque 403 errors
+    await websocket.accept()
+
+    try:
+        project_name = validate_project_name(project_name)
+    except HTTPException:
+        await websocket.send_json({"type": "error", "content": "Invalid project name"})
        await websocket.close(code=4000, reason="Invalid project name")
        return

    project_dir = _get_project_path(project_name)
    if not project_dir:
+        await websocket.send_json({"type": "error", "content": "Project not found in registry"})
        await websocket.close(code=4004, reason="Project not found in registry")
        return

    if not project_dir.exists():
+        await websocket.send_json({"type": "error", "content": "Project directory not found"})
        await websocket.close(code=4004, reason="Project directory not found")
        return
-
-    await websocket.accept()
    logger.info(f"Assistant WebSocket connected for project: {project_name}")

    session: Optional[AssistantChatSession] = None
@@ -297,6 +305,34 @@ async def assistant_chat_websocket(websocket: WebSocket, project_name: str):
                    async for chunk in session.send_message(user_content):
                        await websocket.send_json(chunk)

+                elif msg_type == "answer":
+                    # User answered a structured question
+                    if not session:
+                        session = get_session(project_name)
+                        if not session:
+                            await websocket.send_json({
+                                "type": "error",
+                                "content": "No active session. Send 'start' first."
+                            })
+                            continue
+
+                    # Format the answers as a natural response
+                    answers = message.get("answers", {})
+                    if isinstance(answers, dict):
+                        response_parts = []
+                        for question_idx, answer_value in answers.items():
+                            if isinstance(answer_value, list):
+                                response_parts.append(", ".join(answer_value))
+                            else:
+                                response_parts.append(str(answer_value))
+                        user_response = "; ".join(response_parts) if response_parts else "OK"
+                    else:
+                        user_response = str(answers)
+
+                    # Stream Claude's response
+                    async for chunk in session.send_message(user_response):
+                        await websocket.send_json(chunk)
+
                else:
                    await websocket.send_json({
                        "type": "error",
--- a/server/routers/devserver.py
+++ b/server/routers/devserver.py
@@ -7,6 +7,7 @@ Uses project registry for path lookups and project_config for command detection.
 """

 import logging
+import shlex
 import sys
 from pathlib import Path

@@ -72,6 +73,116 @@ def get_project_dir(project_name: str) -> Path:

    return project_dir

+ALLOWED_RUNNERS = {
+    "npm", "pnpm", "yarn", "npx",
+    "uvicorn", "python", "python3",
+    "flask", "poetry",
+    "cargo", "go",
+}
+
+ALLOWED_NPM_SCRIPTS = {"dev", "start", "serve", "develop", "server", "preview"}
+
+# Allowed Python -m modules for dev servers
+ALLOWED_PYTHON_MODULES = {"uvicorn", "flask", "gunicorn", "http.server"}
+
+BLOCKED_SHELLS = {"sh", "bash", "zsh", "cmd", "powershell", "pwsh", "cmd.exe"}
+
+
+def validate_custom_command_strict(cmd: str) -> None:
+    """
+    Strict allowlist validation for dev server commands.
+    Prevents arbitrary command execution (no sh -c, no cmd /c, no python -c, etc.)
+    """
+    if not isinstance(cmd, str) or not cmd.strip():
+        raise ValueError("custom_command cannot be empty")
+
+    argv = shlex.split(cmd, posix=(sys.platform != "win32"))
+    if not argv:
+        raise ValueError("custom_command could not be parsed")
+
+    base = Path(argv[0]).name.lower()
+
+    # Block direct shells / interpreters commonly used for command injection
+    if base in BLOCKED_SHELLS:
+        raise ValueError(f"custom_command runner not allowed: {base}")
+
+    if base not in ALLOWED_RUNNERS:
+        raise ValueError(
+            f"custom_command runner not allowed: {base}. "
+            f"Allowed: {', '.join(sorted(ALLOWED_RUNNERS))}"
+        )
+
+    # Block one-liner execution for python
+    lowered = [a.lower() for a in argv]
+    if base in {"python", "python3"}:
+        if "-c" in lowered:
+            raise ValueError("python -c is not allowed")
+        if len(argv) >= 3 and argv[1] == "-m":
+            # Allow: python -m <allowed_module> ...
+            if argv[2] not in ALLOWED_PYTHON_MODULES:
+                raise ValueError(
+                    f"python -m {argv[2]} is not allowed. "
+                    f"Allowed modules: {', '.join(sorted(ALLOWED_PYTHON_MODULES))}"
+                )
+        elif len(argv) >= 2 and argv[1].endswith(".py"):
+            # Allow: python manage.py runserver, python app.py, etc.
+            pass
+        else:
+            raise ValueError(
+                "Python commands must use 'python -m <module> ...' or 'python <script>.py ...'"
+            )
+
+    if base == "flask":
+        # Allow: flask run [--host ...] [--port ...]
+        if len(argv) < 2 or argv[1] != "run":
+            raise ValueError("flask custom_command must be 'flask run [options]'")
+
+    if base == "poetry":
+        # Allow: poetry run <subcmd> ...
+        if len(argv) < 3 or argv[1] != "run":
+            raise ValueError("poetry custom_command must be 'poetry run <command> ...'")
+
+    if base == "uvicorn":
+        if len(argv) < 2 or ":" not in argv[1]:
+            raise ValueError("uvicorn must specify an app like module:app")
+
+        allowed_flags = {"--host", "--port", "--reload", "--log-level", "--workers"}
+        for a in argv[2:]:
+            if a.startswith("-"):
+                # Handle --flag=value syntax
+                flag_key = a.split("=", 1)[0]
+                if flag_key not in allowed_flags:
+                    raise ValueError(f"uvicorn flag not allowed: {flag_key}")
+
+    if base in {"npm", "pnpm", "yarn"}:
+        # Allow only known safe scripts (no arbitrary exec)
+        if base == "npm":
+            if len(argv) < 3 or argv[1] != "run" or argv[2] not in ALLOWED_NPM_SCRIPTS:
+                raise ValueError(
+                    f"npm custom_command must be 'npm run <script>' where script is one of: "
+                    f"{', '.join(sorted(ALLOWED_NPM_SCRIPTS))}"
+                )
+        elif base == "pnpm":
+            ok = (
+                (len(argv) >= 2 and argv[1] in ALLOWED_NPM_SCRIPTS)
+                or (len(argv) >= 3 and argv[1] == "run" and argv[2] in ALLOWED_NPM_SCRIPTS)
+            )
+            if not ok:
+                raise ValueError(
+                    f"pnpm custom_command must use a known script: "
+                    f"{', '.join(sorted(ALLOWED_NPM_SCRIPTS))}"
+                )
+        elif base == "yarn":
+            ok = (
+                (len(argv) >= 2 and argv[1] in ALLOWED_NPM_SCRIPTS)
+                or (len(argv) >= 3 and argv[1] == "run" and argv[2] in ALLOWED_NPM_SCRIPTS)
+            )
+            if not ok:
+                raise ValueError(
+                    f"yarn custom_command must use a known script: "
+                    f"{', '.join(sorted(ALLOWED_NPM_SCRIPTS))}"
+                )
+

 def get_project_devserver_manager(project_name: str):
    """
@@ -180,9 +291,12 @@ async def start_devserver(
    # Determine which command to use
    command: str | None
    if request.command:
-        command = request.command
-    else:
-        command = get_dev_command(project_dir)
+        raise HTTPException(
+            status_code=400,
+            detail="Direct command execution is disabled. Use /config to set a safe custom_command."
+        )
+
+    command = get_dev_command(project_dir)

    if not command:
        raise HTTPException(
@@ -193,6 +307,13 @@ async def start_devserver(
    # Validate command against security allowlist before execution
    validate_dev_command(command, project_dir)

+    # Defense-in-depth: also run strict structural validation at execution time
+    # (catches config file tampering that bypasses the /config endpoint)
+    try:
+        validate_custom_command_strict(command)
+    except ValueError as e:
+        raise HTTPException(status_code=400, detail=str(e))
+
    # Now command is definitely str and validated
    success, message = await manager.start(command)

@@ -284,7 +405,13 @@ async def update_devserver_config(
        except ValueError as e:
            raise HTTPException(status_code=400, detail=str(e))
    else:
-        # Validate command against security allowlist before persisting
+        # Strict structural validation first (most specific errors)
+        try:
+            validate_custom_command_strict(update.custom_command)
+        except ValueError as e:
+            raise HTTPException(status_code=400, detail=str(e))
+
+        # Then validate against security allowlist
        validate_dev_command(update.custom_command, project_dir)

        # Set the custom command
--- a/server/routers/expand_project.py
+++ b/server/routers/expand_project.py
@@ -104,19 +104,26 @@ async def expand_project_websocket(websocket: WebSocket, project_name: str):
    - {"type": "error", "content": "..."} - Error message
    - {"type": "pong"} - Keep-alive pong
    """
+    # Always accept the WebSocket first to avoid opaque 403 errors.
+    # Starlette returns 403 if we close before accepting.
+    await websocket.accept()
+
    try:
        project_name = validate_project_name(project_name)
    except HTTPException:
+        await websocket.send_json({"type": "error", "content": "Invalid project name"})
        await websocket.close(code=4000, reason="Invalid project name")
        return

    # Look up project directory from registry
    project_dir = _get_project_path(project_name)
    if not project_dir:
+        await websocket.send_json({"type": "error", "content": "Project not found in registry"})
        await websocket.close(code=4004, reason="Project not found in registry")
        return

    if not project_dir.exists():
+        await websocket.send_json({"type": "error", "content": "Project directory not found"})
        await websocket.close(code=4004, reason="Project directory not found")
        return

@@ -124,11 +131,10 @@ async def expand_project_websocket(websocket: WebSocket, project_name: str):
    from autoforge_paths import get_prompts_dir
    spec_path = get_prompts_dir(project_dir) / "app_spec.txt"
    if not spec_path.exists():
+        await websocket.send_json({"type": "error", "content": "Project has no spec. Create a spec first before expanding."})
        await websocket.close(code=4004, reason="Project has no spec. Create spec first.")
        return

-    await websocket.accept()
-
    session: Optional[ExpandChatSession] = None

    try:
--- a/server/routers/settings.py
+++ b/server/routers/settings.py
@@ -7,12 +7,11 @@ Settings are stored in the registry database and shared across all projects.
 """

 import mimetypes
-import os
 import sys

 from fastapi import APIRouter

-from ..schemas import ModelInfo, ModelsResponse, SettingsResponse, SettingsUpdate
+from ..schemas import ModelInfo, ModelsResponse, ProviderInfo, ProvidersResponse, SettingsResponse, SettingsUpdate
 from ..services.chat_constants import ROOT_DIR

 # Mimetype fix for Windows - must run before StaticFiles is mounted
@@ -23,9 +22,11 @@ if str(ROOT_DIR) not in sys.path:
    sys.path.insert(0, str(ROOT_DIR))

 from registry import (
+    API_PROVIDERS,
    AVAILABLE_MODELS,
    DEFAULT_MODEL,
    get_all_settings,
+    get_setting,
    set_setting,
 )

@@ -37,26 +38,40 @@ def _parse_yolo_mode(value: str | None) -> bool:
    return (value or "false").lower() == "true"


-def _is_glm_mode() -> bool:
-    """Check if GLM API is configured via environment variables."""
-    base_url = os.getenv("ANTHROPIC_BASE_URL", "")
-    # GLM mode is when ANTHROPIC_BASE_URL is set but NOT pointing to Ollama
-    return bool(base_url) and not _is_ollama_mode()
-
-
-def _is_ollama_mode() -> bool:
-    """Check if Ollama API is configured via environment variables."""
-    base_url = os.getenv("ANTHROPIC_BASE_URL", "")
-    return "localhost:11434" in base_url or "127.0.0.1:11434" in base_url
+@router.get("/providers", response_model=ProvidersResponse)
+async def get_available_providers():
+    """Get list of available API providers."""
+    current = get_setting("api_provider", "claude") or "claude"
+    providers = []
+    for pid, pdata in API_PROVIDERS.items():
+        providers.append(ProviderInfo(
+            id=pid,
+            name=pdata["name"],
+            base_url=pdata.get("base_url"),
+            models=[ModelInfo(id=m["id"], name=m["name"]) for m in pdata.get("models", [])],
+            default_model=pdata.get("default_model", ""),
+            requires_auth=pdata.get("requires_auth", False),
+        ))
+    return ProvidersResponse(providers=providers, current=current)


@router.get("/models", response_model=ModelsResponse)
 async def get_available_models():
    """Get list of available models.

-    Frontend should call this to get the current list of models
-    instead of hardcoding them.
+    Returns models for the currently selected API provider.
    """
+    current_provider = get_setting("api_provider", "claude") or "claude"
+    provider = API_PROVIDERS.get(current_provider)
+
+    if provider and current_provider != "claude":
+        provider_models = provider.get("models", [])
+        return ModelsResponse(
+            models=[ModelInfo(id=m["id"], name=m["name"]) for m in provider_models],
+            default=provider.get("default_model", ""),
+        )
+
+    # Default: return Claude models
    return ModelsResponse(
        models=[ModelInfo(id=m["id"], name=m["name"]) for m in AVAILABLE_MODELS],
        default=DEFAULT_MODEL,
@@ -85,14 +100,23 @@ async def get_settings():
    """Get current global settings."""
    all_settings = get_all_settings()

+    api_provider = all_settings.get("api_provider", "claude")
+
+    glm_mode = api_provider == "glm"
+    ollama_mode = api_provider == "ollama"
+
    return SettingsResponse(
        yolo_mode=_parse_yolo_mode(all_settings.get("yolo_mode")),
        model=all_settings.get("model", DEFAULT_MODEL),
-        glm_mode=_is_glm_mode(),
-        ollama_mode=_is_ollama_mode(),
+        glm_mode=glm_mode,
+        ollama_mode=ollama_mode,
        testing_agent_ratio=_parse_int(all_settings.get("testing_agent_ratio"), 1),
        playwright_headless=_parse_bool(all_settings.get("playwright_headless"), default=True),
        batch_size=_parse_int(all_settings.get("batch_size"), 3),
+        api_provider=api_provider,
+        api_base_url=all_settings.get("api_base_url"),
+        api_has_auth_token=bool(all_settings.get("api_auth_token")),
+        api_model=all_settings.get("api_model"),
    )


@@ -114,14 +138,47 @@ async def update_settings(update: SettingsUpdate):
    if update.batch_size is not None:
        set_setting("batch_size", str(update.batch_size))

+    # API provider settings
+    if update.api_provider is not None:
+        old_provider = get_setting("api_provider", "claude")
+        set_setting("api_provider", update.api_provider)
+
+        # When provider changes, auto-set defaults for the new provider
+        if update.api_provider != old_provider:
+            provider = API_PROVIDERS.get(update.api_provider)
+            if provider:
+                # Auto-set base URL from provider definition
+                if provider.get("base_url"):
+                    set_setting("api_base_url", provider["base_url"])
+                # Auto-set model to provider's default
+                if provider.get("default_model") and update.api_model is None:
+                    set_setting("api_model", provider["default_model"])
+
+    if update.api_base_url is not None:
+        set_setting("api_base_url", update.api_base_url)
+
+    if update.api_auth_token is not None:
+        set_setting("api_auth_token", update.api_auth_token)
+
+    if update.api_model is not None:
+        set_setting("api_model", update.api_model)
+
    # Return updated settings
    all_settings = get_all_settings()
+    api_provider = all_settings.get("api_provider", "claude")
+    glm_mode = api_provider == "glm"
+    ollama_mode = api_provider == "ollama"
+
    return SettingsResponse(
        yolo_mode=_parse_yolo_mode(all_settings.get("yolo_mode")),
        model=all_settings.get("model", DEFAULT_MODEL),
-        glm_mode=_is_glm_mode(),
-        ollama_mode=_is_ollama_mode(),
+        glm_mode=glm_mode,
+        ollama_mode=ollama_mode,
        testing_agent_ratio=_parse_int(all_settings.get("testing_agent_ratio"), 1),
        playwright_headless=_parse_bool(all_settings.get("playwright_headless"), default=True),
        batch_size=_parse_int(all_settings.get("batch_size"), 3),
+        api_provider=api_provider,
+        api_base_url=all_settings.get("api_base_url"),
+        api_has_auth_token=bool(all_settings.get("api_auth_token")),
+        api_model=all_settings.get("api_model"),
    )
--- a/server/routers/spec_creation.py
+++ b/server/routers/spec_creation.py
@@ -21,7 +21,7 @@ from ..services.spec_chat_session import (
    remove_session,
 )
 from ..utils.project_helpers import get_project_path as _get_project_path
-from ..utils.validation import is_valid_project_name as validate_project_name
+from ..utils.validation import is_valid_project_name, validate_project_name

 logger = logging.getLogger(__name__)

@@ -49,7 +49,7 @@ async def list_spec_sessions():
@router.get("/sessions/{project_name}", response_model=SpecSessionStatus)
 async def get_session_status(project_name: str):
    """Get status of a spec creation session."""
-    if not validate_project_name(project_name):
+    if not is_valid_project_name(project_name):
        raise HTTPException(status_code=400, detail="Invalid project name")

    session = get_session(project_name)
@@ -67,7 +67,7 @@ async def get_session_status(project_name: str):
@router.delete("/sessions/{project_name}")
 async def cancel_session(project_name: str):
    """Cancel and remove a spec creation session."""
-    if not validate_project_name(project_name):
+    if not is_valid_project_name(project_name):
        raise HTTPException(status_code=400, detail="Invalid project name")

    session = get_session(project_name)
@@ -95,7 +95,7 @@ async def get_spec_file_status(project_name: str):
    This is used for polling to detect when Claude has finished writing spec files.
    Claude writes this status file as the final step after completing all spec work.
    """
-    if not validate_project_name(project_name):
+    if not is_valid_project_name(project_name):
        raise HTTPException(status_code=400, detail="Invalid project name")

    project_dir = _get_project_path(project_name)
@@ -166,22 +166,28 @@ async def spec_chat_websocket(websocket: WebSocket, project_name: str):
    - {"type": "error", "content": "..."} - Error message
    - {"type": "pong"} - Keep-alive pong
    """
-    if not validate_project_name(project_name):
+    # Always accept WebSocket first to avoid opaque 403 errors
+    await websocket.accept()
+
+    try:
+        project_name = validate_project_name(project_name)
+    except HTTPException:
+        await websocket.send_json({"type": "error", "content": "Invalid project name"})
        await websocket.close(code=4000, reason="Invalid project name")
        return

    # Look up project directory from registry
    project_dir = _get_project_path(project_name)
    if not project_dir:
+        await websocket.send_json({"type": "error", "content": "Project not found in registry"})
        await websocket.close(code=4004, reason="Project not found in registry")
        return

    if not project_dir.exists():
+        await websocket.send_json({"type": "error", "content": "Project directory not found"})
        await websocket.close(code=4004, reason="Project directory not found")
        return

-    await websocket.accept()
-
    session: Optional[SpecChatSession] = None

    try:
--- a/server/routers/terminal.py
+++ b/server/routers/terminal.py
@@ -26,7 +26,7 @@ from ..services.terminal_manager import (
    stop_terminal_session,
 )
 from ..utils.project_helpers import get_project_path as _get_project_path
-from ..utils.validation import is_valid_project_name as validate_project_name
+from ..utils.validation import is_valid_project_name

 logger = logging.getLogger(__name__)

@@ -89,7 +89,7 @@ async def list_project_terminals(project_name: str) -> list[TerminalInfoResponse
    Returns:
        List of terminal info objects
    """
-    if not validate_project_name(project_name):
+    if not is_valid_project_name(project_name):
        raise HTTPException(status_code=400, detail="Invalid project name")

    project_dir = _get_project_path(project_name)
@@ -122,7 +122,7 @@ async def create_project_terminal(
    Returns:
        The created terminal info
    """
-    if not validate_project_name(project_name):
+    if not is_valid_project_name(project_name):
        raise HTTPException(status_code=400, detail="Invalid project name")

    project_dir = _get_project_path(project_name)
@@ -148,7 +148,7 @@ async def rename_project_terminal(
    Returns:
        The updated terminal info
    """
-    if not validate_project_name(project_name):
+    if not is_valid_project_name(project_name):
        raise HTTPException(status_code=400, detail="Invalid project name")

    if not validate_terminal_id(terminal_id):
@@ -180,7 +180,7 @@ async def delete_project_terminal(project_name: str, terminal_id: str) -> dict:
    Returns:
        Success message
    """
-    if not validate_project_name(project_name):
+    if not is_valid_project_name(project_name):
        raise HTTPException(status_code=400, detail="Invalid project name")

    if not validate_terminal_id(terminal_id):
@@ -221,8 +221,12 @@ async def terminal_websocket(websocket: WebSocket, project_name: str, terminal_i
    - {"type": "pong"} - Keep-alive response
    - {"type": "error", "message": "..."} - Error message
    """
+    # Always accept WebSocket first to avoid opaque 403 errors
+    await websocket.accept()
+
    # Validate project name
-    if not validate_project_name(project_name):
+    if not is_valid_project_name(project_name):
+        await websocket.send_json({"type": "error", "message": "Invalid project name"})
        await websocket.close(
            code=TerminalCloseCode.INVALID_PROJECT_NAME, reason="Invalid project name"
        )
@@ -230,6 +234,7 @@ async def terminal_websocket(websocket: WebSocket, project_name: str, terminal_i

    # Validate terminal ID
    if not validate_terminal_id(terminal_id):
+        await websocket.send_json({"type": "error", "message": "Invalid terminal ID"})
        await websocket.close(
            code=TerminalCloseCode.INVALID_PROJECT_NAME, reason="Invalid terminal ID"
        )
@@ -238,6 +243,7 @@ async def terminal_websocket(websocket: WebSocket, project_name: str, terminal_i
    # Look up project directory from registry
    project_dir = _get_project_path(project_name)
    if not project_dir:
+        await websocket.send_json({"type": "error", "message": "Project not found in registry"})
        await websocket.close(
            code=TerminalCloseCode.PROJECT_NOT_FOUND,
            reason="Project not found in registry",
@@ -245,6 +251,7 @@ async def terminal_websocket(websocket: WebSocket, project_name: str, terminal_i
        return

    if not project_dir.exists():
+        await websocket.send_json({"type": "error", "message": "Project directory not found"})
        await websocket.close(
            code=TerminalCloseCode.PROJECT_NOT_FOUND,
            reason="Project directory not found",
@@ -254,14 +261,13 @@ async def terminal_websocket(websocket: WebSocket, project_name: str, terminal_i
    # Verify terminal exists in metadata
    terminal_info = get_terminal_info(project_name, terminal_id)
    if not terminal_info:
+        await websocket.send_json({"type": "error", "message": "Terminal not found"})
        await websocket.close(
            code=TerminalCloseCode.PROJECT_NOT_FOUND,
            reason="Terminal not found",
        )
        return

-    await websocket.accept()
-
    # Get or create terminal session for this project/terminal
    session = get_terminal_session(project_name, project_dir, terminal_id)

--- a/server/schemas.py
+++ b/server/schemas.py
@@ -190,9 +190,12 @@ class AgentStartRequest(BaseModel):
    @field_validator('model')
    @classmethod
    def validate_model(cls, v: str | None) -> str | None:
-        """Validate model is in the allowed list."""
+        """Validate model is in the allowed list (Claude) or allow any model for alternative providers."""
        if v is not None and v not in VALID_MODELS:
-            raise ValueError(f"Invalid model. Must be one of: {VALID_MODELS}")
+            from registry import get_all_settings
+            settings = get_all_settings()
+            if settings.get("api_provider", "claude") == "claude":
+                raise ValueError(f"Invalid model. Must be one of: {VALID_MODELS}")
        return v

    @field_validator('max_concurrency')
@@ -391,15 +394,35 @@ class ModelInfo(BaseModel):
    name: str


+class ProviderInfo(BaseModel):
+    """Information about an API provider."""
+    id: str
+    name: str
+    base_url: str | None = None
+    models: list[ModelInfo]
+    default_model: str
+    requires_auth: bool = False
+
+
+class ProvidersResponse(BaseModel):
+    """Response schema for available providers list."""
+    providers: list[ProviderInfo]
+    current: str
+
+
 class SettingsResponse(BaseModel):
    """Response schema for global settings."""
    yolo_mode: bool = False
    model: str = DEFAULT_MODEL
-    glm_mode: bool = False  # True if GLM API is configured via .env
-    ollama_mode: bool = False  # True if Ollama API is configured via .env
+    glm_mode: bool = False  # True when api_provider is "glm"
+    ollama_mode: bool = False  # True when api_provider is "ollama"
    testing_agent_ratio: int = 1  # Regression testing agents (0-3)
    playwright_headless: bool = True
    batch_size: int = 3  # Features per coding agent batch (1-3)
+    api_provider: str = "claude"
+    api_base_url: str | None = None
+    api_has_auth_token: bool = False  # Never expose actual token
+    api_model: str | None = None


 class ModelsResponse(BaseModel):
@@ -415,12 +438,30 @@ class SettingsUpdate(BaseModel):
    testing_agent_ratio: int | None = None  # 0-3
    playwright_headless: bool | None = None
    batch_size: int | None = None  # Features per agent batch (1-3)
+    api_provider: str | None = None
+    api_base_url: str | None = Field(None, max_length=500)
+    api_auth_token: str | None = Field(None, max_length=500)  # Write-only, never returned
+    api_model: str | None = Field(None, max_length=200)
+
+    @field_validator('api_base_url')
+    @classmethod
+    def validate_api_base_url(cls, v: str | None) -> str | None:
+        if v is not None and v.strip():
+            v = v.strip()
+            if not v.startswith(("http://", "https://")):
+                raise ValueError("api_base_url must start with http:// or https://")
+        return v

    @field_validator('model')
    @classmethod
-    def validate_model(cls, v: str | None) -> str | None:
-        if v is not None and v not in VALID_MODELS:
-            raise ValueError(f"Invalid model. Must be one of: {VALID_MODELS}")
+    def validate_model(cls, v: str | None, info) -> str | None:  # type: ignore[override]
+        if v is not None:
+            # Skip VALID_MODELS check when using an alternative API provider
+            api_provider = info.data.get("api_provider")
+            if api_provider and api_provider != "claude":
+                return v
+            if v not in VALID_MODELS:
+                raise ValueError(f"Invalid model. Must be one of: {VALID_MODELS}")
        return v

    @field_validator('testing_agent_ratio')
@@ -533,9 +574,12 @@ class ScheduleCreate(BaseModel):
    @field_validator('model')
    @classmethod
    def validate_model(cls, v: str | None) -> str | None:
-        """Validate model is in the allowed list."""
+        """Validate model is in the allowed list (Claude) or allow any model for alternative providers."""
        if v is not None and v not in VALID_MODELS:
-            raise ValueError(f"Invalid model. Must be one of: {VALID_MODELS}")
+            from registry import get_all_settings
+            settings = get_all_settings()
+            if settings.get("api_provider", "claude") == "claude":
+                raise ValueError(f"Invalid model. Must be one of: {VALID_MODELS}")
        return v


@@ -555,9 +599,12 @@ class ScheduleUpdate(BaseModel):
    @field_validator('model')
    @classmethod
    def validate_model(cls, v: str | None) -> str | None:
-        """Validate model is in the allowed list."""
+        """Validate model is in the allowed list (Claude) or allow any model for alternative providers."""
        if v is not None and v not in VALID_MODELS:
-            raise ValueError(f"Invalid model. Must be one of: {VALID_MODELS}")
+            from registry import get_all_settings
+            settings = get_all_settings()
+            if settings.get("api_provider", "claude") == "claude":
+                raise ValueError(f"Invalid model. Must be one of: {VALID_MODELS}")
        return v


--- a/server/services/assistant_chat_session.py
+++ b/server/services/assistant_chat_session.py
@@ -25,7 +25,7 @@ from .assistant_database import (
    create_conversation,
    get_messages,
 )
-from .chat_constants import API_ENV_VARS, ROOT_DIR
+from .chat_constants import ROOT_DIR

 # Load environment variables from .env file if present
 load_dotenv()
@@ -47,8 +47,13 @@ FEATURE_MANAGEMENT_TOOLS = [
    "mcp__features__feature_skip",
 ]

+# Interactive tools
+INTERACTIVE_TOOLS = [
+    "mcp__features__ask_user",
+]
+
 # Combined list for assistant
-ASSISTANT_FEATURE_TOOLS = READONLY_FEATURE_MCP_TOOLS + FEATURE_MANAGEMENT_TOOLS
+ASSISTANT_FEATURE_TOOLS = READONLY_FEATURE_MCP_TOOLS + FEATURE_MANAGEMENT_TOOLS + INTERACTIVE_TOOLS

 # Read-only built-in tools (no Write, Edit, Bash)
 READONLY_BUILTIN_TOOLS = [
@@ -123,6 +128,9 @@ If the user asks you to modify code, explain that you're a project assistant and
 - **feature_create_bulk**: Create multiple features at once
 - **feature_skip**: Move a feature to the end of the queue

+**Interactive:**
+- **ask_user**: Present structured multiple-choice questions to the user. Use this when you need to clarify requirements, offer design choices, or guide a decision. The user sees clickable option buttons and their selection is returned as your next message.
+
 ## Creating Features

 When a user asks to add a feature, use the `feature_create` or `feature_create_bulk` MCP tools directly:
@@ -157,7 +165,7 @@ class AssistantChatSession:
    """
    Manages a read-only assistant conversation for a project.

-    Uses Claude Opus 4.5 with only read-only tools enabled.
+    Uses Claude Opus with only read-only tools enabled.
    Persists conversation history to SQLite.
    """

@@ -258,15 +266,11 @@ class AssistantChatSession:
        system_cli = shutil.which("claude")

        # Build environment overrides for API configuration
-        sdk_env: dict[str, str] = {}
-        for var in API_ENV_VARS:
-            value = os.getenv(var)
-            if value:
-                sdk_env[var] = value
+        from registry import DEFAULT_MODEL, get_effective_sdk_env
+        sdk_env = get_effective_sdk_env()

-        # Determine model from environment or use default
-        # This allows using alternative APIs (e.g., GLM via z.ai) that may not support Claude model names
-        model = os.getenv("ANTHROPIC_DEFAULT_OPUS_MODEL", "claude-opus-4-5-20251101")
+        # Determine model from SDK env (provider-aware) or fallback to env/default
+        model = sdk_env.get("ANTHROPIC_DEFAULT_OPUS_MODEL") or os.getenv("ANTHROPIC_DEFAULT_OPUS_MODEL", DEFAULT_MODEL)

        try:
            logger.info("Creating ClaudeSDKClient...")
@@ -406,6 +410,17 @@ class AssistantChatSession:
                    elif block_type == "ToolUseBlock" and hasattr(block, "name"):
                        tool_name = block.name
                        tool_input = getattr(block, "input", {})
+
+                        # Intercept ask_user tool calls -> yield as question message
+                        if tool_name == "mcp__features__ask_user":
+                            questions = tool_input.get("questions", [])
+                            if questions:
+                                yield {
+                                    "type": "question",
+                                    "questions": questions,
+                                }
+                                continue
+
                        yield {
                            "type": "tool_call",
                            "tool": tool_name,
--- a/server/services/dev_server_manager.py
+++ b/server/services/dev_server_manager.py
@@ -14,17 +14,17 @@ This is a simplified version of AgentProcessManager, tailored for dev servers:
 import asyncio
 import logging
 import re
+import shlex
 import subprocess
 import sys
 import threading
-from datetime import datetime
+from datetime import datetime, timezone
 from pathlib import Path
 from typing import Awaitable, Callable, Literal, Set

 import psutil

 from registry import list_registered_projects
-from security import extract_commands, get_effective_commands, is_command_allowed
 from server.utils.process_utils import kill_process_tree

 logger = logging.getLogger(__name__)
@@ -291,53 +291,54 @@ class DevServerProcessManager:
        Start the dev server as a subprocess.

        Args:
-            command: The shell command to run (e.g., "npm run dev")
+            command: The command to run (e.g., "npm run dev")

        Returns:
            Tuple of (success, message)
        """
-        if self.status == "running":
+        # Already running?
+        if self.process and self.status == "running":
            return False, "Dev server is already running"

+        # Lock check (prevents double-start)
        if not self._check_lock():
-            return False, "Another dev server instance is already running for this project"
+            return False, "Dev server already running (lock file present)"

-        # Validate that project directory exists
-        if not self.project_dir.exists():
-            return False, f"Project directory does not exist: {self.project_dir}"
+        command = (command or "").strip()
+        if not command:
+            return False, "Empty dev server command"

-        # Defense-in-depth: validate command against security allowlist
-        commands = extract_commands(command)
-        if not commands:
-            return False, "Could not parse command for security validation"
+        # SECURITY: block shell operators/metacharacters (defense-in-depth)
+        # NOTE: On Windows, .cmd/.bat files are executed via cmd.exe even with
+        # shell=False (CPython limitation), so metacharacter blocking is critical.
+        # Single & is a cmd.exe command separator, ^ is cmd escape, % enables
+        # environment variable expansion, > < enable redirection.
+        dangerous_ops = ["&&", "||", ";", "|", "`", "$(", "&", ">", "<", "^", "%"]
+        if any(op in command for op in dangerous_ops):
+            return False, "Shell operators are not allowed in dev server command"
+        # Block newline injection (cmd.exe interprets newlines as command separators)
+        if "\n" in command or "\r" in command:
+            return False, "Newlines are not allowed in dev server command"

-        allowed_commands, blocked_commands = get_effective_commands(self.project_dir)
-        for cmd in commands:
-            if cmd in blocked_commands:
-                logger.warning("Blocked dev server command '%s' (in blocklist) for %s", cmd, self.project_name)
-                return False, f"Command '{cmd}' is blocked and cannot be used as a dev server command"
-            if not is_command_allowed(cmd, allowed_commands):
-                logger.warning("Rejected dev server command '%s' (not in allowlist) for %s", cmd, self.project_name)
-                return False, f"Command '{cmd}' is not in the allowed commands list"
+        # Parse into argv and execute without shell
+        argv = shlex.split(command, posix=(sys.platform != "win32"))
+        if not argv:
+            return False, "Empty dev server command"

-        self._command = command
-        self._detected_url = None  # Reset URL detection
+        base = Path(argv[0]).name.lower()
+
+        # Defense-in-depth: reject direct shells/interpreters commonly used for injection
+        if base in {"sh", "bash", "zsh", "cmd", "powershell", "pwsh"}:
+            return False, f"Shell runner '{base}' is not allowed for dev server commands"
+
+        # Windows: use .cmd shims for Node package managers
+        if sys.platform == "win32" and base in {"npm", "pnpm", "yarn", "npx"} and not argv[0].lower().endswith(".cmd"):
+            argv[0] = argv[0] + ".cmd"

        try:
-            # Determine shell based on platform
-            if sys.platform == "win32":
-                # On Windows, use cmd.exe
-                shell_cmd = ["cmd", "/c", command]
-            else:
-                # On Unix-like systems, use sh
-                shell_cmd = ["sh", "-c", command]
-
-            # Start subprocess with piped stdout/stderr
-            # stdin=DEVNULL prevents interactive dev servers from blocking on stdin
-            # On Windows, use CREATE_NO_WINDOW to prevent console window from flashing
            if sys.platform == "win32":
                self.process = subprocess.Popen(
-                    shell_cmd,
+                    argv,
                    stdin=subprocess.DEVNULL,
                    stdout=subprocess.PIPE,
                    stderr=subprocess.STDOUT,
@@ -346,23 +347,33 @@ class DevServerProcessManager:
                )
            else:
                self.process = subprocess.Popen(
-                    shell_cmd,
+                    argv,
                    stdin=subprocess.DEVNULL,
                    stdout=subprocess.PIPE,
                    stderr=subprocess.STDOUT,
                    cwd=str(self.project_dir),
                )

-            self._create_lock()
-            self.started_at = datetime.now()
-            self.status = "running"
+            self._command = command
+            self.started_at = datetime.now(timezone.utc)
+            self._detected_url = None

-            # Start output streaming task
+            # Create lock once we have a PID
+            self._create_lock()
+
+            # Start output streaming
+            self.status = "running"
            self._output_task = asyncio.create_task(self._stream_output())

-            return True, f"Dev server started with PID {self.process.pid}"
+            return True, "Dev server started"
+
+        except FileNotFoundError:
+            self.status = "stopped"
+            self.process = None
+            return False, f"Command not found: {argv[0]}"
        except Exception as e:
-            logger.exception("Failed to start dev server")
+            self.status = "stopped"
+            self.process = None
            return False, f"Failed to start dev server: {e}"

    async def stop(self) -> tuple[bool, str]:
--- a/server/services/expand_chat_session.py
+++ b/server/services/expand_chat_session.py
@@ -22,7 +22,7 @@ from claude_agent_sdk import ClaudeAgentOptions, ClaudeSDKClient
 from dotenv import load_dotenv

 from ..schemas import ImageAttachment
-from .chat_constants import API_ENV_VARS, ROOT_DIR, make_multimodal_message
+from .chat_constants import ROOT_DIR, make_multimodal_message

 # Load environment variables from .env file if present
 load_dotenv()
@@ -154,16 +154,11 @@ class ExpandChatSession:
        system_prompt = skill_content.replace("$ARGUMENTS", project_path)

        # Build environment overrides for API configuration
-        # Filter to only include vars that are actually set (non-None)
-        sdk_env: dict[str, str] = {}
-        for var in API_ENV_VARS:
-            value = os.getenv(var)
-            if value:
-                sdk_env[var] = value
+        from registry import DEFAULT_MODEL, get_effective_sdk_env
+        sdk_env = get_effective_sdk_env()

-        # Determine model from environment or use default
-        # This allows using alternative APIs (e.g., GLM via z.ai) that may not support Claude model names
-        model = os.getenv("ANTHROPIC_DEFAULT_OPUS_MODEL", "claude-opus-4-5-20251101")
+        # Determine model from SDK env (provider-aware) or fallback to env/default
+        model = sdk_env.get("ANTHROPIC_DEFAULT_OPUS_MODEL") or os.getenv("ANTHROPIC_DEFAULT_OPUS_MODEL", DEFAULT_MODEL)

        # Build MCP servers config for feature creation
        mcp_servers = {
--- a/server/services/process_manager.py
+++ b/server/services/process_manager.py
@@ -227,6 +227,46 @@ class AgentProcessManager:
        """Remove lock file."""
        self.lock_file.unlink(missing_ok=True)

+    def _cleanup_stale_features(self) -> None:
+        """Clear in_progress flag for all features when agent stops/crashes.
+
+        When the agent process exits (normally or crash), any features left
+        with in_progress=True were being worked on and didn't complete.
+        Reset them so they can be picked up on next agent start.
+        """
+        try:
+            from autoforge_paths import get_features_db_path
+            features_db = get_features_db_path(self.project_dir)
+            if not features_db.exists():
+                return
+
+            from sqlalchemy import create_engine
+            from sqlalchemy.orm import sessionmaker
+
+            from api.database import Feature
+
+            engine = create_engine(f"sqlite:///{features_db}")
+            Session = sessionmaker(bind=engine)
+            session = Session()
+            try:
+                stuck = session.query(Feature).filter(
+                    Feature.in_progress == True,  # noqa: E712
+                    Feature.passes == False,  # noqa: E712
+                ).all()
+                if stuck:
+                    for f in stuck:
+                        f.in_progress = False
+                    session.commit()
+                    logger.info(
+                        "Cleaned up %d stuck feature(s) for %s",
+                        len(stuck), self.project_name,
+                    )
+            finally:
+                session.close()
+                engine.dispose()
+        except Exception as e:
+            logger.warning("Failed to cleanup features for %s: %s", self.project_name, e)
+
    async def _broadcast_output(self, line: str) -> None:
        """Broadcast output line to all registered callbacks."""
        with self._callbacks_lock:
@@ -288,6 +328,7 @@ class AgentProcessManager:
                    self.status = "crashed"
                elif self.status == "running":
                    self.status = "stopped"
+                self._cleanup_stale_features()
                self._remove_lock()

    async def start(
@@ -305,7 +346,7 @@ class AgentProcessManager:

        Args:
            yolo_mode: If True, run in YOLO mode (skip testing agents)
-            model: Model to use (e.g., claude-opus-4-5-20251101)
+            model: Model to use (e.g., claude-opus-4-6)
            parallel_mode: DEPRECATED - ignored, always uses unified orchestrator
            max_concurrency: Max concurrent coding agents (1-5, default 1)
            testing_agent_ratio: Number of regression testing agents (0-3, default 1)
@@ -320,6 +361,9 @@ class AgentProcessManager:
        if not self._check_lock():
            return False, "Another agent instance is already running for this project"

+        # Clean up features stuck from a previous crash/stop
+        self._cleanup_stale_features()
+
        # Store for status queries
        self.yolo_mode = yolo_mode
        self.model = model
@@ -359,12 +403,23 @@ class AgentProcessManager:
            # stdin=DEVNULL prevents blocking if Claude CLI or child process tries to read stdin
            # CREATE_NO_WINDOW on Windows prevents console window pop-ups
            # PYTHONUNBUFFERED ensures output isn't delayed
+            # Build subprocess environment with API provider settings
+            from registry import get_effective_sdk_env
+            api_env = get_effective_sdk_env()
+            subprocess_env = {
+                **os.environ,
+                "PYTHONUNBUFFERED": "1",
+                "PLAYWRIGHT_HEADLESS": "true" if playwright_headless else "false",
+                "NODE_COMPILE_CACHE": "",  # Disable V8 compile caching to prevent .node file accumulation in %TEMP%
+                **api_env,
+            }
+
            popen_kwargs: dict[str, Any] = {
                "stdin": subprocess.DEVNULL,
                "stdout": subprocess.PIPE,
                "stderr": subprocess.STDOUT,
                "cwd": str(self.project_dir),
-                "env": {**os.environ, "PYTHONUNBUFFERED": "1", "PLAYWRIGHT_HEADLESS": "true" if playwright_headless else "false"},
+                "env": subprocess_env,
            }
            if sys.platform == "win32":
                popen_kwargs["creationflags"] = subprocess.CREATE_NO_WINDOW
@@ -425,6 +480,7 @@ class AgentProcessManager:
                result.children_terminated, result.children_killed
            )

+            self._cleanup_stale_features()
            self._remove_lock()
            self.status = "stopped"
            self.process = None
@@ -502,6 +558,7 @@ class AgentProcessManager:
        if poll is not None:
            # Process has terminated
            if self.status in ("running", "paused"):
+                self._cleanup_stale_features()
                self.status = "crashed"
                self._remove_lock()
            return False
--- a/server/services/spec_chat_session.py
+++ b/server/services/spec_chat_session.py
@@ -19,7 +19,7 @@ from claude_agent_sdk import ClaudeAgentOptions, ClaudeSDKClient
 from dotenv import load_dotenv

 from ..schemas import ImageAttachment
-from .chat_constants import API_ENV_VARS, ROOT_DIR, make_multimodal_message
+from .chat_constants import ROOT_DIR, make_multimodal_message

 # Load environment variables from .env file if present
 load_dotenv()
@@ -140,16 +140,11 @@ class SpecChatSession:
        system_cli = shutil.which("claude")

        # Build environment overrides for API configuration
-        # Filter to only include vars that are actually set (non-None)
-        sdk_env: dict[str, str] = {}
-        for var in API_ENV_VARS:
-            value = os.getenv(var)
-            if value:
-                sdk_env[var] = value
+        from registry import DEFAULT_MODEL, get_effective_sdk_env
+        sdk_env = get_effective_sdk_env()

-        # Determine model from environment or use default
-        # This allows using alternative APIs (e.g., GLM via z.ai) that may not support Claude model names
-        model = os.getenv("ANTHROPIC_DEFAULT_OPUS_MODEL", "claude-opus-4-5-20251101")
+        # Determine model from SDK env (provider-aware) or fallback to env/default
+        model = sdk_env.get("ANTHROPIC_DEFAULT_OPUS_MODEL") or os.getenv("ANTHROPIC_DEFAULT_OPUS_MODEL", DEFAULT_MODEL)

        try:
            self.client = ClaudeSDKClient(
--- a/server/websocket.py
+++ b/server/websocket.py
@@ -640,9 +640,7 @@ class ConnectionManager:
        self._lock = asyncio.Lock()

    async def connect(self, websocket: WebSocket, project_name: str):
-        """Accept a WebSocket connection for a project."""
-        await websocket.accept()
-
+        """Register a WebSocket connection for a project (must already be accepted)."""
        async with self._lock:
            if project_name not in self.active_connections:
                self.active_connections[project_name] = set()
@@ -727,16 +725,22 @@ async def project_websocket(websocket: WebSocket, project_name: str):
    - Agent status changes
    - Agent stdout/stderr lines
    """
+    # Always accept WebSocket first to avoid opaque 403 errors
+    await websocket.accept()
+
    if not validate_project_name(project_name):
+        await websocket.send_json({"type": "error", "content": "Invalid project name"})
        await websocket.close(code=4000, reason="Invalid project name")
        return

    project_dir = _get_project_path(project_name)
    if not project_dir:
+        await websocket.send_json({"type": "error", "content": "Project not found in registry"})
        await websocket.close(code=4004, reason="Project not found in registry")
        return

    if not project_dir.exists():
+        await websocket.send_json({"type": "error", "content": "Project directory not found"})
        await websocket.close(code=4004, reason="Project directory not found")
        return

@@ -879,8 +883,7 @@ async def project_websocket(websocket: WebSocket, project_name: str):
                break
            except json.JSONDecodeError:
                logger.warning(f"Invalid JSON from WebSocket: {data[:100] if data else 'empty'}")
-            except Exception as e:
-                logger.warning(f"WebSocket error: {e}")
+            except Exception:
                break

    finally:
--- a/start.py
+++ b/start.py
@@ -390,8 +390,11 @@ def run_agent(project_name: str, project_dir: Path) -> None:
    print(f"Location: {project_dir}")
    print("-" * 50)

-    # Build the command - pass absolute path
-    cmd = [sys.executable, "autonomous_agent_demo.py", "--project-dir", str(project_dir.resolve())]
+    # Build the command - pass absolute path and model from settings
+    from registry import DEFAULT_MODEL, get_all_settings
+    settings = get_all_settings()
+    model = settings.get("api_model") or settings.get("model", DEFAULT_MODEL)
+    cmd = [sys.executable, "autonomous_agent_demo.py", "--project-dir", str(project_dir.resolve()), "--model", model]

    # Run the agent with stderr capture to detect auth errors
    # stdout goes directly to terminal for real-time output
--- a/temp_cleanup.py
+++ b/temp_cleanup.py
@@ -0,0 +1,197 @@
+"""
+Temp Cleanup Module
+===================
+
+Cleans up stale temporary files and directories created by AutoForge agents,
+Playwright, Node.js, and other development tools.
+
+Called at Maestro (orchestrator) startup to prevent temp folder bloat.
+
+Why this exists:
+- Playwright creates browser profiles and artifacts in %TEMP%
+- Node.js creates .node cache files (~7MB each, can accumulate to GBs)
+- MongoDB Memory Server downloads binaries to temp
+- These are never cleaned up automatically
+
+When cleanup runs:
+- At Maestro startup (when you click Play or auto-restart after rate limits)
+- Only files/folders older than 1 hour are deleted (safe for running processes)
+"""
+
+import logging
+import shutil
+import tempfile
+import time
+from pathlib import Path
+
+logger = logging.getLogger(__name__)
+
+# Max age in seconds before a temp item is considered stale (1 hour)
+MAX_AGE_SECONDS = 3600
+
+# Directory patterns to clean up (glob patterns)
+DIR_PATTERNS = [
+    "playwright_firefoxdev_profile-*",  # Playwright Firefox profiles
+    "playwright-artifacts-*",           # Playwright test artifacts
+    "playwright-transform-cache",       # Playwright transform cache
+    "mongodb-memory-server*",           # MongoDB Memory Server binaries
+    "ng-*",                             # Angular CLI temp directories
+    "scoped_dir*",                      # Chrome/Chromium temp directories
+    "node-compile-cache",               # Node.js V8 compile cache directory
+]
+
+# File patterns to clean up (glob patterns)
+FILE_PATTERNS = [
+    ".[0-9a-f]*.node",   # Node.js/V8 compile cache files (~7MB each, varying hex prefixes)
+    "claude-*-cwd",   # Claude CLI working directory temp files
+    "mat-debug-*.log",  # Material/Angular debug logs
+]
+
+
+def cleanup_stale_temp(max_age_seconds: int = MAX_AGE_SECONDS) -> dict:
+    """
+    Clean up stale temporary files and directories.
+
+    Only deletes items older than max_age_seconds to avoid
+    interfering with currently running processes.
+
+    Args:
+        max_age_seconds: Maximum age in seconds before an item is deleted.
+                        Defaults to 1 hour (3600 seconds).
+
+    Returns:
+        Dictionary with cleanup statistics:
+        - dirs_deleted: Number of directories deleted
+        - files_deleted: Number of files deleted
+        - bytes_freed: Approximate bytes freed
+        - errors: List of error messages (for debugging, not fatal)
+    """
+    temp_dir = Path(tempfile.gettempdir())
+    cutoff_time = time.time() - max_age_seconds
+
+    stats = {
+        "dirs_deleted": 0,
+        "files_deleted": 0,
+        "bytes_freed": 0,
+        "errors": [],
+    }
+
+    # Clean up directories
+    for pattern in DIR_PATTERNS:
+        for item in temp_dir.glob(pattern):
+            if not item.is_dir():
+                continue
+            try:
+                mtime = item.stat().st_mtime
+                if mtime < cutoff_time:
+                    size = _get_dir_size(item)
+                    shutil.rmtree(item, ignore_errors=True)
+                    if not item.exists():
+                        stats["dirs_deleted"] += 1
+                        stats["bytes_freed"] += size
+                        logger.debug(f"Deleted temp directory: {item}")
+            except Exception as e:
+                stats["errors"].append(f"Failed to delete {item}: {e}")
+                logger.debug(f"Failed to delete {item}: {e}")
+
+    # Clean up files
+    for pattern in FILE_PATTERNS:
+        for item in temp_dir.glob(pattern):
+            if not item.is_file():
+                continue
+            try:
+                mtime = item.stat().st_mtime
+                if mtime < cutoff_time:
+                    size = item.stat().st_size
+                    item.unlink(missing_ok=True)
+                    if not item.exists():
+                        stats["files_deleted"] += 1
+                        stats["bytes_freed"] += size
+                        logger.debug(f"Deleted temp file: {item}")
+            except Exception as e:
+                stats["errors"].append(f"Failed to delete {item}: {e}")
+                logger.debug(f"Failed to delete {item}: {e}")
+
+    # Log summary if anything was cleaned
+    if stats["dirs_deleted"] > 0 or stats["files_deleted"] > 0:
+        mb_freed = stats["bytes_freed"] / (1024 * 1024)
+        logger.info(
+            f"Temp cleanup: {stats['dirs_deleted']} dirs, "
+            f"{stats['files_deleted']} files, {mb_freed:.1f} MB freed"
+        )
+
+    return stats
+
+
+def cleanup_project_screenshots(project_dir: Path, max_age_seconds: int = 300) -> dict:
+    """
+    Clean up stale screenshot files from the project root.
+
+    Playwright browser verification can leave .png files in the project
+    directory. This removes them after they've aged out (default 5 minutes).
+
+    Args:
+        project_dir: Path to the project directory.
+        max_age_seconds: Maximum age in seconds before a screenshot is deleted.
+                        Defaults to 5 minutes (300 seconds).
+
+    Returns:
+        Dictionary with cleanup statistics (files_deleted, bytes_freed, errors).
+    """
+    cutoff_time = time.time() - max_age_seconds
+    stats: dict = {"files_deleted": 0, "bytes_freed": 0, "errors": []}
+
+    screenshot_patterns = [
+        "feature*-*.png",
+        "screenshot-*.png",
+        "step-*.png",
+    ]
+
+    for pattern in screenshot_patterns:
+        for item in project_dir.glob(pattern):
+            if not item.is_file():
+                continue
+            try:
+                mtime = item.stat().st_mtime
+                if mtime < cutoff_time:
+                    size = item.stat().st_size
+                    item.unlink(missing_ok=True)
+                    if not item.exists():
+                        stats["files_deleted"] += 1
+                        stats["bytes_freed"] += size
+                        logger.debug(f"Deleted project screenshot: {item}")
+            except Exception as e:
+                stats["errors"].append(f"Failed to delete {item}: {e}")
+                logger.debug(f"Failed to delete screenshot {item}: {e}")
+
+    if stats["files_deleted"] > 0:
+        mb_freed = stats["bytes_freed"] / (1024 * 1024)
+        logger.info(f"Screenshot cleanup: {stats['files_deleted']} files, {mb_freed:.1f} MB freed")
+
+    return stats
+
+
+def _get_dir_size(path: Path) -> int:
+    """Get total size of a directory in bytes."""
+    total = 0
+    try:
+        for item in path.rglob("*"):
+            if item.is_file():
+                try:
+                    total += item.stat().st_size
+                except (OSError, PermissionError):
+                    pass
+    except (OSError, PermissionError):
+        pass
+    return total
+
+
+if __name__ == "__main__":
+    # Allow running directly for testing/manual cleanup
+    logging.basicConfig(level=logging.DEBUG)
+    print("Running temp cleanup...")
+    stats = cleanup_stale_temp()
+    mb_freed = stats["bytes_freed"] / (1024 * 1024)
+    print(f"Cleanup complete: {stats['dirs_deleted']} dirs, {stats['files_deleted']} files, {mb_freed:.1f} MB freed")
+    if stats["errors"]:
+        print(f"Errors (non-fatal): {len(stats['errors'])}")
--- a/test_client.py
+++ b/test_client.py
@@ -40,15 +40,15 @@ class TestConvertModelForVertex(unittest.TestCase):
    def test_returns_model_unchanged_when_vertex_disabled(self):
        os.environ.pop("CLAUDE_CODE_USE_VERTEX", None)
        self.assertEqual(
-            convert_model_for_vertex("claude-opus-4-5-20251101"),
-            "claude-opus-4-5-20251101",
+            convert_model_for_vertex("claude-opus-4-6"),
+            "claude-opus-4-6",
        )

    def test_returns_model_unchanged_when_vertex_set_to_zero(self):
        os.environ["CLAUDE_CODE_USE_VERTEX"] = "0"
        self.assertEqual(
-            convert_model_for_vertex("claude-opus-4-5-20251101"),
-            "claude-opus-4-5-20251101",
+            convert_model_for_vertex("claude-opus-4-6"),
+            "claude-opus-4-6",
        )

    def test_returns_model_unchanged_when_vertex_set_to_empty(self):
@@ -60,13 +60,20 @@ class TestConvertModelForVertex(unittest.TestCase):

    # --- Vertex AI enabled: standard conversions ---

-    def test_converts_opus_model(self):
+    def test_converts_legacy_opus_model(self):
        os.environ["CLAUDE_CODE_USE_VERTEX"] = "1"
        self.assertEqual(
            convert_model_for_vertex("claude-opus-4-5-20251101"),
            "claude-opus-4-5@20251101",
        )

+    def test_opus_4_6_passthrough_on_vertex(self):
+        os.environ["CLAUDE_CODE_USE_VERTEX"] = "1"
+        self.assertEqual(
+            convert_model_for_vertex("claude-opus-4-6"),
+            "claude-opus-4-6",
+        )
+
    def test_converts_sonnet_model(self):
        os.environ["CLAUDE_CODE_USE_VERTEX"] = "1"
        self.assertEqual(
@@ -86,8 +93,8 @@ class TestConvertModelForVertex(unittest.TestCase):
    def test_already_vertex_format_unchanged(self):
        os.environ["CLAUDE_CODE_USE_VERTEX"] = "1"
        self.assertEqual(
-            convert_model_for_vertex("claude-opus-4-5@20251101"),
-            "claude-opus-4-5@20251101",
+            convert_model_for_vertex("claude-sonnet-4-5@20250929"),
+            "claude-sonnet-4-5@20250929",
        )

    def test_non_claude_model_unchanged(self):
@@ -100,8 +107,8 @@ class TestConvertModelForVertex(unittest.TestCase):
    def test_model_without_date_suffix_unchanged(self):
        os.environ["CLAUDE_CODE_USE_VERTEX"] = "1"
        self.assertEqual(
-            convert_model_for_vertex("claude-opus-4-5"),
-            "claude-opus-4-5",
+            convert_model_for_vertex("claude-opus-4-6"),
+            "claude-opus-4-6",
        )

    def test_empty_string_unchanged(self):
--- a/test_devserver_security.py
+++ b/test_devserver_security.py
@@ -0,0 +1,319 @@
+#!/usr/bin/env python3
+"""
+Dev Server Security Tests
+=========================
+
+Tests for dev server command validation and security hardening.
+Run with: python -m pytest test_devserver_security.py -v
+"""
+
+import sys
+from pathlib import Path
+
+import pytest
+
+# Add project root to path
+sys.path.insert(0, str(Path(__file__).parent))
+
+from server.routers.devserver import (
+    ALLOWED_NPM_SCRIPTS,
+    ALLOWED_PYTHON_MODULES,
+    ALLOWED_RUNNERS,
+    BLOCKED_SHELLS,
+    validate_custom_command_strict,
+)
+
+# =============================================================================
+# validate_custom_command_strict - Valid commands
+# =============================================================================
+
+
+class TestValidCommands:
+    """Commands that should pass validation."""
+
+    def test_npm_run_dev(self):
+        validate_custom_command_strict("npm run dev")
+
+    def test_npm_run_start(self):
+        validate_custom_command_strict("npm run start")
+
+    def test_npm_run_serve(self):
+        validate_custom_command_strict("npm run serve")
+
+    def test_npm_run_preview(self):
+        validate_custom_command_strict("npm run preview")
+
+    def test_pnpm_dev(self):
+        validate_custom_command_strict("pnpm dev")
+
+    def test_pnpm_run_dev(self):
+        validate_custom_command_strict("pnpm run dev")
+
+    def test_yarn_start(self):
+        validate_custom_command_strict("yarn start")
+
+    def test_yarn_run_serve(self):
+        validate_custom_command_strict("yarn run serve")
+
+    def test_uvicorn_basic(self):
+        validate_custom_command_strict("uvicorn main:app")
+
+    def test_uvicorn_with_flags(self):
+        validate_custom_command_strict("uvicorn main:app --host 0.0.0.0 --port 8000 --reload")
+
+    def test_uvicorn_flag_equals_syntax(self):
+        validate_custom_command_strict("uvicorn main:app --port=8000 --host=0.0.0.0")
+
+    def test_python_m_uvicorn(self):
+        validate_custom_command_strict("python -m uvicorn main:app --reload")
+
+    def test_python3_m_uvicorn(self):
+        validate_custom_command_strict("python3 -m uvicorn main:app")
+
+    def test_python_m_flask(self):
+        validate_custom_command_strict("python -m flask run")
+
+    def test_python_m_gunicorn(self):
+        validate_custom_command_strict("python -m gunicorn main:app")
+
+    def test_python_m_http_server(self):
+        validate_custom_command_strict("python -m http.server 8000")
+
+    def test_python_script(self):
+        validate_custom_command_strict("python app.py")
+
+    def test_python_manage_py_runserver(self):
+        validate_custom_command_strict("python manage.py runserver")
+
+    def test_python_manage_py_runserver_with_port(self):
+        validate_custom_command_strict("python manage.py runserver 0.0.0.0:8000")
+
+    def test_flask_run(self):
+        validate_custom_command_strict("flask run")
+
+    def test_flask_run_with_options(self):
+        validate_custom_command_strict("flask run --host 0.0.0.0 --port 5000")
+
+    def test_poetry_run_command(self):
+        validate_custom_command_strict("poetry run python app.py")
+
+    def test_cargo_run(self):
+        # cargo is allowed but has no special sub-validation
+        validate_custom_command_strict("cargo run")
+
+    def test_go_run(self):
+        # go is allowed but has no special sub-validation
+        validate_custom_command_strict("go run .")
+
+
+# =============================================================================
+# validate_custom_command_strict - Blocked shells
+# =============================================================================
+
+
+class TestBlockedShells:
+    """Shell interpreters that must be rejected."""
+
+    @pytest.mark.parametrize("shell", ["sh", "bash", "zsh", "cmd", "powershell", "pwsh", "cmd.exe"])
+    def test_blocked_shell(self, shell):
+        with pytest.raises(ValueError, match="runner not allowed"):
+            validate_custom_command_strict(f"{shell} -c 'echo hacked'")
+
+
+# =============================================================================
+# validate_custom_command_strict - Blocked commands
+# =============================================================================
+
+
+class TestBlockedCommands:
+    """Commands that should be rejected."""
+
+    def test_empty_command(self):
+        with pytest.raises(ValueError, match="cannot be empty"):
+            validate_custom_command_strict("")
+
+    def test_whitespace_only(self):
+        with pytest.raises(ValueError, match="cannot be empty"):
+            validate_custom_command_strict("   ")
+
+    def test_python_dash_c(self):
+        with pytest.raises(ValueError, match="python -c is not allowed"):
+            validate_custom_command_strict("python -c 'import os; os.system(\"rm -rf /\")'")
+
+    def test_python3_dash_c(self):
+        with pytest.raises(ValueError, match="python -c is not allowed"):
+            validate_custom_command_strict("python3 -c 'print(1)'")
+
+    def test_python_no_script_or_module(self):
+        with pytest.raises(ValueError, match="must use"):
+            validate_custom_command_strict("python --version")
+
+    def test_python_m_disallowed_module(self):
+        with pytest.raises(ValueError, match="not allowed"):
+            validate_custom_command_strict("python -m pip install something")
+
+    def test_unknown_runner(self):
+        with pytest.raises(ValueError, match="runner not allowed"):
+            validate_custom_command_strict("curl http://evil.com")
+
+    def test_rm_rf(self):
+        with pytest.raises(ValueError, match="runner not allowed"):
+            validate_custom_command_strict("rm -rf /")
+
+    def test_npm_arbitrary_script(self):
+        with pytest.raises(ValueError, match="npm custom_command"):
+            validate_custom_command_strict("npm run postinstall")
+
+    def test_npm_exec(self):
+        with pytest.raises(ValueError, match="npm custom_command"):
+            validate_custom_command_strict("npm exec evil-package")
+
+    def test_pnpm_arbitrary_script(self):
+        with pytest.raises(ValueError, match="pnpm custom_command"):
+            validate_custom_command_strict("pnpm run postinstall")
+
+    def test_yarn_arbitrary_script(self):
+        with pytest.raises(ValueError, match="yarn custom_command"):
+            validate_custom_command_strict("yarn run postinstall")
+
+    def test_uvicorn_no_app(self):
+        with pytest.raises(ValueError, match="must specify an app"):
+            validate_custom_command_strict("uvicorn --reload")
+
+    def test_uvicorn_disallowed_flag(self):
+        with pytest.raises(ValueError, match="flag not allowed"):
+            validate_custom_command_strict("uvicorn main:app --factory")
+
+    def test_flask_no_run(self):
+        with pytest.raises(ValueError, match="flask custom_command"):
+            validate_custom_command_strict("flask shell")
+
+    def test_poetry_no_run(self):
+        with pytest.raises(ValueError, match="poetry custom_command"):
+            validate_custom_command_strict("poetry install")
+
+
+# =============================================================================
+# validate_custom_command_strict - Injection attempts
+# =============================================================================
+
+
+class TestInjectionAttempts:
+    """Adversarial inputs that attempt to bypass validation."""
+
+    def test_shell_via_path_traversal(self):
+        with pytest.raises(ValueError, match="runner not allowed"):
+            validate_custom_command_strict("/bin/sh -c 'echo hacked'")
+
+    def test_shell_via_relative_path(self):
+        with pytest.raises(ValueError, match="runner not allowed"):
+            validate_custom_command_strict("../../bin/bash -c whoami")
+
+    def test_none_input(self):
+        with pytest.raises(ValueError, match="cannot be empty"):
+            validate_custom_command_strict(None)  # type: ignore[arg-type]
+
+    def test_integer_input(self):
+        with pytest.raises(ValueError, match="cannot be empty"):
+            validate_custom_command_strict(123)  # type: ignore[arg-type]
+
+    def test_python_dash_c_uppercase(self):
+        with pytest.raises(ValueError, match="python -c is not allowed"):
+            validate_custom_command_strict("python -C 'exec(evil)'")
+
+    def test_powershell_via_path(self):
+        with pytest.raises(ValueError, match="runner not allowed"):
+            validate_custom_command_strict("C:\\Windows\\System32\\powershell.exe -c Get-Process")
+
+
+# =============================================================================
+# dev_server_manager.py - dangerous_ops blocking
+# =============================================================================
+
+
+class TestDangerousOpsBlocking:
+    """Test the metacharacter blocking in dev_server_manager.start()."""
+
+    @pytest.fixture
+    def manager(self, tmp_path):
+        from server.services.dev_server_manager import DevServerProcessManager
+        return DevServerProcessManager("test-project", tmp_path)
+
+    @pytest.mark.asyncio
+    @pytest.mark.parametrize("cmd,desc", [
+        ("npm run dev && curl evil.com", "double ampersand"),
+        ("npm run dev & curl evil.com", "single ampersand"),
+        ("npm run dev || curl evil.com", "double pipe"),
+        ("npm run dev | curl evil.com", "single pipe"),
+        ("npm run dev ; curl evil.com", "semicolon"),
+        ("npm run dev `curl evil.com`", "backtick"),
+        ("npm run dev $(curl evil.com)", "dollar paren"),
+        ("npm run dev > /etc/passwd", "output redirect"),
+        ("npm run dev < /etc/passwd", "input redirect"),
+        ("npm run dev ^& calc", "caret escape"),
+        ("npm run %COMSPEC%", "percent env expansion"),
+    ])
+    async def test_blocks_shell_operator(self, manager, cmd, desc):
+        success, message = await manager.start(cmd)
+        assert not success, f"Should block {desc}: {cmd}"
+        assert "not allowed" in message.lower()
+
+    @pytest.mark.asyncio
+    async def test_blocks_newline_injection(self, manager):
+        success, message = await manager.start("npm run dev\ncurl evil.com")
+        assert not success
+        assert "newline" in message.lower()
+
+    @pytest.mark.asyncio
+    async def test_blocks_carriage_return(self, manager):
+        success, message = await manager.start("npm run dev\r\ncurl evil.com")
+        assert not success
+        assert "newline" in message.lower()
+
+    @pytest.mark.asyncio
+    @pytest.mark.parametrize("shell", ["sh", "bash", "zsh", "cmd", "powershell", "pwsh"])
+    async def test_blocks_shell_runners(self, manager, shell):
+        success, message = await manager.start(f"{shell} -c 'echo hacked'")
+        assert not success
+        assert "not allowed" in message.lower()
+
+    @pytest.mark.asyncio
+    async def test_blocks_empty_command(self, manager):
+        success, message = await manager.start("")
+        assert not success
+        assert "empty" in message.lower()
+
+    @pytest.mark.asyncio
+    async def test_blocks_whitespace_command(self, manager):
+        success, message = await manager.start("   ")
+        assert not success
+        assert "empty" in message.lower()
+
+
+# =============================================================================
+# Constants validation
+# =============================================================================
+
+
+class TestConstants:
+    """Verify security constants are properly defined."""
+
+    def test_all_common_shells_blocked(self):
+        for shell in ["sh", "bash", "zsh", "cmd", "powershell", "pwsh", "cmd.exe"]:
+            assert shell in BLOCKED_SHELLS, f"{shell} should be in BLOCKED_SHELLS"
+
+    def test_common_npm_scripts_allowed(self):
+        for script in ["dev", "start", "serve", "preview"]:
+            assert script in ALLOWED_NPM_SCRIPTS, f"{script} should be in ALLOWED_NPM_SCRIPTS"
+
+    def test_common_python_modules_allowed(self):
+        for mod in ["uvicorn", "flask", "gunicorn"]:
+            assert mod in ALLOWED_PYTHON_MODULES, f"{mod} should be in ALLOWED_PYTHON_MODULES"
+
+    def test_common_runners_allowed(self):
+        for runner in ["npm", "pnpm", "yarn", "python", "python3", "uvicorn", "flask", "cargo", "go"]:
+            assert runner in ALLOWED_RUNNERS, f"{runner} should be in ALLOWED_RUNNERS"
+
+
+if __name__ == "__main__":
+    pytest.main([__file__, "-v"])
--- a/ui/e2e/tooltip.spec.ts
+++ b/ui/e2e/tooltip.spec.ts
@@ -0,0 +1,47 @@
+import { test, expect } from '@playwright/test'
+
+/**
+ * E2E tooltip tests for header icon buttons.
+ *
+ * Run tests:
+ *   cd ui && npm run test:e2e
+ *   cd ui && npm run test:e2e -- tooltip.spec.ts
+ */
+test.describe('Header tooltips', () => {
+  test.setTimeout(30000)
+
+  test.beforeEach(async ({ page }) => {
+    await page.goto('/')
+    await page.waitForSelector('button:has-text("Select Project")', { timeout: 10000 })
+  })
+
+  async function selectProject(page: import('@playwright/test').Page) {
+    const projectSelector = page.locator('button:has-text("Select Project")')
+    if (await projectSelector.isVisible()) {
+      await projectSelector.click()
+      const items = page.locator('.neo-dropdown-item')
+      const itemCount = await items.count()
+      if (itemCount === 0) return false
+      await items.first().click()
+      await expect(projectSelector).not.toBeVisible({ timeout: 5000 }).catch(() => {})
+      return true
+    }
+    return false
+  }
+
+  test('Settings tooltip shows on hover', async ({ page }) => {
+    const hasProject = await selectProject(page)
+    if (!hasProject) {
+      test.skip(true, 'No projects available')
+      return
+    }
+
+    const settingsButton = page.locator('button[aria-label="Open Settings"]')
+    await expect(settingsButton).toBeVisible()
+
+    await settingsButton.hover()
+
+    const tooltip = page.locator('[data-slot="tooltip-content"]', { hasText: 'Settings' })
+    await expect(tooltip).toBeVisible({ timeout: 2000 })
+  })
+})
--- a/ui/package-lock.json
+++ b/ui/package-lock.json
--- a/ui/package.json
+++ b/ui/package.json
@@ -19,11 +19,13 @@
    "@radix-ui/react-separator": "^1.1.8",
    "@radix-ui/react-slot": "^1.2.4",
    "@radix-ui/react-switch": "^1.2.6",
+    "@radix-ui/react-tooltip": "^1.2.8",
    "@tanstack/react-query": "^5.72.0",
    "@xterm/addon-fit": "^0.11.0",
    "@xterm/addon-web-links": "^0.12.0",
    "@xterm/xterm": "^6.0.0",
    "@xyflow/react": "^12.10.0",
+    "autoforge-ai": "file:..",
    "canvas-confetti": "^1.9.4",
    "class-variance-authority": "^0.7.1",
    "clsx": "^2.1.1",
@@ -31,6 +33,8 @@
    "lucide-react": "^0.475.0",
    "react": "^19.0.0",
    "react-dom": "^19.0.0",
+    "react-markdown": "^10.1.0",
+    "remark-gfm": "^4.0.1",
    "tailwind-merge": "^3.4.0"
  },
  "devDependencies": {
--- a/ui/src/App.tsx
+++ b/ui/src/App.tsx
@@ -33,6 +33,7 @@ import type { Feature } from './lib/types'
 import { Button } from '@/components/ui/button'
 import { Card, CardContent } from '@/components/ui/card'
 import { Badge } from '@/components/ui/badge'
+import { TooltipProvider, Tooltip, TooltipTrigger, TooltipContent } from '@/components/ui/tooltip'

 const STORAGE_KEY = 'autoforge-selected-project'
 const VIEW_MODE_KEY = 'autoforge-view-mode'
@@ -178,8 +179,8 @@ function App() {
        setShowAddFeature(true)
      }

-      // E : Expand project with AI (when project selected and has features)
-      if ((e.key === 'e' || e.key === 'E') && selectedProject && features &&
+      // E : Expand project with AI (when project selected, has spec and has features)
+      if ((e.key === 'e' || e.key === 'E') && selectedProject && hasSpec && features &&
          (features.pending.length + features.in_progress.length + features.done.length) > 0) {
        e.preventDefault()
        setShowExpandProject(true)
@@ -239,7 +240,7 @@ function App() {

    window.addEventListener('keydown', handleKeyDown)
    return () => window.removeEventListener('keydown', handleKeyDown)
-  }, [selectedProject, showAddFeature, showExpandProject, selectedFeature, debugOpen, debugActiveTab, assistantOpen, features, showSettings, showKeyboardHelp, isSpecCreating, viewMode, showResetModal, wsState.agentStatus])
+  }, [selectedProject, showAddFeature, showExpandProject, selectedFeature, debugOpen, debugActiveTab, assistantOpen, features, showSettings, showKeyboardHelp, isSpecCreating, viewMode, showResetModal, wsState.agentStatus, hasSpec])

  // Combine WebSocket progress with feature data
  const progress = wsState.progress.total > 0 ? wsState.progress : {
@@ -260,18 +261,19 @@ function App() {
    <div className="min-h-screen bg-background">
      {/* Header */}
      <header className="sticky top-0 z-50 bg-card/80 backdrop-blur-md text-foreground border-b-2 border-border">
-        <div className="max-w-7xl mx-auto px-4 py-4">
-          <div className="flex items-center justify-between">
-            {/* Logo and Title */}
+        <div className="max-w-7xl mx-auto px-4 py-3">
+          <TooltipProvider>
+            {/* Row 1: Branding + Project + Utility icons */}
            <div className="flex items-center gap-3">
-              <img src="/logo.png" alt="AutoForge" className="h-9 w-9 rounded-full" />
-              <h1 className="font-display text-2xl font-bold tracking-tight uppercase">
-                AutoForge
-              </h1>
-            </div>
+              {/* Logo and Title */}
+              <div className="flex items-center gap-2 shrink-0">
+                <img src="/logo.png" alt="AutoForge" className="h-9 w-9 rounded-full" />
+                <h1 className="font-display text-2xl font-bold tracking-tight uppercase hidden md:block">
+                  AutoForge
+                </h1>
+              </div>

-            {/* Controls */}
-            <div className="flex items-center gap-4">
+              {/* Project selector */}
              <ProjectSelector
                projects={projects ?? []}
                selectedProject={selectedProject}
@@ -280,94 +282,114 @@ function App() {
                onSpecCreatingChange={setIsSpecCreating}
              />

-              {selectedProject && (
-                <>
-                  <AgentControl
-                    projectName={selectedProject}
-                    status={wsState.agentStatus}
-                    defaultConcurrency={selectedProjectData?.default_concurrency}
-                  />
+              {/* Spacer */}
+              <div className="flex-1" />

-                  <DevServerControl
-                    projectName={selectedProject}
-                    status={wsState.devServerStatus}
-                    url={wsState.devServerUrl}
-                  />
-
-                  <Button
-                    onClick={() => setShowSettings(true)}
-                    variant="outline"
-                    size="sm"
-                    title="Settings (,)"
-                    aria-label="Open Settings"
-                  >
-                    <Settings size={18} />
-                  </Button>
-
-                  <Button
-                    onClick={() => setShowResetModal(true)}
-                    variant="outline"
-                    size="sm"
-                    title="Reset Project (R)"
-                    aria-label="Reset Project"
-                    disabled={wsState.agentStatus === 'running'}
-                  >
-                    <RotateCcw size={18} />
-                  </Button>
-
-                  {/* Ollama Mode Indicator */}
-                  {settings?.ollama_mode && (
-                    <div
-                      className="flex items-center gap-1.5 px-2 py-1 bg-card rounded border-2 border-border shadow-sm"
-                      title="Using Ollama local models (configured via .env)"
-                    >
-                      <img src="/ollama.png" alt="Ollama" className="w-5 h-5" />
-                      <span className="text-xs font-bold text-foreground">Ollama</span>
-                    </div>
-                  )}
-
-                  {/* GLM Mode Badge */}
-                  {settings?.glm_mode && (
-                    <Badge
-                      className="bg-purple-500 text-white hover:bg-purple-600"
-                      title="Using GLM API (configured via .env)"
-                    >
-                      GLM
-                    </Badge>
-                  )}
-                </>
+              {/* Ollama Mode Indicator */}
+              {selectedProject && settings?.ollama_mode && (
+                <div
+                  className="hidden sm:flex items-center gap-1.5 px-2 py-1 bg-card rounded border-2 border-border shadow-sm"
+                  title="Using Ollama local models"
+                >
+                  <img src="/ollama.png" alt="Ollama" className="w-5 h-5" />
+                  <span className="text-xs font-bold text-foreground">Ollama</span>
+                </div>
              )}

-              {/* Docs link */}
-              <Button
-                onClick={() => window.open('https://autoforge.cc', '_blank')}
-                variant="outline"
-                size="sm"
-                title="Documentation"
-                aria-label="Open Documentation"
-              >
-                <BookOpen size={18} />
-              </Button>
+              {/* GLM Mode Badge */}
+              {selectedProject && settings?.glm_mode && (
+                <Badge
+                  className="hidden sm:inline-flex bg-purple-500 text-white hover:bg-purple-600"
+                  title="Using GLM API"
+                >
+                  GLM
+                </Badge>
+              )}
+
+              {/* Utility icons - always visible */}
+              <Tooltip>
+                <TooltipTrigger asChild>
+                  <Button
+                    onClick={() => window.open('https://autoforge.cc', '_blank')}
+                    variant="outline"
+                    size="sm"
+                    aria-label="Open Documentation"
+                  >
+                    <BookOpen size={18} />
+                  </Button>
+                </TooltipTrigger>
+                <TooltipContent>Docs</TooltipContent>
+              </Tooltip>

-              {/* Theme selector */}
              <ThemeSelector
                themes={themes}
                currentTheme={theme}
                onThemeChange={setTheme}
              />

-              {/* Dark mode toggle - always visible */}
-              <Button
-                onClick={toggleDarkMode}
-                variant="outline"
-                size="sm"
-                title="Toggle dark mode"
-                aria-label="Toggle dark mode"
-              >
-                {darkMode ? <Sun size={18} /> : <Moon size={18} />}
-              </Button>
+              <Tooltip>
+                <TooltipTrigger asChild>
+                  <Button
+                    onClick={toggleDarkMode}
+                    variant="outline"
+                    size="sm"
+                    aria-label="Toggle dark mode"
+                  >
+                    {darkMode ? <Sun size={18} /> : <Moon size={18} />}
+                  </Button>
+                </TooltipTrigger>
+                <TooltipContent>Toggle theme</TooltipContent>
+              </Tooltip>
            </div>
-          </div>
+
+            {/* Row 2: Project controls - only when a project is selected */}
+            {selectedProject && (
+              <div className="flex items-center gap-3 mt-2 pt-2 border-t border-border/50">
+                <AgentControl
+                  projectName={selectedProject}
+                  status={wsState.agentStatus}
+                  defaultConcurrency={selectedProjectData?.default_concurrency}
+                />
+
+                <DevServerControl
+                  projectName={selectedProject}
+                  status={wsState.devServerStatus}
+                  url={wsState.devServerUrl}
+                />
+
+                <div className="flex-1" />
+
+                <Tooltip>
+                  <TooltipTrigger asChild>
+                    <Button
+                      onClick={() => setShowSettings(true)}
+                      variant="outline"
+                      size="sm"
+                      aria-label="Open Settings"
+                    >
+                      <Settings size={18} />
+                    </Button>
+                  </TooltipTrigger>
+                  <TooltipContent>Settings (,)</TooltipContent>
+                </Tooltip>
+
+                <Tooltip>
+                  <TooltipTrigger asChild>
+                    <Button
+                      onClick={() => setShowResetModal(true)}
+                      variant="outline"
+                      size="sm"
+                      aria-label="Reset Project"
+                      disabled={wsState.agentStatus === 'running'}
+                    >
+                      <RotateCcw size={18} />
+                    </Button>
+                  </TooltipTrigger>
+                  <TooltipContent>Reset (R)</TooltipContent>
+                </Tooltip>
+              </div>
+            )}
+          </TooltipProvider>
        </div>
      </header>

@@ -490,7 +512,7 @@ function App() {
      )}

      {/* Expand Project Modal - AI-powered bulk feature creation */}
-      {showExpandProject && selectedProject && (
+      {showExpandProject && selectedProject && hasSpec && (
        <ExpandProjectModal
          isOpen={showExpandProject}
          projectName={selectedProject}
--- a/ui/src/components/AgentControl.tsx
+++ b/ui/src/components/AgentControl.tsx
@@ -81,7 +81,7 @@ export function AgentControl({ projectName, status, defaultConcurrency = 3 }: Ag

  return (
    <>
-      <div className="flex items-center gap-4">
+      <div className="flex items-center gap-2 sm:gap-4">
        {/* Concurrency slider - visible when stopped */}
        {isStopped && (
          <div className="flex items-center gap-2">
--- a/ui/src/components/AssistantChat.tsx
+++ b/ui/src/components/AssistantChat.tsx
@@ -11,6 +11,7 @@ import { Send, Loader2, Wifi, WifiOff, Plus, History } from 'lucide-react'
 import { useAssistantChat } from '../hooks/useAssistantChat'
 import { ChatMessage as ChatMessageComponent } from './ChatMessage'
 import { ConversationHistory } from './ConversationHistory'
+import { QuestionOptions } from './QuestionOptions'
 import type { ChatMessage } from '../lib/types'
 import { isSubmitEnter } from '../lib/keyboard'
 import { Button } from '@/components/ui/button'
@@ -52,8 +53,10 @@ export function AssistantChat({
    isLoading,
    connectionStatus,
    conversationId: activeConversationId,
+    currentQuestions,
    start,
    sendMessage,
+    sendAnswer,
    clearMessages,
  } = useAssistantChat({
    projectName,
@@ -268,6 +271,16 @@ export function AssistantChat({
        </div>
      )}

+      {/* Structured questions from assistant */}
+      {currentQuestions && (
+        <div className="border-t border-border bg-background">
+          <QuestionOptions
+            questions={currentQuestions}
+            onSubmit={sendAnswer}
+          />
+        </div>
+      )}
+
      {/* Input area */}
      <div className="border-t border-border p-4 bg-card">
        <div className="flex gap-2">
@@ -277,13 +290,13 @@ export function AssistantChat({
            onChange={(e) => setInputValue(e.target.value)}
            onKeyDown={handleKeyDown}
            placeholder="Ask about the codebase..."
-            disabled={isLoading || isLoadingConversation || connectionStatus !== 'connected'}
+            disabled={isLoading || isLoadingConversation || connectionStatus !== 'connected' || !!currentQuestions}
            className="flex-1 resize-none min-h-[44px] max-h-[120px]"
            rows={1}
          />
          <Button
            onClick={handleSend}
-            disabled={!inputValue.trim() || isLoading || isLoadingConversation || connectionStatus !== 'connected'}
+            disabled={!inputValue.trim() || isLoading || isLoadingConversation || connectionStatus !== 'connected' || !!currentQuestions}
            title="Send message"
          >
            {isLoading ? (
@@ -294,7 +307,7 @@ export function AssistantChat({
          </Button>
        </div>
        <p className="text-xs text-muted-foreground mt-2">
-          Press Enter to send, Shift+Enter for new line
+          {currentQuestions ? 'Select an option above to continue' : 'Press Enter to send, Shift+Enter for new line'}
        </p>
      </div>
    </div>
--- a/ui/src/components/AssistantPanel.tsx
+++ b/ui/src/components/AssistantPanel.tsx
@@ -6,7 +6,7 @@
 * Manages conversation state with localStorage persistence.
 */

-import { useState, useEffect, useCallback } from 'react'
+import { useState, useEffect, useCallback, useRef } from 'react'
 import { X, Bot } from 'lucide-react'
 import { AssistantChat } from './AssistantChat'
 import { useConversation } from '../hooks/useConversations'
@@ -20,6 +20,10 @@ interface AssistantPanelProps {
 }

 const STORAGE_KEY_PREFIX = 'assistant-conversation-'
+const WIDTH_STORAGE_KEY = 'assistant-panel-width'
+const DEFAULT_WIDTH = 400
+const MIN_WIDTH = 300
+const MAX_WIDTH_VW = 90

 function getStoredConversationId(projectName: string): number | null {
  try {
@@ -100,6 +104,49 @@ export function AssistantPanel({ projectName, isOpen, onClose }: AssistantPanelP
    setConversationId(id)
  }, [])

+  // Resizable panel width
+  const [panelWidth, setPanelWidth] = useState<number>(() => {
+    try {
+      const stored = localStorage.getItem(WIDTH_STORAGE_KEY)
+      if (stored) return Math.max(MIN_WIDTH, parseInt(stored, 10))
+    } catch { /* ignore */ }
+    return DEFAULT_WIDTH
+  })
+  const isResizing = useRef(false)
+
+  const handleMouseDown = useCallback((e: React.MouseEvent) => {
+    e.preventDefault()
+    isResizing.current = true
+    const startX = e.clientX
+    const startWidth = panelWidth
+    const maxWidth = window.innerWidth * (MAX_WIDTH_VW / 100)
+
+    const handleMouseMove = (e: MouseEvent) => {
+      if (!isResizing.current) return
+      const delta = startX - e.clientX
+      const newWidth = Math.min(maxWidth, Math.max(MIN_WIDTH, startWidth + delta))
+      setPanelWidth(newWidth)
+    }
+
+    const handleMouseUp = () => {
+      isResizing.current = false
+      document.removeEventListener('mousemove', handleMouseMove)
+      document.removeEventListener('mouseup', handleMouseUp)
+      document.body.style.cursor = ''
+      document.body.style.userSelect = ''
+      // Persist width
+      setPanelWidth((w) => {
+        localStorage.setItem(WIDTH_STORAGE_KEY, String(w))
+        return w
+      })
+    }
+
+    document.body.style.cursor = 'col-resize'
+    document.body.style.userSelect = 'none'
+    document.addEventListener('mousemove', handleMouseMove)
+    document.addEventListener('mouseup', handleMouseUp)
+  }, [panelWidth])
+
  return (
    <>
      {/* Backdrop - click to close */}
@@ -115,17 +162,25 @@ export function AssistantPanel({ projectName, isOpen, onClose }: AssistantPanelP
      <div
        className={`
          fixed right-0 top-0 bottom-0 z-50
-          w-[400px] max-w-[90vw]
          bg-card
          border-l border-border
          transform transition-transform duration-300 ease-out
          flex flex-col shadow-xl
          ${isOpen ? 'translate-x-0' : 'translate-x-full'}
        `}
+        style={{ width: `${panelWidth}px`, maxWidth: `${MAX_WIDTH_VW}vw` }}
        role="dialog"
        aria-label="Project Assistant"
        aria-hidden={!isOpen}
      >
+        {/* Resize handle */}
+        <div
+          className="absolute left-0 top-0 bottom-0 w-1.5 cursor-col-resize z-10 group"
+          onMouseDown={handleMouseDown}
+        >
+          <div className="absolute inset-y-0 left-0 w-0.5 bg-border group-hover:bg-primary transition-colors" />
+        </div>
+
        {/* Header */}
        <div className="flex items-center justify-between px-4 py-3 border-b border-border bg-primary text-primary-foreground">
          <div className="flex items-center gap-2">
--- a/ui/src/components/ChatMessage.tsx
+++ b/ui/src/components/ChatMessage.tsx
@@ -7,6 +7,8 @@

 import { memo } from 'react'
 import { Bot, User, Info } from 'lucide-react'
+import ReactMarkdown, { type Components } from 'react-markdown'
+import remarkGfm from 'remark-gfm'
 import type { ChatMessage as ChatMessageType } from '../lib/types'
 import { Card } from '@/components/ui/card'

@@ -14,8 +16,16 @@ interface ChatMessageProps {
  message: ChatMessageType
 }

-// Module-level regex to avoid recreating on each render
-const BOLD_REGEX = /\*\*(.*?)\*\*/g
+// Stable references for memo — avoids re-renders
+const remarkPlugins = [remarkGfm]
+
+const markdownComponents: Components = {
+  a: ({ children, href, ...props }) => (
+    <a href={href} target="_blank" rel="noopener noreferrer" {...props}>
+      {children}
+    </a>
+  ),
+}

 export const ChatMessage = memo(function ChatMessage({ message }: ChatMessageProps) {
  const { role, content, attachments, timestamp, isStreaming } = message
@@ -86,39 +96,11 @@ export const ChatMessage = memo(function ChatMessage({ message }: ChatMessagePro
          )}

          <Card className={`${config.bgColor} px-4 py-3 border ${isStreaming ? 'animate-pulse' : ''}`}>
-            {/* Parse content for basic markdown-like formatting */}
            {content && (
-              <div className={`whitespace-pre-wrap text-sm leading-relaxed ${config.textColor}`}>
-                {content.split('\n').map((line, i) => {
-                  // Bold text - use module-level regex, reset lastIndex for each line
-                  BOLD_REGEX.lastIndex = 0
-                  const parts = []
-                  let lastIndex = 0
-                  let match
-
-                  while ((match = BOLD_REGEX.exec(line)) !== null) {
-                    if (match.index > lastIndex) {
-                      parts.push(line.slice(lastIndex, match.index))
-                    }
-                    parts.push(
-                      <strong key={`bold-${i}-${match.index}`} className="font-bold">
-                        {match[1]}
-                      </strong>
-                    )
-                    lastIndex = match.index + match[0].length
-                  }
-
-                  if (lastIndex < line.length) {
-                    parts.push(line.slice(lastIndex))
-                  }
-
-                  return (
-                    <span key={i}>
-                      {parts.length > 0 ? parts : line}
-                      {i < content.split('\n').length - 1 && '\n'}
-                    </span>
-                  )
-                })}
+              <div className={`text-sm leading-relaxed ${config.textColor} chat-prose${role === 'user' ? ' chat-prose-user' : ''}`}>
+                <ReactMarkdown remarkPlugins={remarkPlugins} components={markdownComponents}>
+                  {content}
+                </ReactMarkdown>
              </div>
            )}

--- a/ui/src/components/DevServerConfigDialog.tsx
+++ b/ui/src/components/DevServerConfigDialog.tsx
@@ -0,0 +1,182 @@
+import { useState, useEffect } from 'react'
+import { Loader2, RotateCcw, Terminal } from 'lucide-react'
+import { useQueryClient } from '@tanstack/react-query'
+import {
+  Dialog,
+  DialogContent,
+  DialogDescription,
+  DialogFooter,
+  DialogHeader,
+  DialogTitle,
+} from '@/components/ui/dialog'
+import { Button } from '@/components/ui/button'
+import { Input } from '@/components/ui/input'
+import { Label } from '@/components/ui/label'
+import { useDevServerConfig, useUpdateDevServerConfig } from '@/hooks/useProjects'
+import { startDevServer } from '@/lib/api'
+
+interface DevServerConfigDialogProps {
+  projectName: string
+  isOpen: boolean
+  onClose: () => void
+  autoStartOnSave?: boolean
+}
+
+export function DevServerConfigDialog({
+  projectName,
+  isOpen,
+  onClose,
+  autoStartOnSave = false,
+}: DevServerConfigDialogProps) {
+  const { data: config } = useDevServerConfig(isOpen ? projectName : null)
+  const updateConfig = useUpdateDevServerConfig(projectName)
+  const queryClient = useQueryClient()
+
+  const [command, setCommand] = useState('')
+  const [error, setError] = useState<string | null>(null)
+  const [isSaving, setIsSaving] = useState(false)
+
+  // Sync input with config when dialog opens or config loads
+  useEffect(() => {
+    if (isOpen && config) {
+      setCommand(config.custom_command ?? config.effective_command ?? '')
+      setError(null)
+    }
+  }, [isOpen, config])
+
+  const hasCustomCommand = !!config?.custom_command
+
+  const handleSaveAndStart = async () => {
+    const trimmed = command.trim()
+    if (!trimmed) {
+      setError('Please enter a dev server command.')
+      return
+    }
+
+    setIsSaving(true)
+    setError(null)
+
+    try {
+      await updateConfig.mutateAsync(trimmed)
+
+      if (autoStartOnSave) {
+        await startDevServer(projectName)
+        queryClient.invalidateQueries({ queryKey: ['dev-server-status', projectName] })
+      }
+
+      onClose()
+    } catch (err) {
+      setError(err instanceof Error ? err.message : 'Failed to save configuration')
+    } finally {
+      setIsSaving(false)
+    }
+  }
+
+  const handleClear = async () => {
+    setIsSaving(true)
+    setError(null)
+
+    try {
+      await updateConfig.mutateAsync(null)
+      setCommand(config?.detected_command ?? '')
+    } catch (err) {
+      setError(err instanceof Error ? err.message : 'Failed to clear configuration')
+    } finally {
+      setIsSaving(false)
+    }
+  }
+
+  return (
+    <Dialog open={isOpen} onOpenChange={(open) => !open && onClose()}>
+      <DialogContent className="sm:max-w-lg">
+        <DialogHeader>
+          <div className="flex items-center gap-3">
+            <div className="p-2 rounded-lg bg-primary/10 text-primary">
+              <Terminal size={20} />
+            </div>
+            <DialogTitle>Dev Server Configuration</DialogTitle>
+          </div>
+        </DialogHeader>
+
+        <DialogDescription asChild>
+          <div className="space-y-4">
+            {/* Detection info */}
+            <div className="rounded-lg border-2 border-border bg-muted/50 p-3 text-sm">
+              {config?.detected_type ? (
+                <p>
+                  Detected project type: <strong className="text-foreground">{config.detected_type}</strong>
+                  {config.detected_command && (
+                    <span className="text-muted-foreground"> — {config.detected_command}</span>
+                  )}
+                </p>
+              ) : (
+                <p className="text-muted-foreground">
+                  No project type detected. Enter a custom command below.
+                </p>
+              )}
+            </div>
+
+            {/* Command input */}
+            <div className="space-y-2">
+              <Label htmlFor="dev-command" className="text-foreground">Dev server command</Label>
+              <Input
+                id="dev-command"
+                value={command}
+                onChange={(e) => {
+                  setCommand(e.target.value)
+                  setError(null)
+                }}
+                placeholder="npm run dev"
+                onKeyDown={(e) => {
+                  if (e.key === 'Enter' && !isSaving) {
+                    handleSaveAndStart()
+                  }
+                }}
+              />
+              <p className="text-xs text-muted-foreground">
+                Allowed runners: npm, npx, pnpm, yarn, python, uvicorn, flask, poetry, cargo, go
+              </p>
+            </div>
+
+            {/* Clear custom command button */}
+            {hasCustomCommand && (
+              <Button
+                variant="outline"
+                size="sm"
+                onClick={handleClear}
+                disabled={isSaving}
+                className="gap-1.5"
+              >
+                <RotateCcw size={14} />
+                Clear custom command (use auto-detection)
+              </Button>
+            )}
+
+            {/* Error display */}
+            {error && (
+              <p className="text-sm font-mono text-destructive">{error}</p>
+            )}
+          </div>
+        </DialogDescription>
+
+        <DialogFooter className="gap-2 sm:gap-0">
+          <Button variant="outline" onClick={onClose} disabled={isSaving}>
+            Cancel
+          </Button>
+          <Button onClick={handleSaveAndStart} disabled={isSaving}>
+            {isSaving ? (
+              <>
+                <Loader2 size={16} className="animate-spin mr-1.5" />
+                Saving...
+              </>
+            ) : autoStartOnSave ? (
+              'Save & Start'
+            ) : (
+              'Save'
+            )}
+          </Button>
+        </DialogFooter>
+      </DialogContent>
+    </Dialog>
+  )
+}
--- a/ui/src/components/DevServerControl.tsx
+++ b/ui/src/components/DevServerControl.tsx
@@ -1,8 +1,10 @@
-import { Globe, Square, Loader2, ExternalLink, AlertTriangle } from 'lucide-react'
+import { useState } from 'react'
+import { Globe, Square, Loader2, ExternalLink, AlertTriangle, Settings2 } from 'lucide-react'
 import { useMutation, useQueryClient } from '@tanstack/react-query'
 import type { DevServerStatus } from '../lib/types'
 import { startDevServer, stopDevServer } from '../lib/api'
 import { Button } from '@/components/ui/button'
+import { DevServerConfigDialog } from './DevServerConfigDialog'

 // Re-export DevServerStatus from lib/types for consumers that import from here
 export type { DevServerStatus }
@@ -59,17 +61,27 @@ interface DevServerControlProps {
 * - Shows loading state during operations
 * - Displays clickable URL when server is running
 * - Uses neobrutalism design with cyan accent when running
+ * - Config dialog for setting custom dev commands
 */
 export function DevServerControl({ projectName, status, url }: DevServerControlProps) {
  const startDevServerMutation = useStartDevServer(projectName)
  const stopDevServerMutation = useStopDevServer(projectName)
+  const [showConfigDialog, setShowConfigDialog] = useState(false)
+  const [autoStartOnSave, setAutoStartOnSave] = useState(false)

  const isLoading = startDevServerMutation.isPending || stopDevServerMutation.isPending

  const handleStart = () => {
    // Clear any previous errors before starting
    stopDevServerMutation.reset()
-    startDevServerMutation.mutate()
+    startDevServerMutation.mutate(undefined, {
+      onError: (err) => {
+        if (err.message?.includes('No dev command available')) {
+          setAutoStartOnSave(true)
+          setShowConfigDialog(true)
+        }
+      },
+    })
  }
  const handleStop = () => {
    // Clear any previous errors before stopping
@@ -77,6 +89,19 @@ export function DevServerControl({ projectName, status, url }: DevServerControlP
    stopDevServerMutation.mutate()
  }

+  const handleOpenConfig = () => {
+    setAutoStartOnSave(false)
+    setShowConfigDialog(true)
+  }
+
+  const handleCloseConfig = () => {
+    setShowConfigDialog(false)
+    // Clear the start error if config dialog was opened reactively
+    if (startDevServerMutation.error?.message?.includes('No dev command available')) {
+      startDevServerMutation.reset()
+    }
+  }
+
  // Server is stopped when status is 'stopped' or 'crashed' (can restart)
  const isStopped = status === 'stopped' || status === 'crashed'
  // Server is in a running state
@@ -84,25 +109,40 @@ export function DevServerControl({ projectName, status, url }: DevServerControlP
  // Server has crashed
  const isCrashed = status === 'crashed'

+  // Hide inline error when config dialog is handling it
+  const startError = startDevServerMutation.error
+  const showInlineError = startError && !startError.message?.includes('No dev command available')
+
  return (
    <div className="flex items-center gap-2">
      {isStopped ? (
-        <Button
-          onClick={handleStart}
-          disabled={isLoading}
-          variant={isCrashed ? "destructive" : "outline"}
-          size="sm"
-          title={isCrashed ? "Dev Server Crashed - Click to Restart" : "Start Dev Server"}
-          aria-label={isCrashed ? "Restart Dev Server (crashed)" : "Start Dev Server"}
-        >
-          {isLoading ? (
-            <Loader2 size={18} className="animate-spin" />
-          ) : isCrashed ? (
-            <AlertTriangle size={18} />
-          ) : (
-            <Globe size={18} />
-          )}
-        </Button>
+        <>
+          <Button
+            onClick={handleStart}
+            disabled={isLoading}
+            variant={isCrashed ? "destructive" : "outline"}
+            size="sm"
+            title={isCrashed ? "Dev Server Crashed - Click to Restart" : "Start Dev Server"}
+            aria-label={isCrashed ? "Restart Dev Server (crashed)" : "Start Dev Server"}
+          >
+            {isLoading ? (
+              <Loader2 size={18} className="animate-spin" />
+            ) : isCrashed ? (
+              <AlertTriangle size={18} />
+            ) : (
+              <Globe size={18} />
+            )}
+          </Button>
+          <Button
+            onClick={handleOpenConfig}
+            variant="ghost"
+            size="sm"
+            title="Configure Dev Server"
+            aria-label="Configure Dev Server"
+          >
+            <Settings2 size={16} />
+          </Button>
+        </>
      ) : (
        <Button
          onClick={handleStop}
@@ -139,12 +179,20 @@ export function DevServerControl({ projectName, status, url }: DevServerControlP
        </Button>
      )}

-      {/* Error display */}
-      {(startDevServerMutation.error || stopDevServerMutation.error) && (
+      {/* Error display (hide "no dev command" error when config dialog handles it) */}
+      {(showInlineError || stopDevServerMutation.error) && (
        <span className="text-xs font-mono text-destructive ml-2">
-          {String((startDevServerMutation.error || stopDevServerMutation.error)?.message || 'Operation failed')}
+          {String((showInlineError ? startError : stopDevServerMutation.error)?.message || 'Operation failed')}
        </span>
      )}
+
+      {/* Dev Server Config Dialog */}
+      <DevServerConfigDialog
+        projectName={projectName}
+        isOpen={showConfigDialog}
+        onClose={handleCloseConfig}
+        autoStartOnSave={autoStartOnSave}
+      />
    </div>
  )
 }
--- a/ui/src/components/KanbanBoard.tsx
+++ b/ui/src/components/KanbanBoard.tsx
@@ -51,7 +51,7 @@ export function KanbanBoard({ features, onFeatureClick, onAddFeature, onExpandPr
        onFeatureClick={onFeatureClick}
        onAddFeature={onAddFeature}
        onExpandProject={onExpandProject}
-        showExpandButton={hasFeatures}
+        showExpandButton={hasFeatures && hasSpec}
        onCreateSpec={onCreateSpec}
        showCreateSpec={!hasSpec && !hasFeatures}
      />
--- a/ui/src/components/KeyboardShortcutsHelp.tsx
+++ b/ui/src/components/KeyboardShortcutsHelp.tsx
@@ -19,7 +19,7 @@ const shortcuts: Shortcut[] = [
  { key: 'D', description: 'Toggle debug panel' },
  { key: 'T', description: 'Toggle terminal tab' },
  { key: 'N', description: 'Add new feature', context: 'with project' },
-  { key: 'E', description: 'Expand project with AI', context: 'with features' },
+  { key: 'E', description: 'Expand project with AI', context: 'with spec & features' },
  { key: 'A', description: 'Toggle AI assistant', context: 'with project' },
  { key: 'G', description: 'Toggle Kanban/Graph view', context: 'with project' },
  { key: ',', description: 'Open settings' },
--- a/ui/src/components/ProjectSelector.tsx
+++ b/ui/src/components/ProjectSelector.tsx
@@ -73,16 +73,16 @@ export function ProjectSelector({
        <DropdownMenuTrigger asChild>
          <Button
            variant="outline"
-            className="min-w-[200px] justify-between"
+            className="min-w-[140px] sm:min-w-[200px] justify-between"
            disabled={isLoading}
          >
            {isLoading ? (
              <Loader2 size={18} className="animate-spin" />
            ) : selectedProject ? (
              <>
-                <span className="flex items-center gap-2">
-                  <FolderOpen size={18} />
-                  {selectedProject}
+                <span className="flex items-center gap-2 truncate">
+                  <FolderOpen size={18} className="shrink-0" />
+                  <span className="truncate">{selectedProject}</span>
                </span>
                {selectedProjectData && selectedProjectData.stats.total > 0 && (
                  <Badge className="ml-2">{selectedProjectData.stats.percentage}%</Badge>
--- a/ui/src/components/SettingsModal.tsx
+++ b/ui/src/components/SettingsModal.tsx
@@ -1,6 +1,8 @@
-import { Loader2, AlertCircle, Check, Moon, Sun } from 'lucide-react'
-import { useSettings, useUpdateSettings, useAvailableModels } from '../hooks/useProjects'
+import { useState } from 'react'
+import { Loader2, AlertCircle, Check, Moon, Sun, Eye, EyeOff, ShieldCheck } from 'lucide-react'
+import { useSettings, useUpdateSettings, useAvailableModels, useAvailableProviders } from '../hooks/useProjects'
 import { useTheme, THEMES } from '../hooks/useTheme'
+import type { ProviderInfo } from '../lib/types'
 import {
  Dialog,
  DialogContent,
@@ -17,12 +19,26 @@ interface SettingsModalProps {
  onClose: () => void
 }

+const PROVIDER_INFO_TEXT: Record<string, string> = {
+  claude: 'Default provider. Uses your Claude CLI credentials.',
+  kimi: 'Get an API key at kimi.com',
+  glm: 'Get an API key at open.bigmodel.cn',
+  ollama: 'Run models locally. Install from ollama.com',
+  custom: 'Connect to any OpenAI-compatible API endpoint.',
+}
+
 export function SettingsModal({ isOpen, onClose }: SettingsModalProps) {
  const { data: settings, isLoading, isError, refetch } = useSettings()
  const { data: modelsData } = useAvailableModels()
+  const { data: providersData } = useAvailableProviders()
  const updateSettings = useUpdateSettings()
  const { theme, setTheme, darkMode, toggleDarkMode } = useTheme()

+  const [showAuthToken, setShowAuthToken] = useState(false)
+  const [authTokenInput, setAuthTokenInput] = useState('')
+  const [customModelInput, setCustomModelInput] = useState('')
+  const [customBaseUrlInput, setCustomBaseUrlInput] = useState('')
+
  const handleYoloToggle = () => {
    if (settings && !updateSettings.isPending) {
      updateSettings.mutate({ yolo_mode: !settings.yolo_mode })
@@ -31,7 +47,7 @@ export function SettingsModal({ isOpen, onClose }: SettingsModalProps) {

  const handleModelChange = (modelId: string) => {
    if (!updateSettings.isPending) {
-      updateSettings.mutate({ model: modelId })
+      updateSettings.mutate({ api_model: modelId })
    }
  }

@@ -47,12 +63,51 @@ export function SettingsModal({ isOpen, onClose }: SettingsModalProps) {
    }
  }

+  const handleProviderChange = (providerId: string) => {
+    if (!updateSettings.isPending) {
+      updateSettings.mutate({ api_provider: providerId })
+      // Reset local state
+      setAuthTokenInput('')
+      setShowAuthToken(false)
+      setCustomModelInput('')
+      setCustomBaseUrlInput('')
+    }
+  }
+
+  const handleSaveAuthToken = () => {
+    if (authTokenInput.trim() && !updateSettings.isPending) {
+      updateSettings.mutate({ api_auth_token: authTokenInput.trim() })
+      setAuthTokenInput('')
+      setShowAuthToken(false)
+    }
+  }
+
+  const handleSaveCustomBaseUrl = () => {
+    if (customBaseUrlInput.trim() && !updateSettings.isPending) {
+      updateSettings.mutate({ api_base_url: customBaseUrlInput.trim() })
+    }
+  }
+
+  const handleSaveCustomModel = () => {
+    if (customModelInput.trim() && !updateSettings.isPending) {
+      updateSettings.mutate({ api_model: customModelInput.trim() })
+      setCustomModelInput('')
+    }
+  }
+
+  const providers = providersData?.providers ?? []
  const models = modelsData?.models ?? []
  const isSaving = updateSettings.isPending
+  const currentProvider = settings?.api_provider ?? 'claude'
+  const currentProviderInfo: ProviderInfo | undefined = providers.find(p => p.id === currentProvider)
+  const isAlternativeProvider = currentProvider !== 'claude'
+  const showAuthField = isAlternativeProvider && currentProviderInfo?.requires_auth
+  const showBaseUrlField = currentProvider === 'custom'
+  const showCustomModelInput = currentProvider === 'custom' || currentProvider === 'ollama'

  return (
    <Dialog open={isOpen} onOpenChange={(open) => !open && onClose()}>
-      <DialogContent className="sm:max-w-sm">
+      <DialogContent aria-describedby={undefined} className="sm:max-w-sm max-h-[85vh] overflow-y-auto">
        <DialogHeader>
          <DialogTitle className="flex items-center gap-2">
            Settings
@@ -159,6 +214,147 @@ export function SettingsModal({ isOpen, onClose }: SettingsModalProps) {

            <hr className="border-border" />

+            {/* API Provider Selection */}
+            <div className="space-y-3">
+              <Label className="font-medium">API Provider</Label>
+              <div className="flex flex-wrap gap-1.5">
+                {providers.map((provider) => (
+                  <button
+                    key={provider.id}
+                    onClick={() => handleProviderChange(provider.id)}
+                    disabled={isSaving}
+                    className={`py-1.5 px-3 text-sm font-medium rounded-md border transition-colors ${
+                      currentProvider === provider.id
+                        ? 'bg-primary text-primary-foreground border-primary'
+                        : 'bg-background text-foreground border-border hover:bg-muted'
+                    } ${isSaving ? 'opacity-50 cursor-not-allowed' : ''}`}
+                  >
+                    {provider.name.split(' (')[0]}
+                  </button>
+                ))}
+              </div>
+              <p className="text-xs text-muted-foreground">
+                {PROVIDER_INFO_TEXT[currentProvider] ?? ''}
+              </p>
+
+              {/* Auth Token Field */}
+              {showAuthField && (
+                <div className="space-y-2 pt-1">
+                  <Label className="text-sm">API Key</Label>
+                  {settings.api_has_auth_token && !authTokenInput && (
+                    <div className="flex items-center gap-2 text-sm text-muted-foreground">
+                      <ShieldCheck size={14} className="text-green-500" />
+                      <span>Configured</span>
+                      <Button
+                        variant="ghost"
+                        size="sm"
+                        className="h-auto py-0.5 px-2 text-xs"
+                        onClick={() => setAuthTokenInput(' ')}
+                      >
+                        Change
+                      </Button>
+                    </div>
+                  )}
+                  {(!settings.api_has_auth_token || authTokenInput) && (
+                    <div className="flex gap-2">
+                      <div className="relative flex-1">
+                        <input
+                          type={showAuthToken ? 'text' : 'password'}
+                          value={authTokenInput.trim()}
+                          onChange={(e) => setAuthTokenInput(e.target.value)}
+                          placeholder="Enter API key..."
+                          className="w-full py-1.5 px-3 pe-9 text-sm border rounded-md bg-background"
+                        />
+                        <button
+                          type="button"
+                          onClick={() => setShowAuthToken(!showAuthToken)}
+                          className="absolute end-2 top-1/2 -translate-y-1/2 text-muted-foreground hover:text-foreground"
+                        >
+                          {showAuthToken ? <EyeOff size={14} /> : <Eye size={14} />}
+                        </button>
+                      </div>
+                      <Button
+                        size="sm"
+                        onClick={handleSaveAuthToken}
+                        disabled={!authTokenInput.trim() || isSaving}
+                      >
+                        Save
+                      </Button>
+                    </div>
+                  )}
+                </div>
+              )}
+
+              {/* Custom Base URL Field */}
+              {showBaseUrlField && (
+                <div className="space-y-2 pt-1">
+                  <Label className="text-sm">Base URL</Label>
+                  <div className="flex gap-2">
+                    <input
+                      type="text"
+                      value={customBaseUrlInput || settings.api_base_url || ''}
+                      onChange={(e) => setCustomBaseUrlInput(e.target.value)}
+                      placeholder="https://api.example.com/v1"
+                      className="flex-1 py-1.5 px-3 text-sm border rounded-md bg-background"
+                    />
+                    <Button
+                      size="sm"
+                      onClick={handleSaveCustomBaseUrl}
+                      disabled={!customBaseUrlInput.trim() || isSaving}
+                    >
+                      Save
+                    </Button>
+                  </div>
+                </div>
+              )}
+            </div>
+
+            {/* Model Selection */}
+            <div className="space-y-2">
+              <Label className="font-medium">Model</Label>
+              {models.length > 0 && (
+                <div className="flex rounded-lg border overflow-hidden">
+                  {models.map((model) => (
+                    <button
+                      key={model.id}
+                      onClick={() => handleModelChange(model.id)}
+                      disabled={isSaving}
+                      className={`flex-1 py-2 px-3 text-sm font-medium transition-colors ${
+                        (settings.api_model ?? settings.model) === model.id
+                          ? 'bg-primary text-primary-foreground'
+                          : 'bg-background text-foreground hover:bg-muted'
+                      } ${isSaving ? 'opacity-50 cursor-not-allowed' : ''}`}
+                    >
+                      <span className="block">{model.name}</span>
+                      <span className="block text-xs opacity-60">{model.id}</span>
+                    </button>
+                  ))}
+                </div>
+              )}
+              {/* Custom model input for Ollama/Custom */}
+              {showCustomModelInput && (
+                <div className="flex gap-2 pt-1">
+                  <input
+                    type="text"
+                    value={customModelInput}
+                    onChange={(e) => setCustomModelInput(e.target.value)}
+                    placeholder="Custom model name..."
+                    className="flex-1 py-1.5 px-3 text-sm border rounded-md bg-background"
+                    onKeyDown={(e) => e.key === 'Enter' && handleSaveCustomModel()}
+                  />
+                  <Button
+                    size="sm"
+                    onClick={handleSaveCustomModel}
+                    disabled={!customModelInput.trim() || isSaving}
+                  >
+                    Set
+                  </Button>
+                </div>
+              )}
+            </div>
+
+            <hr className="border-border" />
+
            {/* YOLO Mode Toggle */}
            <div className="flex items-center justify-between">
              <div className="space-y-0.5">
@@ -195,27 +391,6 @@ export function SettingsModal({ isOpen, onClose }: SettingsModalProps) {
              />
            </div>

-            {/* Model Selection */}
-            <div className="space-y-2">
-              <Label className="font-medium">Model</Label>
-              <div className="flex rounded-lg border overflow-hidden">
-                {models.map((model) => (
-                  <button
-                    key={model.id}
-                    onClick={() => handleModelChange(model.id)}
-                    disabled={isSaving}
-                    className={`flex-1 py-2 px-3 text-sm font-medium transition-colors ${
-                      settings.model === model.id
-                        ? 'bg-primary text-primary-foreground'
-                        : 'bg-background text-foreground hover:bg-muted'
-                    } ${isSaving ? 'opacity-50 cursor-not-allowed' : ''}`}
-                  >
-                    {model.name}
-                  </button>
-                ))}
-              </div>
-            </div>
-
            {/* Regression Agents */}
            <div className="space-y-2">
              <Label className="font-medium">Regression Agents</Label>
--- a/ui/src/components/ThemeSelector.tsx
+++ b/ui/src/components/ThemeSelector.tsx
@@ -1,6 +1,7 @@
 import { useState, useRef, useEffect } from 'react'
 import { Palette, Check } from 'lucide-react'
 import { Button } from '@/components/ui/button'
+import { Tooltip, TooltipTrigger, TooltipContent } from '@/components/ui/tooltip'
 import type { ThemeId, ThemeOption } from '../hooks/useTheme'

 interface ThemeSelectorProps {
@@ -97,16 +98,20 @@ export function ThemeSelector({ themes, currentTheme, onThemeChange }: ThemeSele
      onMouseEnter={handleMouseEnter}
      onMouseLeave={handleMouseLeave}
    >
-      <Button
-        variant="outline"
-        size="sm"
-        title="Theme"
-        aria-label="Select theme"
-        aria-expanded={isOpen}
-        aria-haspopup="true"
-      >
-        <Palette size={18} />
-      </Button>
+      <Tooltip>
+        <TooltipTrigger asChild>
+          <Button
+            variant="outline"
+            size="sm"
+            aria-label="Select theme"
+            aria-expanded={isOpen}
+            aria-haspopup="true"
+          >
+            <Palette size={18} />
+          </Button>
+        </TooltipTrigger>
+        <TooltipContent>Theme</TooltipContent>
+      </Tooltip>

      {/* Dropdown */}
      {isOpen && (
--- a/ui/src/components/ui/tooltip.tsx
+++ b/ui/src/components/ui/tooltip.tsx
@@ -0,0 +1,65 @@
+import * as React from "react"
+import * as TooltipPrimitive from "@radix-ui/react-tooltip"
+
+import { cn } from "@/lib/utils"
+
+function TooltipProvider({
+  delayDuration = 250,
+  ...props
+}: React.ComponentProps<typeof TooltipPrimitive.Provider> & {
+  delayDuration?: number
+}) {
+  return (
+    <TooltipPrimitive.Provider
+      data-slot="tooltip-provider"
+      delayDuration={delayDuration}
+      {...props}
+    />
+  )
+}
+
+function Tooltip({
+  ...props
+}: React.ComponentProps<typeof TooltipPrimitive.Root>) {
+  return <TooltipPrimitive.Root data-slot="tooltip" {...props} />
+}
+
+function TooltipTrigger({
+  ...props
+}: React.ComponentProps<typeof TooltipPrimitive.Trigger>) {
+  return <TooltipPrimitive.Trigger data-slot="tooltip-trigger" {...props} />
+}
+
+function TooltipContent({
+  className,
+  side = "bottom",
+  align = "center",
+  sideOffset = 8,
+  children,
+  ...props
+}: React.ComponentProps<typeof TooltipPrimitive.Content>) {
+  return (
+    <TooltipPrimitive.Portal>
+      <TooltipPrimitive.Content
+        data-slot="tooltip-content"
+        side={side}
+        align={align}
+        sideOffset={sideOffset}
+        className={cn(
+          "z-50 overflow-hidden rounded-md border bg-neutral-900 px-3 py-2 text-sm text-white shadow-md leading-tight min-h-7",
+          "data-[state=delayed-open]:animate-in data-[state=closed]:animate-out data-[state=closed]:fade-out-0 data-[state=delayed-open]:fade-in-0 data-[side=bottom]:slide-in-from-top-2 data-[side=left]:slide-in-from-right-2 data-[side=right]:slide-in-from-left-2 data-[side=top]:slide-in-from-bottom-2",
+          className
+        )}
+        {...props}
+      >
+        {children}
+        <TooltipPrimitive.Arrow
+          data-slot="tooltip-arrow"
+          className="fill-neutral-900"
+        />
+      </TooltipPrimitive.Content>
+    </TooltipPrimitive.Portal>
+  )
+}
+
+export { Tooltip, TooltipContent, TooltipProvider, TooltipTrigger }
--- a/ui/src/hooks/useAssistantChat.ts
+++ b/ui/src/hooks/useAssistantChat.ts
@@ -3,7 +3,7 @@
 */

 import { useState, useCallback, useRef, useEffect } from "react";
-import type { ChatMessage, AssistantChatServerMessage } from "../lib/types";
+import type { ChatMessage, AssistantChatServerMessage, SpecQuestion } from "../lib/types";

 type ConnectionStatus = "disconnected" | "connecting" | "connected" | "error";

@@ -17,8 +17,10 @@ interface UseAssistantChatReturn {
  isLoading: boolean;
  connectionStatus: ConnectionStatus;
  conversationId: number | null;
+  currentQuestions: SpecQuestion[] | null;
  start: (conversationId?: number | null) => void;
  sendMessage: (content: string) => void;
+  sendAnswer: (answers: Record<string, string | string[]>) => void;
  disconnect: () => void;
  clearMessages: () => void;
 }
@@ -36,6 +38,7 @@ export function useAssistantChat({
  const [connectionStatus, setConnectionStatus] =
    useState<ConnectionStatus>("disconnected");
  const [conversationId, setConversationId] = useState<number | null>(null);
+  const [currentQuestions, setCurrentQuestions] = useState<SpecQuestion[] | null>(null);

  const wsRef = useRef<WebSocket | null>(null);
  const currentAssistantMessageRef = useRef<string | null>(null);
@@ -204,6 +207,25 @@ export function useAssistantChat({
            break;
          }

+          case "question": {
+            // Claude is asking structured questions via ask_user tool
+            setCurrentQuestions(data.questions);
+            setIsLoading(false);
+
+            // Attach questions to the last assistant message for display context
+            setMessages((prev) => {
+              const lastMessage = prev[prev.length - 1];
+              if (lastMessage?.role === "assistant" && lastMessage.isStreaming) {
+                return [
+                  ...prev.slice(0, -1),
+                  { ...lastMessage, isStreaming: false, questions: data.questions },
+                ];
+              }
+              return prev;
+            });
+            break;
+          }
+
          case "conversation_created": {
            setConversationId(data.conversation_id);
            break;
@@ -327,6 +349,49 @@ export function useAssistantChat({
    [onError],
  );

+  const sendAnswer = useCallback(
+    (answers: Record<string, string | string[]>) => {
+      if (!wsRef.current || wsRef.current.readyState !== WebSocket.OPEN) {
+        onError?.("Not connected");
+        return;
+      }
+
+      // Format answers as display text for user message
+      const answerParts: string[] = [];
+      for (const [, value] of Object.entries(answers)) {
+        if (Array.isArray(value)) {
+          answerParts.push(value.join(", "));
+        } else {
+          answerParts.push(value);
+        }
+      }
+      const displayText = answerParts.join("; ");
+
+      // Add user message to chat
+      setMessages((prev) => [
+        ...prev,
+        {
+          id: generateId(),
+          role: "user",
+          content: displayText,
+          timestamp: new Date(),
+        },
+      ]);
+
+      setCurrentQuestions(null);
+      setIsLoading(true);
+
+      // Send structured answer to server
+      wsRef.current.send(
+        JSON.stringify({
+          type: "answer",
+          answers,
+        }),
+      );
+    },
+    [onError],
+  );
+
  const disconnect = useCallback(() => {
    reconnectAttempts.current = maxReconnectAttempts; // Prevent reconnection
    if (pingIntervalRef.current) {
@@ -350,8 +415,10 @@ export function useAssistantChat({
    isLoading,
    connectionStatus,
    conversationId,
+    currentQuestions,
    start,
    sendMessage,
+    sendAnswer,
    disconnect,
    clearMessages,
  };
--- a/ui/src/hooks/useExpandChat.ts
+++ b/ui/src/hooks/useExpandChat.ts
@@ -107,16 +107,20 @@ export function useExpandChat({
      }, 30000)
    }

-    ws.onclose = () => {
+    ws.onclose = (event) => {
      setConnectionStatus('disconnected')
      if (pingIntervalRef.current) {
        clearInterval(pingIntervalRef.current)
        pingIntervalRef.current = null
      }

+      // Don't retry on application-level errors (4xxx codes won't resolve on retry)
+      const isAppError = event.code >= 4000 && event.code <= 4999
+
      // Attempt reconnection if not intentionally closed
      if (
        !manuallyDisconnectedRef.current &&
+        !isAppError &&
        reconnectAttempts.current < maxReconnectAttempts &&
        !isCompleteRef.current
      ) {
--- a/ui/src/hooks/useProjects.ts
+++ b/ui/src/hooks/useProjects.ts
@@ -4,7 +4,7 @@

 import { useQuery, useMutation, useQueryClient } from '@tanstack/react-query'
 import * as api from '../lib/api'
-import type { FeatureCreate, FeatureUpdate, ModelsResponse, ProjectSettingsUpdate, Settings, SettingsUpdate } from '../lib/types'
+import type { DevServerConfig, FeatureCreate, FeatureUpdate, ModelsResponse, ProjectSettingsUpdate, ProvidersResponse, Settings, SettingsUpdate } from '../lib/types'

 // ============================================================================
 // Projects
@@ -254,20 +254,41 @@ export function useValidatePath() {
 // Default models response for placeholder (until API responds)
 const DEFAULT_MODELS: ModelsResponse = {
  models: [
-    { id: 'claude-opus-4-5-20251101', name: 'Claude Opus 4.5' },
-    { id: 'claude-sonnet-4-5-20250929', name: 'Claude Sonnet 4.5' },
+    { id: 'claude-opus-4-6', name: 'Claude Opus' },
+    { id: 'claude-sonnet-4-5-20250929', name: 'Claude Sonnet' },
  ],
-  default: 'claude-opus-4-5-20251101',
+  default: 'claude-opus-4-6',
 }

 const DEFAULT_SETTINGS: Settings = {
  yolo_mode: false,
-  model: 'claude-opus-4-5-20251101',
+  model: 'claude-opus-4-6',
  glm_mode: false,
  ollama_mode: false,
  testing_agent_ratio: 1,
  playwright_headless: true,
  batch_size: 3,
+  api_provider: 'claude',
+  api_base_url: null,
+  api_has_auth_token: false,
+  api_model: null,
+}
+
+const DEFAULT_PROVIDERS: ProvidersResponse = {
+  providers: [
+    { id: 'claude', name: 'Claude (Anthropic)', base_url: null, models: DEFAULT_MODELS.models, default_model: 'claude-opus-4-6', requires_auth: false },
+  ],
+  current: 'claude',
+}
+
+export function useAvailableProviders() {
+  return useQuery({
+    queryKey: ['available-providers'],
+    queryFn: api.getAvailableProviders,
+    staleTime: 300000,
+    retry: 1,
+    placeholderData: DEFAULT_PROVIDERS,
+  })
 }

 export function useAvailableModels() {
@@ -319,6 +340,41 @@ export function useUpdateSettings() {
    },
    onSettled: () => {
      queryClient.invalidateQueries({ queryKey: ['settings'] })
+      queryClient.invalidateQueries({ queryKey: ['available-models'] })
+      queryClient.invalidateQueries({ queryKey: ['available-providers'] })
+    },
+  })
+}
+
+// ============================================================================
+// Dev Server Config
+// ============================================================================
+
+// Default config for placeholder (until API responds)
+const DEFAULT_DEV_SERVER_CONFIG: DevServerConfig = {
+  detected_type: null,
+  detected_command: null,
+  custom_command: null,
+  effective_command: null,
+}
+
+export function useDevServerConfig(projectName: string | null) {
+  return useQuery({
+    queryKey: ['dev-server-config', projectName],
+    queryFn: () => api.getDevServerConfig(projectName!),
+    enabled: !!projectName,
+    staleTime: 30_000,
+    placeholderData: DEFAULT_DEV_SERVER_CONFIG,
+  })
+}
+
+export function useUpdateDevServerConfig(projectName: string) {
+  const queryClient = useQueryClient()
+  return useMutation({
+    mutationFn: (customCommand: string | null) =>
+      api.updateDevServerConfig(projectName, customCommand),
+    onSuccess: () => {
+      queryClient.invalidateQueries({ queryKey: ['dev-server-config', projectName] })
    },
  })
 }
--- a/ui/src/hooks/useSpecChat.ts
+++ b/ui/src/hooks/useSpecChat.ts
@@ -157,15 +157,18 @@ export function useSpecChat({
      }, 30000)
    }

-    ws.onclose = () => {
+    ws.onclose = (event) => {
      setConnectionStatus('disconnected')
      if (pingIntervalRef.current) {
        clearInterval(pingIntervalRef.current)
        pingIntervalRef.current = null
      }

+      // Don't retry on application-level errors (4xxx codes won't resolve on retry)
+      const isAppError = event.code >= 4000 && event.code <= 4999
+
      // Attempt reconnection if not intentionally closed
-      if (reconnectAttempts.current < maxReconnectAttempts && !isCompleteRef.current) {
+      if (!isAppError && reconnectAttempts.current < maxReconnectAttempts && !isCompleteRef.current) {
        reconnectAttempts.current++
        const delay = Math.min(1000 * Math.pow(2, reconnectAttempts.current), 10000)
        reconnectTimeoutRef.current = window.setTimeout(connect, delay)
--- a/ui/src/hooks/useWebSocket.ts
+++ b/ui/src/hooks/useWebSocket.ts
@@ -335,10 +335,14 @@ export function useProjectWebSocket(projectName: string | null) {
        }
      }

-      ws.onclose = () => {
+      ws.onclose = (event) => {
        setState(prev => ({ ...prev, isConnected: false }))
        wsRef.current = null

+        // Don't retry on application-level errors (4xxx codes won't resolve on retry)
+        const isAppError = event.code >= 4000 && event.code <= 4999
+        if (isAppError) return
+
        // Exponential backoff reconnection
        const delay = Math.min(1000 * Math.pow(2, reconnectAttempts.current), 30000)
        reconnectAttempts.current++
--- a/ui/src/lib/api.ts
+++ b/ui/src/lib/api.ts
@@ -24,6 +24,7 @@ import type {
  Settings,
  SettingsUpdate,
  ModelsResponse,
+  ProvidersResponse,
  DevServerStatusResponse,
  DevServerConfig,
  TerminalInfo,
@@ -399,6 +400,10 @@ export async function getAvailableModels(): Promise<ModelsResponse> {
  return fetchJSON('/settings/models')
 }

+export async function getAvailableProviders(): Promise<ProvidersResponse> {
+  return fetchJSON('/settings/providers')
+}
+
 export async function getSettings(): Promise<Settings> {
  return fetchJSON('/settings')
 }
@@ -440,6 +445,16 @@ export async function getDevServerConfig(projectName: string): Promise<DevServer
  return fetchJSON(`/projects/${encodeURIComponent(projectName)}/devserver/config`)
 }

+export async function updateDevServerConfig(
+  projectName: string,
+  customCommand: string | null
+): Promise<DevServerConfig> {
+  return fetchJSON(`/projects/${encodeURIComponent(projectName)}/devserver/config`, {
+    method: 'PATCH',
+    body: JSON.stringify({ custom_command: customCommand }),
+  })
+}
+
 // ============================================================================
 // Terminal API
 // ============================================================================
--- a/ui/src/lib/types.ts
+++ b/ui/src/lib/types.ts
@@ -465,6 +465,11 @@ export interface AssistantChatConversationCreatedMessage {
  conversation_id: number
 }

+export interface AssistantChatQuestionMessage {
+  type: 'question'
+  questions: SpecQuestion[]
+}
+
 export interface AssistantChatPongMessage {
  type: 'pong'
 }
@@ -472,6 +477,7 @@ export interface AssistantChatPongMessage {
 export type AssistantChatServerMessage =
  | AssistantChatTextMessage
  | AssistantChatToolCallMessage
+  | AssistantChatQuestionMessage
  | AssistantChatResponseDoneMessage
  | AssistantChatErrorMessage
  | AssistantChatConversationCreatedMessage
@@ -525,6 +531,20 @@ export interface ModelsResponse {
  default: string
 }

+export interface ProviderInfo {
+  id: string
+  name: string
+  base_url: string | null
+  models: ModelInfo[]
+  default_model: string
+  requires_auth: boolean
+}
+
+export interface ProvidersResponse {
+  providers: ProviderInfo[]
+  current: string
+}
+
 export interface Settings {
  yolo_mode: boolean
  model: string
@@ -533,6 +553,10 @@ export interface Settings {
  testing_agent_ratio: number  // Regression testing agents (0-3)
  playwright_headless: boolean
  batch_size: number  // Features per coding agent batch (1-3)
+  api_provider: string
+  api_base_url: string | null
+  api_has_auth_token: boolean
+  api_model: string | null
 }

 export interface SettingsUpdate {
@@ -541,6 +565,10 @@ export interface SettingsUpdate {
  testing_agent_ratio?: number
  playwright_headless?: boolean
  batch_size?: number
+  api_provider?: string
+  api_base_url?: string
+  api_auth_token?: string
+  api_model?: string
 }

 export interface ProjectSettingsUpdate {
--- a/ui/src/styles/globals.css
+++ b/ui/src/styles/globals.css
@@ -1271,6 +1271,186 @@
  margin: 2rem 0;
 }

+/* ============================================================================
+   Chat Prose Typography (for markdown in chat bubbles)
+   ============================================================================ */
+
+.chat-prose {
+  line-height: 1.6;
+  color: inherit;
+}
+
+.chat-prose > :first-child {
+  margin-top: 0;
+}
+
+.chat-prose > :last-child {
+  margin-bottom: 0;
+}
+
+.chat-prose h1 {
+  font-size: 1.25rem;
+  font-weight: 700;
+  margin-top: 1.25rem;
+  margin-bottom: 0.5rem;
+}
+
+.chat-prose h2 {
+  font-size: 1.125rem;
+  font-weight: 700;
+  margin-top: 1rem;
+  margin-bottom: 0.5rem;
+}
+
+.chat-prose h3 {
+  font-size: 1rem;
+  font-weight: 600;
+  margin-top: 0.75rem;
+  margin-bottom: 0.375rem;
+}
+
+.chat-prose h4,
+.chat-prose h5,
+.chat-prose h6 {
+  font-size: 0.875rem;
+  font-weight: 600;
+  margin-top: 0.75rem;
+  margin-bottom: 0.25rem;
+}
+
+.chat-prose p {
+  margin-bottom: 0.5rem;
+}
+
+.chat-prose ul,
+.chat-prose ol {
+  margin-bottom: 0.5rem;
+  padding-left: 1.25rem;
+}
+
+.chat-prose ul {
+  list-style-type: disc;
+}
+
+.chat-prose ol {
+  list-style-type: decimal;
+}
+
+.chat-prose li {
+  margin-bottom: 0.25rem;
+}
+
+.chat-prose li > ul,
+.chat-prose li > ol {
+  margin-top: 0.25rem;
+  margin-bottom: 0;
+}
+
+.chat-prose pre {
+  background: var(--muted);
+  border: 1px solid var(--border);
+  border-radius: var(--radius);
+  padding: 0.75rem;
+  overflow-x: auto;
+  margin-bottom: 0.5rem;
+  font-family: var(--font-mono);
+  font-size: 0.75rem;
+  line-height: 1.5;
+}
+
+.chat-prose code:not(pre code) {
+  background: var(--muted);
+  padding: 0.1rem 0.3rem;
+  border-radius: 0.25rem;
+  font-family: var(--font-mono);
+  font-size: 0.75rem;
+}
+
+.chat-prose table {
+  width: 100%;
+  border-collapse: collapse;
+  margin-bottom: 0.5rem;
+  font-size: 0.8125rem;
+}
+
+.chat-prose th {
+  background: var(--muted);
+  font-weight: 600;
+  text-align: left;
+  padding: 0.375rem 0.5rem;
+  border: 1px solid var(--border);
+}
+
+.chat-prose td {
+  padding: 0.375rem 0.5rem;
+  border: 1px solid var(--border);
+}
+
+.chat-prose blockquote {
+  border-left: 3px solid var(--primary);
+  padding-left: 0.75rem;
+  margin-bottom: 0.5rem;
+  font-style: italic;
+  opacity: 0.9;
+}
+
+.chat-prose a {
+  color: var(--primary);
+  text-decoration: underline;
+  text-underline-offset: 2px;
+}
+
+.chat-prose a:hover {
+  opacity: 0.8;
+}
+
+.chat-prose strong {
+  font-weight: 700;
+}
+
+.chat-prose hr {
+  border: none;
+  border-top: 1px solid var(--border);
+  margin: 0.75rem 0;
+}
+
+.chat-prose img {
+  max-width: 100%;
+  border-radius: var(--radius);
+}
+
+/* User message overrides - need contrast against primary-colored bubble */
+.chat-prose-user pre {
+  background: rgb(255 255 255 / 0.15);
+  border-color: rgb(255 255 255 / 0.2);
+}
+
+.chat-prose-user code:not(pre code) {
+  background: rgb(255 255 255 / 0.15);
+}
+
+.chat-prose-user th {
+  background: rgb(255 255 255 / 0.15);
+}
+
+.chat-prose-user th,
+.chat-prose-user td {
+  border-color: rgb(255 255 255 / 0.2);
+}
+
+.chat-prose-user blockquote {
+  border-left-color: rgb(255 255 255 / 0.5);
+}
+
+.chat-prose-user a {
+  color: inherit;
+  text-decoration: underline;
+}
+
+.chat-prose-user hr {
+  border-top-color: rgb(255 255 255 / 0.2);
+}
+
 /* ============================================================================
   Scrollbar Styling
   ============================================================================ */
--- a/ui/vite.config.ts
+++ b/ui/vite.config.ts
@@ -36,6 +36,8 @@ export default defineConfig({
            '@radix-ui/react-slot',
            '@radix-ui/react-switch',
          ],
+          // Markdown rendering
+          'vendor-markdown': ['react-markdown', 'remark-gfm'],
          // Icons and utilities
          'vendor-utils': [
            'lucide-react',
Author	SHA1	Message	Date
Auto	55064945a4	version patch	2026-02-09 08:56:33 +02:00
Auto	859987e3b4	0.1.10	2026-02-09 08:55:49 +02:00
Auto	f87970daca	fix: prevent temp file accumulation during long agent runs Address three issues reported after overnight AutoForge runs: 1. ~193GB of .node files in %TEMP% from V8 compile caching 2. Stale npm artifact folders on drive root when %TEMP% fills up 3. PNG screenshot files left in project root by Playwright Changes: - Widen .node cleanup glob from ".78912.node" to ".[0-9a-f].node" to match all V8 compile cache hex prefixes - Add "node-compile-cache" directory to temp cleanup patterns - Set NODE_COMPILE_CACHE="" in all subprocess environments (client.py, parallel_orchestrator.py, process_manager.py) to disable V8 compile caching at the source - Add cleanup_project_screenshots() to remove stale .png files from project directories (feature-.png, screenshot-.png, step-.png) - Run cleanup_stale_temp() at server startup in lifespan() - Add _run_inter_session_cleanup() to orchestrator, called after each agent completes (both coding and testing paths) - Update coding and testing prompt templates to instruct agents to use inline (base64) screenshots only, never saving files to disk Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-09 08:54:52 +02:00
Auto	9eb08d3f71	version patch	2026-02-08 15:51:11 +02:00
Auto	8d76deb75f	0.1.9	2026-02-08 15:50:50 +02:00
Auto	3a31761542	ui: add resizable drag handle to assistant chat panel Add a draggable resize handle on the left edge of the AI assistant panel, allowing users to adjust the panel width by clicking and dragging. Width is persisted to localStorage across sessions. - Drag handle with hover highlight (border -> primary color) - Min width 300px, max width 90vw - Width saved to localStorage under 'assistant-panel-width' - Cursor changes to col-resize and text selection disabled during drag Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-08 15:45:21 +02:00
Auto	96feb38aea	ui: restructure header navbar into two-row responsive layout Redesign the header from a single overflowing row into a clean two-row layout that prevents content from overlapping the logo and bleeding outside the navbar on smaller screens. Row 1: Logo + project selector + spacer + mode badges + utility icons Row 2: Agent controls + dev server + spacer + settings + reset (only rendered when a project is selected, with a subtle border divider) Changes: - App.tsx: Split header into two logical rows with flex spacers for right-alignment; hide title text below md breakpoint; move mode badges (Ollama/GLM) to row 1 with sm:hidden for small screens - ProjectSelector: Responsive min-width (140px mobile, 200px desktop); truncate long project names instead of pushing icons off-screen - AgentControl: Responsive gap (gap-2 mobile, gap-4 desktop) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-08 15:41:17 +02:00
Auto	1925818d49	feat: fix tooltip shortcuts and add dev server config dialog Tooltip fixes (PR #177 follow-up): - Remove duplicate title attr on Settings button that caused double-tooltip - Restore keyboard shortcut hints in tooltip text: Settings (,), Reset (R) - Clean up spurious peer markers in package-lock.json Dev server config dialog: - Add DevServerConfigDialog component for custom dev commands - Open config dialog automatically when start fails with "no dev command" - Add useDevServerConfig/useUpdateDevServerConfig hooks - Add updateDevServerConfig API function - Add config gear button next to dev server start Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-08 15:29:44 +02:00
Leon van Zyl	38fc8788a2	Merge pull request #177 from brainit-consulting/feat/navbar-tooltips ui: add Radix tooltips to header icons	2026-02-08 15:26:28 +02:00
Emile du Toit	b439e2d241	ui: add Radix tooltips to header icons	2026-02-07 19:56:59 -05:00
Auto	b0490be501	version patch	2026-02-06 15:27:09 +02:00
Auto	13a3ff9ac1	0.1.8	2026-02-06 15:26:48 +02:00
Auto	71f17c73c2	feat: add structured questions (AskUserQuestion) to assistant chat Add interactive multiple-choice question support to the project assistant, allowing it to present clickable options when clarification is needed. Backend changes: - Add ask_user MCP tool to feature_mcp.py with input validation - Add mcp__features__ask_user to assistant allowed tools list - Intercept ask_user tool calls in _query_claude() to yield question messages - Add answer WebSocket message handler in assistant_chat router - Document ask_user tool in assistant system prompt Frontend changes: - Add AssistantChatQuestionMessage type and update server message union - Add currentQuestions state and sendAnswer() to useAssistantChat hook - Handle question WebSocket messages by attaching to last assistant message - Render QuestionOptions component between messages and input area - Disable text input while structured questions are active Flow: Claude calls ask_user → backend intercepts → WebSocket question message → frontend renders QuestionOptions → user clicks options → answer sent back → Claude receives formatted answer and continues conversation. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-06 15:26:36 +02:00
Auto	46ac373748	0.1.7	2026-02-06 14:37:42 +02:00
Auto	0d04a062a2	feat: add full markdown rendering to chat messages Replace the custom BOLD_REGEX parser in ChatMessage.tsx with react-markdown + remark-gfm for proper rendering of headers, tables, lists, code blocks, blockquotes, links, and horizontal rules in all chat UIs (AssistantChat, SpecCreationChat, ExpandProjectChat). Changes: - Add react-markdown and remark-gfm dependencies - Add vendor-markdown chunk to Vite manual chunks for code splitting - Add .chat-prose CSS class with styles for all markdown elements - Add .chat-prose-user modifier for contrast on primary-colored bubbles - Replace line-splitting + regex logic with ReactMarkdown component - Links open in new tabs via custom component override - System messages remain plain text (unchanged) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-06 14:37:39 +02:00
Auto	7d08700f3a	version patch	2026-02-06 13:41:17 +02:00
Auto	5ecf74cb31	0.1.6	2026-02-06 13:40:53 +02:00
Auto	9259a799e3	fix: propagate alternative API provider settings to agent subprocesses When users configured GLM/Ollama/Kimi via the Settings UI, agents still used Claude because conflicting env vars leaked through subprocess env. Root cause: get_effective_sdk_env() set ANTHROPIC_AUTH_TOKEN for GLM but didn't clear ANTHROPIC_API_KEY, which leaked from os.environ. The CLI prioritized the wrong credential. Changes: - registry.py: Clear conflicting auth vars (API_KEY vs AUTH_TOKEN) and Vertex AI vars when building env for alternative providers - client.py: Replace manual os.getenv() loop with get_effective_sdk_env() so agent SDK reads provider settings from the database - autonomous_agent_demo.py: Apply UI-configured provider settings to process env so CLI-launched agents also respect Settings UI config - start.py: Pass --model from settings when launching agent subprocess - server/schemas.py: Allow non-Claude model names when an alternative provider is configured (prevents 422 errors for glm-4.7, etc.) - .env.example: Document env vars for GLM, Ollama, and Kimi providers Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-06 13:38:36 +02:00
Auto	f24c7cbf62	patch npm version	2026-02-06 09:44:20 +02:00
Auto	f664378775	0.1.5	2026-02-06 09:43:31 +02:00
Auto	a52f191a54	refactor: make Settings UI the single source of truth for API provider Remove legacy env-var-based provider/mode detection that caused misleading UI badges (e.g., GLM badge showing when Settings was set to Claude). Key changes: - Remove _is_glm_mode() and _is_ollama_mode() env-var sniffing functions from server/routers/settings.py; derive glm_mode/ollama_mode purely from the api_provider setting - Remove `import os` from settings router (no longer needed) - Update schema comments to reflect settings-based derivation - Remove "(configured via .env)" from badge tooltips in App.tsx - Remove Kimi/GLM/Ollama/Playwright-headless sections from .env.example; add note pointing to Settings UI - Update CLAUDE.md and README.md documentation to reference Settings UI for alternative provider configuration - Update model IDs from claude-opus-4-5-20251101 to claude-opus-4-6 across registry, client, chat sessions, tests, and UI defaults - Add LEGACY_MODEL_MAP with auto-migration in get_all_settings() - Show model ID subtitle in SettingsModal model selector - Add Vertex passthrough test for claude-opus-4-6 (no date suffix) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-06 09:23:06 +02:00
Auto	c0aaac241c	npm version patch	2026-02-06 08:10:59 +02:00
Auto	547f1e7d9b	0.1.4	2026-02-06 08:10:39 +02:00
Auto	73d6cfcd36	fix: address PR #163 review findings - Fix model selection regression: _get_settings_defaults() now checks api_model (set by new provider UI) before falling back to legacy model setting, ensuring Claude model selection works end-to-end - Add input validation for provider settings: api_base_url must start with http:// or https:// (max 500 chars), api_auth_token max 500 chars, api_model max 200 chars - Fix terminal.py misleading import alias: replace is_valid_project_name aliased as validate_project_name with direct is_valid_project_name import across all 5 call sites Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-06 08:10:18 +02:00
Leon van Zyl	d15fd37e33	Merge pull request #163 from nioasoft/feat/api-provider-ui feat: add API provider selection UI (Claude, Kimi, GLM, Ollama, Custom)	2026-02-06 08:06:37 +02:00
Auto	97a3250a37	update README	2026-02-06 07:49:28 +02:00
nioasoft	a752ece70c	fix: wrong import alias overwrote project_name with bool assistant_chat.py and spec_creation.py imported is_valid_project_name (returns bool) aliased as validate_project_name. When used as `project_name = validate_project_name(project_name)`, the project name was replaced with True, causing "Project not found in registry" errors. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-06 06:20:03 +02:00
nioasoft	3c61496021	fix: clean up stuck features on agent start Ensures features stuck from a previous crash are reset before launching a new agent, not just on stop/crash going forward. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-06 06:02:30 +02:00
nioasoft	6d4a198380	fix: remove unused API_ENV_VARS imports from chat sessions The provider refactor moved env building to get_effective_sdk_env(), making these imports unused. Fixes ruff F401 lint errors in CI. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-06 05:57:47 +02:00
nioasoft	13785325d7	feat: add API provider selection UI and fix stuck features on agent crash API Provider Selection: - Add provider switcher in Settings modal (Claude, Kimi, GLM, Ollama, Custom) - Auth tokens stored locally only (registry.db), never returned by API - get_effective_sdk_env() builds provider-specific env vars for agent subprocess - All chat sessions (spec, expand, assistant) use provider settings - Backward compatible: defaults to Claude, env vars still work as override Fix Stuck Features: - Add _cleanup_stale_features() to process_manager.py - Reset in_progress features when agent stops, crashes, or fails healthcheck - Prevents features from being permanently stuck after rate limit crashes - Uses separate SQLAlchemy engine to avoid session conflicts with subprocess Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-06 05:55:51 +02:00
nioasoft	70131f2271	fix: accept WebSocket before validation to prevent opaque 403 errors All WebSocket endpoints now call websocket.accept() before any validation checks. Previously, closing the connection before accepting caused Starlette to return an opaque HTTP 403 instead of a meaningful error message. Changes: - Server: Accept WebSocket first, then send JSON error + close with 4xxx code if validation fails (expand, spec, assistant, terminal, main project WS) - Server: ConnectionManager.connect() no longer calls accept() to avoid double-accept - UI: Gate expand button and keyboard shortcut on hasSpec - UI: Skip WebSocket reconnection on application error codes (4000-4999) - UI: Update keyboard shortcuts help text Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-06 05:46:24 +02:00
nioasoft	035e8fdfca	fix: accept WebSocket before validation to prevent opaque 403 errors All 5 WebSocket endpoints (expand, spec, assistant, terminal, project) were closing the connection before calling accept() when validation failed. Starlette converts pre-accept close into an HTTP 403, giving clients no meaningful error information. Server changes: - Move websocket.accept() before all validation checks in every WS handler - Send JSON error message before closing so clients get actionable errors - Fix validate_project_name usage (raises HTTPException, not returns bool) - ConnectionManager.connect() no longer calls accept() (caller's job) Client changes: - All 3 WS hooks (useWebSocket, useExpandChat, useSpecChat) skip reconnection on 4xxx close codes (application errors won't self-resolve) - Gate expand button, keyboard shortcut, and modal on hasSpec - Add hasSpec to useEffect dependency array to prevent stale closure - Update keyboard shortcuts help text for E key context Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-05 21:08:46 +02:00
Auto	f4facb3200	update lock	2026-02-05 09:55:39 +02:00
Auto	2f8a6a6274	0.1.3	2026-02-05 09:54:57 +02:00
Auto	76246bad69	fix: add temp_cleanup.py to npm package files whitelist PR #158 added temp_cleanup.py and its import in autonomous_agent_demo.py but did not include the file in the package.json "files" array. This caused ModuleNotFoundError for npm installations since the module was missing from the published tarball. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2026-02-05 09:54:33 +02:00
Auto	b736fb7382	update packagelock	2026-02-05 08:53:26 +02:00
Auto	032752e564	0.1.2	2026-02-05 08:53:00 +02:00
Auto	c55a1a0182	fix: harden dev server RCE mitigations from PR #153 Address security gaps and improve validation in the dev server command execution path introduced by PR #153: Security fixes (critical): - Add missing shell metacharacters to dangerous_ops blocklist: single & (Windows cmd.exe command separator), >, <, ^, %, \n, \r - The single & gap was a confirmed RCE bypass on Windows where .cmd files are always executed via cmd.exe even with shell=False (CPython limitation documented in issue #77696) - Apply validate_custom_command_strict at /start endpoint for defense-in-depth against config file tampering Validation improvements: - Fix uvicorn --flag=value syntax (split on = before comparing) - Expand Python support: Django (manage.py), Flask, custom .py scripts - Add runners: flask, poetry, cargo, go, npx - Expand npm script allowlist: serve, develop, server, preview - Reorder PATCH /config validation to run strict check first (fail fast) - Extract constants: ALLOWED_NPM_SCRIPTS, ALLOWED_PYTHON_MODULES, BLOCKED_SHELLS for reuse and testability Cleanup: - Remove unused security.py imports from dev_server_manager.py - Fix deprecated datetime.utcnow() -> datetime.now(timezone.utc) - Remove unnecessary _remove_lock() in exception handlers where lock was never created (Popen failure path) Tests: - Add test_devserver_security.py with 78 tests covering valid commands, blocked shells, blocked commands, injection attempts, dangerous_ops blocking, and constant verification Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2026-02-05 08:52:47 +02:00
Leon van Zyl	75766a433a	Merge pull request #153 from syphonetic/master Implemented RCE mitigation measures	2026-02-05 08:31:28 +02:00
Leon van Zyl	ee993ed8ed	Merge pull request #158 from Mediainvita/fix/temp-cleanup fix: add automatic temp folder cleanup at Maestro startup	2026-02-05 08:20:23 +02:00
Manuel Fischer	a3b0abdc31	fix: add automatic temp folder cleanup at Maestro startup Problem: When AutoForge runs agents that use Playwright for browser testing or mongodb-memory-server for database tests, temporary files accumulate in the system temp folder (%TEMP% on Windows, /tmp on Linux/macOS). These files are never cleaned up automatically and can consume hundreds of GB over time. Affected temp items: - playwright_firefoxdev_profile-* (browser profiles) - playwright-artifacts-* (test artifacts) - playwright-transform-cache - mongodb-memory-server* (MongoDB binaries) - ng-* (Angular CLI temp) - scoped_dir* (Chrome/Chromium temp) - .78912.node (Node.js native module cache, ~7MB each) - claude--cwd (Claude CLI working directory files) - mat-debug-*.log (Material/Angular debug logs) Solution: - New temp_cleanup.py module with cleanup_stale_temp() function - Called at Maestro (orchestrator) startup in autonomous_agent_demo.py - Only deletes files/folders older than 1 hour (safe for running processes) - Runs every time the Play button is clicked or agent auto-restarts - Reports cleanup stats: dirs deleted, files deleted, MB freed Why cleanup at Maestro startup: - Reliable hook point (runs on every agent start, including auto-restart after rate limits which happens every ~5 hours) - No need for background timers or scheduled tasks - Cleanup happens before new temp files are created Testing: - Tested on Windows with 958 items in temp folder - Successfully cleaned 45 dirs, 758 files, freed 415 MB - Files younger than 1 hour correctly preserved Closes #155 Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2026-02-05 00:08:26 +01:00
Auto	326f38b3c4	version patch	2026-02-04 15:41:15 +02:00
syphonetic	81d2f0cbe0	Merge branch 'master' into master	2026-02-04 05:50:35 +08:00
syphonetic	c7c88449ad	Remove unused dev server management functions Removed unused functions and endpoints related to dev server management, including command validation and configuration updates.	2026-02-04 02:34:29 +08:00
syphonetic	9622da9561	Remove unnecessary blank line in dev_server_manager.py	2026-02-04 02:34:06 +08:00
syphonetic	83d2182107	Refactor dev server API for security and validation Refactor dev server API to enhance security and command validation. Added logging and improved command handling.	2026-02-04 02:19:19 +08:00
syphonetic	7651436c27	Refactor dev server command execution and locking Refactor dev server management to improve command execution and security checks. Introduce lock file handling and command validation enhancements.	2026-02-04 02:18:55 +08:00