fix: address PR #184 review findings for blocked-for-human-input feature

A) Graph view: add needs_human_input bucket to handleGraphNodeClick so clicking blocked nodes opens the feature modal B) MCP validation: validate field type enum, require options for select, enforce unique non-empty field IDs and labels C) Progress fallback: include needs_human_input in non-WebSocket total D) WebSocket: track needs_human_input count in progress state E) Cleanup guard: remove unnecessary needs_human_input check in _cleanup_stale_features (resolved via merge conflict) F) Defensive SQL: require in_progress=1 in feature_request_human_input Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Merge remote-tracking branch 'origin/master' into feature/blocked-for-human-input
2026-03-18 19:33:09 +00:00 · 2026-02-12 07:36:48 +02:00 · 2026-02-12 07:36:11 +02:00 · 2026-02-12 07:28:37 +02:00 · 2026-02-12 07:22:01 +02:00 · 2026-02-11 18:48:44 +02:00
90 changed files with 6412 additions and 695 deletions
--- a/.claude/commands/review-pr.md
+++ b/.claude/commands/review-pr.md
@@ -72,4 +72,21 @@ Pull request(s): $ARGUMENTS
     - What this PR is actually about (one sentence)
     - The key concerns, if any (or "no significant concerns")
     - **Verdict: MERGE** / **MERGE (with minor follow-up)** / **DON'T MERGE** with a one-line reason
-   - This section should be scannable in under 10 seconds
+   - This section should be scannable in under 10 seconds
+
+10. **Post-Review Action**
+    - Immediately after the TLDR, provide a `## Recommended Action` section
+    - Based on the verdict, recommend one of the following actions:
+
+    **If verdict is MERGE (no concerns):**
+    - Recommend merging as-is. No further action needed.
+
+    **If verdict is MERGE (with minor follow-up):**
+    - If the concerns are low-risk and straightforward to fix (e.g., naming tweaks, small refactors, missing type annotations, minor style issues, trivial bug fixes), recommend merging the PR now and offer to immediately address the concerns in a follow-up commit directly on the target branch
+    - List the specific changes you would make in the follow-up
+    - Ask the user: *"Should I merge this PR and push a follow-up commit addressing these concerns?"*
+
+    **If verdict is DON'T MERGE:**
+    - If the blocking concerns are still relatively contained and you are confident you can resolve them quickly (e.g., a small bug fix, a missing validation, a straightforward architectural adjustment), recommend merging the PR and immediately addressing the issues in a follow-up commit — but only if the fixes are low-risk and well-understood
+    - If the issues are too complex, risky, or require author input (e.g., design decisions, major refactors, unclear intent), recommend sending the PR back to the author with specific feedback on what needs to change
+    - Be honest about your confidence level — if you're unsure whether you can address the concerns correctly, say so and defer to the author
--- a/.claude/skills/playwright-cli/SKILL.md
+++ b/.claude/skills/playwright-cli/SKILL.md
@@ -0,0 +1,259 @@
+---
+name: playwright-cli
+description: Automates browser interactions for web testing, form filling, screenshots, and data extraction. Use when the user needs to navigate websites, interact with web pages, fill forms, take screenshots, test web applications, or extract information from web pages.
+allowed-tools: Bash(playwright-cli:*)
+---
+
+# Browser Automation with playwright-cli
+
+## Quick start
+
+```bash
+# open new browser
+playwright-cli open
+# navigate to a page
+playwright-cli goto https://playwright.dev
+# interact with the page using refs from the snapshot
+playwright-cli click e15
+playwright-cli type "page.click"
+playwright-cli press Enter
+# take a screenshot
+playwright-cli screenshot
+# close the browser
+playwright-cli close
+```
+
+## Commands
+
+### Core
+
+```bash
+playwright-cli open
+# open and navigate right away
+playwright-cli open https://example.com/
+playwright-cli goto https://playwright.dev
+playwright-cli type "search query"
+playwright-cli click e3
+playwright-cli dblclick e7
+playwright-cli fill e5 "user@example.com"
+playwright-cli drag e2 e8
+playwright-cli hover e4
+playwright-cli select e9 "option-value"
+playwright-cli upload ./document.pdf
+playwright-cli check e12
+playwright-cli uncheck e12
+playwright-cli snapshot
+playwright-cli snapshot --filename=after-click.yaml
+playwright-cli eval "document.title"
+playwright-cli eval "el => el.textContent" e5
+playwright-cli dialog-accept
+playwright-cli dialog-accept "confirmation text"
+playwright-cli dialog-dismiss
+playwright-cli resize 1920 1080
+playwright-cli close
+```
+
+### Navigation
+
+```bash
+playwright-cli go-back
+playwright-cli go-forward
+playwright-cli reload
+```
+
+### Keyboard
+
+```bash
+playwright-cli press Enter
+playwright-cli press ArrowDown
+playwright-cli keydown Shift
+playwright-cli keyup Shift
+```
+
+### Mouse
+
+```bash
+playwright-cli mousemove 150 300
+playwright-cli mousedown
+playwright-cli mousedown right
+playwright-cli mouseup
+playwright-cli mouseup right
+playwright-cli mousewheel 0 100
+```
+
+### Save as
+
+```bash
+playwright-cli screenshot
+playwright-cli screenshot e5
+playwright-cli screenshot --filename=page.png
+playwright-cli pdf --filename=page.pdf
+```
+
+### Tabs
+
+```bash
+playwright-cli tab-list
+playwright-cli tab-new
+playwright-cli tab-new https://example.com/page
+playwright-cli tab-close
+playwright-cli tab-close 2
+playwright-cli tab-select 0
+```
+
+### Storage
+
+```bash
+playwright-cli state-save
+playwright-cli state-save auth.json
+playwright-cli state-load auth.json
+
+# Cookies
+playwright-cli cookie-list
+playwright-cli cookie-list --domain=example.com
+playwright-cli cookie-get session_id
+playwright-cli cookie-set session_id abc123
+playwright-cli cookie-set session_id abc123 --domain=example.com --httpOnly --secure
+playwright-cli cookie-delete session_id
+playwright-cli cookie-clear
+
+# LocalStorage
+playwright-cli localstorage-list
+playwright-cli localstorage-get theme
+playwright-cli localstorage-set theme dark
+playwright-cli localstorage-delete theme
+playwright-cli localstorage-clear
+
+# SessionStorage
+playwright-cli sessionstorage-list
+playwright-cli sessionstorage-get step
+playwright-cli sessionstorage-set step 3
+playwright-cli sessionstorage-delete step
+playwright-cli sessionstorage-clear
+```
+
+### Network
+
+```bash
+playwright-cli route "**/*.jpg" --status=404
+playwright-cli route "https://api.example.com/**" --body='{"mock": true}'
+playwright-cli route-list
+playwright-cli unroute "**/*.jpg"
+playwright-cli unroute
+```
+
+### DevTools
+
+```bash
+playwright-cli console
+playwright-cli console warning
+playwright-cli network
+playwright-cli run-code "async page => await page.context().grantPermissions(['geolocation'])"
+playwright-cli tracing-start
+playwright-cli tracing-stop
+playwright-cli video-start
+playwright-cli video-stop video.webm
+```
+
+### Install
+
+```bash
+playwright-cli install --skills
+playwright-cli install-browser
+```
+
+### Configuration
+```bash
+# Use specific browser when creating session
+playwright-cli open --browser=chrome
+playwright-cli open --browser=firefox
+playwright-cli open --browser=webkit
+playwright-cli open --browser=msedge
+# Connect to browser via extension
+playwright-cli open --extension
+
+# Use persistent profile (by default profile is in-memory)
+playwright-cli open --persistent
+# Use persistent profile with custom directory
+playwright-cli open --profile=/path/to/profile
+
+# Start with config file
+playwright-cli open --config=my-config.json
+
+# Close the browser
+playwright-cli close
+# Delete user data for the default session
+playwright-cli delete-data
+```
+
+### Browser Sessions
+
+```bash
+# create new browser session named "mysession" with persistent profile
+playwright-cli -s=mysession open example.com --persistent
+# same with manually specified profile directory (use when requested explicitly)
+playwright-cli -s=mysession open example.com --profile=/path/to/profile
+playwright-cli -s=mysession click e6
+playwright-cli -s=mysession close  # stop a named browser
+playwright-cli -s=mysession delete-data  # delete user data for persistent session
+
+playwright-cli list
+# Close all browsers
+playwright-cli close-all
+# Forcefully kill all browser processes
+playwright-cli kill-all
+```
+
+## Example: Form submission
+
+```bash
+playwright-cli open https://example.com/form
+playwright-cli snapshot
+
+playwright-cli fill e1 "user@example.com"
+playwright-cli fill e2 "password123"
+playwright-cli click e3
+playwright-cli snapshot
+playwright-cli close
+```
+
+## Example: Multi-tab workflow
+
+```bash
+playwright-cli open https://example.com
+playwright-cli tab-new https://example.com/other
+playwright-cli tab-list
+playwright-cli tab-select 0
+playwright-cli snapshot
+playwright-cli close
+```
+
+## Example: Debugging with DevTools
+
+```bash
+playwright-cli open https://example.com
+playwright-cli click e4
+playwright-cli fill e7 "test"
+playwright-cli console
+playwright-cli network
+playwright-cli close
+```
+
+```bash
+playwright-cli open https://example.com
+playwright-cli tracing-start
+playwright-cli click e4
+playwright-cli fill e7 "test"
+playwright-cli tracing-stop
+playwright-cli close
+```
+
+## Specific tasks
+
+* **Request mocking** [references/request-mocking.md](references/request-mocking.md)
+* **Running Playwright code** [references/running-code.md](references/running-code.md)
+* **Browser session management** [references/session-management.md](references/session-management.md)
+* **Storage state (cookies, localStorage)** [references/storage-state.md](references/storage-state.md)
+* **Test generation** [references/test-generation.md](references/test-generation.md)
+* **Tracing** [references/tracing.md](references/tracing.md)
+* **Video recording** [references/video-recording.md](references/video-recording.md)
--- a/.claude/skills/playwright-cli/references/request-mocking.md
+++ b/.claude/skills/playwright-cli/references/request-mocking.md
@@ -0,0 +1,87 @@
+# Request Mocking
+
+Intercept, mock, modify, and block network requests.
+
+## CLI Route Commands
+
+```bash
+# Mock with custom status
+playwright-cli route "**/*.jpg" --status=404
+
+# Mock with JSON body
+playwright-cli route "**/api/users" --body='[{"id":1,"name":"Alice"}]' --content-type=application/json
+
+# Mock with custom headers
+playwright-cli route "**/api/data" --body='{"ok":true}' --header="X-Custom: value"
+
+# Remove headers from requests
+playwright-cli route "**/*" --remove-header=cookie,authorization
+
+# List active routes
+playwright-cli route-list
+
+# Remove a route or all routes
+playwright-cli unroute "**/*.jpg"
+playwright-cli unroute
+```
+
+## URL Patterns
+
+```
+**/api/users           - Exact path match
+**/api/*/details       - Wildcard in path
+**/*.{png,jpg,jpeg}    - Match file extensions
+**/search?q=*          - Match query parameters
+```
+
+## Advanced Mocking with run-code
+
+For conditional responses, request body inspection, response modification, or delays:
+
+### Conditional Response Based on Request
+
+```bash
+playwright-cli run-code "async page => {
+  await page.route('**/api/login', route => {
+    const body = route.request().postDataJSON();
+    if (body.username === 'admin') {
+      route.fulfill({ body: JSON.stringify({ token: 'mock-token' }) });
+    } else {
+      route.fulfill({ status: 401, body: JSON.stringify({ error: 'Invalid' }) });
+    }
+  });
+}"
+```
+
+### Modify Real Response
+
+```bash
+playwright-cli run-code "async page => {
+  await page.route('**/api/user', async route => {
+    const response = await route.fetch();
+    const json = await response.json();
+    json.isPremium = true;
+    await route.fulfill({ response, json });
+  });
+}"
+```
+
+### Simulate Network Failures
+
+```bash
+playwright-cli run-code "async page => {
+  await page.route('**/api/offline', route => route.abort('internetdisconnected'));
+}"
+# Options: connectionrefused, timedout, connectionreset, internetdisconnected
+```
+
+### Delayed Response
+
+```bash
+playwright-cli run-code "async page => {
+  await page.route('**/api/slow', async route => {
+    await new Promise(r => setTimeout(r, 3000));
+    route.fulfill({ body: JSON.stringify({ data: 'loaded' }) });
+  });
+}"
+```
--- a/.claude/skills/playwright-cli/references/running-code.md
+++ b/.claude/skills/playwright-cli/references/running-code.md
@@ -0,0 +1,232 @@
+# Running Custom Playwright Code
+
+Use `run-code` to execute arbitrary Playwright code for advanced scenarios not covered by CLI commands.
+
+## Syntax
+
+```bash
+playwright-cli run-code "async page => {
+  // Your Playwright code here
+  // Access page.context() for browser context operations
+}"
+```
+
+## Geolocation
+
+```bash
+# Grant geolocation permission and set location
+playwright-cli run-code "async page => {
+  await page.context().grantPermissions(['geolocation']);
+  await page.context().setGeolocation({ latitude: 37.7749, longitude: -122.4194 });
+}"
+
+# Set location to London
+playwright-cli run-code "async page => {
+  await page.context().grantPermissions(['geolocation']);
+  await page.context().setGeolocation({ latitude: 51.5074, longitude: -0.1278 });
+}"
+
+# Clear geolocation override
+playwright-cli run-code "async page => {
+  await page.context().clearPermissions();
+}"
+```
+
+## Permissions
+
+```bash
+# Grant multiple permissions
+playwright-cli run-code "async page => {
+  await page.context().grantPermissions([
+    'geolocation',
+    'notifications',
+    'camera',
+    'microphone'
+  ]);
+}"
+
+# Grant permissions for specific origin
+playwright-cli run-code "async page => {
+  await page.context().grantPermissions(['clipboard-read'], {
+    origin: 'https://example.com'
+  });
+}"
+```
+
+## Media Emulation
+
+```bash
+# Emulate dark color scheme
+playwright-cli run-code "async page => {
+  await page.emulateMedia({ colorScheme: 'dark' });
+}"
+
+# Emulate light color scheme
+playwright-cli run-code "async page => {
+  await page.emulateMedia({ colorScheme: 'light' });
+}"
+
+# Emulate reduced motion
+playwright-cli run-code "async page => {
+  await page.emulateMedia({ reducedMotion: 'reduce' });
+}"
+
+# Emulate print media
+playwright-cli run-code "async page => {
+  await page.emulateMedia({ media: 'print' });
+}"
+```
+
+## Wait Strategies
+
+```bash
+# Wait for network idle
+playwright-cli run-code "async page => {
+  await page.waitForLoadState('networkidle');
+}"
+
+# Wait for specific element
+playwright-cli run-code "async page => {
+  await page.waitForSelector('.loading', { state: 'hidden' });
+}"
+
+# Wait for function to return true
+playwright-cli run-code "async page => {
+  await page.waitForFunction(() => window.appReady === true);
+}"
+
+# Wait with timeout
+playwright-cli run-code "async page => {
+  await page.waitForSelector('.result', { timeout: 10000 });
+}"
+```
+
+## Frames and Iframes
+
+```bash
+# Work with iframe
+playwright-cli run-code "async page => {
+  const frame = page.locator('iframe#my-iframe').contentFrame();
+  await frame.locator('button').click();
+}"
+
+# Get all frames
+playwright-cli run-code "async page => {
+  const frames = page.frames();
+  return frames.map(f => f.url());
+}"
+```
+
+## File Downloads
+
+```bash
+# Handle file download
+playwright-cli run-code "async page => {
+  const [download] = await Promise.all([
+    page.waitForEvent('download'),
+    page.click('a.download-link')
+  ]);
+  await download.saveAs('./downloaded-file.pdf');
+  return download.suggestedFilename();
+}"
+```
+
+## Clipboard
+
+```bash
+# Read clipboard (requires permission)
+playwright-cli run-code "async page => {
+  await page.context().grantPermissions(['clipboard-read']);
+  return await page.evaluate(() => navigator.clipboard.readText());
+}"
+
+# Write to clipboard
+playwright-cli run-code "async page => {
+  await page.evaluate(text => navigator.clipboard.writeText(text), 'Hello clipboard!');
+}"
+```
+
+## Page Information
+
+```bash
+# Get page title
+playwright-cli run-code "async page => {
+  return await page.title();
+}"
+
+# Get current URL
+playwright-cli run-code "async page => {
+  return page.url();
+}"
+
+# Get page content
+playwright-cli run-code "async page => {
+  return await page.content();
+}"
+
+# Get viewport size
+playwright-cli run-code "async page => {
+  return page.viewportSize();
+}"
+```
+
+## JavaScript Execution
+
+```bash
+# Execute JavaScript and return result
+playwright-cli run-code "async page => {
+  return await page.evaluate(() => {
+    return {
+      userAgent: navigator.userAgent,
+      language: navigator.language,
+      cookiesEnabled: navigator.cookieEnabled
+    };
+  });
+}"
+
+# Pass arguments to evaluate
+playwright-cli run-code "async page => {
+  const multiplier = 5;
+  return await page.evaluate(m => document.querySelectorAll('li').length * m, multiplier);
+}"
+```
+
+## Error Handling
+
+```bash
+# Try-catch in run-code
+playwright-cli run-code "async page => {
+  try {
+    await page.click('.maybe-missing', { timeout: 1000 });
+    return 'clicked';
+  } catch (e) {
+    return 'element not found';
+  }
+}"
+```
+
+## Complex Workflows
+
+```bash
+# Login and save state
+playwright-cli run-code "async page => {
+  await page.goto('https://example.com/login');
+  await page.fill('input[name=email]', 'user@example.com');
+  await page.fill('input[name=password]', 'secret');
+  await page.click('button[type=submit]');
+  await page.waitForURL('**/dashboard');
+  await page.context().storageState({ path: 'auth.json' });
+  return 'Login successful';
+}"
+
+# Scrape data from multiple pages
+playwright-cli run-code "async page => {
+  const results = [];
+  for (let i = 1; i <= 3; i++) {
+    await page.goto(\`https://example.com/page/\${i}\`);
+    const items = await page.locator('.item').allTextContents();
+    results.push(...items);
+  }
+  return results;
+}"
+```
--- a/.claude/skills/playwright-cli/references/session-management.md
+++ b/.claude/skills/playwright-cli/references/session-management.md
@@ -0,0 +1,169 @@
+# Browser Session Management
+
+Run multiple isolated browser sessions concurrently with state persistence.
+
+## Named Browser Sessions
+
+Use `-b` flag to isolate browser contexts:
+
+```bash
+# Browser 1: Authentication flow
+playwright-cli -s=auth open https://app.example.com/login
+
+# Browser 2: Public browsing (separate cookies, storage)
+playwright-cli -s=public open https://example.com
+
+# Commands are isolated by browser session
+playwright-cli -s=auth fill e1 "user@example.com"
+playwright-cli -s=public snapshot
+```
+
+## Browser Session Isolation Properties
+
+Each browser session has independent:
+- Cookies
+- LocalStorage / SessionStorage
+- IndexedDB
+- Cache
+- Browsing history
+- Open tabs
+
+## Browser Session Commands
+
+```bash
+# List all browser sessions
+playwright-cli list
+
+# Stop a browser session (close the browser)
+playwright-cli close                # stop the default browser
+playwright-cli -s=mysession close   # stop a named browser
+
+# Stop all browser sessions
+playwright-cli close-all
+
+# Forcefully kill all daemon processes (for stale/zombie processes)
+playwright-cli kill-all
+
+# Delete browser session user data (profile directory)
+playwright-cli delete-data                # delete default browser data
+playwright-cli -s=mysession delete-data   # delete named browser data
+```
+
+## Environment Variable
+
+Set a default browser session name via environment variable:
+
+```bash
+export PLAYWRIGHT_CLI_SESSION="mysession"
+playwright-cli open example.com  # Uses "mysession" automatically
+```
+
+## Common Patterns
+
+### Concurrent Scraping
+
+```bash
+#!/bin/bash
+# Scrape multiple sites concurrently
+
+# Start all browsers
+playwright-cli -s=site1 open https://site1.com &
+playwright-cli -s=site2 open https://site2.com &
+playwright-cli -s=site3 open https://site3.com &
+wait
+
+# Take snapshots from each
+playwright-cli -s=site1 snapshot
+playwright-cli -s=site2 snapshot
+playwright-cli -s=site3 snapshot
+
+# Cleanup
+playwright-cli close-all
+```
+
+### A/B Testing Sessions
+
+```bash
+# Test different user experiences
+playwright-cli -s=variant-a open "https://app.com?variant=a"
+playwright-cli -s=variant-b open "https://app.com?variant=b"
+
+# Compare
+playwright-cli -s=variant-a screenshot
+playwright-cli -s=variant-b screenshot
+```
+
+### Persistent Profile
+
+By default, browser profile is kept in memory only. Use `--persistent` flag on `open` to persist the browser profile to disk:
+
+```bash
+# Use persistent profile (auto-generated location)
+playwright-cli open https://example.com --persistent
+
+# Use persistent profile with custom directory
+playwright-cli open https://example.com --profile=/path/to/profile
+```
+
+## Default Browser Session
+
+When `-s` is omitted, commands use the default browser session:
+
+```bash
+# These use the same default browser session
+playwright-cli open https://example.com
+playwright-cli snapshot
+playwright-cli close  # Stops default browser
+```
+
+## Browser Session Configuration
+
+Configure a browser session with specific settings when opening:
+
+```bash
+# Open with config file
+playwright-cli open https://example.com --config=.playwright/my-cli.json
+
+# Open with specific browser
+playwright-cli open https://example.com --browser=firefox
+
+# Open in headed mode
+playwright-cli open https://example.com --headed
+
+# Open with persistent profile
+playwright-cli open https://example.com --persistent
+```
+
+## Best Practices
+
+### 1. Name Browser Sessions Semantically
+
+```bash
+# GOOD: Clear purpose
+playwright-cli -s=github-auth open https://github.com
+playwright-cli -s=docs-scrape open https://docs.example.com
+
+# AVOID: Generic names
+playwright-cli -s=s1 open https://github.com
+```
+
+### 2. Always Clean Up
+
+```bash
+# Stop browsers when done
+playwright-cli -s=auth close
+playwright-cli -s=scrape close
+
+# Or stop all at once
+playwright-cli close-all
+
+# If browsers become unresponsive or zombie processes remain
+playwright-cli kill-all
+```
+
+### 3. Delete Stale Browser Data
+
+```bash
+# Remove old browser data to free disk space
+playwright-cli -s=oldsession delete-data
+```
--- a/.claude/skills/playwright-cli/references/storage-state.md
+++ b/.claude/skills/playwright-cli/references/storage-state.md
@@ -0,0 +1,275 @@
+# Storage Management
+
+Manage cookies, localStorage, sessionStorage, and browser storage state.
+
+## Storage State
+
+Save and restore complete browser state including cookies and storage.
+
+### Save Storage State
+
+```bash
+# Save to auto-generated filename (storage-state-{timestamp}.json)
+playwright-cli state-save
+
+# Save to specific filename
+playwright-cli state-save my-auth-state.json
+```
+
+### Restore Storage State
+
+```bash
+# Load storage state from file
+playwright-cli state-load my-auth-state.json
+
+# Reload page to apply cookies
+playwright-cli open https://example.com
+```
+
+### Storage State File Format
+
+The saved file contains:
+
+```json
+{
+  "cookies": [
+    {
+      "name": "session_id",
+      "value": "abc123",
+      "domain": "example.com",
+      "path": "/",
+      "expires": 1735689600,
+      "httpOnly": true,
+      "secure": true,
+      "sameSite": "Lax"
+    }
+  ],
+  "origins": [
+    {
+      "origin": "https://example.com",
+      "localStorage": [
+        { "name": "theme", "value": "dark" },
+        { "name": "user_id", "value": "12345" }
+      ]
+    }
+  ]
+}
+```
+
+## Cookies
+
+### List All Cookies
+
+```bash
+playwright-cli cookie-list
+```
+
+### Filter Cookies by Domain
+
+```bash
+playwright-cli cookie-list --domain=example.com
+```
+
+### Filter Cookies by Path
+
+```bash
+playwright-cli cookie-list --path=/api
+```
+
+### Get Specific Cookie
+
+```bash
+playwright-cli cookie-get session_id
+```
+
+### Set a Cookie
+
+```bash
+# Basic cookie
+playwright-cli cookie-set session abc123
+
+# Cookie with options
+playwright-cli cookie-set session abc123 --domain=example.com --path=/ --httpOnly --secure --sameSite=Lax
+
+# Cookie with expiration (Unix timestamp)
+playwright-cli cookie-set remember_me token123 --expires=1735689600
+```
+
+### Delete a Cookie
+
+```bash
+playwright-cli cookie-delete session_id
+```
+
+### Clear All Cookies
+
+```bash
+playwright-cli cookie-clear
+```
+
+### Advanced: Multiple Cookies or Custom Options
+
+For complex scenarios like adding multiple cookies at once, use `run-code`:
+
+```bash
+playwright-cli run-code "async page => {
+  await page.context().addCookies([
+    { name: 'session_id', value: 'sess_abc123', domain: 'example.com', path: '/', httpOnly: true },
+    { name: 'preferences', value: JSON.stringify({ theme: 'dark' }), domain: 'example.com', path: '/' }
+  ]);
+}"
+```
+
+## Local Storage
+
+### List All localStorage Items
+
+```bash
+playwright-cli localstorage-list
+```
+
+### Get Single Value
+
+```bash
+playwright-cli localstorage-get token
+```
+
+### Set Value
+
+```bash
+playwright-cli localstorage-set theme dark
+```
+
+### Set JSON Value
+
+```bash
+playwright-cli localstorage-set user_settings '{"theme":"dark","language":"en"}'
+```
+
+### Delete Single Item
+
+```bash
+playwright-cli localstorage-delete token
+```
+
+### Clear All localStorage
+
+```bash
+playwright-cli localstorage-clear
+```
+
+### Advanced: Multiple Operations
+
+For complex scenarios like setting multiple values at once, use `run-code`:
+
+```bash
+playwright-cli run-code "async page => {
+  await page.evaluate(() => {
+    localStorage.setItem('token', 'jwt_abc123');
+    localStorage.setItem('user_id', '12345');
+    localStorage.setItem('expires_at', Date.now() + 3600000);
+  });
+}"
+```
+
+## Session Storage
+
+### List All sessionStorage Items
+
+```bash
+playwright-cli sessionstorage-list
+```
+
+### Get Single Value
+
+```bash
+playwright-cli sessionstorage-get form_data
+```
+
+### Set Value
+
+```bash
+playwright-cli sessionstorage-set step 3
+```
+
+### Delete Single Item
+
+```bash
+playwright-cli sessionstorage-delete step
+```
+
+### Clear sessionStorage
+
+```bash
+playwright-cli sessionstorage-clear
+```
+
+## IndexedDB
+
+### List Databases
+
+```bash
+playwright-cli run-code "async page => {
+  return await page.evaluate(async () => {
+    const databases = await indexedDB.databases();
+    return databases;
+  });
+}"
+```
+
+### Delete Database
+
+```bash
+playwright-cli run-code "async page => {
+  await page.evaluate(() => {
+    indexedDB.deleteDatabase('myDatabase');
+  });
+}"
+```
+
+## Common Patterns
+
+### Authentication State Reuse
+
+```bash
+# Step 1: Login and save state
+playwright-cli open https://app.example.com/login
+playwright-cli snapshot
+playwright-cli fill e1 "user@example.com"
+playwright-cli fill e2 "password123"
+playwright-cli click e3
+
+# Save the authenticated state
+playwright-cli state-save auth.json
+
+# Step 2: Later, restore state and skip login
+playwright-cli state-load auth.json
+playwright-cli open https://app.example.com/dashboard
+# Already logged in!
+```
+
+### Save and Restore Roundtrip
+
+```bash
+# Set up authentication state
+playwright-cli open https://example.com
+playwright-cli eval "() => { document.cookie = 'session=abc123'; localStorage.setItem('user', 'john'); }"
+
+# Save state to file
+playwright-cli state-save my-session.json
+
+# ... later, in a new session ...
+
+# Restore state
+playwright-cli state-load my-session.json
+playwright-cli open https://example.com
+# Cookies and localStorage are restored!
+```
+
+## Security Notes
+
+- Never commit storage state files containing auth tokens
+- Add `*.auth-state.json` to `.gitignore`
+- Delete state files after automation completes
+- Use environment variables for sensitive data
+- By default, sessions run in-memory mode which is safer for sensitive operations
--- a/.claude/skills/playwright-cli/references/test-generation.md
+++ b/.claude/skills/playwright-cli/references/test-generation.md
@@ -0,0 +1,88 @@
+# Test Generation
+
+Generate Playwright test code automatically as you interact with the browser.
+
+## How It Works
+
+Every action you perform with `playwright-cli` generates corresponding Playwright TypeScript code.
+This code appears in the output and can be copied directly into your test files.
+
+## Example Workflow
+
+```bash
+# Start a session
+playwright-cli open https://example.com/login
+
+# Take a snapshot to see elements
+playwright-cli snapshot
+# Output shows: e1 [textbox "Email"], e2 [textbox "Password"], e3 [button "Sign In"]
+
+# Fill form fields - generates code automatically
+playwright-cli fill e1 "user@example.com"
+# Ran Playwright code:
+# await page.getByRole('textbox', { name: 'Email' }).fill('user@example.com');
+
+playwright-cli fill e2 "password123"
+# Ran Playwright code:
+# await page.getByRole('textbox', { name: 'Password' }).fill('password123');
+
+playwright-cli click e3
+# Ran Playwright code:
+# await page.getByRole('button', { name: 'Sign In' }).click();
+```
+
+## Building a Test File
+
+Collect the generated code into a Playwright test:
+
+```typescript
+import { test, expect } from '@playwright/test';
+
+test('login flow', async ({ page }) => {
+  // Generated code from playwright-cli session:
+  await page.goto('https://example.com/login');
+  await page.getByRole('textbox', { name: 'Email' }).fill('user@example.com');
+  await page.getByRole('textbox', { name: 'Password' }).fill('password123');
+  await page.getByRole('button', { name: 'Sign In' }).click();
+
+  // Add assertions
+  await expect(page).toHaveURL(/.*dashboard/);
+});
+```
+
+## Best Practices
+
+### 1. Use Semantic Locators
+
+The generated code uses role-based locators when possible, which are more resilient:
+
+```typescript
+// Generated (good - semantic)
+await page.getByRole('button', { name: 'Submit' }).click();
+
+// Avoid (fragile - CSS selectors)
+await page.locator('#submit-btn').click();
+```
+
+### 2. Explore Before Recording
+
+Take snapshots to understand the page structure before recording actions:
+
+```bash
+playwright-cli open https://example.com
+playwright-cli snapshot
+# Review the element structure
+playwright-cli click e5
+```
+
+### 3. Add Assertions Manually
+
+Generated code captures actions but not assertions. Add expectations in your test:
+
+```typescript
+// Generated action
+await page.getByRole('button', { name: 'Submit' }).click();
+
+// Manual assertion
+await expect(page.getByText('Success')).toBeVisible();
+```
--- a/.claude/skills/playwright-cli/references/tracing.md
+++ b/.claude/skills/playwright-cli/references/tracing.md
@@ -0,0 +1,139 @@
+# Tracing
+
+Capture detailed execution traces for debugging and analysis. Traces include DOM snapshots, screenshots, network activity, and console logs.
+
+## Basic Usage
+
+```bash
+# Start trace recording
+playwright-cli tracing-start
+
+# Perform actions
+playwright-cli open https://example.com
+playwright-cli click e1
+playwright-cli fill e2 "test"
+
+# Stop trace recording
+playwright-cli tracing-stop
+```
+
+## Trace Output Files
+
+When you start tracing, Playwright creates a `traces/` directory with several files:
+
+### `trace-{timestamp}.trace`
+
+**Action log** - The main trace file containing:
+- Every action performed (clicks, fills, navigations)
+- DOM snapshots before and after each action
+- Screenshots at each step
+- Timing information
+- Console messages
+- Source locations
+
+### `trace-{timestamp}.network`
+
+**Network log** - Complete network activity:
+- All HTTP requests and responses
+- Request headers and bodies
+- Response headers and bodies
+- Timing (DNS, connect, TLS, TTFB, download)
+- Resource sizes
+- Failed requests and errors
+
+### `resources/`
+
+**Resources directory** - Cached resources:
+- Images, fonts, stylesheets, scripts
+- Response bodies for replay
+- Assets needed to reconstruct page state
+
+## What Traces Capture
+
+| Category | Details |
+|----------|---------|
+| **Actions** | Clicks, fills, hovers, keyboard input, navigations |
+| **DOM** | Full DOM snapshot before/after each action |
+| **Screenshots** | Visual state at each step |
+| **Network** | All requests, responses, headers, bodies, timing |
+| **Console** | All console.log, warn, error messages |
+| **Timing** | Precise timing for each operation |
+
+## Use Cases
+
+### Debugging Failed Actions
+
+```bash
+playwright-cli tracing-start
+playwright-cli open https://app.example.com
+
+# This click fails - why?
+playwright-cli click e5
+
+playwright-cli tracing-stop
+# Open trace to see DOM state when click was attempted
+```
+
+### Analyzing Performance
+
+```bash
+playwright-cli tracing-start
+playwright-cli open https://slow-site.com
+playwright-cli tracing-stop
+
+# View network waterfall to identify slow resources
+```
+
+### Capturing Evidence
+
+```bash
+# Record a complete user flow for documentation
+playwright-cli tracing-start
+
+playwright-cli open https://app.example.com/checkout
+playwright-cli fill e1 "4111111111111111"
+playwright-cli fill e2 "12/25"
+playwright-cli fill e3 "123"
+playwright-cli click e4
+
+playwright-cli tracing-stop
+# Trace shows exact sequence of events
+```
+
+## Trace vs Video vs Screenshot
+
+| Feature | Trace | Video | Screenshot |
+|---------|-------|-------|------------|
+| **Format** | .trace file | .webm video | .png/.jpeg image |
+| **DOM inspection** | Yes | No | No |
+| **Network details** | Yes | No | No |
+| **Step-by-step replay** | Yes | Continuous | Single frame |
+| **File size** | Medium | Large | Small |
+| **Best for** | Debugging | Demos | Quick capture |
+
+## Best Practices
+
+### 1. Start Tracing Before the Problem
+
+```bash
+# Trace the entire flow, not just the failing step
+playwright-cli tracing-start
+playwright-cli open https://example.com
+# ... all steps leading to the issue ...
+playwright-cli tracing-stop
+```
+
+### 2. Clean Up Old Traces
+
+Traces can consume significant disk space:
+
+```bash
+# Remove traces older than 7 days
+find .playwright-cli/traces -mtime +7 -delete
+```
+
+## Limitations
+
+- Traces add overhead to automation
+- Large traces can consume significant disk space
+- Some dynamic content may not replay perfectly
--- a/.claude/skills/playwright-cli/references/video-recording.md
+++ b/.claude/skills/playwright-cli/references/video-recording.md
@@ -0,0 +1,43 @@
+# Video Recording
+
+Capture browser automation sessions as video for debugging, documentation, or verification. Produces WebM (VP8/VP9 codec).
+
+## Basic Recording
+
+```bash
+# Start recording
+playwright-cli video-start
+
+# Perform actions
+playwright-cli open https://example.com
+playwright-cli snapshot
+playwright-cli click e1
+playwright-cli fill e2 "test input"
+
+# Stop and save
+playwright-cli video-stop demo.webm
+```
+
+## Best Practices
+
+### 1. Use Descriptive Filenames
+
+```bash
+# Include context in filename
+playwright-cli video-stop recordings/login-flow-2024-01-15.webm
+playwright-cli video-stop recordings/checkout-test-run-42.webm
+```
+
+## Tracing vs Video
+
+| Feature | Video | Tracing |
+|---------|-------|---------|
+| Output | WebM file | Trace file (viewable in Trace Viewer) |
+| Shows | Visual recording | DOM snapshots, network, console, actions |
+| Use case | Demos, documentation | Debugging, analysis |
+| Size | Larger | Smaller |
+
+## Limitations
+
+- Recording adds slight overhead to automation
+- Large recordings can consume significant disk space
--- a/.claude/templates/coding_prompt.template.md
+++ b/.claude/templates/coding_prompt.template.md
@@ -86,24 +86,33 @@ Implement the chosen feature thoroughly:

 **CRITICAL:** You MUST verify features through the actual UI.

-Use browser automation tools:
+Use `playwright-cli` for browser automation:

- Navigate to the app in a real browser
- Interact like a human user (click, type, scroll)
- Take screenshots at each step
- Verify both functionality AND visual appearance
+- Open the browser: `playwright-cli open http://localhost:PORT`
+- Take a snapshot to see page elements: `playwright-cli snapshot`
+- Read the snapshot YAML file to see element refs
+- Click elements by ref: `playwright-cli click e5`
+- Type text: `playwright-cli type "search query"`
+- Fill form fields: `playwright-cli fill e3 "value"`
+- Take screenshots: `playwright-cli screenshot`
+- Read the screenshot file to verify visual appearance
+- Check console errors: `playwright-cli console`
+- Close browser when done: `playwright-cli close`
+
+**Token-efficient workflow:** `playwright-cli screenshot` and `snapshot` save files
+to `.playwright-cli/`. You will see a file link in the output. Read the file only
+when you need to verify visual appearance or find element refs.

 **DO:**
-
 - Test through the UI with clicks and keyboard input
- Take screenshots to verify visual appearance
- Check for console errors in browser
+- Take screenshots and read them to verify visual appearance
+- Check for console errors with `playwright-cli console`
 - Verify complete user workflows end-to-end
+- Always run `playwright-cli close` when finished testing

 **DON'T:**
-
- Only test with curl commands (backend testing alone is insufficient)
- Use JavaScript evaluation to bypass UI (no shortcuts)
+- Only test with curl commands
+- Use JavaScript evaluation to bypass UI (`eval` and `run-code` are blocked)
 - Skip visual verification
 - Mark tests passing without thorough verification

@@ -145,7 +154,7 @@ Use the feature_mark_passing tool with feature_id=42
 - Combine or consolidate features
 - Reorder features

-**ONLY MARK A FEATURE AS PASSING AFTER VERIFICATION WITH SCREENSHOTS.**
+**ONLY MARK A FEATURE AS PASSING AFTER VERIFICATION WITH BROWSER AUTOMATION.**

 ### STEP 7: COMMIT YOUR PROGRESS

@@ -192,9 +201,15 @@ Before context fills up:

 ## BROWSER AUTOMATION

-Use Playwright MCP tools (`browser_*`) for UI verification. Key tools: `navigate`, `click`, `type`, `fill_form`, `take_screenshot`, `console_messages`, `network_requests`. All tools have auto-wait built in.
+Use `playwright-cli` commands for UI verification. Key commands: `open`, `goto`,
+`snapshot`, `click`, `type`, `fill`, `screenshot`, `console`, `close`.

-Test like a human user with mouse and keyboard. Use `browser_console_messages` to detect errors. Don't bypass UI with JavaScript evaluation.
+**How it works:** `playwright-cli` uses a persistent browser daemon. `open` starts it,
+subsequent commands interact via socket, `close` shuts it down. Screenshots and snapshots
+save to `.playwright-cli/` -- read the files when you need to verify content.
+
+Test like a human user with mouse and keyboard. Use `playwright-cli console` to detect
+JS errors. Don't bypass UI with JavaScript evaluation.

 ---

--- a/.claude/templates/testing_prompt.template.md
+++ b/.claude/templates/testing_prompt.template.md
@@ -31,26 +31,32 @@ For the feature returned:
 1. Read and understand the feature's verification steps
 2. Navigate to the relevant part of the application
 3. Execute each verification step using browser automation
-4. Take screenshots to document the verification
+4. Take screenshots and read them to verify visual appearance
 5. Check for console errors

-Use browser automation tools:
+### Browser Automation (Playwright CLI)

 **Navigation & Screenshots:**
- browser_navigate - Navigate to a URL
- browser_take_screenshot - Capture screenshot (use for visual verification)
- browser_snapshot - Get accessibility tree snapshot
+- `playwright-cli open <url>` - Open browser and navigate
+- `playwright-cli goto <url>` - Navigate to URL
+- `playwright-cli screenshot` - Save screenshot to `.playwright-cli/`
+- `playwright-cli snapshot` - Save page snapshot with element refs to `.playwright-cli/`

 **Element Interaction:**
- browser_click - Click elements
- browser_type - Type text into editable elements
- browser_fill_form - Fill multiple form fields
- browser_select_option - Select dropdown options
- browser_press_key - Press keyboard keys
+- `playwright-cli click <ref>` - Click elements (ref from snapshot)
+- `playwright-cli type <text>` - Type text
+- `playwright-cli fill <ref> <text>` - Fill form fields
+- `playwright-cli select <ref> <val>` - Select dropdown
+- `playwright-cli press <key>` - Keyboard input

 **Debugging:**
- browser_console_messages - Get browser console output (check for errors)
- browser_network_requests - Monitor API calls
+- `playwright-cli console` - Check for JS errors
+- `playwright-cli network` - Monitor API calls
+
+**Cleanup:**
+- `playwright-cli close` - Close browser when done (ALWAYS do this)
+
+**Note:** Screenshots and snapshots save to files. Read the file to see the content.

 ### STEP 3: HANDLE RESULTS

@@ -79,7 +85,7 @@ A regression has been introduced. You MUST fix it:

 4. **Verify the fix:**
   - Run through all verification steps again
-   - Take screenshots confirming the fix
+   - Take screenshots and read them to confirm the fix

 5. **Mark as passing after fix:**
   ```
@@ -98,7 +104,7 @@ A regression has been introduced. You MUST fix it:

 ---

-## AVAILABLE MCP TOOLS
+## AVAILABLE TOOLS

 ### Feature Management
 - `feature_get_stats` - Get progress overview (passing/in_progress/total counts)
@@ -106,19 +112,17 @@ A regression has been introduced. You MUST fix it:
 - `feature_mark_failing` - Mark a feature as failing (when you find a regression)
 - `feature_mark_passing` - Mark a feature as passing (after fixing a regression)

-### Browser Automation (Playwright)
-All interaction tools have **built-in auto-wait** -- no manual timeouts needed.
-
- `browser_navigate` - Navigate to URL
- `browser_take_screenshot` - Capture screenshot
- `browser_snapshot` - Get accessibility tree
- `browser_click` - Click elements
- `browser_type` - Type text
- `browser_fill_form` - Fill form fields
- `browser_select_option` - Select dropdown
- `browser_press_key` - Keyboard input
- `browser_console_messages` - Check for JS errors
- `browser_network_requests` - Monitor API calls
+### Browser Automation (Playwright CLI)
+Use `playwright-cli` commands for browser interaction. Key commands:
+- `playwright-cli open <url>` - Open browser
+- `playwright-cli goto <url>` - Navigate to URL
+- `playwright-cli screenshot` - Take screenshot (saved to `.playwright-cli/`)
+- `playwright-cli snapshot` - Get page snapshot with element refs
+- `playwright-cli click <ref>` - Click element
+- `playwright-cli type <text>` - Type text
+- `playwright-cli fill <ref> <text>` - Fill form field
+- `playwright-cli console` - Check for JS errors
+- `playwright-cli close` - Close browser (always do this when done)

 ---

--- a/.env.example
+++ b/.env.example
@@ -9,11 +9,6 @@
 # - webkit: Safari engine
 # - msedge: Microsoft Edge
 # PLAYWRIGHT_BROWSER=firefox
-#
-# PLAYWRIGHT_HEADLESS: Run browser without visible window
-# - true: Browser runs in background, saves CPU (default)
-# - false: Browser opens a visible window (useful for debugging)
-# PLAYWRIGHT_HEADLESS=true

 # Extra Read Paths (Optional)
 # Comma-separated list of absolute paths for read-only access to external directories.
@@ -25,40 +20,44 @@
 # Google Cloud Vertex AI Configuration (Optional)
 # To use Claude via Vertex AI on Google Cloud Platform, uncomment and set these variables.
 # Requires: gcloud CLI installed and authenticated (run: gcloud auth application-default login)
-# Note: Use @ instead of - in model names (e.g., claude-opus-4-5@20251101)
+# Note: Use @ instead of - in model names for date-suffixed models (e.g., claude-sonnet-4-5@20250929)
 #
 # CLAUDE_CODE_USE_VERTEX=1
 # CLOUD_ML_REGION=us-east5
 # ANTHROPIC_VERTEX_PROJECT_ID=your-gcp-project-id
-# ANTHROPIC_DEFAULT_OPUS_MODEL=claude-opus-4-5@20251101
+# ANTHROPIC_DEFAULT_OPUS_MODEL=claude-opus-4-6
 # ANTHROPIC_DEFAULT_SONNET_MODEL=claude-sonnet-4-5@20250929
 # ANTHROPIC_DEFAULT_HAIKU_MODEL=claude-3-5-haiku@20241022

-# GLM/Alternative API Configuration (Optional)
-# To use Zhipu AI's GLM models instead of Claude, uncomment and set these variables.
-# This only affects AutoForge - your global Claude Code settings remain unchanged.
-# Get an API key at: https://z.ai/subscribe
+# ===================
+# Alternative API Providers (Azure, GLM, Ollama, Kimi, Custom)
+# ===================
+# Configure via Settings UI (recommended) or set env vars below.
+# When both are set, env vars take precedence.
 #
+# Azure Anthropic (Claude):
+# ANTHROPIC_BASE_URL=https://your-resource.services.ai.azure.com/anthropic
+# ANTHROPIC_API_KEY=your-azure-api-key
+# ANTHROPIC_DEFAULT_OPUS_MODEL=claude-opus-4-6
+# ANTHROPIC_DEFAULT_SONNET_MODEL=claude-sonnet-4-5
+# ANTHROPIC_DEFAULT_HAIKU_MODEL=claude-haiku-4-5
+#
+# GLM (Zhipu AI):
 # ANTHROPIC_BASE_URL=https://api.z.ai/api/anthropic
-# ANTHROPIC_AUTH_TOKEN=your-zhipu-api-key
-# API_TIMEOUT_MS=3000000
-# ANTHROPIC_DEFAULT_SONNET_MODEL=glm-4.7
+# ANTHROPIC_AUTH_TOKEN=your-glm-api-key
 # ANTHROPIC_DEFAULT_OPUS_MODEL=glm-4.7
-# ANTHROPIC_DEFAULT_HAIKU_MODEL=glm-4.5-air
-
-# Ollama Local Model Configuration (Optional)
-# To use local models via Ollama instead of Claude, uncomment and set these variables.
-# Requires Ollama v0.14.0+ with Anthropic API compatibility.
-# See: https://ollama.com/blog/claude
+# ANTHROPIC_DEFAULT_SONNET_MODEL=glm-4.7
+# ANTHROPIC_DEFAULT_HAIKU_MODEL=glm-4.7
 #
+# Ollama (Local):
 # ANTHROPIC_BASE_URL=http://localhost:11434
-# ANTHROPIC_AUTH_TOKEN=ollama
-# API_TIMEOUT_MS=3000000
-# ANTHROPIC_DEFAULT_SONNET_MODEL=qwen3-coder
 # ANTHROPIC_DEFAULT_OPUS_MODEL=qwen3-coder
+# ANTHROPIC_DEFAULT_SONNET_MODEL=qwen3-coder
 # ANTHROPIC_DEFAULT_HAIKU_MODEL=qwen3-coder
 #
-# Model recommendations:
-# - For best results, use a capable coding model like qwen3-coder or deepseek-coder-v2
-# - You can use the same model for all tiers, or different models per tier
-# - Larger models (70B+) work best for Opus tier, smaller (7B-20B) for Haiku
+# Kimi (Moonshot):
+# ANTHROPIC_BASE_URL=https://api.kimi.com/coding/
+# ANTHROPIC_API_KEY=your-kimi-api-key
+# ANTHROPIC_DEFAULT_OPUS_MODEL=kimi-k2.5
+# ANTHROPIC_DEFAULT_SONNET_MODEL=kimi-k2.5
+# ANTHROPIC_DEFAULT_HAIKU_MODEL=kimi-k2.5
--- a/.gitignore
+++ b/.gitignore
@@ -10,6 +10,10 @@ issues/
 # Browser profiles for parallel agent execution
 .browser-profiles/

+# Playwright CLI daemon artifacts
+.playwright-cli/
+.playwright/
+
 # Log files
 logs/
 *.log
--- a/.npmignore
+++ b/.npmignore
@@ -28,5 +28,4 @@ start.sh
 start_ui.sh
 start_ui.py
 .claude/agents/
-.claude/skills/
 .claude/settings.json
--- a/CLAUDE.md
+++ b/CLAUDE.md
@@ -85,7 +85,7 @@ python autonomous_agent_demo.py --project-dir my-app --yolo

 **What's different in YOLO mode:**
 - No regression testing
- No Playwright MCP server (browser automation disabled)
+- No Playwright CLI (browser automation disabled)
 - Features marked passing after lint/type-check succeeds
 - Faster iteration for prototyping

@@ -163,7 +163,7 @@ Publishing: `npm publish` (triggers `prepublishOnly` which builds UI, then publi
 - `autonomous_agent_demo.py` - Entry point for running the agent (supports `--yolo`, `--parallel`, `--batch-size`, `--batch-features`)
 - `autoforge_paths.py` - Central path resolution with dual-path backward compatibility and migration
 - `agent.py` - Agent session loop using Claude Agent SDK
- `client.py` - ClaudeSDKClient configuration with security hooks, MCP servers, and Vertex AI support
+- `client.py` - ClaudeSDKClient configuration with security hooks, feature MCP server, and Vertex AI support
 - `security.py` - Bash command allowlist validation (ALLOWED_COMMANDS whitelist)
 - `prompts.py` - Prompt template loading with project-specific fallback and batch feature prompts
 - `progress.py` - Progress tracking, database queries, webhook notifications
@@ -288,6 +288,9 @@ Projects can be stored in any directory (registered in `~/.autoforge/registry.db
 - `.autoforge/.agent.lock` - Lock file to prevent multiple agent instances
 - `.autoforge/allowed_commands.yaml` - Project-specific bash command allowlist (optional)
 - `.autoforge/.gitignore` - Ignores runtime files
+- `.claude/skills/playwright-cli/` - Playwright CLI skill for browser automation
+- `.playwright/cli.config.json` - Browser configuration (headless, viewport, etc.)
+- `.playwright-cli/` - Playwright CLI daemon artifacts (screenshots, snapshots) - gitignored
 - `CLAUDE.md` - Stays at project root (SDK convention)
 - `app_spec.txt` - Root copy for agent template compatibility

@@ -408,44 +411,23 @@ Run coding agents via Google Cloud Vertex AI:
   CLAUDE_CODE_USE_VERTEX=1
   CLOUD_ML_REGION=us-east5
   ANTHROPIC_VERTEX_PROJECT_ID=your-gcp-project-id
-   ANTHROPIC_DEFAULT_OPUS_MODEL=claude-opus-4-5@20251101
+   ANTHROPIC_DEFAULT_OPUS_MODEL=claude-opus-4-6
   ANTHROPIC_DEFAULT_SONNET_MODEL=claude-sonnet-4-5@20250929
   ANTHROPIC_DEFAULT_HAIKU_MODEL=claude-3-5-haiku@20241022
   ```

 **Note:** Use `@` instead of `-` in model names for Vertex AI.

-### Ollama Local Models (Optional)
+### Alternative API Providers (GLM, Ollama, Kimi, Custom)

-Run coding agents using local models via Ollama v0.14.0+:
+Alternative providers are configured via the **Settings UI** (gear icon > API Provider section). Select a provider, set the base URL, auth token, and model — no `.env` changes needed.

-1. Install Ollama: https://ollama.com
-2. Start Ollama: `ollama serve`
-3. Pull a coding model: `ollama pull qwen3-coder`
-4. Configure `.env`:
-   ```
-   ANTHROPIC_BASE_URL=http://localhost:11434
-   ANTHROPIC_AUTH_TOKEN=ollama
-   API_TIMEOUT_MS=3000000
-   ANTHROPIC_DEFAULT_SONNET_MODEL=qwen3-coder
-   ANTHROPIC_DEFAULT_OPUS_MODEL=qwen3-coder
-   ANTHROPIC_DEFAULT_HAIKU_MODEL=qwen3-coder
-   ```
-5. Run AutoForge normally - it will use your local Ollama models
+**Available providers:** Claude (default), GLM (Zhipu AI), Ollama (local models), Kimi (Moonshot), Custom

-**Recommended coding models:**
- `qwen3-coder` - Good balance of speed and capability
- `deepseek-coder-v2` - Strong coding performance
- `codellama` - Meta's code-focused model
-
-**Model tier mapping:**
- Use the same model for all tiers, or map different models per capability level
- Larger models (70B+) work best for Opus tier
- Smaller models (7B-20B) work well for Haiku tier
-
-**Known limitations:**
- Smaller context windows than Claude (model-dependent)
- Extended context beta disabled (not supported by Ollama)
+**Ollama notes:**
+- Requires Ollama v0.14.0+ with Anthropic API compatibility
+- Install: https://ollama.com → `ollama serve` → `ollama pull qwen3-coder`
+- Recommended models: `qwen3-coder`, `deepseek-coder-v2`, `codellama`
 - Performance depends on local hardware (GPU recommended)

 ## Claude Code Integration
@@ -466,6 +448,7 @@ Run coding agents using local models via Ollama v0.14.0+:
 **Skills** (`.claude/skills/`):
 - `frontend-design` - Distinctive, production-grade UI design
 - `gsd-to-autoforge-spec` - Convert GSD codebase mapping to AutoForge app_spec format
+- `playwright-cli` - Browser automation via Playwright CLI (copied to each project)

 **Other:**
 - `.claude/templates/` - Prompt templates copied to new projects
@@ -500,7 +483,7 @@ When running with `--parallel`, the orchestrator:
 1. Spawns multiple Claude agents as subprocesses (up to `--max-concurrency`)
 2. Each agent claims features atomically via `feature_claim_and_get`
 3. Features blocked by unmet dependencies are skipped
-4. Browser contexts are isolated per agent using `--isolated` flag
+4. Browser sessions are isolated per agent via `PLAYWRIGHT_CLI_SESSION` environment variable
 5. AgentTracker parses output and emits `agent_update` messages for UI

 ### Process Limits (Parallel Mode)
--- a/README.md
+++ b/README.md
@@ -6,9 +6,9 @@ A long-running autonomous coding agent powered by the Claude Agent SDK. This too

 ## Video Tutorial

-[![Watch the tutorial](https://img.youtube.com/vi/lGWFlpffWk4/hqdefault.jpg)](https://youtu.be/lGWFlpffWk4)
+[![Watch the tutorial](https://img.youtube.com/vi/nKiPOxDpcJY/hqdefault.jpg)](https://youtu.be/nKiPOxDpcJY)

-> **[Watch the setup and usage guide →](https://youtu.be/lGWFlpffWk4)**
+> **[Watch the setup and usage guide →](https://youtu.be/nKiPOxDpcJY)**

 ---

@@ -326,37 +326,13 @@ When test progress increases, the agent sends:
 }
 ```

-### Using GLM Models (Alternative to Claude)
+### Alternative API Providers (GLM, Ollama, Kimi, Custom)

-Add these variables to your `.env` file to use Zhipu AI's GLM models:
+Alternative providers are configured via the **Settings UI** (gear icon > API Provider). Select your provider, set the base URL, auth token, and model directly in the UI — no `.env` changes needed.

-```bash
-ANTHROPIC_BASE_URL=https://api.z.ai/api/anthropic
-ANTHROPIC_AUTH_TOKEN=your-zhipu-api-key
-API_TIMEOUT_MS=3000000
-ANTHROPIC_DEFAULT_SONNET_MODEL=glm-4.7
-ANTHROPIC_DEFAULT_OPUS_MODEL=glm-4.7
-ANTHROPIC_DEFAULT_HAIKU_MODEL=glm-4.5-air
-```
+Available providers: **Claude** (default), **GLM** (Zhipu AI), **Ollama** (local models), **Kimi** (Moonshot), **Custom**

-This routes AutoForge's API requests through Zhipu's Claude-compatible API, allowing you to use GLM-4.7 and other models. **This only affects AutoForge** - your global Claude Code settings remain unchanged.
-
-Get an API key at: https://z.ai/subscribe
-
-### Using Ollama Local Models
-
-Add these variables to your `.env` file to run agents with local models via Ollama v0.14.0+:
-
-```bash
-ANTHROPIC_BASE_URL=http://localhost:11434
-ANTHROPIC_AUTH_TOKEN=ollama
-API_TIMEOUT_MS=3000000
-ANTHROPIC_DEFAULT_SONNET_MODEL=qwen3-coder
-ANTHROPIC_DEFAULT_OPUS_MODEL=qwen3-coder
-ANTHROPIC_DEFAULT_HAIKU_MODEL=qwen3-coder
-```
-
-See the [CLAUDE.md](CLAUDE.md) for recommended models and known limitations.
+For Ollama, install [Ollama v0.14.0+](https://ollama.com), run `ollama serve`, and pull a coding model (e.g., `ollama pull qwen3-coder`). Then select "Ollama" in the Settings UI.

 ### Using Vertex AI

@@ -366,7 +342,7 @@ Add these variables to your `.env` file to run agents via Google Cloud Vertex AI
 CLAUDE_CODE_USE_VERTEX=1
 CLOUD_ML_REGION=us-east5
 ANTHROPIC_VERTEX_PROJECT_ID=your-gcp-project-id
-ANTHROPIC_DEFAULT_OPUS_MODEL=claude-opus-4-5@20251101
+ANTHROPIC_DEFAULT_OPUS_MODEL=claude-opus-4-6
 ANTHROPIC_DEFAULT_SONNET_MODEL=claude-sonnet-4-5@20250929
 ANTHROPIC_DEFAULT_HAIKU_MODEL=claude-3-5-haiku@20241022
 ```
--- a/agent.py
+++ b/agent.py
@@ -222,7 +222,7 @@ async def run_autonomous_agent(
        # Check if all features are already complete (before starting a new session)
        # Skip this check if running as initializer (needs to create features first)
        if not is_initializer and iteration == 1:
-            passing, in_progress, total = count_passing_tests(project_dir)
+            passing, in_progress, total, _nhi = count_passing_tests(project_dir)
            if total > 0 and passing == total:
                print("\n" + "=" * 70)
                print("  ALL FEATURES ALREADY COMPLETE!")
@@ -240,17 +240,7 @@ async def run_autonomous_agent(
        print_session_header(iteration, is_initializer)

        # Create client (fresh context)
-        # Pass agent_id for browser isolation in multi-agent scenarios
-        import os
-        if agent_type == "testing":
-            agent_id = f"testing-{os.getpid()}"  # Unique ID for testing agents
-        elif feature_ids and len(feature_ids) > 1:
-            agent_id = f"batch-{feature_ids[0]}"
-        elif feature_id:
-            agent_id = f"feature-{feature_id}"
-        else:
-            agent_id = None
-        client = create_client(project_dir, model, yolo_mode=yolo_mode, agent_id=agent_id, agent_type=agent_type)
+        client = create_client(project_dir, model, yolo_mode=yolo_mode, agent_type=agent_type)

        # Choose prompt based on agent type
        if agent_type == "initializer":
@@ -358,7 +348,7 @@ async def run_autonomous_agent(
            print_progress_summary(project_dir)

            # Check if all features are complete - exit gracefully if done
-            passing, in_progress, total = count_passing_tests(project_dir)
+            passing, in_progress, total, _nhi = count_passing_tests(project_dir)
            if total > 0 and passing == total:
                print("\n" + "=" * 70)
                print("  ALL FEATURES COMPLETE!")
--- a/api/database.py
+++ b/api/database.py
@@ -43,10 +43,10 @@ class Feature(Base):

    __tablename__ = "features"

-    # Composite index for common status query pattern (passes, in_progress)
+    # Composite index for common status query pattern (passes, in_progress, needs_human_input)
    # Used by feature_get_stats, get_ready_features, and other status queries
    __table_args__ = (
-        Index('ix_feature_status', 'passes', 'in_progress'),
+        Index('ix_feature_status', 'passes', 'in_progress', 'needs_human_input'),
    )

    id = Column(Integer, primary_key=True, index=True)
@@ -61,6 +61,11 @@ class Feature(Base):
    # NULL/empty = no dependencies (backwards compatible)
    dependencies = Column(JSON, nullable=True, default=None)

+    # Human input: agent can request structured input from a human
+    needs_human_input = Column(Boolean, nullable=False, default=False, index=True)
+    human_input_request = Column(JSON, nullable=True, default=None)   # Agent's structured request
+    human_input_response = Column(JSON, nullable=True, default=None)  # Human's response
+
    def to_dict(self) -> dict:
        """Convert feature to dictionary for JSON serialization."""
        return {
@@ -75,6 +80,10 @@ class Feature(Base):
            "in_progress": self.in_progress if self.in_progress is not None else False,
            # Dependencies: NULL/empty treated as empty list for backwards compat
            "dependencies": self.dependencies if self.dependencies else [],
+            # Human input fields
+            "needs_human_input": self.needs_human_input if self.needs_human_input is not None else False,
+            "human_input_request": self.human_input_request,
+            "human_input_response": self.human_input_response,
        }

    def get_dependencies_safe(self) -> list[int]:
@@ -302,6 +311,21 @@ def _is_network_path(path: Path) -> bool:
    return False


+def _migrate_add_human_input_columns(engine) -> None:
+    """Add human input columns to existing databases that don't have them."""
+    with engine.connect() as conn:
+        result = conn.execute(text("PRAGMA table_info(features)"))
+        columns = [row[1] for row in result.fetchall()]
+
+        if "needs_human_input" not in columns:
+            conn.execute(text("ALTER TABLE features ADD COLUMN needs_human_input BOOLEAN DEFAULT 0"))
+        if "human_input_request" not in columns:
+            conn.execute(text("ALTER TABLE features ADD COLUMN human_input_request TEXT DEFAULT NULL"))
+        if "human_input_response" not in columns:
+            conn.execute(text("ALTER TABLE features ADD COLUMN human_input_response TEXT DEFAULT NULL"))
+        conn.commit()
+
+
 def _migrate_add_schedules_tables(engine) -> None:
    """Create schedules and schedule_overrides tables if they don't exist."""
    from sqlalchemy import inspect
@@ -425,6 +449,7 @@ def create_database(project_dir: Path) -> tuple:
    _migrate_fix_null_boolean_fields(engine)
    _migrate_add_dependencies_column(engine)
    _migrate_add_testing_columns(engine)
+    _migrate_add_human_input_columns(engine)

    # Migrate to add schedules tables
    _migrate_add_schedules_tables(engine)
--- a/autoforge_paths.py
+++ b/autoforge_paths.py
@@ -39,10 +39,12 @@ assistant.db-wal
 assistant.db-shm
 .agent.lock
 .devserver.lock
+.pause_drain
 .claude_settings.json
 .claude_assistant_settings.json
 .claude_settings.expand.*.json
 .progress_cache
+.migration_version
 """


@@ -145,6 +147,15 @@ def get_claude_assistant_settings_path(project_dir: Path) -> Path:
    return _resolve_path(project_dir, ".claude_assistant_settings.json")


+def get_pause_drain_path(project_dir: Path) -> Path:
+    """Return the path to the ``.pause_drain`` signal file.
+
+    This file is created to request a graceful pause (drain mode).
+    Always uses the new location since it's a transient signal file.
+    """
+    return project_dir / ".autoforge" / ".pause_drain"
+
+
 def get_progress_cache_path(project_dir: Path) -> Path:
    """Resolve the path to ``.progress_cache``."""
    return _resolve_path(project_dir, ".progress_cache")
--- a/autonomous_agent_demo.py
+++ b/autonomous_agent_demo.py
@@ -44,8 +44,10 @@ from dotenv import load_dotenv
 # IMPORTANT: Must be called BEFORE importing other modules that read env vars at load time
 load_dotenv()

+import os
+
 from agent import run_autonomous_agent
-from registry import DEFAULT_MODEL, get_project_path
+from registry import DEFAULT_MODEL, get_effective_sdk_env, get_project_path


 def parse_args() -> argparse.Namespace:
@@ -195,6 +197,14 @@ def main() -> None:
    # Note: Authentication is handled by start.bat/start.sh before this script runs.
    # The Claude SDK auto-detects credentials from ~/.claude/.credentials.json

+    # Apply UI-configured provider settings to this process's environment.
+    # This ensures CLI-launched agents respect Settings UI provider config (GLM, Ollama, etc.).
+    # Uses setdefault so explicit env vars / .env file take precedence.
+    sdk_overrides = get_effective_sdk_env()
+    for key, value in sdk_overrides.items():
+        if value:  # Only set non-empty values (empty values are used to clear conflicts)
+            os.environ.setdefault(key, value)
+
    # Handle deprecated --parallel flag
    if args.parallel is not None:
        print("WARNING: --parallel is deprecated. Use --concurrency instead.", flush=True)
@@ -227,6 +237,12 @@ def main() -> None:
    if migrated:
        print(f"Migrated project files to .autoforge/: {', '.join(migrated)}", flush=True)

+    # Migrate project to current AutoForge version (idempotent, safe)
+    from prompts import migrate_project_to_current
+    version_migrated = migrate_project_to_current(project_dir)
+    if version_migrated:
+        print(f"Upgraded project: {', '.join(version_migrated)}", flush=True)
+
    # Parse batch testing feature IDs (comma-separated string -> list[int])
    testing_feature_ids: list[int] | None = None
    if args.testing_feature_ids:
--- a/bin/autoforge.js
+++ b/bin/autoforge.js
--- a/client.py
+++ b/client.py
@@ -16,22 +16,11 @@ from claude_agent_sdk import ClaudeAgentOptions, ClaudeSDKClient
 from claude_agent_sdk.types import HookContext, HookInput, HookMatcher, SyncHookJSONOutput
 from dotenv import load_dotenv

-from env_constants import API_ENV_VARS
 from security import SENSITIVE_DIRECTORIES, bash_security_hook

 # Load environment variables from .env file if present
 load_dotenv()

-# Default Playwright headless mode - can be overridden via PLAYWRIGHT_HEADLESS env var
-# When True, browser runs invisibly in background (default - saves CPU)
-# When False, browser window is visible (useful for monitoring agent progress)
-DEFAULT_PLAYWRIGHT_HEADLESS = True
-
-# Default browser for Playwright - can be overridden via PLAYWRIGHT_BROWSER env var
-# Options: chrome, firefox, webkit, msedge
-# Firefox is recommended for lower CPU usage
-DEFAULT_PLAYWRIGHT_BROWSER = "firefox"
-
 # Extra read paths for cross-project file access (read-only)
 # Set EXTRA_READ_PATHS environment variable with comma-separated absolute paths
 # Example: EXTRA_READ_PATHS=/Volumes/Data/dev,/Users/shared/libs
@@ -42,12 +31,14 @@ EXTRA_READ_PATHS_VAR = "EXTRA_READ_PATHS"
 # this blocklist and the filesystem browser API share a single source of truth.
 EXTRA_READ_PATHS_BLOCKLIST = SENSITIVE_DIRECTORIES

+
 def convert_model_for_vertex(model: str) -> str:
    """
    Convert model name format for Vertex AI compatibility.

-    Vertex AI uses @ to separate model name from version (e.g., claude-opus-4-5@20251101)
-    while the Anthropic API uses - (e.g., claude-opus-4-5-20251101).
+    Vertex AI uses @ to separate model name from version (e.g., claude-sonnet-4-5@20250929)
+    while the Anthropic API uses - (e.g., claude-sonnet-4-5-20250929).
+    Models without a date suffix (e.g., claude-opus-4-6) pass through unchanged.

    Args:
        model: Model name in Anthropic format (with hyphens)
@@ -61,7 +52,7 @@ def convert_model_for_vertex(model: str) -> str:
        return model

    # Pattern: claude-{name}-{version}-{date} -> claude-{name}-{version}@{date}
-    # Example: claude-opus-4-5-20251101 -> claude-opus-4-5@20251101
+    # Example: claude-sonnet-4-5-20250929 -> claude-sonnet-4-5@20250929
    # The date is always 8 digits at the end
    match = re.match(r'^(claude-.+)-(\d{8})$', model)
    if match:
@@ -72,43 +63,6 @@ def convert_model_for_vertex(model: str) -> str:
    return model


-def get_playwright_headless() -> bool:
-    """
-    Get the Playwright headless mode setting.
-
-    Reads from PLAYWRIGHT_HEADLESS environment variable, defaults to True.
-    Returns True for headless mode (invisible browser), False for visible browser.
-    """
-    value = os.getenv("PLAYWRIGHT_HEADLESS", str(DEFAULT_PLAYWRIGHT_HEADLESS).lower()).strip().lower()
-    truthy = {"true", "1", "yes", "on"}
-    falsy = {"false", "0", "no", "off"}
-    if value not in truthy | falsy:
-        print(f"   - Warning: Invalid PLAYWRIGHT_HEADLESS='{value}', defaulting to {DEFAULT_PLAYWRIGHT_HEADLESS}")
-        return DEFAULT_PLAYWRIGHT_HEADLESS
-    return value in truthy
-
-
-# Valid browsers supported by Playwright MCP
-VALID_PLAYWRIGHT_BROWSERS = {"chrome", "firefox", "webkit", "msedge"}
-
-
-def get_playwright_browser() -> str:
-    """
-    Get the browser to use for Playwright.
-
-    Reads from PLAYWRIGHT_BROWSER environment variable, defaults to firefox.
-    Options: chrome, firefox, webkit, msedge
-    Firefox is recommended for lower CPU usage.
-    """
-    value = os.getenv("PLAYWRIGHT_BROWSER", DEFAULT_PLAYWRIGHT_BROWSER).strip().lower()
-    if value not in VALID_PLAYWRIGHT_BROWSERS:
-        print(f"   - Warning: Invalid PLAYWRIGHT_BROWSER='{value}', "
-              f"valid options: {', '.join(sorted(VALID_PLAYWRIGHT_BROWSERS))}. "
-              f"Defaulting to {DEFAULT_PLAYWRIGHT_BROWSER}")
-        return DEFAULT_PLAYWRIGHT_BROWSER
-    return value
-
-
 def get_extra_read_paths() -> list[Path]:
    """
    Get extra read-only paths from EXTRA_READ_PATHS environment variable.
@@ -187,7 +141,6 @@ def get_extra_read_paths() -> list[Path]:
 # overhead and preventing agents from calling tools meant for other roles.
 #
 # Tools intentionally omitted from ALL agent lists (UI/orchestrator only):
-#   feature_get_ready, feature_get_blocked, feature_get_graph,
 #   feature_remove_dependency
 #
 # The ghost tool "feature_release_testing" was removed entirely -- it was
@@ -197,6 +150,9 @@ CODING_AGENT_TOOLS = [
    "mcp__features__feature_get_stats",
    "mcp__features__feature_get_by_id",
    "mcp__features__feature_get_summary",
+    "mcp__features__feature_get_ready",
+    "mcp__features__feature_get_blocked",
+    "mcp__features__feature_get_graph",
    "mcp__features__feature_claim_and_get",
    "mcp__features__feature_mark_in_progress",
    "mcp__features__feature_mark_passing",
@@ -209,12 +165,18 @@ TESTING_AGENT_TOOLS = [
    "mcp__features__feature_get_stats",
    "mcp__features__feature_get_by_id",
    "mcp__features__feature_get_summary",
+    "mcp__features__feature_get_ready",
+    "mcp__features__feature_get_blocked",
+    "mcp__features__feature_get_graph",
    "mcp__features__feature_mark_passing",
    "mcp__features__feature_mark_failing",
 ]

 INITIALIZER_AGENT_TOOLS = [
    "mcp__features__feature_get_stats",
+    "mcp__features__feature_get_ready",
+    "mcp__features__feature_get_blocked",
+    "mcp__features__feature_get_graph",
    "mcp__features__feature_create_bulk",
    "mcp__features__feature_create",
    "mcp__features__feature_add_dependency",
@@ -228,41 +190,6 @@ ALL_FEATURE_MCP_TOOLS = sorted(
    set(CODING_AGENT_TOOLS) | set(TESTING_AGENT_TOOLS) | set(INITIALIZER_AGENT_TOOLS)
 )

-# Playwright MCP tools for browser automation.
-# Full set of tools for comprehensive UI testing including drag-and-drop,
-# hover menus, file uploads, tab management, etc.
-PLAYWRIGHT_TOOLS = [
-    # Core navigation & screenshots
-    "mcp__playwright__browser_navigate",
-    "mcp__playwright__browser_navigate_back",
-    "mcp__playwright__browser_take_screenshot",
-    "mcp__playwright__browser_snapshot",
-
-    # Element interaction
-    "mcp__playwright__browser_click",
-    "mcp__playwright__browser_type",
-    "mcp__playwright__browser_fill_form",
-    "mcp__playwright__browser_select_option",
-    "mcp__playwright__browser_press_key",
-    "mcp__playwright__browser_drag",
-    "mcp__playwright__browser_hover",
-    "mcp__playwright__browser_file_upload",
-
-    # JavaScript & debugging
-    "mcp__playwright__browser_evaluate",
-    # "mcp__playwright__browser_run_code",  # REMOVED - causes Playwright MCP server crash
-    "mcp__playwright__browser_console_messages",
-    "mcp__playwright__browser_network_requests",
-
-    # Browser management
-    "mcp__playwright__browser_resize",
-    "mcp__playwright__browser_wait_for",
-    "mcp__playwright__browser_handle_dialog",
-    "mcp__playwright__browser_install",
-    "mcp__playwright__browser_close",
-    "mcp__playwright__browser_tabs",
-]
-
 # Built-in tools available to agents.
 # WebFetch and WebSearch are included so coding agents can look up current
 # documentation for frameworks and libraries they are implementing.
@@ -282,7 +209,6 @@ def create_client(
    project_dir: Path,
    model: str,
    yolo_mode: bool = False,
-    agent_id: str | None = None,
    agent_type: str = "coding",
 ):
    """
@@ -291,9 +217,7 @@ def create_client(
    Args:
        project_dir: Directory for the project
        model: Claude model to use
-        yolo_mode: If True, skip Playwright MCP server for rapid prototyping
-        agent_id: Optional unique identifier for browser isolation in parallel mode.
-                  When provided, each agent gets its own browser profile.
+        yolo_mode: If True, skip browser testing for rapid prototyping
        agent_type: One of "coding", "testing", or "initializer". Controls which
                    MCP tools are exposed and the max_turns limit.

@@ -327,11 +251,8 @@ def create_client(
    }
    max_turns = max_turns_map.get(agent_type, 300)

-    # Build allowed tools list based on mode and agent type.
-    # In YOLO mode, exclude Playwright tools for faster prototyping.
+    # Build allowed tools list based on agent type.
    allowed_tools = [*BUILTIN_TOOLS, *feature_tools]
-    if not yolo_mode:
-        allowed_tools.extend(PLAYWRIGHT_TOOLS)

    # Build permissions list.
    # We permit ALL feature MCP tools at the security layer (so the MCP server
@@ -363,10 +284,6 @@ def create_client(
        permissions_list.append(f"Glob({path}/**)")
        permissions_list.append(f"Grep({path}/**)")

-    if not yolo_mode:
-        # Allow Playwright MCP tools for browser automation (standard mode only)
-        permissions_list.extend(PLAYWRIGHT_TOOLS)
-
    # Create comprehensive security settings
    # Note: Using relative paths ("./**") restricts access to project directory
    # since cwd is set to project_dir
@@ -395,9 +312,9 @@ def create_client(
        print(f"   - Extra read paths (validated): {', '.join(str(p) for p in extra_read_paths)}")
    print("   - Bash commands restricted to allowlist (see security.py)")
    if yolo_mode:
-        print("   - MCP servers: features (database) - YOLO MODE (no Playwright)")
+        print("   - MCP servers: features (database) - YOLO MODE (no browser testing)")
    else:
-        print("   - MCP servers: playwright (browser), features (database)")
+        print("   - MCP servers: features (database)")
    print("   - Project settings enabled (skills, commands, CLAUDE.md)")
    print()

@@ -421,48 +338,19 @@ def create_client(
            },
        },
    }
-    if not yolo_mode:
-        # Include Playwright MCP server for browser automation (standard mode only)
-        # Browser and headless mode configurable via environment variables
-        browser = get_playwright_browser()
-        playwright_args = [
-            "@playwright/mcp@latest",
-            "--viewport-size", "1280x720",
-            "--browser", browser,
-        ]
-        if get_playwright_headless():
-            playwright_args.append("--headless")
-        print(f"   - Browser: {browser} (headless={get_playwright_headless()})")
-
-        # Browser isolation for parallel execution
-        # Each agent gets its own isolated browser context to prevent tab conflicts
-        if agent_id:
-            # Use --isolated for ephemeral browser context
-            # This creates a fresh, isolated context without persistent state
-            # Note: --isolated and --user-data-dir are mutually exclusive
-            playwright_args.append("--isolated")
-            print(f"   - Browser isolation enabled for agent: {agent_id}")
-
-        mcp_servers["playwright"] = {
-            "command": "npx",
-            "args": playwright_args,
-        }
-
    # Build environment overrides for API endpoint configuration
-    # These override system env vars for the Claude CLI subprocess,
-    # allowing AutoForge to use alternative APIs (e.g., GLM) without
-    # affecting the user's global Claude Code settings
-    sdk_env = {}
-    for var in API_ENV_VARS:
-        value = os.getenv(var)
-        if value:
-            sdk_env[var] = value
+    # Uses get_effective_sdk_env() which reads provider settings from the database,
+    # ensuring UI-configured alternative providers (GLM, Ollama, Kimi, Custom) propagate
+    # correctly to the Claude CLI subprocess
+    from registry import get_effective_sdk_env
+    sdk_env = get_effective_sdk_env()

    # Detect alternative API mode (Ollama, GLM, or Vertex AI)
    base_url = sdk_env.get("ANTHROPIC_BASE_URL", "")
    is_vertex = sdk_env.get("CLAUDE_CODE_USE_VERTEX") == "1"
    is_alternative_api = bool(base_url) or is_vertex
    is_ollama = "localhost:11434" in base_url or "127.0.0.1:11434" in base_url
+    is_azure = "services.ai.azure.com" in base_url
    model = convert_model_for_vertex(model)
    if sdk_env:
        print(f"   - API overrides: {', '.join(sdk_env.keys())}")
@@ -472,8 +360,10 @@ def create_client(
            print(f"   - Vertex AI Mode: Using GCP project '{project_id}' with model '{model}' in region '{region}'")
        elif is_ollama:
            print("   - Ollama Mode: Using local models")
+        elif is_azure:
+            print(f"   - Azure Mode: Using {base_url}")
        elif "ANTHROPIC_BASE_URL" in sdk_env:
-            print(f"   - GLM Mode: Using {sdk_env['ANTHROPIC_BASE_URL']}")
+            print(f"   - Alternative API: Using {sdk_env['ANTHROPIC_BASE_URL']}")

    # Create a wrapper for bash_security_hook that passes project_dir via context
    async def bash_hook_with_context(input_data, tool_use_id=None, context=None):
--- a/env_constants.py
+++ b/env_constants.py
@@ -15,6 +15,7 @@ API_ENV_VARS: list[str] = [
    # Core API configuration
    "ANTHROPIC_BASE_URL",              # Custom API endpoint (e.g., https://api.z.ai/api/anthropic)
    "ANTHROPIC_AUTH_TOKEN",            # API authentication token
+    "ANTHROPIC_API_KEY",               # API key (used by Kimi and other providers)
    "API_TIMEOUT_MS",                  # Request timeout in milliseconds
    # Model tier overrides
    "ANTHROPIC_DEFAULT_SONNET_MODEL",  # Model override for Sonnet
--- a/lib/cli.js
+++ b/lib/cli.js
@@ -517,6 +517,41 @@ function killProcess(pid) {
  }
 }

+// ---------------------------------------------------------------------------
+// Playwright CLI
+// ---------------------------------------------------------------------------
+
+/**
+ * Ensure playwright-cli is available globally for browser automation.
+ * Returns true if available (already installed or freshly installed).
+ *
+ * @param {boolean} showProgress - If true, print install progress
+ */
+function ensurePlaywrightCli(showProgress) {
+  try {
+    execSync('playwright-cli --version', {
+      timeout: 10_000,
+      stdio: ['pipe', 'pipe', 'pipe'],
+    });
+    return true;
+  } catch {
+    // Not installed — try to install
+  }
+
+  if (showProgress) {
+    log('      Installing playwright-cli for browser automation...');
+  }
+  try {
+    execSync('npm install -g @playwright/cli', {
+      timeout: 120_000,
+      stdio: ['pipe', 'pipe', 'pipe'],
+    });
+    return true;
+  } catch {
+    return false;
+  }
+}
+
 // ---------------------------------------------------------------------------
 // CLI commands
 // ---------------------------------------------------------------------------
@@ -613,6 +648,14 @@ function startServer(opts) {
  }
  const wasAlreadyReady = ensureVenv(python, repair);

+  // Ensure playwright-cli for browser automation (quick check, installs once)
+  if (!ensurePlaywrightCli(!wasAlreadyReady)) {
+    log('');
+    log('  Note: playwright-cli not available (browser automation will be limited)');
+    log('  Install manually: npm install -g @playwright/cli');
+    log('');
+  }
+
  // Step 3: Config file
  const configCreated = ensureEnvFile();

--- a/mcp_server/feature_mcp.py
+++ b/mcp_server/feature_mcp.py
@@ -151,17 +151,20 @@ def feature_get_stats() -> str:
        result = session.query(
            func.count(Feature.id).label('total'),
            func.sum(case((Feature.passes == True, 1), else_=0)).label('passing'),
-            func.sum(case((Feature.in_progress == True, 1), else_=0)).label('in_progress')
+            func.sum(case((Feature.in_progress == True, 1), else_=0)).label('in_progress'),
+            func.sum(case((Feature.needs_human_input == True, 1), else_=0)).label('needs_human_input')
        ).first()

        total = result.total or 0
        passing = int(result.passing or 0)
        in_progress = int(result.in_progress or 0)
+        needs_human_input = int(result.needs_human_input or 0)
        percentage = round((passing / total) * 100, 1) if total > 0 else 0.0

        return json.dumps({
            "passing": passing,
            "in_progress": in_progress,
+            "needs_human_input": needs_human_input,
            "total": total,
            "percentage": percentage
        })
@@ -221,6 +224,7 @@ def feature_get_summary(
            "name": feature.name,
            "passes": feature.passes,
            "in_progress": feature.in_progress,
+            "needs_human_input": feature.needs_human_input if feature.needs_human_input is not None else False,
            "dependencies": feature.dependencies or []
        })
    finally:
@@ -401,11 +405,11 @@ def feature_mark_in_progress(
    """
    session = get_session()
    try:
-        # Atomic claim: only succeeds if feature is not already claimed or passing
+        # Atomic claim: only succeeds if feature is not already claimed, passing, or blocked for human input
        result = session.execute(text("""
            UPDATE features
            SET in_progress = 1
-            WHERE id = :id AND passes = 0 AND in_progress = 0
+            WHERE id = :id AND passes = 0 AND in_progress = 0 AND needs_human_input = 0
        """), {"id": feature_id})
        session.commit()

@@ -418,6 +422,8 @@ def feature_mark_in_progress(
                return json.dumps({"error": f"Feature with ID {feature_id} is already passing"})
            if feature.in_progress:
                return json.dumps({"error": f"Feature with ID {feature_id} is already in-progress"})
+            if getattr(feature, 'needs_human_input', False):
+                return json.dumps({"error": f"Feature with ID {feature_id} is blocked waiting for human input"})
            return json.dumps({"error": "Failed to mark feature in-progress for unknown reason"})

        # Fetch the claimed feature
@@ -455,11 +461,14 @@ def feature_claim_and_get(
        if feature.passes:
            return json.dumps({"error": f"Feature with ID {feature_id} is already passing"})

-        # Try atomic claim: only succeeds if not already claimed
+        if getattr(feature, 'needs_human_input', False):
+            return json.dumps({"error": f"Feature with ID {feature_id} is blocked waiting for human input"})
+
+        # Try atomic claim: only succeeds if not already claimed and not blocked for human input
        result = session.execute(text("""
            UPDATE features
            SET in_progress = 1
-            WHERE id = :id AND passes = 0 AND in_progress = 0
+            WHERE id = :id AND passes = 0 AND in_progress = 0 AND needs_human_input = 0
        """), {"id": feature_id})
        session.commit()

@@ -806,6 +815,8 @@ def feature_get_ready(
        for f in all_features:
            if f.passes or f.in_progress:
                continue
+            if getattr(f, 'needs_human_input', False):
+                continue
            deps = f.dependencies or []
            if all(dep_id in passing_ids for dep_id in deps):
                ready.append(f.to_dict())
@@ -888,6 +899,8 @@ def feature_get_graph() -> str:

            if f.passes:
                status = "done"
+            elif getattr(f, 'needs_human_input', False):
+                status = "needs_human_input"
            elif blocking:
                status = "blocked"
            elif f.in_progress:
@@ -984,5 +997,132 @@ def feature_set_dependencies(
        return json.dumps({"error": f"Failed to set dependencies: {str(e)}"})


+@mcp.tool()
+def feature_request_human_input(
+    feature_id: Annotated[int, Field(description="The ID of the feature that needs human input", ge=1)],
+    prompt: Annotated[str, Field(min_length=1, description="Explain what you need from the human and why")],
+    fields: Annotated[list[dict], Field(min_length=1, description="List of input fields to collect")]
+) -> str:
+    """Request structured input from a human for a feature that is blocked.
+
+    Use this ONLY when the feature genuinely cannot proceed without human intervention:
+    - Creating API keys or external accounts
+    - Choosing between design approaches that require human preference
+    - Configuring external services the agent cannot access
+    - Providing credentials or secrets
+
+    Do NOT use this for issues you can solve yourself (debugging, reading docs, etc.).
+
+    The feature will be moved out of in_progress and into a "needs human input" state.
+    Once the human provides their response, the feature returns to the pending queue
+    and will include the human's response when you pick it up again.
+
+    Args:
+        feature_id: The ID of the feature that needs human input
+        prompt: A clear explanation of what you need and why
+        fields: List of input fields, each with:
+            - id (str): Unique field identifier
+            - label (str): Human-readable label
+            - type (str): "text", "textarea", "select", or "boolean" (default: "text")
+            - required (bool): Whether the field is required (default: true)
+            - placeholder (str, optional): Placeholder text
+            - options (list, optional): For select type: [{value, label}]
+
+    Returns:
+        JSON with success confirmation or error message
+    """
+    # Validate fields
+    VALID_FIELD_TYPES = {"text", "textarea", "select", "boolean"}
+    seen_ids: set[str] = set()
+    for i, field in enumerate(fields):
+        if "id" not in field or "label" not in field:
+            return json.dumps({"error": f"Field at index {i} missing required 'id' or 'label'"})
+        fid = field["id"]
+        flabel = field["label"]
+        if not isinstance(fid, str) or not fid.strip():
+            return json.dumps({"error": f"Field at index {i} has empty or invalid 'id'"})
+        if not isinstance(flabel, str) or not flabel.strip():
+            return json.dumps({"error": f"Field at index {i} has empty or invalid 'label'"})
+        if fid in seen_ids:
+            return json.dumps({"error": f"Duplicate field id '{fid}' at index {i}"})
+        seen_ids.add(fid)
+        ftype = field.get("type", "text")
+        if ftype not in VALID_FIELD_TYPES:
+            return json.dumps({"error": f"Field at index {i} has invalid type '{ftype}'. Must be one of: {', '.join(sorted(VALID_FIELD_TYPES))}"})
+        if ftype == "select" and not field.get("options"):
+            return json.dumps({"error": f"Field at index {i} is type 'select' but missing 'options' array"})
+
+    request_data = {
+        "prompt": prompt,
+        "fields": fields,
+    }
+
+    session = get_session()
+    try:
+        # Atomically set needs_human_input, clear in_progress, store request, clear previous response
+        result = session.execute(text("""
+            UPDATE features
+            SET needs_human_input = 1,
+                in_progress = 0,
+                human_input_request = :request,
+                human_input_response = NULL
+            WHERE id = :id AND passes = 0 AND in_progress = 1
+        """), {"id": feature_id, "request": json.dumps(request_data)})
+        session.commit()
+
+        if result.rowcount == 0:
+            feature = session.query(Feature).filter(Feature.id == feature_id).first()
+            if feature is None:
+                return json.dumps({"error": f"Feature with ID {feature_id} not found"})
+            if feature.passes:
+                return json.dumps({"error": f"Feature with ID {feature_id} is already passing"})
+            if not feature.in_progress:
+                return json.dumps({"error": f"Feature with ID {feature_id} is not in progress"})
+            return json.dumps({"error": "Failed to request human input for unknown reason"})
+
+        feature = session.query(Feature).filter(Feature.id == feature_id).first()
+        return json.dumps({
+            "success": True,
+            "feature_id": feature_id,
+            "name": feature.name,
+            "message": f"Feature '{feature.name}' is now blocked waiting for human input"
+        })
+    except Exception as e:
+        session.rollback()
+        return json.dumps({"error": f"Failed to request human input: {str(e)}"})
+    finally:
+        session.close()
+
+
+@mcp.tool()
+def ask_user(
+    questions: Annotated[list[dict], Field(description="List of questions to ask, each with question, header, options (list of {label, description}), and multiSelect (bool)")]
+) -> str:
+    """Ask the user structured questions with selectable options.
+
+    Use this when you need clarification or want to offer choices to the user.
+    Each question has a short header, the question text, and 2-4 clickable options.
+    The user's selections will be returned as your next message.
+
+    Args:
+        questions: List of questions, each with:
+            - question (str): The question to ask
+            - header (str): Short label (max 12 chars)
+            - options (list): Each with label (str) and description (str)
+            - multiSelect (bool): Allow multiple selections (default false)
+
+    Returns:
+        Acknowledgment that questions were presented to the user
+    """
+    # Validate input
+    for i, q in enumerate(questions):
+        if not all(key in q for key in ["question", "header", "options"]):
+            return json.dumps({"error": f"Question at index {i} missing required fields"})
+        if len(q["options"]) < 2 or len(q["options"]) > 4:
+            return json.dumps({"error": f"Question at index {i} must have 2-4 options"})
+
+    return "Questions presented to the user. Their response will arrive as your next message."
+
+
 if __name__ == "__main__":
    mcp.run()
--- a/package.json
+++ b/package.json
@@ -1,6 +1,6 @@
 {
  "name": "autoforge-ai",
-  "version": "0.1.2",
+  "version": "0.1.12",
  "description": "Autonomous coding agent with web UI - build complete apps with AI",
  "license": "AGPL-3.0",
  "bin": {
@@ -19,6 +19,7 @@
    "ui/dist/",
    "ui/package.json",
    ".claude/commands/",
+    ".claude/skills/",
    ".claude/templates/",
    "examples/",
    "start.py",
@@ -34,6 +35,7 @@
    "registry.py",
    "rate_limit_utils.py",
    "security.py",
+    "temp_cleanup.py",
    "requirements-prod.txt",
    "pyproject.toml",
    ".env.example",
--- a/parallel_orchestrator.py
+++ b/parallel_orchestrator.py
@@ -194,6 +194,7 @@ class ParallelOrchestrator:
        # Legacy alias for backward compatibility
        self.running_agents = self.running_coding_agents
        self.abort_events: dict[int, threading.Event] = {}
+        self._testing_session_counter = 0
        self.is_running = False

        # Track feature failures to prevent infinite retry loops
@@ -212,6 +213,9 @@ class ParallelOrchestrator:
        # Signal handlers only set this flag; cleanup happens in the main loop
        self._shutdown_requested = False

+        # Graceful pause (drain mode) flag
+        self._drain_requested = False
+
        # Session tracking for logging/debugging
        self.session_start_time: datetime | None = None

@@ -492,6 +496,9 @@ class ParallelOrchestrator:
        for fd in feature_dicts:
            if not fd.get("in_progress") or fd.get("passes"):
                continue
+            # Skip if blocked for human input
+            if fd.get("needs_human_input"):
+                continue
            # Skip if already running in this orchestrator instance
            if fd["id"] in running_ids:
                continue
@@ -536,11 +543,14 @@ class ParallelOrchestrator:
                running_ids.update(batch_ids)

        ready = []
-        skipped_reasons = {"passes": 0, "in_progress": 0, "running": 0, "failed": 0, "deps": 0}
+        skipped_reasons = {"passes": 0, "in_progress": 0, "running": 0, "failed": 0, "deps": 0, "needs_human_input": 0}
        for fd in feature_dicts:
            if fd.get("passes"):
                skipped_reasons["passes"] += 1
                continue
+            if fd.get("needs_human_input"):
+                skipped_reasons["needs_human_input"] += 1
+                continue
            if fd.get("in_progress"):
                skipped_reasons["in_progress"] += 1
                continue
@@ -846,7 +856,7 @@ class ParallelOrchestrator:
                "encoding": "utf-8",
                "errors": "replace",
                "cwd": str(self.project_dir),  # Run from project dir so CLI creates .claude/ in project
-                "env": {**os.environ, "PYTHONUNBUFFERED": "1"},
+                "env": {**os.environ, "PYTHONUNBUFFERED": "1", "NODE_COMPILE_CACHE": "", "PLAYWRIGHT_CLI_SESSION": f"coding-{feature_id}"},
            }
            if sys.platform == "win32":
                popen_kwargs["creationflags"] = subprocess.CREATE_NO_WINDOW
@@ -909,7 +919,7 @@ class ParallelOrchestrator:
                "encoding": "utf-8",
                "errors": "replace",
                "cwd": str(self.project_dir),  # Run from project dir so CLI creates .claude/ in project
-                "env": {**os.environ, "PYTHONUNBUFFERED": "1"},
+                "env": {**os.environ, "PYTHONUNBUFFERED": "1", "NODE_COMPILE_CACHE": "", "PLAYWRIGHT_CLI_SESSION": f"coding-{primary_id}"},
            }
            if sys.platform == "win32":
                popen_kwargs["creationflags"] = subprocess.CREATE_NO_WINDOW
@@ -1013,8 +1023,9 @@ class ParallelOrchestrator:
                    "encoding": "utf-8",
                    "errors": "replace",
                    "cwd": str(self.project_dir),  # Run from project dir so CLI creates .claude/ in project
-                    "env": {**os.environ, "PYTHONUNBUFFERED": "1"},
+                    "env": {**os.environ, "PYTHONUNBUFFERED": "1", "NODE_COMPILE_CACHE": "", "PLAYWRIGHT_CLI_SESSION": f"testing-{self._testing_session_counter}"},
                }
+                self._testing_session_counter += 1
                if sys.platform == "win32":
                    popen_kwargs["creationflags"] = subprocess.CREATE_NO_WINDOW

@@ -1074,7 +1085,7 @@ class ParallelOrchestrator:
            "encoding": "utf-8",
            "errors": "replace",
            "cwd": str(AUTOFORGE_ROOT),
-            "env": {**os.environ, "PYTHONUNBUFFERED": "1"},
+            "env": {**os.environ, "PYTHONUNBUFFERED": "1", "NODE_COMPILE_CACHE": ""},
        }
        if sys.platform == "win32":
            popen_kwargs["creationflags"] = subprocess.CREATE_NO_WINDOW
@@ -1160,6 +1171,19 @@ class ParallelOrchestrator:
                debug_log.log("CLEANUP", f"Error killing process tree for {agent_type} agent", error=str(e))
            self._on_agent_complete(feature_id, proc.returncode, agent_type, proc)

+    def _run_inter_session_cleanup(self):
+        """Run lightweight cleanup between agent sessions.
+
+        Removes stale temp files and project screenshots to prevent
+        disk space accumulation during long overnight runs.
+        """
+        try:
+            from temp_cleanup import cleanup_project_screenshots, cleanup_stale_temp
+            cleanup_stale_temp()
+            cleanup_project_screenshots(self.project_dir)
+        except Exception as e:
+            debug_log.log("CLEANUP", f"Inter-session cleanup failed (non-fatal): {e}")
+
    def _signal_agent_completed(self):
        """Signal that an agent has completed, waking the main loop.

@@ -1235,6 +1259,8 @@ class ParallelOrchestrator:
                pid=proc.pid,
                feature_id=feature_id,
                status=status)
+            # Run lightweight cleanup between sessions
+            self._run_inter_session_cleanup()
            # Signal main loop that an agent slot is available
            self._signal_agent_completed()
            return
@@ -1301,6 +1327,8 @@ class ParallelOrchestrator:
        else:
            print(f"Feature #{feature_id} {status}", flush=True)

+        # Run lightweight cleanup between sessions
+        self._run_inter_session_cleanup()
        # Signal main loop that an agent slot is available
        self._signal_agent_completed()

@@ -1368,6 +1396,9 @@ class ParallelOrchestrator:
        # Must happen before any debug_log.log() calls
        debug_log.start_session()

+        # Clear any stale drain signal from a previous session
+        self._clear_drain_signal()
+
        # Log startup to debug file
        debug_log.section("ORCHESTRATOR STARTUP")
        debug_log.log("STARTUP", "Orchestrator run_loop starting",
@@ -1489,6 +1520,34 @@ class ParallelOrchestrator:
                    print("\nAll features complete!", flush=True)
                    break

+                # --- Graceful pause (drain mode) ---
+                if not self._drain_requested and self._check_drain_signal():
+                    self._drain_requested = True
+                    print("Graceful pause requested - draining running agents...", flush=True)
+                    debug_log.log("DRAIN", "Graceful pause requested, draining running agents")
+
+                if self._drain_requested:
+                    with self._lock:
+                        coding_count = len(self.running_coding_agents)
+                        testing_count = len(self.running_testing_agents)
+
+                    if coding_count == 0 and testing_count == 0:
+                        print("All agents drained - paused.", flush=True)
+                        debug_log.log("DRAIN", "All agents drained, entering paused state")
+                        # Wait until signal file is removed (resume) or shutdown
+                        while self._check_drain_signal() and self.is_running and not self._shutdown_requested:
+                            await asyncio.sleep(1)
+                        if not self.is_running or self._shutdown_requested:
+                            break
+                        self._drain_requested = False
+                        print("Resuming from graceful pause...", flush=True)
+                        debug_log.log("DRAIN", "Resuming from graceful pause")
+                        continue
+                    else:
+                        debug_log.log("DRAIN", f"Waiting for agents to finish: coding={coding_count}, testing={testing_count}")
+                        await self._wait_for_agent_completion()
+                        continue
+
                # Maintain testing agents independently (runs every iteration)
                self._maintain_testing_agents(feature_dicts)

@@ -1613,6 +1672,17 @@ class ParallelOrchestrator:
                "yolo_mode": self.yolo_mode,
            }

+    def _check_drain_signal(self) -> bool:
+        """Check if the graceful pause (drain) signal file exists."""
+        from autoforge_paths import get_pause_drain_path
+        return get_pause_drain_path(self.project_dir).exists()
+
+    def _clear_drain_signal(self) -> None:
+        """Delete the drain signal file and reset the flag."""
+        from autoforge_paths import get_pause_drain_path
+        get_pause_drain_path(self.project_dir).unlink(missing_ok=True)
+        self._drain_requested = False
+
    def cleanup(self) -> None:
        """Clean up database resources. Safe to call multiple times.

--- a/progress.py
+++ b/progress.py
@@ -62,54 +62,71 @@ def has_features(project_dir: Path) -> bool:
        return False


-def count_passing_tests(project_dir: Path) -> tuple[int, int, int]:
+def count_passing_tests(project_dir: Path) -> tuple[int, int, int, int]:
    """
-    Count passing, in_progress, and total tests via direct database access.
+    Count passing, in_progress, total, and needs_human_input tests via direct database access.

    Args:
        project_dir: Directory containing the project

    Returns:
-        (passing_count, in_progress_count, total_count)
+        (passing_count, in_progress_count, total_count, needs_human_input_count)
    """
    from autoforge_paths import get_features_db_path
    db_file = get_features_db_path(project_dir)
    if not db_file.exists():
-        return 0, 0, 0
+        return 0, 0, 0, 0

    try:
        with closing(_get_connection(db_file)) as conn:
            cursor = conn.cursor()
-            # Single aggregate query instead of 3 separate COUNT queries
-            # Handle case where in_progress column doesn't exist yet (legacy DBs)
+            # Single aggregate query instead of separate COUNT queries
+            # Handle case where columns don't exist yet (legacy DBs)
            try:
                cursor.execute("""
                    SELECT
                        COUNT(*) as total,
                        SUM(CASE WHEN passes = 1 THEN 1 ELSE 0 END) as passing,
-                        SUM(CASE WHEN in_progress = 1 THEN 1 ELSE 0 END) as in_progress
+                        SUM(CASE WHEN in_progress = 1 THEN 1 ELSE 0 END) as in_progress,
+                        SUM(CASE WHEN needs_human_input = 1 THEN 1 ELSE 0 END) as needs_human_input
                    FROM features
                """)
                row = cursor.fetchone()
                total = row[0] or 0
                passing = row[1] or 0
                in_progress = row[2] or 0
+                needs_human_input = row[3] or 0
            except sqlite3.OperationalError:
-                # Fallback for databases without in_progress column
-                cursor.execute("""
-                    SELECT
-                        COUNT(*) as total,
-                        SUM(CASE WHEN passes = 1 THEN 1 ELSE 0 END) as passing
-                    FROM features
-                """)
-                row = cursor.fetchone()
-                total = row[0] or 0
-                passing = row[1] or 0
-                in_progress = 0
-            return passing, in_progress, total
+                # Fallback for databases without newer columns
+                try:
+                    cursor.execute("""
+                        SELECT
+                            COUNT(*) as total,
+                            SUM(CASE WHEN passes = 1 THEN 1 ELSE 0 END) as passing,
+                            SUM(CASE WHEN in_progress = 1 THEN 1 ELSE 0 END) as in_progress
+                        FROM features
+                    """)
+                    row = cursor.fetchone()
+                    total = row[0] or 0
+                    passing = row[1] or 0
+                    in_progress = row[2] or 0
+                    needs_human_input = 0
+                except sqlite3.OperationalError:
+                    cursor.execute("""
+                        SELECT
+                            COUNT(*) as total,
+                            SUM(CASE WHEN passes = 1 THEN 1 ELSE 0 END) as passing
+                        FROM features
+                    """)
+                    row = cursor.fetchone()
+                    total = row[0] or 0
+                    passing = row[1] or 0
+                    in_progress = 0
+                    needs_human_input = 0
+            return passing, in_progress, total, needs_human_input
    except Exception as e:
        print(f"[Database error in count_passing_tests: {e}]")
-        return 0, 0, 0
+        return 0, 0, 0, 0


 def get_all_passing_features(project_dir: Path) -> list[dict]:
@@ -234,7 +251,7 @@ def print_session_header(session_num: int, is_initializer: bool) -> None:

 def print_progress_summary(project_dir: Path) -> None:
    """Print a summary of current progress."""
-    passing, in_progress, total = count_passing_tests(project_dir)
+    passing, in_progress, total, _needs_human_input = count_passing_tests(project_dir)

    if total > 0:
        percentage = (passing / total) * 100
--- a/prompts.py
+++ b/prompts.py
@@ -16,6 +16,9 @@ from pathlib import Path
 # Base templates location (generic templates)
 TEMPLATES_DIR = Path(__file__).parent / ".claude" / "templates"

+# Migration version — bump when adding new migration steps
+CURRENT_MIGRATION_VERSION = 1
+

 def get_project_prompts_dir(project_dir: Path) -> Path:
    """Get the prompts directory for a specific project."""
@@ -99,9 +102,9 @@ def _strip_browser_testing_sections(prompt: str) -> str:
        flags=re.DOTALL,
    )

-    # Replace the screenshots-only marking rule with YOLO-appropriate wording
+    # Replace the marking rule with YOLO-appropriate wording
    prompt = prompt.replace(
-        "**ONLY MARK A FEATURE AS PASSING AFTER VERIFICATION WITH SCREENSHOTS.**",
+        "**ONLY MARK A FEATURE AS PASSING AFTER VERIFICATION WITH BROWSER AUTOMATION.**",
        "**YOLO mode: Mark a feature as passing after lint/type-check succeeds and server starts cleanly.**",
    )

@@ -351,9 +354,70 @@ def scaffold_project_prompts(project_dir: Path) -> Path:
        except (OSError, PermissionError) as e:
            print(f"  Warning: Could not copy allowed_commands.yaml: {e}")

+    # Copy Playwright CLI skill for browser automation
+    skills_src = Path(__file__).parent / ".claude" / "skills" / "playwright-cli"
+    skills_dest = project_dir / ".claude" / "skills" / "playwright-cli"
+    if skills_src.exists() and not skills_dest.exists():
+        try:
+            shutil.copytree(skills_src, skills_dest)
+            copied_files.append(".claude/skills/playwright-cli/")
+        except (OSError, PermissionError) as e:
+            print(f"  Warning: Could not copy playwright-cli skill: {e}")
+
+    # Ensure .playwright-cli/ and .playwright/ are in project .gitignore
+    project_gitignore = project_dir / ".gitignore"
+    entries_to_add = [".playwright-cli/", ".playwright/"]
+    existing_lines: list[str] = []
+    if project_gitignore.exists():
+        try:
+            existing_lines = project_gitignore.read_text(encoding="utf-8").splitlines()
+        except (OSError, PermissionError):
+            pass
+    missing_entries = [e for e in entries_to_add if e not in existing_lines]
+    if missing_entries:
+        try:
+            with open(project_gitignore, "a", encoding="utf-8") as f:
+                # Add newline before entries if file doesn't end with one
+                if existing_lines and existing_lines[-1].strip():
+                    f.write("\n")
+                for entry in missing_entries:
+                    f.write(f"{entry}\n")
+        except (OSError, PermissionError) as e:
+            print(f"  Warning: Could not update .gitignore: {e}")
+
+    # Scaffold .playwright/cli.config.json for browser settings
+    playwright_config_dir = project_dir / ".playwright"
+    playwright_config_file = playwright_config_dir / "cli.config.json"
+    if not playwright_config_file.exists():
+        try:
+            playwright_config_dir.mkdir(parents=True, exist_ok=True)
+            import json
+            config = {
+                "browser": {
+                    "browserName": "chromium",
+                    "launchOptions": {
+                        "channel": "chrome",
+                        "headless": True,
+                    },
+                    "contextOptions": {
+                        "viewport": {"width": 1280, "height": 720},
+                    },
+                    "isolated": True,
+                },
+            }
+            with open(playwright_config_file, "w", encoding="utf-8") as f:
+                json.dump(config, f, indent=2)
+                f.write("\n")
+            copied_files.append(".playwright/cli.config.json")
+        except (OSError, PermissionError) as e:
+            print(f"  Warning: Could not create playwright config: {e}")
+
    if copied_files:
        print(f"  Created project files: {', '.join(copied_files)}")

+    # Stamp new projects at the current migration version so they never trigger migration
+    _set_migration_version(project_dir, CURRENT_MIGRATION_VERSION)
+
    return project_prompts


@@ -425,3 +489,330 @@ def copy_spec_to_project(project_dir: Path) -> None:
            return

    print("Warning: No app_spec.txt found to copy to project directory")
+
+
+# ---------------------------------------------------------------------------
+# Project version migration
+# ---------------------------------------------------------------------------
+
+# Replacement content: coding_prompt.md STEP 5 section (Playwright CLI)
+_CLI_STEP5_CONTENT = """\
+### STEP 5: VERIFY WITH BROWSER AUTOMATION
+
+**CRITICAL:** You MUST verify features through the actual UI.
+
+Use `playwright-cli` for browser automation:
+
+- Open the browser: `playwright-cli open http://localhost:PORT`
+- Take a snapshot to see page elements: `playwright-cli snapshot`
+- Read the snapshot YAML file to see element refs
+- Click elements by ref: `playwright-cli click e5`
+- Type text: `playwright-cli type "search query"`
+- Fill form fields: `playwright-cli fill e3 "value"`
+- Take screenshots: `playwright-cli screenshot`
+- Read the screenshot file to verify visual appearance
+- Check console errors: `playwright-cli console`
+- Close browser when done: `playwright-cli close`
+
+**Token-efficient workflow:** `playwright-cli screenshot` and `snapshot` save files
+to `.playwright-cli/`. You will see a file link in the output. Read the file only
+when you need to verify visual appearance or find element refs.
+
+**DO:**
+- Test through the UI with clicks and keyboard input
+- Take screenshots and read them to verify visual appearance
+- Check for console errors with `playwright-cli console`
+- Verify complete user workflows end-to-end
+- Always run `playwright-cli close` when finished testing
+
+**DON'T:**
+- Only test with curl commands
+- Use JavaScript evaluation to bypass UI (`eval` and `run-code` are blocked)
+- Skip visual verification
+- Mark tests passing without thorough verification
+
+"""
+
+# Replacement content: coding_prompt.md BROWSER AUTOMATION reference section
+_CLI_BROWSER_SECTION = """\
+## BROWSER AUTOMATION
+
+Use `playwright-cli` commands for UI verification. Key commands: `open`, `goto`,
+`snapshot`, `click`, `type`, `fill`, `screenshot`, `console`, `close`.
+
+**How it works:** `playwright-cli` uses a persistent browser daemon. `open` starts it,
+subsequent commands interact via socket, `close` shuts it down. Screenshots and snapshots
+save to `.playwright-cli/` -- read the files when you need to verify content.
+
+Test like a human user with mouse and keyboard. Use `playwright-cli console` to detect
+JS errors. Don't bypass UI with JavaScript evaluation.
+
+"""
+
+# Replacement content: testing_prompt.md STEP 2 section (Playwright CLI)
+_CLI_TESTING_STEP2 = """\
+### STEP 2: VERIFY THE FEATURE
+
+**CRITICAL:** You MUST verify the feature through the actual UI using browser automation.
+
+For the feature returned:
+1. Read and understand the feature's verification steps
+2. Navigate to the relevant part of the application
+3. Execute each verification step using browser automation
+4. Take screenshots and read them to verify visual appearance
+5. Check for console errors
+
+### Browser Automation (Playwright CLI)
+
+**Navigation & Screenshots:**
+- `playwright-cli open <url>` - Open browser and navigate
+- `playwright-cli goto <url>` - Navigate to URL
+- `playwright-cli screenshot` - Save screenshot to `.playwright-cli/`
+- `playwright-cli snapshot` - Save page snapshot with element refs to `.playwright-cli/`
+
+**Element Interaction:**
+- `playwright-cli click <ref>` - Click elements (ref from snapshot)
+- `playwright-cli type <text>` - Type text
+- `playwright-cli fill <ref> <text>` - Fill form fields
+- `playwright-cli select <ref> <val>` - Select dropdown
+- `playwright-cli press <key>` - Keyboard input
+
+**Debugging:**
+- `playwright-cli console` - Check for JS errors
+- `playwright-cli network` - Monitor API calls
+
+**Cleanup:**
+- `playwright-cli close` - Close browser when done (ALWAYS do this)
+
+**Note:** Screenshots and snapshots save to files. Read the file to see the content.
+
+"""
+
+# Replacement content: testing_prompt.md AVAILABLE TOOLS browser subsection
+_CLI_TESTING_TOOLS = """\
+### Browser Automation (Playwright CLI)
+Use `playwright-cli` commands for browser interaction. Key commands:
+- `playwright-cli open <url>` - Open browser
+- `playwright-cli goto <url>` - Navigate to URL
+- `playwright-cli screenshot` - Take screenshot (saved to `.playwright-cli/`)
+- `playwright-cli snapshot` - Get page snapshot with element refs
+- `playwright-cli click <ref>` - Click element
+- `playwright-cli type <text>` - Type text
+- `playwright-cli fill <ref> <text>` - Fill form field
+- `playwright-cli console` - Check for JS errors
+- `playwright-cli close` - Close browser (always do this when done)
+
+"""
+
+
+def _get_migration_version(project_dir: Path) -> int:
+    """Read the migration version from .autoforge/.migration_version."""
+    from autoforge_paths import get_autoforge_dir
+    version_file = get_autoforge_dir(project_dir) / ".migration_version"
+    if not version_file.exists():
+        return 0
+    try:
+        return int(version_file.read_text().strip())
+    except (ValueError, OSError):
+        return 0
+
+
+def _set_migration_version(project_dir: Path, version: int) -> None:
+    """Write the migration version to .autoforge/.migration_version."""
+    from autoforge_paths import get_autoforge_dir
+    version_file = get_autoforge_dir(project_dir) / ".migration_version"
+    version_file.parent.mkdir(parents=True, exist_ok=True)
+    version_file.write_text(str(version))
+
+
+def _migrate_coding_prompt_to_cli(content: str) -> str:
+    """Replace MCP-based Playwright sections with CLI-based content in coding prompt."""
+    # Replace STEP 5 section (from header to just before STEP 5.5)
+    content = re.sub(
+        r"### STEP 5: VERIFY WITH BROWSER AUTOMATION.*?(?=### STEP 5\.5:)",
+        _CLI_STEP5_CONTENT,
+        content,
+        count=1,
+        flags=re.DOTALL,
+    )
+
+    # Replace BROWSER AUTOMATION reference section (from header to next ---)
+    content = re.sub(
+        r"## BROWSER AUTOMATION\n\n.*?(?=---)",
+        _CLI_BROWSER_SECTION,
+        content,
+        count=1,
+        flags=re.DOTALL,
+    )
+
+    # Replace inline screenshot rule
+    content = content.replace(
+        "**ONLY MARK A FEATURE AS PASSING AFTER VERIFICATION WITH SCREENSHOTS.**",
+        "**ONLY MARK A FEATURE AS PASSING AFTER VERIFICATION WITH BROWSER AUTOMATION.**",
+    )
+
+    # Replace inline screenshot references (various phrasings from old templates)
+    for old_phrase in (
+        "(inline only -- do NOT save to disk)",
+        "(inline only, never save to disk)",
+        "(inline mode only -- never save to disk)",
+    ):
+        content = content.replace(old_phrase, "(saved to `.playwright-cli/`)")
+
+    return content
+
+
+def _migrate_testing_prompt_to_cli(content: str) -> str:
+    """Replace MCP-based Playwright sections with CLI-based content in testing prompt."""
+    # Replace AVAILABLE TOOLS browser subsection FIRST (before STEP 2, to avoid
+    # matching the new CLI subsection header that the STEP 2 replacement inserts).
+    # In old prompts, ### Browser Automation (Playwright) only exists in AVAILABLE TOOLS.
+    content = re.sub(
+        r"### Browser Automation \(Playwright[^)]*\)\n.*?(?=---)",
+        _CLI_TESTING_TOOLS,
+        content,
+        count=1,
+        flags=re.DOTALL,
+    )
+
+    # Replace STEP 2 verification section (from header to just before STEP 3)
+    content = re.sub(
+        r"### STEP 2: VERIFY THE FEATURE.*?(?=### STEP 3:)",
+        _CLI_TESTING_STEP2,
+        content,
+        count=1,
+        flags=re.DOTALL,
+    )
+
+    # Replace inline screenshot references (various phrasings from old templates)
+    for old_phrase in (
+        "(inline only -- do NOT save to disk)",
+        "(inline only, never save to disk)",
+        "(inline mode only -- never save to disk)",
+    ):
+        content = content.replace(old_phrase, "(saved to `.playwright-cli/`)")
+
+    return content
+
+
+def _migrate_v0_to_v1(project_dir: Path) -> list[str]:
+    """Migrate from v0 (MCP-based Playwright) to v1 (Playwright CLI).
+
+    Four idempotent sub-steps:
+    A. Copy playwright-cli skill to project
+    B. Scaffold .playwright/cli.config.json
+    C. Update .gitignore with .playwright-cli/ and .playwright/
+    D. Update coding_prompt.md and testing_prompt.md
+    """
+    import json
+
+    migrated: list[str] = []
+
+    # A. Copy Playwright CLI skill
+    skills_src = Path(__file__).parent / ".claude" / "skills" / "playwright-cli"
+    skills_dest = project_dir / ".claude" / "skills" / "playwright-cli"
+    if skills_src.exists() and not skills_dest.exists():
+        try:
+            shutil.copytree(skills_src, skills_dest)
+            migrated.append("Copied playwright-cli skill")
+        except (OSError, PermissionError) as e:
+            print(f"  Warning: Could not copy playwright-cli skill: {e}")
+
+    # B. Scaffold .playwright/cli.config.json
+    playwright_config_dir = project_dir / ".playwright"
+    playwright_config_file = playwright_config_dir / "cli.config.json"
+    if not playwright_config_file.exists():
+        try:
+            playwright_config_dir.mkdir(parents=True, exist_ok=True)
+            config = {
+                "browser": {
+                    "browserName": "chromium",
+                    "launchOptions": {
+                        "channel": "chrome",
+                        "headless": True,
+                    },
+                    "contextOptions": {
+                        "viewport": {"width": 1280, "height": 720},
+                    },
+                    "isolated": True,
+                },
+            }
+            with open(playwright_config_file, "w", encoding="utf-8") as f:
+                json.dump(config, f, indent=2)
+                f.write("\n")
+            migrated.append("Created .playwright/cli.config.json")
+        except (OSError, PermissionError) as e:
+            print(f"  Warning: Could not create playwright config: {e}")
+
+    # C. Update .gitignore
+    project_gitignore = project_dir / ".gitignore"
+    entries_to_add = [".playwright-cli/", ".playwright/"]
+    existing_lines: list[str] = []
+    if project_gitignore.exists():
+        try:
+            existing_lines = project_gitignore.read_text(encoding="utf-8").splitlines()
+        except (OSError, PermissionError):
+            pass
+    missing_entries = [e for e in entries_to_add if e not in existing_lines]
+    if missing_entries:
+        try:
+            with open(project_gitignore, "a", encoding="utf-8") as f:
+                if existing_lines and existing_lines[-1].strip():
+                    f.write("\n")
+                for entry in missing_entries:
+                    f.write(f"{entry}\n")
+            migrated.append(f"Added {', '.join(missing_entries)} to .gitignore")
+        except (OSError, PermissionError) as e:
+            print(f"  Warning: Could not update .gitignore: {e}")
+
+    # D. Update prompts
+    prompts_dir = get_project_prompts_dir(project_dir)
+
+    # D1. Update coding_prompt.md
+    coding_prompt_path = prompts_dir / "coding_prompt.md"
+    if coding_prompt_path.exists():
+        try:
+            content = coding_prompt_path.read_text(encoding="utf-8")
+            if "Playwright MCP" in content or "browser_navigate" in content or "browser_take_screenshot" in content:
+                updated = _migrate_coding_prompt_to_cli(content)
+                if updated != content:
+                    coding_prompt_path.write_text(updated, encoding="utf-8")
+                    migrated.append("Updated coding_prompt.md to Playwright CLI")
+        except (OSError, PermissionError) as e:
+            print(f"  Warning: Could not update coding_prompt.md: {e}")
+
+    # D2. Update testing_prompt.md
+    testing_prompt_path = prompts_dir / "testing_prompt.md"
+    if testing_prompt_path.exists():
+        try:
+            content = testing_prompt_path.read_text(encoding="utf-8")
+            if "browser_navigate" in content or "browser_take_screenshot" in content:
+                updated = _migrate_testing_prompt_to_cli(content)
+                if updated != content:
+                    testing_prompt_path.write_text(updated, encoding="utf-8")
+                    migrated.append("Updated testing_prompt.md to Playwright CLI")
+        except (OSError, PermissionError) as e:
+            print(f"  Warning: Could not update testing_prompt.md: {e}")
+
+    return migrated
+
+
+def migrate_project_to_current(project_dir: Path) -> list[str]:
+    """Migrate an existing project to the current AutoForge version.
+
+    Idempotent — safe to call on every agent start. Returns list of
+    human-readable descriptions of what was migrated.
+    """
+    current = _get_migration_version(project_dir)
+    if current >= CURRENT_MIGRATION_VERSION:
+        return []
+
+    migrated: list[str] = []
+
+    if current < 1:
+        migrated.extend(_migrate_v0_to_v1(project_dir))
+
+    # Future: if current < 2: migrated.extend(_migrate_v1_to_v2(project_dir))
+
+    _set_migration_version(project_dir, CURRENT_MIGRATION_VERSION)
+    return migrated
--- a/registry.py
+++ b/registry.py
@@ -46,10 +46,16 @@ def _migrate_registry_dir() -> None:
 # Available models with display names
 # To add a new model: add an entry here with {"id": "model-id", "name": "Display Name"}
 AVAILABLE_MODELS = [
-    {"id": "claude-opus-4-5-20251101", "name": "Claude Opus 4.5"},
-    {"id": "claude-sonnet-4-5-20250929", "name": "Claude Sonnet 4.5"},
+    {"id": "claude-opus-4-6", "name": "Claude Opus"},
+    {"id": "claude-sonnet-4-5-20250929", "name": "Claude Sonnet"},
 ]

+# Map legacy model IDs to their current replacements.
+# Used by get_all_settings() to auto-migrate stale values on first read after upgrade.
+LEGACY_MODEL_MAP = {
+    "claude-opus-4-5-20251101": "claude-opus-4-6",
+}
+
 # List of valid model IDs (derived from AVAILABLE_MODELS)
 VALID_MODELS = [m["id"] for m in AVAILABLE_MODELS]

@@ -59,7 +65,7 @@ VALID_MODELS = [m["id"] for m in AVAILABLE_MODELS]
 _env_default_model = os.getenv("ANTHROPIC_DEFAULT_OPUS_MODEL")
 if _env_default_model is not None:
    _env_default_model = _env_default_model.strip()
-DEFAULT_MODEL = _env_default_model or "claude-opus-4-5-20251101"
+DEFAULT_MODEL = _env_default_model or "claude-opus-4-6"

 # Ensure env-provided DEFAULT_MODEL is in VALID_MODELS for validation consistency
 # (idempotent: only adds if missing, doesn't alter AVAILABLE_MODELS semantics)
@@ -598,6 +604,9 @@ def get_all_settings() -> dict[str, str]:
    """
    Get all settings as a dictionary.

+    Automatically migrates legacy model IDs (e.g. claude-opus-4-5-20251101 -> claude-opus-4-6)
+    on first read after upgrade. This is a one-time silent migration.
+
    Returns:
        Dictionary mapping setting keys to values.
    """
@@ -606,9 +615,171 @@ def get_all_settings() -> dict[str, str]:
        session = SessionLocal()
        try:
            settings = session.query(Settings).all()
-            return {s.key: s.value for s in settings}
+            result = {s.key: s.value for s in settings}
+
+            # Auto-migrate legacy model IDs
+            migrated = False
+            for key in ("model", "api_model"):
+                old_id = result.get(key)
+                if old_id and old_id in LEGACY_MODEL_MAP:
+                    new_id = LEGACY_MODEL_MAP[old_id]
+                    setting = session.query(Settings).filter(Settings.key == key).first()
+                    if setting:
+                        setting.value = new_id
+                        setting.updated_at = datetime.now()
+                        result[key] = new_id
+                        migrated = True
+                        logger.info("Migrated setting '%s': %s -> %s", key, old_id, new_id)
+
+            if migrated:
+                session.commit()
+
+            return result
        finally:
            session.close()
    except Exception as e:
        logger.warning("Failed to read settings: %s", e)
        return {}
+
+
+# =============================================================================
+# API Provider Definitions
+# =============================================================================
+
+API_PROVIDERS: dict[str, dict[str, Any]] = {
+    "claude": {
+        "name": "Claude (Anthropic)",
+        "base_url": None,
+        "requires_auth": False,
+        "models": [
+            {"id": "claude-opus-4-6", "name": "Claude Opus"},
+            {"id": "claude-sonnet-4-5-20250929", "name": "Claude Sonnet"},
+        ],
+        "default_model": "claude-opus-4-6",
+    },
+    "kimi": {
+        "name": "Kimi K2.5 (Moonshot)",
+        "base_url": "https://api.kimi.com/coding/",
+        "requires_auth": True,
+        "auth_env_var": "ANTHROPIC_API_KEY",
+        "models": [{"id": "kimi-k2.5", "name": "Kimi K2.5"}],
+        "default_model": "kimi-k2.5",
+    },
+    "glm": {
+        "name": "GLM (Zhipu AI)",
+        "base_url": "https://api.z.ai/api/anthropic",
+        "requires_auth": True,
+        "auth_env_var": "ANTHROPIC_AUTH_TOKEN",
+        "models": [
+            {"id": "glm-4.7", "name": "GLM 4.7"},
+            {"id": "glm-4.5-air", "name": "GLM 4.5 Air"},
+        ],
+        "default_model": "glm-4.7",
+    },
+    "azure": {
+        "name": "Azure Anthropic (Claude)",
+        "base_url": "",
+        "requires_auth": True,
+        "auth_env_var": "ANTHROPIC_API_KEY",
+        "models": [
+            {"id": "claude-opus-4-6", "name": "Claude Opus"},
+            {"id": "claude-sonnet-4-5", "name": "Claude Sonnet"},
+            {"id": "claude-haiku-4-5", "name": "Claude Haiku"},
+        ],
+        "default_model": "claude-opus-4-6",
+    },
+    "ollama": {
+        "name": "Ollama (Local)",
+        "base_url": "http://localhost:11434",
+        "requires_auth": False,
+        "models": [
+            {"id": "qwen3-coder", "name": "Qwen3 Coder"},
+            {"id": "deepseek-coder-v2", "name": "DeepSeek Coder V2"},
+        ],
+        "default_model": "qwen3-coder",
+    },
+    "custom": {
+        "name": "Custom Provider",
+        "base_url": "",
+        "requires_auth": True,
+        "auth_env_var": "ANTHROPIC_AUTH_TOKEN",
+        "models": [],
+        "default_model": "",
+    },
+}
+
+
+def get_effective_sdk_env() -> dict[str, str]:
+    """Build environment variable dict for Claude SDK based on current API provider settings.
+
+    When api_provider is "claude" (or unset), falls back to existing env vars (current behavior).
+    For other providers, builds env dict from stored settings (api_base_url, api_auth_token, api_model).
+
+    Returns:
+        Dict ready to merge into subprocess env or pass to SDK.
+    """
+    all_settings = get_all_settings()
+    provider_id = all_settings.get("api_provider", "claude")
+
+    if provider_id == "claude":
+        # Default behavior: forward existing env vars
+        from env_constants import API_ENV_VARS
+        sdk_env: dict[str, str] = {}
+        for var in API_ENV_VARS:
+            value = os.getenv(var)
+            if value:
+                sdk_env[var] = value
+        return sdk_env
+
+    # Alternative provider: build env from settings
+    provider = API_PROVIDERS.get(provider_id)
+    if not provider:
+        logger.warning("Unknown API provider '%s', falling back to claude", provider_id)
+        from env_constants import API_ENV_VARS
+        sdk_env = {}
+        for var in API_ENV_VARS:
+            value = os.getenv(var)
+            if value:
+                sdk_env[var] = value
+        return sdk_env
+
+    sdk_env = {}
+
+    # Explicitly clear credentials that could leak from the server process env.
+    # For providers using ANTHROPIC_AUTH_TOKEN (GLM, Custom), clear ANTHROPIC_API_KEY.
+    # For providers using ANTHROPIC_API_KEY (Kimi), clear ANTHROPIC_AUTH_TOKEN.
+    # This prevents the Claude CLI from using the wrong credentials.
+    auth_env_var = provider.get("auth_env_var", "ANTHROPIC_AUTH_TOKEN")
+    if auth_env_var == "ANTHROPIC_AUTH_TOKEN":
+        sdk_env["ANTHROPIC_API_KEY"] = ""
+    elif auth_env_var == "ANTHROPIC_API_KEY":
+        sdk_env["ANTHROPIC_AUTH_TOKEN"] = ""
+
+    # Clear Vertex AI vars when using non-Vertex alternative providers
+    sdk_env["CLAUDE_CODE_USE_VERTEX"] = ""
+    sdk_env["CLOUD_ML_REGION"] = ""
+    sdk_env["ANTHROPIC_VERTEX_PROJECT_ID"] = ""
+
+    # Base URL
+    base_url = all_settings.get("api_base_url") or provider.get("base_url")
+    if base_url:
+        sdk_env["ANTHROPIC_BASE_URL"] = base_url
+
+    # Auth token
+    auth_token = all_settings.get("api_auth_token")
+    if auth_token:
+        sdk_env[auth_env_var] = auth_token
+
+    # Model - set all three tier overrides to the same model
+    model = all_settings.get("api_model") or provider.get("default_model")
+    if model:
+        sdk_env["ANTHROPIC_DEFAULT_OPUS_MODEL"] = model
+        sdk_env["ANTHROPIC_DEFAULT_SONNET_MODEL"] = model
+        sdk_env["ANTHROPIC_DEFAULT_HAIKU_MODEL"] = model
+
+    # Timeout
+    timeout = all_settings.get("api_timeout_ms")
+    if timeout:
+        sdk_env["API_TIMEOUT_MS"] = timeout
+
+    return sdk_env
--- a/security.py
+++ b/security.py
@@ -66,10 +66,12 @@ ALLOWED_COMMANDS = {
    "bash",
    # Script execution
    "init.sh",  # Init scripts; validated separately
+    # Browser automation
+    "playwright-cli",  # Playwright CLI for browser testing; validated separately
 }

 # Commands that need additional validation even when in the allowlist
-COMMANDS_NEEDING_EXTRA_VALIDATION = {"pkill", "chmod", "init.sh"}
+COMMANDS_NEEDING_EXTRA_VALIDATION = {"pkill", "chmod", "init.sh", "playwright-cli"}

 # Commands that are NEVER allowed, even with user approval
 # These commands can cause permanent system damage or security breaches
@@ -438,6 +440,37 @@ def validate_init_script(command_string: str) -> tuple[bool, str]:
    return False, f"Only ./init.sh is allowed, got: {script}"


+def validate_playwright_command(command_string: str) -> tuple[bool, str]:
+    """
+    Validate playwright-cli commands - block dangerous subcommands.
+
+    Blocks `run-code` (arbitrary Node.js execution) and `eval` (arbitrary JS
+    evaluation) which bypass the security sandbox.
+
+    Returns:
+        Tuple of (is_allowed, reason_if_blocked)
+    """
+    try:
+        tokens = shlex.split(command_string)
+    except ValueError:
+        return False, "Could not parse playwright-cli command"
+
+    if not tokens:
+        return False, "Empty command"
+
+    BLOCKED_SUBCOMMANDS = {"run-code", "eval"}
+
+    # Find the subcommand: first non-flag token after 'playwright-cli'
+    for token in tokens[1:]:
+        if token.startswith("-"):
+            continue  # skip flags like -s=agent-1
+        if token in BLOCKED_SUBCOMMANDS:
+            return False, f"playwright-cli '{token}' is not allowed"
+        break  # first non-flag token is the subcommand
+
+    return True, ""
+
+
 def matches_pattern(command: str, pattern: str) -> bool:
    """
    Check if a command matches a pattern.
@@ -955,5 +988,9 @@ async def bash_security_hook(input_data, tool_use_id=None, context=None):
                allowed, reason = validate_init_script(cmd_segment)
                if not allowed:
                    return {"decision": "block", "reason": reason}
+            elif cmd == "playwright-cli":
+                allowed, reason = validate_playwright_command(cmd_segment)
+                if not allowed:
+                    return {"decision": "block", "reason": reason}

    return {}
--- a/server/main.py
+++ b/server/main.py
@@ -61,6 +61,17 @@ UI_DIST_DIR = ROOT_DIR / "ui" / "dist"
@asynccontextmanager
 async def lifespan(app: FastAPI):
    """Lifespan context manager for startup and shutdown."""
+    # Startup - clean up stale temp files (Playwright profiles, .node cache, etc.)
+    try:
+        from temp_cleanup import cleanup_stale_temp
+        stats = cleanup_stale_temp()
+        if stats["dirs_deleted"] > 0 or stats["files_deleted"] > 0:
+            mb_freed = stats["bytes_freed"] / (1024 * 1024)
+            logger.info("Startup temp cleanup: %d dirs, %d files, %.1f MB freed",
+                        stats["dirs_deleted"], stats["files_deleted"], mb_freed)
+    except Exception as e:
+        logger.warning("Startup temp cleanup failed (non-fatal): %s", e)
+
    # Startup - clean up orphaned lock files from previous runs
    cleanup_orphaned_locks()
    cleanup_orphaned_devserver_locks()
--- a/server/routers/agent.py
+++ b/server/routers/agent.py
@@ -32,7 +32,7 @@ def _get_settings_defaults() -> tuple[bool, str, int, bool, int]:

    settings = get_all_settings()
    yolo_mode = (settings.get("yolo_mode") or "false").lower() == "true"
-    model = settings.get("model", DEFAULT_MODEL)
+    model = settings.get("api_model") or settings.get("model", DEFAULT_MODEL)

    # Parse testing agent settings with defaults
    try:
@@ -175,3 +175,31 @@ async def resume_agent(project_name: str):
        status=manager.status,
        message=message,
    )
+
+
+@router.post("/graceful-pause", response_model=AgentActionResponse)
+async def graceful_pause_agent(project_name: str):
+    """Request a graceful pause (drain mode) - finish current work then pause."""
+    manager = get_project_manager(project_name)
+
+    success, message = await manager.graceful_pause()
+
+    return AgentActionResponse(
+        success=success,
+        status=manager.status,
+        message=message,
+    )
+
+
+@router.post("/graceful-resume", response_model=AgentActionResponse)
+async def graceful_resume_agent(project_name: str):
+    """Resume from a graceful pause."""
+    manager = get_project_manager(project_name)
+
+    success, message = await manager.graceful_resume()
+
+    return AgentActionResponse(
+        success=success,
+        status=manager.status,
+        message=message,
+    )
--- a/server/routers/assistant_chat.py
+++ b/server/routers/assistant_chat.py
@@ -26,7 +26,7 @@ from ..services.assistant_database import (
    get_conversations,
 )
 from ..utils.project_helpers import get_project_path as _get_project_path
-from ..utils.validation import is_valid_project_name as validate_project_name
+from ..utils.validation import validate_project_name

 logger = logging.getLogger(__name__)

@@ -207,30 +207,38 @@ async def assistant_chat_websocket(websocket: WebSocket, project_name: str):
    Client -> Server:
    - {"type": "start", "conversation_id": int | null} - Start/resume session
    - {"type": "message", "content": "..."} - Send user message
+    - {"type": "answer", "answers": {...}} - Answer to structured questions
    - {"type": "ping"} - Keep-alive ping

    Server -> Client:
    - {"type": "conversation_created", "conversation_id": int} - New conversation created
    - {"type": "text", "content": "..."} - Text chunk from Claude
    - {"type": "tool_call", "tool": "...", "input": {...}} - Tool being called
+    - {"type": "question", "questions": [...]} - Structured questions for user
    - {"type": "response_done"} - Response complete
    - {"type": "error", "content": "..."} - Error message
    - {"type": "pong"} - Keep-alive pong
    """
-    if not validate_project_name(project_name):
+    # Always accept WebSocket first to avoid opaque 403 errors
+    await websocket.accept()
+
+    try:
+        project_name = validate_project_name(project_name)
+    except HTTPException:
+        await websocket.send_json({"type": "error", "content": "Invalid project name"})
        await websocket.close(code=4000, reason="Invalid project name")
        return

    project_dir = _get_project_path(project_name)
    if not project_dir:
+        await websocket.send_json({"type": "error", "content": "Project not found in registry"})
        await websocket.close(code=4004, reason="Project not found in registry")
        return

    if not project_dir.exists():
+        await websocket.send_json({"type": "error", "content": "Project directory not found"})
        await websocket.close(code=4004, reason="Project directory not found")
        return
-
-    await websocket.accept()
    logger.info(f"Assistant WebSocket connected for project: {project_name}")

    session: Optional[AssistantChatSession] = None
@@ -297,6 +305,34 @@ async def assistant_chat_websocket(websocket: WebSocket, project_name: str):
                    async for chunk in session.send_message(user_content):
                        await websocket.send_json(chunk)

+                elif msg_type == "answer":
+                    # User answered a structured question
+                    if not session:
+                        session = get_session(project_name)
+                        if not session:
+                            await websocket.send_json({
+                                "type": "error",
+                                "content": "No active session. Send 'start' first."
+                            })
+                            continue
+
+                    # Format the answers as a natural response
+                    answers = message.get("answers", {})
+                    if isinstance(answers, dict):
+                        response_parts = []
+                        for question_idx, answer_value in answers.items():
+                            if isinstance(answer_value, list):
+                                response_parts.append(", ".join(answer_value))
+                            else:
+                                response_parts.append(str(answer_value))
+                        user_response = "; ".join(response_parts) if response_parts else "OK"
+                    else:
+                        user_response = str(answers)
+
+                    # Stream Claude's response
+                    async for chunk in session.send_message(user_response):
+                        await websocket.send_json(chunk)
+
                else:
                    await websocket.send_json({
                        "type": "error",
--- a/server/routers/expand_project.py
+++ b/server/routers/expand_project.py
@@ -104,19 +104,26 @@ async def expand_project_websocket(websocket: WebSocket, project_name: str):
    - {"type": "error", "content": "..."} - Error message
    - {"type": "pong"} - Keep-alive pong
    """
+    # Always accept the WebSocket first to avoid opaque 403 errors.
+    # Starlette returns 403 if we close before accepting.
+    await websocket.accept()
+
    try:
        project_name = validate_project_name(project_name)
    except HTTPException:
+        await websocket.send_json({"type": "error", "content": "Invalid project name"})
        await websocket.close(code=4000, reason="Invalid project name")
        return

    # Look up project directory from registry
    project_dir = _get_project_path(project_name)
    if not project_dir:
+        await websocket.send_json({"type": "error", "content": "Project not found in registry"})
        await websocket.close(code=4004, reason="Project not found in registry")
        return

    if not project_dir.exists():
+        await websocket.send_json({"type": "error", "content": "Project directory not found"})
        await websocket.close(code=4004, reason="Project directory not found")
        return

@@ -124,11 +131,10 @@ async def expand_project_websocket(websocket: WebSocket, project_name: str):
    from autoforge_paths import get_prompts_dir
    spec_path = get_prompts_dir(project_dir) / "app_spec.txt"
    if not spec_path.exists():
+        await websocket.send_json({"type": "error", "content": "Project has no spec. Create a spec first before expanding."})
        await websocket.close(code=4004, reason="Project has no spec. Create spec first.")
        return

-    await websocket.accept()
-
    session: Optional[ExpandChatSession] = None

    try:
--- a/server/routers/features.py
+++ b/server/routers/features.py
@@ -23,6 +23,7 @@ from ..schemas import (
    FeatureListResponse,
    FeatureResponse,
    FeatureUpdate,
+    HumanInputResponse,
 )
 from ..utils.project_helpers import get_project_path as _get_project_path
 from ..utils.validation import validate_project_name
@@ -104,6 +105,9 @@ def feature_to_response(f, passing_ids: set[int] | None = None) -> FeatureRespon
        in_progress=f.in_progress if f.in_progress is not None else False,
        blocked=blocked,
        blocking_dependencies=blocking,
+        needs_human_input=getattr(f, 'needs_human_input', False) or False,
+        human_input_request=getattr(f, 'human_input_request', None),
+        human_input_response=getattr(f, 'human_input_response', None),
    )


@@ -143,11 +147,14 @@ async def list_features(project_name: str):
            pending = []
            in_progress = []
            done = []
+            needs_human_input_list = []

            for f in all_features:
                feature_response = feature_to_response(f, passing_ids)
                if f.passes:
                    done.append(feature_response)
+                elif getattr(f, 'needs_human_input', False):
+                    needs_human_input_list.append(feature_response)
                elif f.in_progress:
                    in_progress.append(feature_response)
                else:
@@ -157,6 +164,7 @@ async def list_features(project_name: str):
                pending=pending,
                in_progress=in_progress,
                done=done,
+                needs_human_input=needs_human_input_list,
            )
    except HTTPException:
        raise
@@ -341,9 +349,11 @@ async def get_dependency_graph(project_name: str):
                deps = f.dependencies or []
                blocking = [d for d in deps if d not in passing_ids]

-                status: Literal["pending", "in_progress", "done", "blocked"]
+                status: Literal["pending", "in_progress", "done", "blocked", "needs_human_input"]
                if f.passes:
                    status = "done"
+                elif getattr(f, 'needs_human_input', False):
+                    status = "needs_human_input"
                elif blocking:
                    status = "blocked"
                elif f.in_progress:
@@ -564,6 +574,71 @@ async def skip_feature(project_name: str, feature_id: int):
        raise HTTPException(status_code=500, detail="Failed to skip feature")


+@router.post("/{feature_id}/resolve-human-input", response_model=FeatureResponse)
+async def resolve_human_input(project_name: str, feature_id: int, response: HumanInputResponse):
+    """Resolve a human input request for a feature.
+
+    Validates all required fields have values, stores the response,
+    and returns the feature to the pending queue for agents to pick up.
+    """
+    project_name = validate_project_name(project_name)
+    project_dir = _get_project_path(project_name)
+
+    if not project_dir:
+        raise HTTPException(status_code=404, detail=f"Project '{project_name}' not found in registry")
+
+    if not project_dir.exists():
+        raise HTTPException(status_code=404, detail="Project directory not found")
+
+    _, Feature = _get_db_classes()
+
+    try:
+        with get_db_session(project_dir) as session:
+            feature = session.query(Feature).filter(Feature.id == feature_id).first()
+
+            if not feature:
+                raise HTTPException(status_code=404, detail=f"Feature {feature_id} not found")
+
+            if not getattr(feature, 'needs_human_input', False):
+                raise HTTPException(status_code=400, detail="Feature is not waiting for human input")
+
+            # Validate required fields
+            request_data = feature.human_input_request
+            if request_data and isinstance(request_data, dict):
+                for field_def in request_data.get("fields", []):
+                    if field_def.get("required", True):
+                        field_id = field_def.get("id")
+                        if field_id not in response.fields or response.fields[field_id] in (None, ""):
+                            raise HTTPException(
+                                status_code=400,
+                                detail=f"Required field '{field_def.get('label', field_id)}' is missing"
+                            )
+
+            # Store response and return to pending queue
+            from datetime import datetime, timezone
+            response_data = {
+                "fields": {k: v for k, v in response.fields.items()},
+                "responded_at": datetime.now(timezone.utc).isoformat(),
+            }
+            feature.human_input_response = response_data
+            feature.needs_human_input = False
+            # Keep in_progress=False, passes=False so it returns to pending
+
+            session.commit()
+            session.refresh(feature)
+
+            # Compute passing IDs for response
+            all_features = session.query(Feature).all()
+            passing_ids = {f.id for f in all_features if f.passes}
+
+            return feature_to_response(feature, passing_ids)
+    except HTTPException:
+        raise
+    except Exception:
+        logger.exception("Failed to resolve human input")
+        raise HTTPException(status_code=500, detail="Failed to resolve human input")
+
+
 # ============================================================================
 # Dependency Management Endpoints
 # ============================================================================
--- a/server/routers/projects.py
+++ b/server/routers/projects.py
@@ -102,7 +102,7 @@ def get_project_stats(project_dir: Path) -> ProjectStats:
    """Get statistics for a project."""
    _init_imports()
    assert _count_passing_tests is not None  # guaranteed by _init_imports()
-    passing, in_progress, total = _count_passing_tests(project_dir)
+    passing, in_progress, total, _needs_human_input = _count_passing_tests(project_dir)
    percentage = (passing / total * 100) if total > 0 else 0.0
    return ProjectStats(
        passing=passing,
--- a/server/routers/settings.py
+++ b/server/routers/settings.py
@@ -7,12 +7,11 @@ Settings are stored in the registry database and shared across all projects.
 """

 import mimetypes
-import os
 import sys

 from fastapi import APIRouter

-from ..schemas import ModelInfo, ModelsResponse, SettingsResponse, SettingsUpdate
+from ..schemas import ModelInfo, ModelsResponse, ProviderInfo, ProvidersResponse, SettingsResponse, SettingsUpdate
 from ..services.chat_constants import ROOT_DIR

 # Mimetype fix for Windows - must run before StaticFiles is mounted
@@ -23,9 +22,11 @@ if str(ROOT_DIR) not in sys.path:
    sys.path.insert(0, str(ROOT_DIR))

 from registry import (
+    API_PROVIDERS,
    AVAILABLE_MODELS,
    DEFAULT_MODEL,
    get_all_settings,
+    get_setting,
    set_setting,
 )

@@ -37,26 +38,40 @@ def _parse_yolo_mode(value: str | None) -> bool:
    return (value or "false").lower() == "true"


-def _is_glm_mode() -> bool:
-    """Check if GLM API is configured via environment variables."""
-    base_url = os.getenv("ANTHROPIC_BASE_URL", "")
-    # GLM mode is when ANTHROPIC_BASE_URL is set but NOT pointing to Ollama
-    return bool(base_url) and not _is_ollama_mode()
-
-
-def _is_ollama_mode() -> bool:
-    """Check if Ollama API is configured via environment variables."""
-    base_url = os.getenv("ANTHROPIC_BASE_URL", "")
-    return "localhost:11434" in base_url or "127.0.0.1:11434" in base_url
+@router.get("/providers", response_model=ProvidersResponse)
+async def get_available_providers():
+    """Get list of available API providers."""
+    current = get_setting("api_provider", "claude") or "claude"
+    providers = []
+    for pid, pdata in API_PROVIDERS.items():
+        providers.append(ProviderInfo(
+            id=pid,
+            name=pdata["name"],
+            base_url=pdata.get("base_url"),
+            models=[ModelInfo(id=m["id"], name=m["name"]) for m in pdata.get("models", [])],
+            default_model=pdata.get("default_model", ""),
+            requires_auth=pdata.get("requires_auth", False),
+        ))
+    return ProvidersResponse(providers=providers, current=current)


@router.get("/models", response_model=ModelsResponse)
 async def get_available_models():
    """Get list of available models.

-    Frontend should call this to get the current list of models
-    instead of hardcoding them.
+    Returns models for the currently selected API provider.
    """
+    current_provider = get_setting("api_provider", "claude") or "claude"
+    provider = API_PROVIDERS.get(current_provider)
+
+    if provider and current_provider != "claude":
+        provider_models = provider.get("models", [])
+        return ModelsResponse(
+            models=[ModelInfo(id=m["id"], name=m["name"]) for m in provider_models],
+            default=provider.get("default_model", ""),
+        )
+
+    # Default: return Claude models
    return ModelsResponse(
        models=[ModelInfo(id=m["id"], name=m["name"]) for m in AVAILABLE_MODELS],
        default=DEFAULT_MODEL,
@@ -85,14 +100,23 @@ async def get_settings():
    """Get current global settings."""
    all_settings = get_all_settings()

+    api_provider = all_settings.get("api_provider", "claude")
+
+    glm_mode = api_provider == "glm"
+    ollama_mode = api_provider == "ollama"
+
    return SettingsResponse(
        yolo_mode=_parse_yolo_mode(all_settings.get("yolo_mode")),
        model=all_settings.get("model", DEFAULT_MODEL),
-        glm_mode=_is_glm_mode(),
-        ollama_mode=_is_ollama_mode(),
+        glm_mode=glm_mode,
+        ollama_mode=ollama_mode,
        testing_agent_ratio=_parse_int(all_settings.get("testing_agent_ratio"), 1),
        playwright_headless=_parse_bool(all_settings.get("playwright_headless"), default=True),
        batch_size=_parse_int(all_settings.get("batch_size"), 3),
+        api_provider=api_provider,
+        api_base_url=all_settings.get("api_base_url"),
+        api_has_auth_token=bool(all_settings.get("api_auth_token")),
+        api_model=all_settings.get("api_model"),
    )


@@ -114,14 +138,47 @@ async def update_settings(update: SettingsUpdate):
    if update.batch_size is not None:
        set_setting("batch_size", str(update.batch_size))

+    # API provider settings
+    if update.api_provider is not None:
+        old_provider = get_setting("api_provider", "claude")
+        set_setting("api_provider", update.api_provider)
+
+        # When provider changes, auto-set defaults for the new provider
+        if update.api_provider != old_provider:
+            provider = API_PROVIDERS.get(update.api_provider)
+            if provider:
+                # Auto-set base URL from provider definition
+                if provider.get("base_url"):
+                    set_setting("api_base_url", provider["base_url"])
+                # Auto-set model to provider's default
+                if provider.get("default_model") and update.api_model is None:
+                    set_setting("api_model", provider["default_model"])
+
+    if update.api_base_url is not None:
+        set_setting("api_base_url", update.api_base_url)
+
+    if update.api_auth_token is not None:
+        set_setting("api_auth_token", update.api_auth_token)
+
+    if update.api_model is not None:
+        set_setting("api_model", update.api_model)
+
    # Return updated settings
    all_settings = get_all_settings()
+    api_provider = all_settings.get("api_provider", "claude")
+    glm_mode = api_provider == "glm"
+    ollama_mode = api_provider == "ollama"
+
    return SettingsResponse(
        yolo_mode=_parse_yolo_mode(all_settings.get("yolo_mode")),
        model=all_settings.get("model", DEFAULT_MODEL),
-        glm_mode=_is_glm_mode(),
-        ollama_mode=_is_ollama_mode(),
+        glm_mode=glm_mode,
+        ollama_mode=ollama_mode,
        testing_agent_ratio=_parse_int(all_settings.get("testing_agent_ratio"), 1),
        playwright_headless=_parse_bool(all_settings.get("playwright_headless"), default=True),
        batch_size=_parse_int(all_settings.get("batch_size"), 3),
+        api_provider=api_provider,
+        api_base_url=all_settings.get("api_base_url"),
+        api_has_auth_token=bool(all_settings.get("api_auth_token")),
+        api_model=all_settings.get("api_model"),
    )
--- a/server/routers/spec_creation.py
+++ b/server/routers/spec_creation.py
@@ -21,7 +21,7 @@ from ..services.spec_chat_session import (
    remove_session,
 )
 from ..utils.project_helpers import get_project_path as _get_project_path
-from ..utils.validation import is_valid_project_name as validate_project_name
+from ..utils.validation import is_valid_project_name, validate_project_name

 logger = logging.getLogger(__name__)

@@ -49,7 +49,7 @@ async def list_spec_sessions():
@router.get("/sessions/{project_name}", response_model=SpecSessionStatus)
 async def get_session_status(project_name: str):
    """Get status of a spec creation session."""
-    if not validate_project_name(project_name):
+    if not is_valid_project_name(project_name):
        raise HTTPException(status_code=400, detail="Invalid project name")

    session = get_session(project_name)
@@ -67,7 +67,7 @@ async def get_session_status(project_name: str):
@router.delete("/sessions/{project_name}")
 async def cancel_session(project_name: str):
    """Cancel and remove a spec creation session."""
-    if not validate_project_name(project_name):
+    if not is_valid_project_name(project_name):
        raise HTTPException(status_code=400, detail="Invalid project name")

    session = get_session(project_name)
@@ -95,7 +95,7 @@ async def get_spec_file_status(project_name: str):
    This is used for polling to detect when Claude has finished writing spec files.
    Claude writes this status file as the final step after completing all spec work.
    """
-    if not validate_project_name(project_name):
+    if not is_valid_project_name(project_name):
        raise HTTPException(status_code=400, detail="Invalid project name")

    project_dir = _get_project_path(project_name)
@@ -166,22 +166,28 @@ async def spec_chat_websocket(websocket: WebSocket, project_name: str):
    - {"type": "error", "content": "..."} - Error message
    - {"type": "pong"} - Keep-alive pong
    """
-    if not validate_project_name(project_name):
+    # Always accept WebSocket first to avoid opaque 403 errors
+    await websocket.accept()
+
+    try:
+        project_name = validate_project_name(project_name)
+    except HTTPException:
+        await websocket.send_json({"type": "error", "content": "Invalid project name"})
        await websocket.close(code=4000, reason="Invalid project name")
        return

    # Look up project directory from registry
    project_dir = _get_project_path(project_name)
    if not project_dir:
+        await websocket.send_json({"type": "error", "content": "Project not found in registry"})
        await websocket.close(code=4004, reason="Project not found in registry")
        return

    if not project_dir.exists():
+        await websocket.send_json({"type": "error", "content": "Project directory not found"})
        await websocket.close(code=4004, reason="Project directory not found")
        return

-    await websocket.accept()
-
    session: Optional[SpecChatSession] = None

    try:
--- a/server/routers/terminal.py
+++ b/server/routers/terminal.py
@@ -26,7 +26,7 @@ from ..services.terminal_manager import (
    stop_terminal_session,
 )
 from ..utils.project_helpers import get_project_path as _get_project_path
-from ..utils.validation import is_valid_project_name as validate_project_name
+from ..utils.validation import is_valid_project_name

 logger = logging.getLogger(__name__)

@@ -89,7 +89,7 @@ async def list_project_terminals(project_name: str) -> list[TerminalInfoResponse
    Returns:
        List of terminal info objects
    """
-    if not validate_project_name(project_name):
+    if not is_valid_project_name(project_name):
        raise HTTPException(status_code=400, detail="Invalid project name")

    project_dir = _get_project_path(project_name)
@@ -122,7 +122,7 @@ async def create_project_terminal(
    Returns:
        The created terminal info
    """
-    if not validate_project_name(project_name):
+    if not is_valid_project_name(project_name):
        raise HTTPException(status_code=400, detail="Invalid project name")

    project_dir = _get_project_path(project_name)
@@ -148,7 +148,7 @@ async def rename_project_terminal(
    Returns:
        The updated terminal info
    """
-    if not validate_project_name(project_name):
+    if not is_valid_project_name(project_name):
        raise HTTPException(status_code=400, detail="Invalid project name")

    if not validate_terminal_id(terminal_id):
@@ -180,7 +180,7 @@ async def delete_project_terminal(project_name: str, terminal_id: str) -> dict:
    Returns:
        Success message
    """
-    if not validate_project_name(project_name):
+    if not is_valid_project_name(project_name):
        raise HTTPException(status_code=400, detail="Invalid project name")

    if not validate_terminal_id(terminal_id):
@@ -221,8 +221,12 @@ async def terminal_websocket(websocket: WebSocket, project_name: str, terminal_i
    - {"type": "pong"} - Keep-alive response
    - {"type": "error", "message": "..."} - Error message
    """
+    # Always accept WebSocket first to avoid opaque 403 errors
+    await websocket.accept()
+
    # Validate project name
-    if not validate_project_name(project_name):
+    if not is_valid_project_name(project_name):
+        await websocket.send_json({"type": "error", "message": "Invalid project name"})
        await websocket.close(
            code=TerminalCloseCode.INVALID_PROJECT_NAME, reason="Invalid project name"
        )
@@ -230,6 +234,7 @@ async def terminal_websocket(websocket: WebSocket, project_name: str, terminal_i

    # Validate terminal ID
    if not validate_terminal_id(terminal_id):
+        await websocket.send_json({"type": "error", "message": "Invalid terminal ID"})
        await websocket.close(
            code=TerminalCloseCode.INVALID_PROJECT_NAME, reason="Invalid terminal ID"
        )
@@ -238,6 +243,7 @@ async def terminal_websocket(websocket: WebSocket, project_name: str, terminal_i
    # Look up project directory from registry
    project_dir = _get_project_path(project_name)
    if not project_dir:
+        await websocket.send_json({"type": "error", "message": "Project not found in registry"})
        await websocket.close(
            code=TerminalCloseCode.PROJECT_NOT_FOUND,
            reason="Project not found in registry",
@@ -245,6 +251,7 @@ async def terminal_websocket(websocket: WebSocket, project_name: str, terminal_i
        return

    if not project_dir.exists():
+        await websocket.send_json({"type": "error", "message": "Project directory not found"})
        await websocket.close(
            code=TerminalCloseCode.PROJECT_NOT_FOUND,
            reason="Project directory not found",
@@ -254,14 +261,13 @@ async def terminal_websocket(websocket: WebSocket, project_name: str, terminal_i
    # Verify terminal exists in metadata
    terminal_info = get_terminal_info(project_name, terminal_id)
    if not terminal_info:
+        await websocket.send_json({"type": "error", "message": "Terminal not found"})
        await websocket.close(
            code=TerminalCloseCode.PROJECT_NOT_FOUND,
            reason="Terminal not found",
        )
        return

-    await websocket.accept()
-
    # Get or create terminal session for this project/terminal
    session = get_terminal_session(project_name, project_dir, terminal_id)

--- a/server/schemas.py
+++ b/server/schemas.py
@@ -120,16 +120,41 @@ class FeatureResponse(FeatureBase):
    in_progress: bool
    blocked: bool = False  # Computed: has unmet dependencies
    blocking_dependencies: list[int] = Field(default_factory=list)  # Computed
+    needs_human_input: bool = False
+    human_input_request: dict | None = None
+    human_input_response: dict | None = None

    class Config:
        from_attributes = True


+class HumanInputField(BaseModel):
+    """Schema for a single human input field."""
+    id: str
+    label: str
+    type: Literal["text", "textarea", "select", "boolean"] = "text"
+    required: bool = True
+    placeholder: str | None = None
+    options: list[dict] | None = None  # For select: [{value, label}]
+
+
+class HumanInputRequest(BaseModel):
+    """Schema for an agent's human input request."""
+    prompt: str
+    fields: list[HumanInputField]
+
+
+class HumanInputResponse(BaseModel):
+    """Schema for a human's response to an input request."""
+    fields: dict[str, str | bool | list[str]]
+
+
 class FeatureListResponse(BaseModel):
    """Response containing list of features organized by status."""
    pending: list[FeatureResponse]
    in_progress: list[FeatureResponse]
    done: list[FeatureResponse]
+    needs_human_input: list[FeatureResponse] = Field(default_factory=list)


 class FeatureBulkCreate(BaseModel):
@@ -153,7 +178,7 @@ class DependencyGraphNode(BaseModel):
    id: int
    name: str
    category: str
-    status: Literal["pending", "in_progress", "done", "blocked"]
+    status: Literal["pending", "in_progress", "done", "blocked", "needs_human_input"]
    priority: int
    dependencies: list[int]

@@ -190,9 +215,12 @@ class AgentStartRequest(BaseModel):
    @field_validator('model')
    @classmethod
    def validate_model(cls, v: str | None) -> str | None:
-        """Validate model is in the allowed list."""
+        """Validate model is in the allowed list (Claude) or allow any model for alternative providers."""
        if v is not None and v not in VALID_MODELS:
-            raise ValueError(f"Invalid model. Must be one of: {VALID_MODELS}")
+            from registry import get_all_settings
+            settings = get_all_settings()
+            if settings.get("api_provider", "claude") == "claude":
+                raise ValueError(f"Invalid model. Must be one of: {VALID_MODELS}")
        return v

    @field_validator('max_concurrency')
@@ -214,7 +242,7 @@ class AgentStartRequest(BaseModel):

 class AgentStatus(BaseModel):
    """Current agent status."""
-    status: Literal["stopped", "running", "paused", "crashed"]
+    status: Literal["stopped", "running", "paused", "crashed", "pausing", "paused_graceful"]
    pid: int | None = None
    started_at: datetime | None = None
    yolo_mode: bool = False
@@ -254,6 +282,7 @@ class WSProgressMessage(BaseModel):
    in_progress: int
    total: int
    percentage: float
+    needs_human_input: int = 0


 class WSFeatureUpdateMessage(BaseModel):
@@ -391,15 +420,35 @@ class ModelInfo(BaseModel):
    name: str


+class ProviderInfo(BaseModel):
+    """Information about an API provider."""
+    id: str
+    name: str
+    base_url: str | None = None
+    models: list[ModelInfo]
+    default_model: str
+    requires_auth: bool = False
+
+
+class ProvidersResponse(BaseModel):
+    """Response schema for available providers list."""
+    providers: list[ProviderInfo]
+    current: str
+
+
 class SettingsResponse(BaseModel):
    """Response schema for global settings."""
    yolo_mode: bool = False
    model: str = DEFAULT_MODEL
-    glm_mode: bool = False  # True if GLM API is configured via .env
-    ollama_mode: bool = False  # True if Ollama API is configured via .env
+    glm_mode: bool = False  # True when api_provider is "glm"
+    ollama_mode: bool = False  # True when api_provider is "ollama"
    testing_agent_ratio: int = 1  # Regression testing agents (0-3)
    playwright_headless: bool = True
    batch_size: int = 3  # Features per coding agent batch (1-3)
+    api_provider: str = "claude"
+    api_base_url: str | None = None
+    api_has_auth_token: bool = False  # Never expose actual token
+    api_model: str | None = None


 class ModelsResponse(BaseModel):
@@ -415,12 +464,30 @@ class SettingsUpdate(BaseModel):
    testing_agent_ratio: int | None = None  # 0-3
    playwright_headless: bool | None = None
    batch_size: int | None = None  # Features per agent batch (1-3)
+    api_provider: str | None = None
+    api_base_url: str | None = Field(None, max_length=500)
+    api_auth_token: str | None = Field(None, max_length=500)  # Write-only, never returned
+    api_model: str | None = Field(None, max_length=200)
+
+    @field_validator('api_base_url')
+    @classmethod
+    def validate_api_base_url(cls, v: str | None) -> str | None:
+        if v is not None and v.strip():
+            v = v.strip()
+            if not v.startswith(("http://", "https://")):
+                raise ValueError("api_base_url must start with http:// or https://")
+        return v

    @field_validator('model')
    @classmethod
-    def validate_model(cls, v: str | None) -> str | None:
-        if v is not None and v not in VALID_MODELS:
-            raise ValueError(f"Invalid model. Must be one of: {VALID_MODELS}")
+    def validate_model(cls, v: str | None, info) -> str | None:  # type: ignore[override]
+        if v is not None:
+            # Skip VALID_MODELS check when using an alternative API provider
+            api_provider = info.data.get("api_provider")
+            if api_provider and api_provider != "claude":
+                return v
+            if v not in VALID_MODELS:
+                raise ValueError(f"Invalid model. Must be one of: {VALID_MODELS}")
        return v

    @field_validator('testing_agent_ratio')
@@ -533,9 +600,12 @@ class ScheduleCreate(BaseModel):
    @field_validator('model')
    @classmethod
    def validate_model(cls, v: str | None) -> str | None:
-        """Validate model is in the allowed list."""
+        """Validate model is in the allowed list (Claude) or allow any model for alternative providers."""
        if v is not None and v not in VALID_MODELS:
-            raise ValueError(f"Invalid model. Must be one of: {VALID_MODELS}")
+            from registry import get_all_settings
+            settings = get_all_settings()
+            if settings.get("api_provider", "claude") == "claude":
+                raise ValueError(f"Invalid model. Must be one of: {VALID_MODELS}")
        return v


@@ -555,9 +625,12 @@ class ScheduleUpdate(BaseModel):
    @field_validator('model')
    @classmethod
    def validate_model(cls, v: str | None) -> str | None:
-        """Validate model is in the allowed list."""
+        """Validate model is in the allowed list (Claude) or allow any model for alternative providers."""
        if v is not None and v not in VALID_MODELS:
-            raise ValueError(f"Invalid model. Must be one of: {VALID_MODELS}")
+            from registry import get_all_settings
+            settings = get_all_settings()
+            if settings.get("api_provider", "claude") == "claude":
+                raise ValueError(f"Invalid model. Must be one of: {VALID_MODELS}")
        return v


--- a/server/services/assistant_chat_session.py
+++ b/server/services/assistant_chat_session.py
@@ -25,7 +25,7 @@ from .assistant_database import (
    create_conversation,
    get_messages,
 )
-from .chat_constants import API_ENV_VARS, ROOT_DIR
+from .chat_constants import ROOT_DIR

 # Load environment variables from .env file if present
 load_dotenv()
@@ -47,8 +47,13 @@ FEATURE_MANAGEMENT_TOOLS = [
    "mcp__features__feature_skip",
 ]

+# Interactive tools
+INTERACTIVE_TOOLS = [
+    "mcp__features__ask_user",
+]
+
 # Combined list for assistant
-ASSISTANT_FEATURE_TOOLS = READONLY_FEATURE_MCP_TOOLS + FEATURE_MANAGEMENT_TOOLS
+ASSISTANT_FEATURE_TOOLS = READONLY_FEATURE_MCP_TOOLS + FEATURE_MANAGEMENT_TOOLS + INTERACTIVE_TOOLS

 # Read-only built-in tools (no Write, Edit, Bash)
 READONLY_BUILTIN_TOOLS = [
@@ -123,6 +128,9 @@ If the user asks you to modify code, explain that you're a project assistant and
 - **feature_create_bulk**: Create multiple features at once
 - **feature_skip**: Move a feature to the end of the queue

+**Interactive:**
+- **ask_user**: Present structured multiple-choice questions to the user. Use this when you need to clarify requirements, offer design choices, or guide a decision. The user sees clickable option buttons and their selection is returned as your next message.
+
 ## Creating Features

 When a user asks to add a feature, use the `feature_create` or `feature_create_bulk` MCP tools directly:
@@ -157,7 +165,7 @@ class AssistantChatSession:
    """
    Manages a read-only assistant conversation for a project.

-    Uses Claude Opus 4.5 with only read-only tools enabled.
+    Uses Claude Opus with only read-only tools enabled.
    Persists conversation history to SQLite.
    """

@@ -258,15 +266,11 @@ class AssistantChatSession:
        system_cli = shutil.which("claude")

        # Build environment overrides for API configuration
-        sdk_env: dict[str, str] = {}
-        for var in API_ENV_VARS:
-            value = os.getenv(var)
-            if value:
-                sdk_env[var] = value
+        from registry import DEFAULT_MODEL, get_effective_sdk_env
+        sdk_env = get_effective_sdk_env()

-        # Determine model from environment or use default
-        # This allows using alternative APIs (e.g., GLM via z.ai) that may not support Claude model names
-        model = os.getenv("ANTHROPIC_DEFAULT_OPUS_MODEL", "claude-opus-4-5-20251101")
+        # Determine model from SDK env (provider-aware) or fallback to env/default
+        model = sdk_env.get("ANTHROPIC_DEFAULT_OPUS_MODEL") or os.getenv("ANTHROPIC_DEFAULT_OPUS_MODEL", DEFAULT_MODEL)

        try:
            logger.info("Creating ClaudeSDKClient...")
@@ -406,6 +410,17 @@ class AssistantChatSession:
                    elif block_type == "ToolUseBlock" and hasattr(block, "name"):
                        tool_name = block.name
                        tool_input = getattr(block, "input", {})
+
+                        # Intercept ask_user tool calls -> yield as question message
+                        if tool_name == "mcp__features__ask_user":
+                            questions = tool_input.get("questions", [])
+                            if questions:
+                                yield {
+                                    "type": "question",
+                                    "questions": questions,
+                                }
+                                continue
+
                        yield {
                            "type": "tool_call",
                            "tool": tool_name,
--- a/server/services/expand_chat_session.py
+++ b/server/services/expand_chat_session.py
@@ -22,7 +22,7 @@ from claude_agent_sdk import ClaudeAgentOptions, ClaudeSDKClient
 from dotenv import load_dotenv

 from ..schemas import ImageAttachment
-from .chat_constants import API_ENV_VARS, ROOT_DIR, make_multimodal_message
+from .chat_constants import ROOT_DIR, make_multimodal_message

 # Load environment variables from .env file if present
 load_dotenv()
@@ -154,16 +154,11 @@ class ExpandChatSession:
        system_prompt = skill_content.replace("$ARGUMENTS", project_path)

        # Build environment overrides for API configuration
-        # Filter to only include vars that are actually set (non-None)
-        sdk_env: dict[str, str] = {}
-        for var in API_ENV_VARS:
-            value = os.getenv(var)
-            if value:
-                sdk_env[var] = value
+        from registry import DEFAULT_MODEL, get_effective_sdk_env
+        sdk_env = get_effective_sdk_env()

-        # Determine model from environment or use default
-        # This allows using alternative APIs (e.g., GLM via z.ai) that may not support Claude model names
-        model = os.getenv("ANTHROPIC_DEFAULT_OPUS_MODEL", "claude-opus-4-5-20251101")
+        # Determine model from SDK env (provider-aware) or fallback to env/default
+        model = sdk_env.get("ANTHROPIC_DEFAULT_OPUS_MODEL") or os.getenv("ANTHROPIC_DEFAULT_OPUS_MODEL", DEFAULT_MODEL)

        # Build MCP servers config for feature creation
        mcp_servers = {
--- a/server/services/process_manager.py
+++ b/server/services/process_manager.py
@@ -77,7 +77,7 @@ class AgentProcessManager:
        self.project_dir = project_dir
        self.root_dir = root_dir
        self.process: subprocess.Popen | None = None
-        self._status: Literal["stopped", "running", "paused", "crashed"] = "stopped"
+        self._status: Literal["stopped", "running", "paused", "crashed", "pausing", "paused_graceful"] = "stopped"
        self.started_at: datetime | None = None
        self._output_task: asyncio.Task | None = None
        self.yolo_mode: bool = False  # YOLO mode for rapid prototyping
@@ -96,11 +96,11 @@ class AgentProcessManager:
        self.lock_file = get_agent_lock_path(self.project_dir)

    @property
-    def status(self) -> Literal["stopped", "running", "paused", "crashed"]:
+    def status(self) -> Literal["stopped", "running", "paused", "crashed", "pausing", "paused_graceful"]:
        return self._status

    @status.setter
-    def status(self, value: Literal["stopped", "running", "paused", "crashed"]):
+    def status(self, value: Literal["stopped", "running", "paused", "crashed", "pausing", "paused_graceful"]):
        old_status = self._status
        self._status = value
        if old_status != value:
@@ -227,6 +227,68 @@ class AgentProcessManager:
        """Remove lock file."""
        self.lock_file.unlink(missing_ok=True)

+    def _apply_playwright_headless(self, headless: bool) -> None:
+        """Update .playwright/cli.config.json with the current headless setting.
+
+        playwright-cli reads this config file on each ``open`` command, so
+        updating it before the agent starts is sufficient.
+        """
+        config_file = self.project_dir / ".playwright" / "cli.config.json"
+        if not config_file.exists():
+            return
+        try:
+            import json
+            config = json.loads(config_file.read_text(encoding="utf-8"))
+            launch_opts = config.get("browser", {}).get("launchOptions", {})
+            if launch_opts.get("headless") == headless:
+                return  # already correct
+            launch_opts["headless"] = headless
+            config.setdefault("browser", {})["launchOptions"] = launch_opts
+            config_file.write_text(json.dumps(config, indent=2) + "\n", encoding="utf-8")
+            logger.info("Set playwright headless=%s for %s", headless, self.project_name)
+        except Exception:
+            logger.warning("Failed to update playwright config", exc_info=True)
+
+    def _cleanup_stale_features(self) -> None:
+        """Clear in_progress flag for all features when agent stops/crashes.
+
+        When the agent process exits (normally or crash), any features left
+        with in_progress=True were being worked on and didn't complete.
+        Reset them so they can be picked up on next agent start.
+        """
+        try:
+            from autoforge_paths import get_features_db_path
+            features_db = get_features_db_path(self.project_dir)
+            if not features_db.exists():
+                return
+
+            from sqlalchemy import create_engine
+            from sqlalchemy.orm import sessionmaker
+
+            from api.database import Feature
+
+            engine = create_engine(f"sqlite:///{features_db}")
+            Session = sessionmaker(bind=engine)
+            session = Session()
+            try:
+                stuck = session.query(Feature).filter(
+                    Feature.in_progress == True,  # noqa: E712
+                    Feature.passes == False,  # noqa: E712
+                ).all()
+                if stuck:
+                    for f in stuck:
+                        f.in_progress = False  # type: ignore[assignment]
+                    session.commit()
+                    logger.info(
+                        "Cleaned up %d stuck feature(s) for %s",
+                        len(stuck), self.project_name,
+                    )
+            finally:
+                session.close()
+                engine.dispose()
+        except Exception as e:
+            logger.warning("Failed to cleanup features for %s: %s", self.project_name, e)
+
    async def _broadcast_output(self, line: str) -> None:
        """Broadcast output line to all registered callbacks."""
        with self._callbacks_lock:
@@ -268,6 +330,12 @@ class AgentProcessManager:
                    for help_line in AUTH_ERROR_HELP.strip().split('\n'):
                        await self._broadcast_output(help_line)

+                # Detect graceful pause status transitions from orchestrator output
+                if "All agents drained - paused." in decoded:
+                    self.status = "paused_graceful"
+                elif "Resuming from graceful pause..." in decoded:
+                    self.status = "running"
+
                await self._broadcast_output(sanitized)

        except asyncio.CancelledError:
@@ -278,7 +346,7 @@ class AgentProcessManager:
            # Check if process ended
            if self.process and self.process.poll() is not None:
                exit_code = self.process.returncode
-                if exit_code != 0 and self.status == "running":
+                if exit_code != 0 and self.status in ("running", "pausing", "paused_graceful"):
                    # Check buffered output for auth errors if we haven't detected one yet
                    if not auth_error_detected:
                        combined_output = '\n'.join(output_buffer)
@@ -286,9 +354,16 @@ class AgentProcessManager:
                            for help_line in AUTH_ERROR_HELP.strip().split('\n'):
                                await self._broadcast_output(help_line)
                    self.status = "crashed"
-                elif self.status == "running":
+                elif self.status in ("running", "pausing", "paused_graceful"):
                    self.status = "stopped"
+                self._cleanup_stale_features()
                self._remove_lock()
+                # Clean up drain signal file if present
+                try:
+                    from autoforge_paths import get_pause_drain_path
+                    get_pause_drain_path(self.project_dir).unlink(missing_ok=True)
+                except Exception:
+                    pass

    async def start(
        self,
@@ -305,7 +380,7 @@ class AgentProcessManager:

        Args:
            yolo_mode: If True, run in YOLO mode (skip testing agents)
-            model: Model to use (e.g., claude-opus-4-5-20251101)
+            model: Model to use (e.g., claude-opus-4-6)
            parallel_mode: DEPRECATED - ignored, always uses unified orchestrator
            max_concurrency: Max concurrent coding agents (1-5, default 1)
            testing_agent_ratio: Number of regression testing agents (0-3, default 1)
@@ -314,12 +389,24 @@ class AgentProcessManager:
        Returns:
            Tuple of (success, message)
        """
-        if self.status in ("running", "paused"):
+        if self.status in ("running", "paused", "pausing", "paused_graceful"):
            return False, f"Agent is already {self.status}"

        if not self._check_lock():
            return False, "Another agent instance is already running for this project"

+        # Clean up stale browser daemons from previous runs
+        try:
+            subprocess.run(
+                ["playwright-cli", "kill-all"],
+                timeout=5, capture_output=True,
+            )
+        except (subprocess.TimeoutExpired, FileNotFoundError, OSError):
+            pass
+
+        # Clean up features stuck from a previous crash/stop
+        self._cleanup_stale_features()
+
        # Store for status queries
        self.yolo_mode = yolo_mode
        self.model = model
@@ -353,18 +440,33 @@ class AgentProcessManager:
        # Add --batch-size flag for multi-feature batching
        cmd.extend(["--batch-size", str(batch_size)])

+        # Apply headless setting to .playwright/cli.config.json so playwright-cli
+        # picks it up (the only mechanism it supports for headless control)
+        self._apply_playwright_headless(playwright_headless)
+
        try:
            # Start subprocess with piped stdout/stderr
            # Use project_dir as cwd so Claude SDK sandbox allows access to project files
            # stdin=DEVNULL prevents blocking if Claude CLI or child process tries to read stdin
            # CREATE_NO_WINDOW on Windows prevents console window pop-ups
            # PYTHONUNBUFFERED ensures output isn't delayed
+            # Build subprocess environment with API provider settings
+            from registry import get_effective_sdk_env
+            api_env = get_effective_sdk_env()
+            subprocess_env = {
+                **os.environ,
+                "PYTHONUNBUFFERED": "1",
+                "PLAYWRIGHT_CLI_SESSION": f"agent-{self.project_name}-{os.getpid()}",
+                "NODE_COMPILE_CACHE": "",  # Disable V8 compile caching to prevent .node file accumulation in %TEMP%
+                **api_env,
+            }
+
            popen_kwargs: dict[str, Any] = {
                "stdin": subprocess.DEVNULL,
                "stdout": subprocess.PIPE,
                "stderr": subprocess.STDOUT,
                "cwd": str(self.project_dir),
-                "env": {**os.environ, "PYTHONUNBUFFERED": "1", "PLAYWRIGHT_HEADLESS": "true" if playwright_headless else "false"},
+                "env": subprocess_env,
            }
            if sys.platform == "win32":
                popen_kwargs["creationflags"] = subprocess.CREATE_NO_WINDOW
@@ -414,6 +516,15 @@ class AgentProcessManager:
                except asyncio.CancelledError:
                    pass

+            # Kill browser daemons before stopping agent
+            try:
+                subprocess.run(
+                    ["playwright-cli", "kill-all"],
+                    timeout=5, capture_output=True,
+                )
+            except (subprocess.TimeoutExpired, FileNotFoundError, OSError):
+                pass
+
            # CRITICAL: Kill entire process tree, not just orchestrator
            # This ensures all spawned coding/testing agents are also terminated
            proc = self.process  # Capture reference before async call
@@ -425,7 +536,14 @@ class AgentProcessManager:
                result.children_terminated, result.children_killed
            )

+            self._cleanup_stale_features()
            self._remove_lock()
+            # Clean up drain signal file if present
+            try:
+                from autoforge_paths import get_pause_drain_path
+                get_pause_drain_path(self.project_dir).unlink(missing_ok=True)
+            except Exception:
+                pass
            self.status = "stopped"
            self.process = None
            self.started_at = None
@@ -486,6 +604,47 @@ class AgentProcessManager:
            logger.exception("Failed to resume agent")
            return False, f"Failed to resume agent: {e}"

+    async def graceful_pause(self) -> tuple[bool, str]:
+        """Request a graceful pause (drain mode).
+
+        Creates a signal file that the orchestrator polls. Running agents
+        finish their current work before the orchestrator enters a paused state.
+
+        Returns:
+            Tuple of (success, message)
+        """
+        if not self.process or self.status not in ("running",):
+            return False, "Agent is not running"
+
+        try:
+            from autoforge_paths import get_pause_drain_path
+            drain_path = get_pause_drain_path(self.project_dir)
+            drain_path.parent.mkdir(parents=True, exist_ok=True)
+            drain_path.write_text(str(self.process.pid))
+            self.status = "pausing"
+            return True, "Graceful pause requested"
+        except Exception as e:
+            logger.exception("Failed to request graceful pause")
+            return False, f"Failed to request graceful pause: {e}"
+
+    async def graceful_resume(self) -> tuple[bool, str]:
+        """Resume from a graceful pause by removing the drain signal file.
+
+        Returns:
+            Tuple of (success, message)
+        """
+        if not self.process or self.status not in ("pausing", "paused_graceful"):
+            return False, "Agent is not in a graceful pause state"
+
+        try:
+            from autoforge_paths import get_pause_drain_path
+            get_pause_drain_path(self.project_dir).unlink(missing_ok=True)
+            self.status = "running"
+            return True, "Agent resumed from graceful pause"
+        except Exception as e:
+            logger.exception("Failed to resume from graceful pause")
+            return False, f"Failed to resume: {e}"
+
    async def healthcheck(self) -> bool:
        """
        Check if the agent process is still alive.
@@ -501,7 +660,14 @@ class AgentProcessManager:
        poll = self.process.poll()
        if poll is not None:
            # Process has terminated
-            if self.status in ("running", "paused"):
+            if self.status in ("running", "paused", "pausing", "paused_graceful"):
+                self._cleanup_stale_features()
+                # Clean up drain signal file if present
+                try:
+                    from autoforge_paths import get_pause_drain_path
+                    get_pause_drain_path(self.project_dir).unlink(missing_ok=True)
+                except Exception:
+                    pass
                self.status = "crashed"
                self._remove_lock()
            return False
@@ -586,8 +752,14 @@ def cleanup_orphaned_locks() -> int:
            if not project_path.exists():
                continue

+            # Clean up stale drain signal files
+            from autoforge_paths import get_autoforge_dir, get_pause_drain_path
+            drain_file = get_pause_drain_path(project_path)
+            if drain_file.exists():
+                drain_file.unlink(missing_ok=True)
+                logger.info("Removed stale drain signal file for project '%s'", name)
+
            # Check both legacy and new locations for lock files
-            from autoforge_paths import get_autoforge_dir
            lock_locations = [
                project_path / ".agent.lock",
                get_autoforge_dir(project_path) / ".agent.lock",
--- a/server/services/spec_chat_session.py
+++ b/server/services/spec_chat_session.py
@@ -19,7 +19,7 @@ from claude_agent_sdk import ClaudeAgentOptions, ClaudeSDKClient
 from dotenv import load_dotenv

 from ..schemas import ImageAttachment
-from .chat_constants import API_ENV_VARS, ROOT_DIR, make_multimodal_message
+from .chat_constants import ROOT_DIR, make_multimodal_message

 # Load environment variables from .env file if present
 load_dotenv()
@@ -140,16 +140,11 @@ class SpecChatSession:
        system_cli = shutil.which("claude")

        # Build environment overrides for API configuration
-        # Filter to only include vars that are actually set (non-None)
-        sdk_env: dict[str, str] = {}
-        for var in API_ENV_VARS:
-            value = os.getenv(var)
-            if value:
-                sdk_env[var] = value
+        from registry import DEFAULT_MODEL, get_effective_sdk_env
+        sdk_env = get_effective_sdk_env()

-        # Determine model from environment or use default
-        # This allows using alternative APIs (e.g., GLM via z.ai) that may not support Claude model names
-        model = os.getenv("ANTHROPIC_DEFAULT_OPUS_MODEL", "claude-opus-4-5-20251101")
+        # Determine model from SDK env (provider-aware) or fallback to env/default
+        model = sdk_env.get("ANTHROPIC_DEFAULT_OPUS_MODEL") or os.getenv("ANTHROPIC_DEFAULT_OPUS_MODEL", DEFAULT_MODEL)

        try:
            self.client = ClaudeSDKClient(
--- a/server/websocket.py
+++ b/server/websocket.py
@@ -61,7 +61,7 @@ THOUGHT_PATTERNS = [
    (re.compile(r'(?:Testing|Verifying|Running tests|Validating)\s+(.+)', re.I), 'testing'),
    (re.compile(r'(?:Error|Failed|Cannot|Unable to|Exception)\s+(.+)', re.I), 'struggling'),
    # Test results
-    (re.compile(r'(?:PASS|passed|success)', re.I), 'success'),
+    (re.compile(r'(?:PASS|passed|success)', re.I), 'testing'),
    (re.compile(r'(?:FAIL|failed|error)', re.I), 'struggling'),
 ]

@@ -78,6 +78,9 @@ ORCHESTRATOR_PATTERNS = {
    'testing_complete': re.compile(r'Feature #(\d+) testing (completed|failed)'),
    'all_complete': re.compile(r'All features complete'),
    'blocked_features': re.compile(r'(\d+) blocked by dependencies'),
+    'drain_start': re.compile(r'Graceful pause requested'),
+    'drain_complete': re.compile(r'All agents drained'),
+    'drain_resume': re.compile(r'Resuming from graceful pause'),
 }


@@ -562,6 +565,30 @@ class OrchestratorTracker:
                    'All features complete!'
                )

+            # Graceful pause (drain mode) events
+            elif ORCHESTRATOR_PATTERNS['drain_start'].search(line):
+                self.state = 'draining'
+                update = self._create_update(
+                    'drain_start',
+                    'Draining active agents...'
+                )
+
+            elif ORCHESTRATOR_PATTERNS['drain_complete'].search(line):
+                self.state = 'paused'
+                self.coding_agents = 0
+                self.testing_agents = 0
+                update = self._create_update(
+                    'drain_complete',
+                    'All agents drained. Paused.'
+                )
+
+            elif ORCHESTRATOR_PATTERNS['drain_resume'].search(line):
+                self.state = 'scheduling'
+                update = self._create_update(
+                    'drain_resume',
+                    'Resuming feature scheduling'
+                )
+
            return update

    def _create_update(
@@ -640,9 +667,7 @@ class ConnectionManager:
        self._lock = asyncio.Lock()

    async def connect(self, websocket: WebSocket, project_name: str):
-        """Accept a WebSocket connection for a project."""
-        await websocket.accept()
-
+        """Register a WebSocket connection for a project (must already be accepted)."""
        async with self._lock:
            if project_name not in self.active_connections:
                self.active_connections[project_name] = set()
@@ -691,15 +716,19 @@ async def poll_progress(websocket: WebSocket, project_name: str, project_dir: Pa
    last_in_progress = -1
    last_total = -1

+    last_needs_human_input = -1
+
    while True:
        try:
-            passing, in_progress, total = count_passing_tests(project_dir)
+            passing, in_progress, total, needs_human_input = count_passing_tests(project_dir)

            # Only send if changed
-            if passing != last_passing or in_progress != last_in_progress or total != last_total:
+            if (passing != last_passing or in_progress != last_in_progress
+                    or total != last_total or needs_human_input != last_needs_human_input):
                last_passing = passing
                last_in_progress = in_progress
                last_total = total
+                last_needs_human_input = needs_human_input
                percentage = (passing / total * 100) if total > 0 else 0

                await websocket.send_json({
@@ -708,6 +737,7 @@ async def poll_progress(websocket: WebSocket, project_name: str, project_dir: Pa
                    "in_progress": in_progress,
                    "total": total,
                    "percentage": round(percentage, 1),
+                    "needs_human_input": needs_human_input,
                })

            await asyncio.sleep(2)  # Poll every 2 seconds
@@ -727,16 +757,22 @@ async def project_websocket(websocket: WebSocket, project_name: str):
    - Agent status changes
    - Agent stdout/stderr lines
    """
+    # Always accept WebSocket first to avoid opaque 403 errors
+    await websocket.accept()
+
    if not validate_project_name(project_name):
+        await websocket.send_json({"type": "error", "content": "Invalid project name"})
        await websocket.close(code=4000, reason="Invalid project name")
        return

    project_dir = _get_project_path(project_name)
    if not project_dir:
+        await websocket.send_json({"type": "error", "content": "Project not found in registry"})
        await websocket.close(code=4004, reason="Project not found in registry")
        return

    if not project_dir.exists():
+        await websocket.send_json({"type": "error", "content": "Project directory not found"})
        await websocket.close(code=4004, reason="Project directory not found")
        return

@@ -854,7 +890,7 @@ async def project_websocket(websocket: WebSocket, project_name: str):

        # Send initial progress
        count_passing_tests = _get_count_passing_tests()
-        passing, in_progress, total = count_passing_tests(project_dir)
+        passing, in_progress, total, needs_human_input = count_passing_tests(project_dir)
        percentage = (passing / total * 100) if total > 0 else 0
        await websocket.send_json({
            "type": "progress",
@@ -862,6 +898,7 @@ async def project_websocket(websocket: WebSocket, project_name: str):
            "in_progress": in_progress,
            "total": total,
            "percentage": round(percentage, 1),
+            "needs_human_input": needs_human_input,
        })

        # Keep connection alive and handle incoming messages
@@ -879,8 +916,7 @@ async def project_websocket(websocket: WebSocket, project_name: str):
                break
            except json.JSONDecodeError:
                logger.warning(f"Invalid JSON from WebSocket: {data[:100] if data else 'empty'}")
-            except Exception as e:
-                logger.warning(f"WebSocket error: {e}")
+            except Exception:
                break

    finally:
--- a/start.bat
+++ b/start.bat
@@ -54,5 +54,15 @@ REM Install dependencies
 echo Installing dependencies...
 pip install -r requirements.txt --quiet

+REM Ensure playwright-cli is available for browser automation
+where playwright-cli >nul 2>&1
+if %ERRORLEVEL% neq 0 (
+    echo Installing playwright-cli for browser automation...
+    call npm install -g @playwright/cli >nul 2>&1
+    if %ERRORLEVEL% neq 0 (
+        echo Note: Could not install playwright-cli. Install manually: npm install -g @playwright/cli
+    )
+)
+
 REM Run the app
 python start.py
--- a/start.py
+++ b/start.py
@@ -390,8 +390,11 @@ def run_agent(project_name: str, project_dir: Path) -> None:
    print(f"Location: {project_dir}")
    print("-" * 50)

-    # Build the command - pass absolute path
-    cmd = [sys.executable, "autonomous_agent_demo.py", "--project-dir", str(project_dir.resolve())]
+    # Build the command - pass absolute path and model from settings
+    from registry import DEFAULT_MODEL, get_all_settings
+    settings = get_all_settings()
+    model = settings.get("api_model") or settings.get("model", DEFAULT_MODEL)
+    cmd = [sys.executable, "autonomous_agent_demo.py", "--project-dir", str(project_dir.resolve()), "--model", model]

    # Run the agent with stderr capture to detect auth errors
    # stdout goes directly to terminal for real-time output
--- a/start.sh
+++ b/start.sh
@@ -74,5 +74,14 @@ fi
 echo "Installing dependencies..."
 pip install -r requirements.txt --quiet

+# Ensure playwright-cli is available for browser automation
+if ! command -v playwright-cli &> /dev/null; then
+    echo "Installing playwright-cli for browser automation..."
+    npm install -g @playwright/cli --quiet 2>/dev/null
+    if [ $? -ne 0 ]; then
+        echo "Note: Could not install playwright-cli. Install manually: npm install -g @playwright/cli"
+    fi
+fi
+
 # Run the app
 python start.py
--- a/start_ui.bat
+++ b/start_ui.bat
@@ -37,5 +37,15 @@ REM Install dependencies
 echo Installing dependencies...
 pip install -r requirements.txt --quiet

+REM Ensure playwright-cli is available for browser automation
+where playwright-cli >nul 2>&1
+if %ERRORLEVEL% neq 0 (
+    echo Installing playwright-cli for browser automation...
+    call npm install -g @playwright/cli >nul 2>&1
+    if %ERRORLEVEL% neq 0 (
+        echo Note: Could not install playwright-cli. Install manually: npm install -g @playwright/cli
+    )
+)
+
 REM Run the Python launcher
 python "%~dp0start_ui.py" %*
--- a/start_ui.sh
+++ b/start_ui.sh
@@ -80,5 +80,14 @@ fi
 echo "Installing dependencies..."
 pip install -r requirements.txt --quiet

+# Ensure playwright-cli is available for browser automation
+if ! command -v playwright-cli &> /dev/null; then
+    echo "Installing playwright-cli for browser automation..."
+    npm install -g @playwright/cli --quiet 2>/dev/null
+    if [ $? -ne 0 ]; then
+        echo "Note: Could not install playwright-cli. Install manually: npm install -g @playwright/cli"
+    fi
+fi
+
 # Run the Python launcher
 python start_ui.py "$@"
--- a/temp_cleanup.py
+++ b/temp_cleanup.py
@@ -37,11 +37,12 @@ DIR_PATTERNS = [
    "mongodb-memory-server*",           # MongoDB Memory Server binaries
    "ng-*",                             # Angular CLI temp directories
    "scoped_dir*",                      # Chrome/Chromium temp directories
+    "node-compile-cache",               # Node.js V8 compile cache directory
 ]

 # File patterns to clean up (glob patterns)
 FILE_PATTERNS = [
-    ".78912*.node",   # Node.js native module cache (major space consumer, ~7MB each)
+    ".[0-9a-f]*.node",   # Node.js/V8 compile cache files (~7MB each, varying hex prefixes)
    "claude-*-cwd",   # Claude CLI working directory temp files
    "mat-debug-*.log",  # Material/Angular debug logs
 ]
@@ -122,6 +123,78 @@ def cleanup_stale_temp(max_age_seconds: int = MAX_AGE_SECONDS) -> dict:
    return stats


+def cleanup_project_screenshots(project_dir: Path, max_age_seconds: int = 300) -> dict:
+    """
+    Clean up stale Playwright CLI artifacts from the project.
+
+    The Playwright CLI daemon saves screenshots, snapshots, and other artifacts
+    to `{project_dir}/.playwright-cli/`. This removes them after they've aged
+    out (default 5 minutes).
+
+    Also cleans up legacy screenshot patterns from the project root (from the
+    old Playwright MCP server approach).
+
+    Args:
+        project_dir: Path to the project directory.
+        max_age_seconds: Maximum age in seconds before an artifact is deleted.
+                        Defaults to 5 minutes (300 seconds).
+
+    Returns:
+        Dictionary with cleanup statistics (files_deleted, bytes_freed, errors).
+    """
+    cutoff_time = time.time() - max_age_seconds
+    stats: dict = {"files_deleted": 0, "bytes_freed": 0, "errors": []}
+
+    # Clean up .playwright-cli/ directory (new CLI approach)
+    playwright_cli_dir = project_dir / ".playwright-cli"
+    if playwright_cli_dir.exists():
+        for item in playwright_cli_dir.iterdir():
+            if not item.is_file():
+                continue
+            try:
+                mtime = item.stat().st_mtime
+                if mtime < cutoff_time:
+                    size = item.stat().st_size
+                    item.unlink(missing_ok=True)
+                    if not item.exists():
+                        stats["files_deleted"] += 1
+                        stats["bytes_freed"] += size
+                        logger.debug(f"Deleted playwright-cli artifact: {item}")
+            except Exception as e:
+                stats["errors"].append(f"Failed to delete {item}: {e}")
+                logger.debug(f"Failed to delete artifact {item}: {e}")
+
+    # Legacy cleanup: root-level screenshot patterns (from old MCP server approach)
+    legacy_patterns = [
+        "feature*-*.png",
+        "screenshot-*.png",
+        "step-*.png",
+    ]
+
+    for pattern in legacy_patterns:
+        for item in project_dir.glob(pattern):
+            if not item.is_file():
+                continue
+            try:
+                mtime = item.stat().st_mtime
+                if mtime < cutoff_time:
+                    size = item.stat().st_size
+                    item.unlink(missing_ok=True)
+                    if not item.exists():
+                        stats["files_deleted"] += 1
+                        stats["bytes_freed"] += size
+                        logger.debug(f"Deleted legacy screenshot: {item}")
+            except Exception as e:
+                stats["errors"].append(f"Failed to delete {item}: {e}")
+                logger.debug(f"Failed to delete screenshot {item}: {e}")
+
+    if stats["files_deleted"] > 0:
+        mb_freed = stats["bytes_freed"] / (1024 * 1024)
+        logger.info(f"Artifact cleanup: {stats['files_deleted']} files, {mb_freed:.1f} MB freed")
+
+    return stats
+
+
 def _get_dir_size(path: Path) -> int:
    """Get total size of a directory in bytes."""
    total = 0
--- a/test_client.py
+++ b/test_client.py
@@ -40,15 +40,15 @@ class TestConvertModelForVertex(unittest.TestCase):
    def test_returns_model_unchanged_when_vertex_disabled(self):
        os.environ.pop("CLAUDE_CODE_USE_VERTEX", None)
        self.assertEqual(
-            convert_model_for_vertex("claude-opus-4-5-20251101"),
-            "claude-opus-4-5-20251101",
+            convert_model_for_vertex("claude-opus-4-6"),
+            "claude-opus-4-6",
        )

    def test_returns_model_unchanged_when_vertex_set_to_zero(self):
        os.environ["CLAUDE_CODE_USE_VERTEX"] = "0"
        self.assertEqual(
-            convert_model_for_vertex("claude-opus-4-5-20251101"),
-            "claude-opus-4-5-20251101",
+            convert_model_for_vertex("claude-opus-4-6"),
+            "claude-opus-4-6",
        )

    def test_returns_model_unchanged_when_vertex_set_to_empty(self):
@@ -60,13 +60,20 @@ class TestConvertModelForVertex(unittest.TestCase):

    # --- Vertex AI enabled: standard conversions ---

-    def test_converts_opus_model(self):
+    def test_converts_legacy_opus_model(self):
        os.environ["CLAUDE_CODE_USE_VERTEX"] = "1"
        self.assertEqual(
            convert_model_for_vertex("claude-opus-4-5-20251101"),
            "claude-opus-4-5@20251101",
        )

+    def test_opus_4_6_passthrough_on_vertex(self):
+        os.environ["CLAUDE_CODE_USE_VERTEX"] = "1"
+        self.assertEqual(
+            convert_model_for_vertex("claude-opus-4-6"),
+            "claude-opus-4-6",
+        )
+
    def test_converts_sonnet_model(self):
        os.environ["CLAUDE_CODE_USE_VERTEX"] = "1"
        self.assertEqual(
@@ -86,8 +93,8 @@ class TestConvertModelForVertex(unittest.TestCase):
    def test_already_vertex_format_unchanged(self):
        os.environ["CLAUDE_CODE_USE_VERTEX"] = "1"
        self.assertEqual(
-            convert_model_for_vertex("claude-opus-4-5@20251101"),
-            "claude-opus-4-5@20251101",
+            convert_model_for_vertex("claude-sonnet-4-5@20250929"),
+            "claude-sonnet-4-5@20250929",
        )

    def test_non_claude_model_unchanged(self):
@@ -100,8 +107,8 @@ class TestConvertModelForVertex(unittest.TestCase):
    def test_model_without_date_suffix_unchanged(self):
        os.environ["CLAUDE_CODE_USE_VERTEX"] = "1"
        self.assertEqual(
-            convert_model_for_vertex("claude-opus-4-5"),
-            "claude-opus-4-5",
+            convert_model_for_vertex("claude-opus-4-6"),
+            "claude-opus-4-6",
        )

    def test_empty_string_unchanged(self):
--- a/test_security.py
+++ b/test_security.py
@@ -25,6 +25,7 @@ from security import (
    validate_chmod_command,
    validate_init_script,
    validate_pkill_command,
+    validate_playwright_command,
    validate_project_command,
 )

@@ -923,6 +924,70 @@ pkill_processes:
    return passed, failed


+def test_playwright_cli_validation():
+    """Test playwright-cli subcommand validation."""
+    print("\nTesting playwright-cli validation:\n")
+    passed = 0
+    failed = 0
+
+    # Test cases: (command, should_be_allowed, description)
+    test_cases = [
+        # Allowed cases
+        ("playwright-cli screenshot", True, "screenshot allowed"),
+        ("playwright-cli snapshot", True, "snapshot allowed"),
+        ("playwright-cli click e5", True, "click with ref"),
+        ("playwright-cli open http://localhost:3000", True, "open URL"),
+        ("playwright-cli -s=agent-1 click e5", True, "session flag with click"),
+        ("playwright-cli close", True, "close browser"),
+        ("playwright-cli goto http://localhost:3000/page", True, "goto URL"),
+        ("playwright-cli fill e3 'test value'", True, "fill form field"),
+        ("playwright-cli console", True, "console messages"),
+        # Blocked cases
+        ("playwright-cli run-code 'await page.evaluate(() => {})'", False, "run-code blocked"),
+        ("playwright-cli eval 'document.title'", False, "eval blocked"),
+        ("playwright-cli -s=test eval 'document.title'", False, "eval with session flag blocked"),
+    ]
+
+    for cmd, should_allow, description in test_cases:
+        allowed, reason = validate_playwright_command(cmd)
+        if allowed == should_allow:
+            print(f"  PASS: {cmd!r} ({description})")
+            passed += 1
+        else:
+            expected = "allowed" if should_allow else "blocked"
+            actual = "allowed" if allowed else "blocked"
+            print(f"  FAIL: {cmd!r} ({description})")
+            print(f"         Expected: {expected}, Got: {actual}")
+            if reason:
+                print(f"         Reason: {reason}")
+            failed += 1
+
+    # Integration test: verify through the security hook
+    print("\n  Integration tests (via security hook):\n")
+
+    # playwright-cli screenshot should be allowed
+    input_data = {"tool_name": "Bash", "tool_input": {"command": "playwright-cli screenshot"}}
+    result = asyncio.run(bash_security_hook(input_data))
+    if result.get("decision") != "block":
+        print("  PASS: playwright-cli screenshot allowed via hook")
+        passed += 1
+    else:
+        print(f"  FAIL: playwright-cli screenshot should be allowed: {result.get('reason')}")
+        failed += 1
+
+    # playwright-cli run-code should be blocked
+    input_data = {"tool_name": "Bash", "tool_input": {"command": "playwright-cli run-code 'code'"}}
+    result = asyncio.run(bash_security_hook(input_data))
+    if result.get("decision") == "block":
+        print("  PASS: playwright-cli run-code blocked via hook")
+        passed += 1
+    else:
+        print("  FAIL: playwright-cli run-code should be blocked via hook")
+        failed += 1
+
+    return passed, failed
+
+
 def main():
    print("=" * 70)
    print("  SECURITY HOOK TESTS")
@@ -991,6 +1056,11 @@ def main():
    passed += pkill_passed
    failed += pkill_failed

+    # Test playwright-cli validation
+    pw_passed, pw_failed = test_playwright_cli_validation()
+    passed += pw_passed
+    failed += pw_failed
+
    # Commands that SHOULD be blocked
    # Note: blocklisted commands (sudo, shutdown, dd, aws) are tested in
    # test_blocklist_enforcement(). chmod validation is tested in
@@ -1012,6 +1082,9 @@ def main():
        # Shell injection attempts
        "$(echo pkill) node",
        'eval "pkill node"',
+        # playwright-cli dangerous subcommands
+        "playwright-cli run-code 'await page.goto(\"http://evil.com\")'",
+        "playwright-cli eval 'document.cookie'",
    ]

    for cmd in dangerous:
@@ -1077,6 +1150,12 @@ def main():
        "/usr/local/bin/node app.js",
        # Combined chmod and init.sh (integration test for both validators)
        "chmod +x init.sh && ./init.sh",
+        # Playwright CLI allowed commands
+        "playwright-cli open http://localhost:3000",
+        "playwright-cli screenshot",
+        "playwright-cli snapshot",
+        "playwright-cli click e5",
+        "playwright-cli -s=agent-1 close",
    ]

    for cmd in safe:
--- a/ui/e2e/tooltip.spec.ts
+++ b/ui/e2e/tooltip.spec.ts
@@ -0,0 +1,47 @@
+import { test, expect } from '@playwright/test'
+
+/**
+ * E2E tooltip tests for header icon buttons.
+ *
+ * Run tests:
+ *   cd ui && npm run test:e2e
+ *   cd ui && npm run test:e2e -- tooltip.spec.ts
+ */
+test.describe('Header tooltips', () => {
+  test.setTimeout(30000)
+
+  test.beforeEach(async ({ page }) => {
+    await page.goto('/')
+    await page.waitForSelector('button:has-text("Select Project")', { timeout: 10000 })
+  })
+
+  async function selectProject(page: import('@playwright/test').Page) {
+    const projectSelector = page.locator('button:has-text("Select Project")')
+    if (await projectSelector.isVisible()) {
+      await projectSelector.click()
+      const items = page.locator('.neo-dropdown-item')
+      const itemCount = await items.count()
+      if (itemCount === 0) return false
+      await items.first().click()
+      await expect(projectSelector).not.toBeVisible({ timeout: 5000 }).catch(() => {})
+      return true
+    }
+    return false
+  }
+
+  test('Settings tooltip shows on hover', async ({ page }) => {
+    const hasProject = await selectProject(page)
+    if (!hasProject) {
+      test.skip(true, 'No projects available')
+      return
+    }
+
+    const settingsButton = page.locator('button[aria-label="Open Settings"]')
+    await expect(settingsButton).toBeVisible()
+
+    await settingsButton.hover()
+
+    const tooltip = page.locator('[data-slot="tooltip-content"]', { hasText: 'Settings' })
+    await expect(tooltip).toBeVisible({ timeout: 2000 })
+  })
+})
--- a/ui/package-lock.json
+++ b/ui/package-lock.json
--- a/ui/package.json
+++ b/ui/package.json
@@ -19,6 +19,7 @@
    "@radix-ui/react-separator": "^1.1.8",
    "@radix-ui/react-slot": "^1.2.4",
    "@radix-ui/react-switch": "^1.2.6",
+    "@radix-ui/react-tooltip": "^1.2.8",
    "@tanstack/react-query": "^5.72.0",
    "@xterm/addon-fit": "^0.11.0",
    "@xterm/addon-web-links": "^0.12.0",
@@ -32,6 +33,8 @@
    "lucide-react": "^0.475.0",
    "react": "^19.0.0",
    "react-dom": "^19.0.0",
+    "react-markdown": "^10.1.0",
+    "remark-gfm": "^4.0.1",
    "tailwind-merge": "^3.4.0"
  },
  "devDependencies": {
--- a/ui/src/App.tsx
+++ b/ui/src/App.tsx
@@ -33,6 +33,7 @@ import type { Feature } from './lib/types'
 import { Button } from '@/components/ui/button'
 import { Card, CardContent } from '@/components/ui/card'
 import { Badge } from '@/components/ui/badge'
+import { TooltipProvider, Tooltip, TooltipTrigger, TooltipContent } from '@/components/ui/tooltip'

 const STORAGE_KEY = 'autoforge-selected-project'
 const VIEW_MODE_KEY = 'autoforge-view-mode'
@@ -129,7 +130,8 @@ function App() {
    const allFeatures = [
      ...(features?.pending ?? []),
      ...(features?.in_progress ?? []),
-      ...(features?.done ?? [])
+      ...(features?.done ?? []),
+      ...(features?.needs_human_input ?? [])
    ]
    const feature = allFeatures.find(f => f.id === nodeId)
    if (feature) setSelectedFeature(feature)
@@ -178,9 +180,9 @@ function App() {
        setShowAddFeature(true)
      }

-      // E : Expand project with AI (when project selected and has features)
-      if ((e.key === 'e' || e.key === 'E') && selectedProject && features &&
-          (features.pending.length + features.in_progress.length + features.done.length) > 0) {
+      // E : Expand project with AI (when project selected, has spec and has features)
+      if ((e.key === 'e' || e.key === 'E') && selectedProject && hasSpec && features &&
+          (features.pending.length + features.in_progress.length + features.done.length + (features.needs_human_input?.length || 0)) > 0) {
        e.preventDefault()
        setShowExpandProject(true)
      }
@@ -209,8 +211,8 @@ function App() {
        setShowKeyboardHelp(true)
      }

-      // R : Open reset modal (when project selected and agent not running)
-      if ((e.key === 'r' || e.key === 'R') && selectedProject && wsState.agentStatus !== 'running') {
+      // R : Open reset modal (when project selected and agent not running/draining)
+      if ((e.key === 'r' || e.key === 'R') && selectedProject && !['running', 'pausing', 'paused_graceful'].includes(wsState.agentStatus)) {
        e.preventDefault()
        setShowResetModal(true)
      }
@@ -239,12 +241,12 @@ function App() {

    window.addEventListener('keydown', handleKeyDown)
    return () => window.removeEventListener('keydown', handleKeyDown)
-  }, [selectedProject, showAddFeature, showExpandProject, selectedFeature, debugOpen, debugActiveTab, assistantOpen, features, showSettings, showKeyboardHelp, isSpecCreating, viewMode, showResetModal, wsState.agentStatus])
+  }, [selectedProject, showAddFeature, showExpandProject, selectedFeature, debugOpen, debugActiveTab, assistantOpen, features, showSettings, showKeyboardHelp, isSpecCreating, viewMode, showResetModal, wsState.agentStatus, hasSpec])

  // Combine WebSocket progress with feature data
  const progress = wsState.progress.total > 0 ? wsState.progress : {
    passing: features?.done.length ?? 0,
-    total: (features?.pending.length ?? 0) + (features?.in_progress.length ?? 0) + (features?.done.length ?? 0),
+    total: (features?.pending.length ?? 0) + (features?.in_progress.length ?? 0) + (features?.done.length ?? 0) + (features?.needs_human_input?.length ?? 0),
    percentage: 0,
  }

@@ -260,18 +262,19 @@ function App() {
    <div className="min-h-screen bg-background">
      {/* Header */}
      <header className="sticky top-0 z-50 bg-card/80 backdrop-blur-md text-foreground border-b-2 border-border">
-        <div className="max-w-7xl mx-auto px-4 py-4">
-          <div className="flex items-center justify-between">
-            {/* Logo and Title */}
+        <div className="max-w-7xl mx-auto px-4 py-3">
+          <TooltipProvider>
+            {/* Row 1: Branding + Project + Utility icons */}
            <div className="flex items-center gap-3">
-              <img src="/logo.png" alt="AutoForge" className="h-9 w-9 rounded-full" />
-              <h1 className="font-display text-2xl font-bold tracking-tight uppercase">
-                AutoForge
-              </h1>
-            </div>
+              {/* Logo and Title */}
+              <div className="flex items-center gap-2 shrink-0">
+                <img src="/logo.png" alt="AutoForge" className="h-9 w-9 rounded-full" />
+                <h1 className="font-display text-2xl font-bold tracking-tight uppercase hidden md:block">
+                  AutoForge
+                </h1>
+              </div>

-            {/* Controls */}
-            <div className="flex items-center gap-4">
+              {/* Project selector */}
              <ProjectSelector
                projects={projects ?? []}
                selectedProject={selectedProject}
@@ -280,94 +283,114 @@ function App() {
                onSpecCreatingChange={setIsSpecCreating}
              />

-              {selectedProject && (
-                <>
-                  <AgentControl
-                    projectName={selectedProject}
-                    status={wsState.agentStatus}
-                    defaultConcurrency={selectedProjectData?.default_concurrency}
-                  />
+              {/* Spacer */}
+              <div className="flex-1" />

-                  <DevServerControl
-                    projectName={selectedProject}
-                    status={wsState.devServerStatus}
-                    url={wsState.devServerUrl}
-                  />
-
-                  <Button
-                    onClick={() => setShowSettings(true)}
-                    variant="outline"
-                    size="sm"
-                    title="Settings (,)"
-                    aria-label="Open Settings"
-                  >
-                    <Settings size={18} />
-                  </Button>
-
-                  <Button
-                    onClick={() => setShowResetModal(true)}
-                    variant="outline"
-                    size="sm"
-                    title="Reset Project (R)"
-                    aria-label="Reset Project"
-                    disabled={wsState.agentStatus === 'running'}
-                  >
-                    <RotateCcw size={18} />
-                  </Button>
-
-                  {/* Ollama Mode Indicator */}
-                  {settings?.ollama_mode && (
-                    <div
-                      className="flex items-center gap-1.5 px-2 py-1 bg-card rounded border-2 border-border shadow-sm"
-                      title="Using Ollama local models (configured via .env)"
-                    >
-                      <img src="/ollama.png" alt="Ollama" className="w-5 h-5" />
-                      <span className="text-xs font-bold text-foreground">Ollama</span>
-                    </div>
-                  )}
-
-                  {/* GLM Mode Badge */}
-                  {settings?.glm_mode && (
-                    <Badge
-                      className="bg-purple-500 text-white hover:bg-purple-600"
-                      title="Using GLM API (configured via .env)"
-                    >
-                      GLM
-                    </Badge>
-                  )}
-                </>
+              {/* Ollama Mode Indicator */}
+              {selectedProject && settings?.ollama_mode && (
+                <div
+                  className="hidden sm:flex items-center gap-1.5 px-2 py-1 bg-card rounded border-2 border-border shadow-sm"
+                  title="Using Ollama local models"
+                >
+                  <img src="/ollama.png" alt="Ollama" className="w-5 h-5" />
+                  <span className="text-xs font-bold text-foreground">Ollama</span>
+                </div>
              )}

-              {/* Docs link */}
-              <Button
-                onClick={() => window.open('https://autoforge.cc', '_blank')}
-                variant="outline"
-                size="sm"
-                title="Documentation"
-                aria-label="Open Documentation"
-              >
-                <BookOpen size={18} />
-              </Button>
+              {/* GLM Mode Badge */}
+              {selectedProject && settings?.glm_mode && (
+                <Badge
+                  className="hidden sm:inline-flex bg-purple-500 text-white hover:bg-purple-600"
+                  title="Using GLM API"
+                >
+                  GLM
+                </Badge>
+              )}
+
+              {/* Utility icons - always visible */}
+              <Tooltip>
+                <TooltipTrigger asChild>
+                  <Button
+                    onClick={() => window.open('https://autoforge.cc', '_blank')}
+                    variant="outline"
+                    size="sm"
+                    aria-label="Open Documentation"
+                  >
+                    <BookOpen size={18} />
+                  </Button>
+                </TooltipTrigger>
+                <TooltipContent>Docs</TooltipContent>
+              </Tooltip>

-              {/* Theme selector */}
              <ThemeSelector
                themes={themes}
                currentTheme={theme}
                onThemeChange={setTheme}
              />

-              {/* Dark mode toggle - always visible */}
-              <Button
-                onClick={toggleDarkMode}
-                variant="outline"
-                size="sm"
-                title="Toggle dark mode"
-                aria-label="Toggle dark mode"
-              >
-                {darkMode ? <Sun size={18} /> : <Moon size={18} />}
-              </Button>
+              <Tooltip>
+                <TooltipTrigger asChild>
+                  <Button
+                    onClick={toggleDarkMode}
+                    variant="outline"
+                    size="sm"
+                    aria-label="Toggle dark mode"
+                  >
+                    {darkMode ? <Sun size={18} /> : <Moon size={18} />}
+                  </Button>
+                </TooltipTrigger>
+                <TooltipContent>Toggle theme</TooltipContent>
+              </Tooltip>
            </div>
-          </div>
+
+            {/* Row 2: Project controls - only when a project is selected */}
+            {selectedProject && (
+              <div className="flex items-center gap-3 mt-2 pt-2 border-t border-border/50">
+                <AgentControl
+                  projectName={selectedProject}
+                  status={wsState.agentStatus}
+                  defaultConcurrency={selectedProjectData?.default_concurrency}
+                />
+
+                <DevServerControl
+                  projectName={selectedProject}
+                  status={wsState.devServerStatus}
+                  url={wsState.devServerUrl}
+                />
+
+                <div className="flex-1" />
+
+                <Tooltip>
+                  <TooltipTrigger asChild>
+                    <Button
+                      onClick={() => setShowSettings(true)}
+                      variant="outline"
+                      size="sm"
+                      aria-label="Open Settings"
+                    >
+                      <Settings size={18} />
+                    </Button>
+                  </TooltipTrigger>
+                  <TooltipContent>Settings (,)</TooltipContent>
+                </Tooltip>
+
+                <Tooltip>
+                  <TooltipTrigger asChild>
+                    <Button
+                      onClick={() => setShowResetModal(true)}
+                      variant="outline"
+                      size="sm"
+                      aria-label="Reset Project"
+                      disabled={['running', 'pausing', 'paused_graceful'].includes(wsState.agentStatus)}
+                    >
+                      <RotateCcw size={18} />
+                    </Button>
+                  </TooltipTrigger>
+                  <TooltipContent>Reset (R)</TooltipContent>
+                </Tooltip>
+              </div>
+            )}
+          </TooltipProvider>
        </div>
      </header>

@@ -421,6 +444,7 @@ function App() {
             features.pending.length === 0 &&
             features.in_progress.length === 0 &&
             features.done.length === 0 &&
+             (features.needs_human_input?.length || 0) === 0 &&
             wsState.agentStatus === 'running' && (
              <Card className="p-8 text-center">
                <CardContent className="p-0">
@@ -436,7 +460,7 @@ function App() {
            )}

            {/* View Toggle - only show when there are features */}
-            {features && (features.pending.length + features.in_progress.length + features.done.length) > 0 && (
+            {features && (features.pending.length + features.in_progress.length + features.done.length + (features.needs_human_input?.length || 0)) > 0 && (
              <div className="flex justify-center">
                <ViewToggle viewMode={viewMode} onViewModeChange={setViewMode} />
              </div>
@@ -490,7 +514,7 @@ function App() {
      )}

      {/* Expand Project Modal - AI-powered bulk feature creation */}
-      {showExpandProject && selectedProject && (
+      {showExpandProject && selectedProject && hasSpec && (
        <ExpandProjectModal
          isOpen={showExpandProject}
          projectName={selectedProject}
--- a/ui/src/components/AgentControl.tsx
+++ b/ui/src/components/AgentControl.tsx
@@ -1,8 +1,10 @@
 import { useState, useEffect, useRef, useCallback } from 'react'
-import { Play, Square, Loader2, GitBranch, Clock } from 'lucide-react'
+import { Play, Square, Loader2, GitBranch, Clock, Pause, PlayCircle } from 'lucide-react'
 import {
  useStartAgent,
  useStopAgent,
+  useGracefulPauseAgent,
+  useGracefulResumeAgent,
  useSettings,
  useUpdateProjectSettings,
 } from '../hooks/useProjects'
@@ -60,12 +62,14 @@ export function AgentControl({ projectName, status, defaultConcurrency = 3 }: Ag

  const startAgent = useStartAgent(projectName)
  const stopAgent = useStopAgent(projectName)
+  const gracefulPause = useGracefulPauseAgent(projectName)
+  const gracefulResume = useGracefulResumeAgent(projectName)
  const { data: nextRun } = useNextScheduledRun(projectName)

  const [showScheduleModal, setShowScheduleModal] = useState(false)

-  const isLoading = startAgent.isPending || stopAgent.isPending
-  const isRunning = status === 'running' || status === 'paused'
+  const isLoading = startAgent.isPending || stopAgent.isPending || gracefulPause.isPending || gracefulResume.isPending
+  const isRunning = status === 'running' || status === 'paused' || status === 'pausing' || status === 'paused_graceful'
  const isLoadingStatus = status === 'loading'
  const isParallel = concurrency > 1

@@ -81,7 +85,7 @@ export function AgentControl({ projectName, status, defaultConcurrency = 3 }: Ag

  return (
    <>
-      <div className="flex items-center gap-4">
+      <div className="flex items-center gap-2 sm:gap-4">
        {/* Concurrency slider - visible when stopped */}
        {isStopped && (
          <div className="flex items-center gap-2">
@@ -126,7 +130,7 @@ export function AgentControl({ projectName, status, defaultConcurrency = 3 }: Ag
          </Badge>
        )}

-        {/* Start/Stop button */}
+        {/* Start/Stop/Pause/Resume buttons */}
        {isLoadingStatus ? (
          <Button disabled variant="outline" size="sm">
            <Loader2 size={18} className="animate-spin" />
@@ -146,19 +150,69 @@ export function AgentControl({ projectName, status, defaultConcurrency = 3 }: Ag
            )}
          </Button>
        ) : (
-          <Button
-            onClick={handleStop}
-            disabled={isLoading}
-            variant="destructive"
-            size="sm"
-            title={yoloMode ? 'Stop Agent (YOLO Mode)' : 'Stop Agent'}
-          >
-            {isLoading ? (
-              <Loader2 size={18} className="animate-spin" />
-            ) : (
-              <Square size={18} />
+          <div className="flex items-center gap-1.5">
+            {/* Pausing indicator */}
+            {status === 'pausing' && (
+              <Badge variant="secondary" className="gap-1 animate-pulse">
+                <Loader2 size={12} className="animate-spin" />
+                Pausing...
+              </Badge>
            )}
-          </Button>
+
+            {/* Paused indicator + Resume button */}
+            {status === 'paused_graceful' && (
+              <>
+                <Badge variant="outline" className="gap-1">
+                  Paused
+                </Badge>
+                <Button
+                  onClick={() => gracefulResume.mutate()}
+                  disabled={isLoading}
+                  variant="default"
+                  size="sm"
+                  title="Resume agent"
+                >
+                  {gracefulResume.isPending ? (
+                    <Loader2 size={18} className="animate-spin" />
+                  ) : (
+                    <PlayCircle size={18} />
+                  )}
+                </Button>
+              </>
+            )}
+
+            {/* Graceful pause button (only when running normally) */}
+            {status === 'running' && (
+              <Button
+                onClick={() => gracefulPause.mutate()}
+                disabled={isLoading}
+                variant="outline"
+                size="sm"
+                title="Pause agent (finish current work first)"
+              >
+                {gracefulPause.isPending ? (
+                  <Loader2 size={18} className="animate-spin" />
+                ) : (
+                  <Pause size={18} />
+                )}
+              </Button>
+            )}
+
+            {/* Stop button (always available) */}
+            <Button
+              onClick={handleStop}
+              disabled={isLoading}
+              variant="destructive"
+              size="sm"
+              title="Stop Agent (immediate)"
+            >
+              {stopAgent.isPending ? (
+                <Loader2 size={18} className="animate-spin" />
+              ) : (
+                <Square size={18} />
+              )}
+            </Button>
+          </div>
        )}

        {/* Clock button to open schedule modal */}
--- a/ui/src/components/AgentMissionControl.tsx
+++ b/ui/src/components/AgentMissionControl.tsx
@@ -72,9 +72,13 @@ export function AgentMissionControl({
              ? `${agents.length} ${agents.length === 1 ? 'agent' : 'agents'} active`
              : orchestratorStatus?.state === 'initializing'
                ? 'Initializing'
-                : orchestratorStatus?.state === 'complete'
-                  ? 'Complete'
-                  : 'Orchestrating'
+                : orchestratorStatus?.state === 'draining'
+                  ? 'Draining'
+                  : orchestratorStatus?.state === 'paused'
+                    ? 'Paused'
+                    : orchestratorStatus?.state === 'complete'
+                      ? 'Complete'
+                      : 'Orchestrating'
            }
          </Badge>
        </div>
--- a/ui/src/components/AgentThought.tsx
+++ b/ui/src/components/AgentThought.tsx
@@ -63,7 +63,7 @@ export function AgentThought({ logs, agentStatus }: AgentThoughtProps) {
  // Determine if component should be visible
  const shouldShow = useMemo(() => {
    if (!thought) return false
-    if (agentStatus === 'running') return true
+    if (agentStatus === 'running' || agentStatus === 'pausing') return true
    if (agentStatus === 'paused') {
      return Date.now() - lastLogTimestamp < IDLE_TIMEOUT
    }
--- a/ui/src/components/AssistantChat.tsx
+++ b/ui/src/components/AssistantChat.tsx
@@ -11,6 +11,7 @@ import { Send, Loader2, Wifi, WifiOff, Plus, History } from 'lucide-react'
 import { useAssistantChat } from '../hooks/useAssistantChat'
 import { ChatMessage as ChatMessageComponent } from './ChatMessage'
 import { ConversationHistory } from './ConversationHistory'
+import { QuestionOptions } from './QuestionOptions'
 import type { ChatMessage } from '../lib/types'
 import { isSubmitEnter } from '../lib/keyboard'
 import { Button } from '@/components/ui/button'
@@ -52,8 +53,10 @@ export function AssistantChat({
    isLoading,
    connectionStatus,
    conversationId: activeConversationId,
+    currentQuestions,
    start,
    sendMessage,
+    sendAnswer,
    clearMessages,
  } = useAssistantChat({
    projectName,
@@ -268,6 +271,16 @@ export function AssistantChat({
        </div>
      )}

+      {/* Structured questions from assistant */}
+      {currentQuestions && (
+        <div className="border-t border-border bg-background">
+          <QuestionOptions
+            questions={currentQuestions}
+            onSubmit={sendAnswer}
+          />
+        </div>
+      )}
+
      {/* Input area */}
      <div className="border-t border-border p-4 bg-card">
        <div className="flex gap-2">
@@ -277,13 +290,13 @@ export function AssistantChat({
            onChange={(e) => setInputValue(e.target.value)}
            onKeyDown={handleKeyDown}
            placeholder="Ask about the codebase..."
-            disabled={isLoading || isLoadingConversation || connectionStatus !== 'connected'}
+            disabled={isLoading || isLoadingConversation || connectionStatus !== 'connected' || !!currentQuestions}
            className="flex-1 resize-none min-h-[44px] max-h-[120px]"
            rows={1}
          />
          <Button
            onClick={handleSend}
-            disabled={!inputValue.trim() || isLoading || isLoadingConversation || connectionStatus !== 'connected'}
+            disabled={!inputValue.trim() || isLoading || isLoadingConversation || connectionStatus !== 'connected' || !!currentQuestions}
            title="Send message"
          >
            {isLoading ? (
@@ -294,7 +307,7 @@ export function AssistantChat({
          </Button>
        </div>
        <p className="text-xs text-muted-foreground mt-2">
-          Press Enter to send, Shift+Enter for new line
+          {currentQuestions ? 'Select an option above to continue' : 'Press Enter to send, Shift+Enter for new line'}
        </p>
      </div>
    </div>
--- a/ui/src/components/AssistantPanel.tsx
+++ b/ui/src/components/AssistantPanel.tsx
@@ -6,7 +6,7 @@
 * Manages conversation state with localStorage persistence.
 */

-import { useState, useEffect, useCallback } from 'react'
+import { useState, useEffect, useCallback, useRef } from 'react'
 import { X, Bot } from 'lucide-react'
 import { AssistantChat } from './AssistantChat'
 import { useConversation } from '../hooks/useConversations'
@@ -20,6 +20,10 @@ interface AssistantPanelProps {
 }

 const STORAGE_KEY_PREFIX = 'assistant-conversation-'
+const WIDTH_STORAGE_KEY = 'assistant-panel-width'
+const DEFAULT_WIDTH = 400
+const MIN_WIDTH = 300
+const MAX_WIDTH_VW = 90

 function getStoredConversationId(projectName: string): number | null {
  try {
@@ -100,6 +104,49 @@ export function AssistantPanel({ projectName, isOpen, onClose }: AssistantPanelP
    setConversationId(id)
  }, [])

+  // Resizable panel width
+  const [panelWidth, setPanelWidth] = useState<number>(() => {
+    try {
+      const stored = localStorage.getItem(WIDTH_STORAGE_KEY)
+      if (stored) return Math.max(MIN_WIDTH, parseInt(stored, 10))
+    } catch { /* ignore */ }
+    return DEFAULT_WIDTH
+  })
+  const isResizing = useRef(false)
+
+  const handleMouseDown = useCallback((e: React.MouseEvent) => {
+    e.preventDefault()
+    isResizing.current = true
+    const startX = e.clientX
+    const startWidth = panelWidth
+    const maxWidth = window.innerWidth * (MAX_WIDTH_VW / 100)
+
+    const handleMouseMove = (e: MouseEvent) => {
+      if (!isResizing.current) return
+      const delta = startX - e.clientX
+      const newWidth = Math.min(maxWidth, Math.max(MIN_WIDTH, startWidth + delta))
+      setPanelWidth(newWidth)
+    }
+
+    const handleMouseUp = () => {
+      isResizing.current = false
+      document.removeEventListener('mousemove', handleMouseMove)
+      document.removeEventListener('mouseup', handleMouseUp)
+      document.body.style.cursor = ''
+      document.body.style.userSelect = ''
+      // Persist width
+      setPanelWidth((w) => {
+        localStorage.setItem(WIDTH_STORAGE_KEY, String(w))
+        return w
+      })
+    }
+
+    document.body.style.cursor = 'col-resize'
+    document.body.style.userSelect = 'none'
+    document.addEventListener('mousemove', handleMouseMove)
+    document.addEventListener('mouseup', handleMouseUp)
+  }, [panelWidth])
+
  return (
    <>
      {/* Backdrop - click to close */}
@@ -115,17 +162,25 @@ export function AssistantPanel({ projectName, isOpen, onClose }: AssistantPanelP
      <div
        className={`
          fixed right-0 top-0 bottom-0 z-50
-          w-[400px] max-w-[90vw]
          bg-card
          border-l border-border
          transform transition-transform duration-300 ease-out
          flex flex-col shadow-xl
          ${isOpen ? 'translate-x-0' : 'translate-x-full'}
        `}
+        style={{ width: `${panelWidth}px`, maxWidth: `${MAX_WIDTH_VW}vw` }}
        role="dialog"
        aria-label="Project Assistant"
        aria-hidden={!isOpen}
      >
+        {/* Resize handle */}
+        <div
+          className="absolute left-0 top-0 bottom-0 w-1.5 cursor-col-resize z-10 group"
+          onMouseDown={handleMouseDown}
+        >
+          <div className="absolute inset-y-0 left-0 w-0.5 bg-border group-hover:bg-primary transition-colors" />
+        </div>
+
        {/* Header */}
        <div className="flex items-center justify-between px-4 py-3 border-b border-border bg-primary text-primary-foreground">
          <div className="flex items-center gap-2">
--- a/ui/src/components/ChatMessage.tsx
+++ b/ui/src/components/ChatMessage.tsx
@@ -7,6 +7,8 @@

 import { memo } from 'react'
 import { Bot, User, Info } from 'lucide-react'
+import ReactMarkdown, { type Components } from 'react-markdown'
+import remarkGfm from 'remark-gfm'
 import type { ChatMessage as ChatMessageType } from '../lib/types'
 import { Card } from '@/components/ui/card'

@@ -14,8 +16,16 @@ interface ChatMessageProps {
  message: ChatMessageType
 }

-// Module-level regex to avoid recreating on each render
-const BOLD_REGEX = /\*\*(.*?)\*\*/g
+// Stable references for memo — avoids re-renders
+const remarkPlugins = [remarkGfm]
+
+const markdownComponents: Components = {
+  a: ({ children, href, ...props }) => (
+    <a href={href} target="_blank" rel="noopener noreferrer" {...props}>
+      {children}
+    </a>
+  ),
+}

 export const ChatMessage = memo(function ChatMessage({ message }: ChatMessageProps) {
  const { role, content, attachments, timestamp, isStreaming } = message
@@ -86,39 +96,11 @@ export const ChatMessage = memo(function ChatMessage({ message }: ChatMessagePro
          )}

          <Card className={`${config.bgColor} px-4 py-3 border ${isStreaming ? 'animate-pulse' : ''}`}>
-            {/* Parse content for basic markdown-like formatting */}
            {content && (
-              <div className={`whitespace-pre-wrap text-sm leading-relaxed ${config.textColor}`}>
-                {content.split('\n').map((line, i) => {
-                  // Bold text - use module-level regex, reset lastIndex for each line
-                  BOLD_REGEX.lastIndex = 0
-                  const parts = []
-                  let lastIndex = 0
-                  let match
-
-                  while ((match = BOLD_REGEX.exec(line)) !== null) {
-                    if (match.index > lastIndex) {
-                      parts.push(line.slice(lastIndex, match.index))
-                    }
-                    parts.push(
-                      <strong key={`bold-${i}-${match.index}`} className="font-bold">
-                        {match[1]}
-                      </strong>
-                    )
-                    lastIndex = match.index + match[0].length
-                  }
-
-                  if (lastIndex < line.length) {
-                    parts.push(line.slice(lastIndex))
-                  }
-
-                  return (
-                    <span key={i}>
-                      {parts.length > 0 ? parts : line}
-                      {i < content.split('\n').length - 1 && '\n'}
-                    </span>
-                  )
-                })}
+              <div className={`text-sm leading-relaxed ${config.textColor} chat-prose${role === 'user' ? ' chat-prose-user' : ''}`}>
+                <ReactMarkdown remarkPlugins={remarkPlugins} components={markdownComponents}>
+                  {content}
+                </ReactMarkdown>
              </div>
            )}

--- a/ui/src/components/DependencyGraph.tsx
+++ b/ui/src/components/DependencyGraph.tsx
@@ -15,7 +15,7 @@ import {
  Handle,
 } from '@xyflow/react'
 import dagre from 'dagre'
-import { CheckCircle2, Circle, Loader2, AlertTriangle, RefreshCw } from 'lucide-react'
+import { CheckCircle2, Circle, Loader2, AlertTriangle, RefreshCw, UserCircle } from 'lucide-react'
 import type { DependencyGraph as DependencyGraphData, GraphNode, ActiveAgent, AgentMascot, AgentState } from '../lib/types'
 import { AgentAvatar } from './AgentAvatar'
 import { Button } from '@/components/ui/button'
@@ -93,18 +93,20 @@ class GraphErrorBoundary extends Component<ErrorBoundaryProps, ErrorBoundaryStat

 // Custom node component
 function FeatureNode({ data }: { data: GraphNode & { onClick?: () => void; agent?: NodeAgentInfo } }) {
-  const statusColors = {
+  const statusColors: Record<string, string> = {
    pending: 'bg-yellow-100 border-yellow-300 dark:bg-yellow-900/30 dark:border-yellow-700',
    in_progress: 'bg-cyan-100 border-cyan-300 dark:bg-cyan-900/30 dark:border-cyan-700',
    done: 'bg-green-100 border-green-300 dark:bg-green-900/30 dark:border-green-700',
    blocked: 'bg-red-50 border-red-300 dark:bg-red-900/20 dark:border-red-700',
+    needs_human_input: 'bg-amber-100 border-amber-300 dark:bg-amber-900/30 dark:border-amber-700',
  }

-  const textColors = {
+  const textColors: Record<string, string> = {
    pending: 'text-yellow-900 dark:text-yellow-100',
    in_progress: 'text-cyan-900 dark:text-cyan-100',
    done: 'text-green-900 dark:text-green-100',
    blocked: 'text-red-900 dark:text-red-100',
+    needs_human_input: 'text-amber-900 dark:text-amber-100',
  }

  const StatusIcon = () => {
@@ -115,6 +117,8 @@ function FeatureNode({ data }: { data: GraphNode & { onClick?: () => void; agent
        return <Loader2 size={16} className={`${textColors[data.status]} animate-spin`} />
      case 'blocked':
        return <AlertTriangle size={16} className="text-destructive" />
+      case 'needs_human_input':
+        return <UserCircle size={16} className={textColors[data.status]} />
      default:
        return <Circle size={16} className={textColors[data.status]} />
    }
@@ -323,6 +327,8 @@ function DependencyGraphInner({ graphData, onNodeClick, activeAgents = [] }: Dep
        return '#06b6d4' // cyan-500
      case 'blocked':
        return '#ef4444' // red-500
+      case 'needs_human_input':
+        return '#f59e0b' // amber-500
      default:
        return '#eab308' // yellow-500
    }
--- a/ui/src/components/DevServerConfigDialog.tsx
+++ b/ui/src/components/DevServerConfigDialog.tsx
@@ -0,0 +1,182 @@
+import { useState, useEffect } from 'react'
+import { Loader2, RotateCcw, Terminal } from 'lucide-react'
+import { useQueryClient } from '@tanstack/react-query'
+import {
+  Dialog,
+  DialogContent,
+  DialogDescription,
+  DialogFooter,
+  DialogHeader,
+  DialogTitle,
+} from '@/components/ui/dialog'
+import { Button } from '@/components/ui/button'
+import { Input } from '@/components/ui/input'
+import { Label } from '@/components/ui/label'
+import { useDevServerConfig, useUpdateDevServerConfig } from '@/hooks/useProjects'
+import { startDevServer } from '@/lib/api'
+
+interface DevServerConfigDialogProps {
+  projectName: string
+  isOpen: boolean
+  onClose: () => void
+  autoStartOnSave?: boolean
+}
+
+export function DevServerConfigDialog({
+  projectName,
+  isOpen,
+  onClose,
+  autoStartOnSave = false,
+}: DevServerConfigDialogProps) {
+  const { data: config } = useDevServerConfig(isOpen ? projectName : null)
+  const updateConfig = useUpdateDevServerConfig(projectName)
+  const queryClient = useQueryClient()
+
+  const [command, setCommand] = useState('')
+  const [error, setError] = useState<string | null>(null)
+  const [isSaving, setIsSaving] = useState(false)
+
+  // Sync input with config when dialog opens or config loads
+  useEffect(() => {
+    if (isOpen && config) {
+      setCommand(config.custom_command ?? config.effective_command ?? '')
+      setError(null)
+    }
+  }, [isOpen, config])
+
+  const hasCustomCommand = !!config?.custom_command
+
+  const handleSaveAndStart = async () => {
+    const trimmed = command.trim()
+    if (!trimmed) {
+      setError('Please enter a dev server command.')
+      return
+    }
+
+    setIsSaving(true)
+    setError(null)
+
+    try {
+      await updateConfig.mutateAsync(trimmed)
+
+      if (autoStartOnSave) {
+        await startDevServer(projectName)
+        queryClient.invalidateQueries({ queryKey: ['dev-server-status', projectName] })
+      }
+
+      onClose()
+    } catch (err) {
+      setError(err instanceof Error ? err.message : 'Failed to save configuration')
+    } finally {
+      setIsSaving(false)
+    }
+  }
+
+  const handleClear = async () => {
+    setIsSaving(true)
+    setError(null)
+
+    try {
+      await updateConfig.mutateAsync(null)
+      setCommand(config?.detected_command ?? '')
+    } catch (err) {
+      setError(err instanceof Error ? err.message : 'Failed to clear configuration')
+    } finally {
+      setIsSaving(false)
+    }
+  }
+
+  return (
+    <Dialog open={isOpen} onOpenChange={(open) => !open && onClose()}>
+      <DialogContent className="sm:max-w-lg">
+        <DialogHeader>
+          <div className="flex items-center gap-3">
+            <div className="p-2 rounded-lg bg-primary/10 text-primary">
+              <Terminal size={20} />
+            </div>
+            <DialogTitle>Dev Server Configuration</DialogTitle>
+          </div>
+        </DialogHeader>
+
+        <DialogDescription asChild>
+          <div className="space-y-4">
+            {/* Detection info */}
+            <div className="rounded-lg border-2 border-border bg-muted/50 p-3 text-sm">
+              {config?.detected_type ? (
+                <p>
+                  Detected project type: <strong className="text-foreground">{config.detected_type}</strong>
+                  {config.detected_command && (
+                    <span className="text-muted-foreground"> — {config.detected_command}</span>
+                  )}
+                </p>
+              ) : (
+                <p className="text-muted-foreground">
+                  No project type detected. Enter a custom command below.
+                </p>
+              )}
+            </div>
+
+            {/* Command input */}
+            <div className="space-y-2">
+              <Label htmlFor="dev-command" className="text-foreground">Dev server command</Label>
+              <Input
+                id="dev-command"
+                value={command}
+                onChange={(e) => {
+                  setCommand(e.target.value)
+                  setError(null)
+                }}
+                placeholder="npm run dev"
+                onKeyDown={(e) => {
+                  if (e.key === 'Enter' && !isSaving) {
+                    handleSaveAndStart()
+                  }
+                }}
+              />
+              <p className="text-xs text-muted-foreground">
+                Allowed runners: npm, npx, pnpm, yarn, python, uvicorn, flask, poetry, cargo, go
+              </p>
+            </div>
+
+            {/* Clear custom command button */}
+            {hasCustomCommand && (
+              <Button
+                variant="outline"
+                size="sm"
+                onClick={handleClear}
+                disabled={isSaving}
+                className="gap-1.5"
+              >
+                <RotateCcw size={14} />
+                Clear custom command (use auto-detection)
+              </Button>
+            )}
+
+            {/* Error display */}
+            {error && (
+              <p className="text-sm font-mono text-destructive">{error}</p>
+            )}
+          </div>
+        </DialogDescription>
+
+        <DialogFooter className="gap-2 sm:gap-0">
+          <Button variant="outline" onClick={onClose} disabled={isSaving}>
+            Cancel
+          </Button>
+          <Button onClick={handleSaveAndStart} disabled={isSaving}>
+            {isSaving ? (
+              <>
+                <Loader2 size={16} className="animate-spin mr-1.5" />
+                Saving...
+              </>
+            ) : autoStartOnSave ? (
+              'Save & Start'
+            ) : (
+              'Save'
+            )}
+          </Button>
+        </DialogFooter>
+      </DialogContent>
+    </Dialog>
+  )
+}
--- a/ui/src/components/DevServerControl.tsx
+++ b/ui/src/components/DevServerControl.tsx
@@ -1,8 +1,10 @@
-import { Globe, Square, Loader2, ExternalLink, AlertTriangle } from 'lucide-react'
+import { useState } from 'react'
+import { Globe, Square, Loader2, ExternalLink, AlertTriangle, Settings2 } from 'lucide-react'
 import { useMutation, useQueryClient } from '@tanstack/react-query'
 import type { DevServerStatus } from '../lib/types'
 import { startDevServer, stopDevServer } from '../lib/api'
 import { Button } from '@/components/ui/button'
+import { DevServerConfigDialog } from './DevServerConfigDialog'

 // Re-export DevServerStatus from lib/types for consumers that import from here
 export type { DevServerStatus }
@@ -59,17 +61,27 @@ interface DevServerControlProps {
 * - Shows loading state during operations
 * - Displays clickable URL when server is running
 * - Uses neobrutalism design with cyan accent when running
+ * - Config dialog for setting custom dev commands
 */
 export function DevServerControl({ projectName, status, url }: DevServerControlProps) {
  const startDevServerMutation = useStartDevServer(projectName)
  const stopDevServerMutation = useStopDevServer(projectName)
+  const [showConfigDialog, setShowConfigDialog] = useState(false)
+  const [autoStartOnSave, setAutoStartOnSave] = useState(false)

  const isLoading = startDevServerMutation.isPending || stopDevServerMutation.isPending

  const handleStart = () => {
    // Clear any previous errors before starting
    stopDevServerMutation.reset()
-    startDevServerMutation.mutate()
+    startDevServerMutation.mutate(undefined, {
+      onError: (err) => {
+        if (err.message?.includes('No dev command available')) {
+          setAutoStartOnSave(true)
+          setShowConfigDialog(true)
+        }
+      },
+    })
  }
  const handleStop = () => {
    // Clear any previous errors before stopping
@@ -77,6 +89,19 @@ export function DevServerControl({ projectName, status, url }: DevServerControlP
    stopDevServerMutation.mutate()
  }

+  const handleOpenConfig = () => {
+    setAutoStartOnSave(false)
+    setShowConfigDialog(true)
+  }
+
+  const handleCloseConfig = () => {
+    setShowConfigDialog(false)
+    // Clear the start error if config dialog was opened reactively
+    if (startDevServerMutation.error?.message?.includes('No dev command available')) {
+      startDevServerMutation.reset()
+    }
+  }
+
  // Server is stopped when status is 'stopped' or 'crashed' (can restart)
  const isStopped = status === 'stopped' || status === 'crashed'
  // Server is in a running state
@@ -84,25 +109,40 @@ export function DevServerControl({ projectName, status, url }: DevServerControlP
  // Server has crashed
  const isCrashed = status === 'crashed'

+  // Hide inline error when config dialog is handling it
+  const startError = startDevServerMutation.error
+  const showInlineError = startError && !startError.message?.includes('No dev command available')
+
  return (
    <div className="flex items-center gap-2">
      {isStopped ? (
-        <Button
-          onClick={handleStart}
-          disabled={isLoading}
-          variant={isCrashed ? "destructive" : "outline"}
-          size="sm"
-          title={isCrashed ? "Dev Server Crashed - Click to Restart" : "Start Dev Server"}
-          aria-label={isCrashed ? "Restart Dev Server (crashed)" : "Start Dev Server"}
-        >
-          {isLoading ? (
-            <Loader2 size={18} className="animate-spin" />
-          ) : isCrashed ? (
-            <AlertTriangle size={18} />
-          ) : (
-            <Globe size={18} />
-          )}
-        </Button>
+        <>
+          <Button
+            onClick={handleStart}
+            disabled={isLoading}
+            variant={isCrashed ? "destructive" : "outline"}
+            size="sm"
+            title={isCrashed ? "Dev Server Crashed - Click to Restart" : "Start Dev Server"}
+            aria-label={isCrashed ? "Restart Dev Server (crashed)" : "Start Dev Server"}
+          >
+            {isLoading ? (
+              <Loader2 size={18} className="animate-spin" />
+            ) : isCrashed ? (
+              <AlertTriangle size={18} />
+            ) : (
+              <Globe size={18} />
+            )}
+          </Button>
+          <Button
+            onClick={handleOpenConfig}
+            variant="ghost"
+            size="sm"
+            title="Configure Dev Server"
+            aria-label="Configure Dev Server"
+          >
+            <Settings2 size={16} />
+          </Button>
+        </>
      ) : (
        <Button
          onClick={handleStop}
@@ -139,12 +179,20 @@ export function DevServerControl({ projectName, status, url }: DevServerControlP
        </Button>
      )}

-      {/* Error display */}
-      {(startDevServerMutation.error || stopDevServerMutation.error) && (
+      {/* Error display (hide "no dev command" error when config dialog handles it) */}
+      {(showInlineError || stopDevServerMutation.error) && (
        <span className="text-xs font-mono text-destructive ml-2">
-          {String((startDevServerMutation.error || stopDevServerMutation.error)?.message || 'Operation failed')}
+          {String((showInlineError ? startError : stopDevServerMutation.error)?.message || 'Operation failed')}
        </span>
      )}
+
+      {/* Dev Server Config Dialog */}
+      <DevServerConfigDialog
+        projectName={projectName}
+        isOpen={showConfigDialog}
+        onClose={handleCloseConfig}
+        autoStartOnSave={autoStartOnSave}
+      />
    </div>
  )
 }
--- a/ui/src/components/FeatureCard.tsx
+++ b/ui/src/components/FeatureCard.tsx
@@ -1,4 +1,4 @@
-import { CheckCircle2, Circle, Loader2, MessageCircle } from 'lucide-react'
+import { CheckCircle2, Circle, Loader2, MessageCircle, UserCircle } from 'lucide-react'
 import type { Feature, ActiveAgent } from '../lib/types'
 import { DependencyBadge } from './DependencyBadge'
 import { AgentAvatar } from './AgentAvatar'
@@ -45,7 +45,8 @@ export function FeatureCard({ feature, onClick, isInProgress, allFeatures = [],
        cursor-pointer transition-all hover:border-primary py-3
        ${isInProgress ? 'animate-pulse' : ''}
        ${feature.passes ? 'border-primary/50' : ''}
-        ${isBlocked && !feature.passes ? 'border-destructive/50 opacity-80' : ''}
+        ${feature.needs_human_input ? 'border-amber-500/50' : ''}
+        ${isBlocked && !feature.passes && !feature.needs_human_input ? 'border-destructive/50 opacity-80' : ''}
        ${hasActiveAgent ? 'ring-2 ring-primary ring-offset-2' : ''}
      `}
    >
@@ -105,6 +106,11 @@ export function FeatureCard({ feature, onClick, isInProgress, allFeatures = [],
              <CheckCircle2 size={16} className="text-primary" />
              <span className="text-primary font-medium">Complete</span>
            </>
+          ) : feature.needs_human_input ? (
+            <>
+              <UserCircle size={16} className="text-amber-500" />
+              <span className="text-amber-500 font-medium">Needs Your Input</span>
+            </>
          ) : isBlocked ? (
            <>
              <Circle size={16} className="text-destructive" />
--- a/ui/src/components/FeatureModal.tsx
+++ b/ui/src/components/FeatureModal.tsx
@@ -1,7 +1,8 @@
 import { useState } from 'react'
-import { X, CheckCircle2, Circle, SkipForward, Trash2, Loader2, AlertCircle, Pencil, Link2, AlertTriangle } from 'lucide-react'
-import { useSkipFeature, useDeleteFeature, useFeatures } from '../hooks/useProjects'
+import { X, CheckCircle2, Circle, SkipForward, Trash2, Loader2, AlertCircle, Pencil, Link2, AlertTriangle, UserCircle } from 'lucide-react'
+import { useSkipFeature, useDeleteFeature, useFeatures, useResolveHumanInput } from '../hooks/useProjects'
 import { EditFeatureForm } from './EditFeatureForm'
+import { HumanInputForm } from './HumanInputForm'
 import type { Feature } from '../lib/types'
 import {
  Dialog,
@@ -50,10 +51,12 @@ export function FeatureModal({ feature, projectName, onClose }: FeatureModalProp
  const deleteFeature = useDeleteFeature(projectName)
  const { data: allFeatures } = useFeatures(projectName)

+  const resolveHumanInput = useResolveHumanInput(projectName)
+
  // Build a map of feature ID to feature for looking up dependency names
  const featureMap = new Map<number, Feature>()
  if (allFeatures) {
-    ;[...allFeatures.pending, ...allFeatures.in_progress, ...allFeatures.done].forEach(f => {
+    ;[...allFeatures.pending, ...allFeatures.in_progress, ...allFeatures.done, ...(allFeatures.needs_human_input || [])].forEach(f => {
      featureMap.set(f.id, f)
    })
  }
@@ -141,6 +144,11 @@ export function FeatureModal({ feature, projectName, onClose }: FeatureModalProp
                <CheckCircle2 size={24} className="text-primary" />
                <span className="font-semibold text-primary">COMPLETE</span>
              </>
+            ) : feature.needs_human_input ? (
+              <>
+                <UserCircle size={24} className="text-amber-500" />
+                <span className="font-semibold text-amber-500">NEEDS YOUR INPUT</span>
+              </>
            ) : (
              <>
                <Circle size={24} className="text-muted-foreground" />
@@ -152,6 +160,38 @@ export function FeatureModal({ feature, projectName, onClose }: FeatureModalProp
            </span>
          </div>

+          {/* Human Input Request */}
+          {feature.needs_human_input && feature.human_input_request && (
+            <HumanInputForm
+              request={feature.human_input_request}
+              onSubmit={async (fields) => {
+                setError(null)
+                try {
+                  await resolveHumanInput.mutateAsync({ featureId: feature.id, fields })
+                  onClose()
+                } catch (err) {
+                  setError(err instanceof Error ? err.message : 'Failed to submit response')
+                }
+              }}
+              isLoading={resolveHumanInput.isPending}
+            />
+          )}
+
+          {/* Previous Human Input Response */}
+          {feature.human_input_response && !feature.needs_human_input && (
+            <Alert className="border-green-500 bg-green-50 dark:bg-green-950/20">
+              <CheckCircle2 className="h-4 w-4 text-green-600" />
+              <AlertDescription>
+                <h4 className="font-semibold mb-1 text-green-700 dark:text-green-400">Human Input Provided</h4>
+                <p className="text-sm text-green-600 dark:text-green-300">
+                  Response submitted{feature.human_input_response.responded_at
+                    ? ` at ${new Date(feature.human_input_response.responded_at).toLocaleString()}`
+                    : ''}.
+                </p>
+              </AlertDescription>
+            </Alert>
+          )}
+
          {/* Description */}
          <div>
            <h3 className="font-semibold mb-2 text-sm uppercase tracking-wide text-muted-foreground">
--- a/ui/src/components/HumanInputForm.tsx
+++ b/ui/src/components/HumanInputForm.tsx
@@ -0,0 +1,150 @@
+import { useState } from 'react'
+import { Loader2, UserCircle, Send } from 'lucide-react'
+import type { HumanInputRequest } from '../lib/types'
+import { Button } from '@/components/ui/button'
+import { Input } from '@/components/ui/input'
+import { Textarea } from '@/components/ui/textarea'
+import { Label } from '@/components/ui/label'
+import { Alert, AlertDescription } from '@/components/ui/alert'
+import { Switch } from '@/components/ui/switch'
+
+interface HumanInputFormProps {
+  request: HumanInputRequest
+  onSubmit: (fields: Record<string, string | boolean | string[]>) => Promise<void>
+  isLoading: boolean
+}
+
+export function HumanInputForm({ request, onSubmit, isLoading }: HumanInputFormProps) {
+  const [values, setValues] = useState<Record<string, string | boolean | string[]>>(() => {
+    const initial: Record<string, string | boolean | string[]> = {}
+    for (const field of request.fields) {
+      if (field.type === 'boolean') {
+        initial[field.id] = false
+      } else {
+        initial[field.id] = ''
+      }
+    }
+    return initial
+  })
+
+  const [validationError, setValidationError] = useState<string | null>(null)
+
+  const handleSubmit = async () => {
+    // Validate required fields
+    for (const field of request.fields) {
+      if (field.required) {
+        const val = values[field.id]
+        if (val === undefined || val === null || val === '') {
+          setValidationError(`"${field.label}" is required`)
+          return
+        }
+      }
+    }
+    setValidationError(null)
+    await onSubmit(values)
+  }
+
+  return (
+    <Alert className="border-amber-500 bg-amber-50 dark:bg-amber-950/20">
+      <UserCircle className="h-5 w-5 text-amber-600" />
+      <AlertDescription className="space-y-4">
+        <div>
+          <h4 className="font-semibold text-amber-700 dark:text-amber-400">Agent needs your help</h4>
+          <p className="text-sm text-amber-600 dark:text-amber-300 mt-1">
+            {request.prompt}
+          </p>
+        </div>
+
+        <div className="space-y-3">
+          {request.fields.map((field) => (
+            <div key={field.id} className="space-y-1.5">
+              <Label htmlFor={`human-input-${field.id}`} className="text-sm font-medium text-foreground">
+                {field.label}
+                {field.required && <span className="text-destructive ml-1">*</span>}
+              </Label>
+
+              {field.type === 'text' && (
+                <Input
+                  id={`human-input-${field.id}`}
+                  value={values[field.id] as string}
+                  onChange={(e) => setValues(prev => ({ ...prev, [field.id]: e.target.value }))}
+                  placeholder={field.placeholder || ''}
+                  disabled={isLoading}
+                />
+              )}
+
+              {field.type === 'textarea' && (
+                <Textarea
+                  id={`human-input-${field.id}`}
+                  value={values[field.id] as string}
+                  onChange={(e) => setValues(prev => ({ ...prev, [field.id]: e.target.value }))}
+                  placeholder={field.placeholder || ''}
+                  disabled={isLoading}
+                  rows={3}
+                />
+              )}
+
+              {field.type === 'select' && field.options && (
+                <div className="space-y-1.5">
+                  {field.options.map((option) => (
+                    <label
+                      key={option.value}
+                      className={`flex items-center gap-2 p-2 rounded-md border cursor-pointer transition-colors
+                        ${values[field.id] === option.value
+                          ? 'border-primary bg-primary/10'
+                          : 'border-border hover:border-primary/50'}`}
+                    >
+                      <input
+                        type="radio"
+                        name={`human-input-${field.id}`}
+                        value={option.value}
+                        checked={values[field.id] === option.value}
+                        onChange={(e) => setValues(prev => ({ ...prev, [field.id]: e.target.value }))}
+                        disabled={isLoading}
+                        className="accent-primary"
+                      />
+                      <span className="text-sm">{option.label}</span>
+                    </label>
+                  ))}
+                </div>
+              )}
+
+              {field.type === 'boolean' && (
+                <div className="flex items-center gap-2">
+                  <Switch
+                    id={`human-input-${field.id}`}
+                    checked={values[field.id] as boolean}
+                    onCheckedChange={(checked) => setValues(prev => ({ ...prev, [field.id]: checked }))}
+                    disabled={isLoading}
+                  />
+                  <Label htmlFor={`human-input-${field.id}`} className="text-sm">
+                    {values[field.id] ? 'Yes' : 'No'}
+                  </Label>
+                </div>
+              )}
+            </div>
+          ))}
+        </div>
+
+        {validationError && (
+          <p className="text-sm text-destructive">{validationError}</p>
+        )}
+
+        <Button
+          onClick={handleSubmit}
+          disabled={isLoading}
+          className="w-full"
+        >
+          {isLoading ? (
+            <Loader2 size={16} className="animate-spin" />
+          ) : (
+            <>
+              <Send size={16} />
+              Submit Response
+            </>
+          )}
+        </Button>
+      </AlertDescription>
+    </Alert>
+  )
+}
--- a/ui/src/components/KanbanBoard.tsx
+++ b/ui/src/components/KanbanBoard.tsx
@@ -13,13 +13,16 @@ interface KanbanBoardProps {
 }

 export function KanbanBoard({ features, onFeatureClick, onAddFeature, onExpandProject, activeAgents = [], onCreateSpec, hasSpec = true }: KanbanBoardProps) {
-  const hasFeatures = features && (features.pending.length + features.in_progress.length + features.done.length) > 0
+  const hasFeatures = features && (features.pending.length + features.in_progress.length + features.done.length + (features.needs_human_input?.length || 0)) > 0

  // Combine all features for dependency status calculation
  const allFeatures = features
-    ? [...features.pending, ...features.in_progress, ...features.done]
+    ? [...features.pending, ...features.in_progress, ...features.done, ...(features.needs_human_input || [])]
    : []

+  const needsInputCount = features?.needs_human_input?.length || 0
+  const showNeedsInput = needsInputCount > 0
+
  if (!features) {
    return (
      <div className="grid grid-cols-1 md:grid-cols-3 gap-6">
@@ -40,7 +43,7 @@ export function KanbanBoard({ features, onFeatureClick, onAddFeature, onExpandPr
  }

  return (
-    <div className="grid grid-cols-1 md:grid-cols-3 gap-6">
+    <div className={`grid grid-cols-1 ${showNeedsInput ? 'md:grid-cols-4' : 'md:grid-cols-3'} gap-6`}>
      <KanbanColumn
        title="Pending"
        count={features.pending.length}
@@ -51,7 +54,7 @@ export function KanbanBoard({ features, onFeatureClick, onAddFeature, onExpandPr
        onFeatureClick={onFeatureClick}
        onAddFeature={onAddFeature}
        onExpandProject={onExpandProject}
-        showExpandButton={hasFeatures}
+        showExpandButton={hasFeatures && hasSpec}
        onCreateSpec={onCreateSpec}
        showCreateSpec={!hasSpec && !hasFeatures}
      />
@@ -64,6 +67,17 @@ export function KanbanBoard({ features, onFeatureClick, onAddFeature, onExpandPr
        color="progress"
        onFeatureClick={onFeatureClick}
      />
+      {showNeedsInput && (
+        <KanbanColumn
+          title="Needs Input"
+          count={needsInputCount}
+          features={features.needs_human_input}
+          allFeatures={allFeatures}
+          activeAgents={activeAgents}
+          color="human_input"
+          onFeatureClick={onFeatureClick}
+        />
+      )}
      <KanbanColumn
        title="Done"
        count={features.done.length}
--- a/ui/src/components/KanbanColumn.tsx
+++ b/ui/src/components/KanbanColumn.tsx
@@ -11,7 +11,7 @@ interface KanbanColumnProps {
  features: Feature[]
  allFeatures?: Feature[]
  activeAgents?: ActiveAgent[]
-  color: 'pending' | 'progress' | 'done'
+  color: 'pending' | 'progress' | 'done' | 'human_input'
  onFeatureClick: (feature: Feature) => void
  onAddFeature?: () => void
  onExpandProject?: () => void
@@ -24,6 +24,7 @@ const colorMap = {
  pending: 'border-t-4 border-t-muted',
  progress: 'border-t-4 border-t-primary',
  done: 'border-t-4 border-t-primary',
+  human_input: 'border-t-4 border-t-amber-500',
 }

 export function KanbanColumn({
--- a/ui/src/components/KeyboardShortcutsHelp.tsx
+++ b/ui/src/components/KeyboardShortcutsHelp.tsx
@@ -19,7 +19,7 @@ const shortcuts: Shortcut[] = [
  { key: 'D', description: 'Toggle debug panel' },
  { key: 'T', description: 'Toggle terminal tab' },
  { key: 'N', description: 'Add new feature', context: 'with project' },
-  { key: 'E', description: 'Expand project with AI', context: 'with features' },
+  { key: 'E', description: 'Expand project with AI', context: 'with spec & features' },
  { key: 'A', description: 'Toggle AI assistant', context: 'with project' },
  { key: 'G', description: 'Toggle Kanban/Graph view', context: 'with project' },
  { key: ',', description: 'Open settings' },
--- a/ui/src/components/OrchestratorAvatar.tsx
+++ b/ui/src/components/OrchestratorAvatar.tsx
@@ -103,6 +103,10 @@ function getStateAnimation(state: OrchestratorState): string {
      return 'animate-working'
    case 'monitoring':
      return 'animate-bounce-gentle'
+    case 'draining':
+      return 'animate-thinking'
+    case 'paused':
+      return ''
    case 'complete':
      return 'animate-celebrate'
    default:
@@ -121,6 +125,10 @@ function getStateGlow(state: OrchestratorState): string {
      return 'shadow-[0_0_16px_rgba(124,58,237,0.6)]'
    case 'monitoring':
      return 'shadow-[0_0_8px_rgba(167,139,250,0.4)]'
+    case 'draining':
+      return 'shadow-[0_0_10px_rgba(251,191,36,0.5)]'
+    case 'paused':
+      return ''
    case 'complete':
      return 'shadow-[0_0_20px_rgba(112,224,0,0.6)]'
    default:
@@ -141,6 +149,10 @@ function getStateDescription(state: OrchestratorState): string {
      return 'spawning agents'
    case 'monitoring':
      return 'monitoring progress'
+    case 'draining':
+      return 'draining active agents'
+    case 'paused':
+      return 'paused'
    case 'complete':
      return 'all features complete'
    default:
--- a/ui/src/components/OrchestratorStatusCard.tsx
+++ b/ui/src/components/OrchestratorStatusCard.tsx
@@ -25,6 +25,10 @@ function getStateText(state: OrchestratorState): string {
      return 'Watching progress...'
    case 'complete':
      return 'Mission accomplished!'
+    case 'draining':
+      return 'Draining agents...'
+    case 'paused':
+      return 'Paused'
    default:
      return 'Orchestrating...'
  }
@@ -42,6 +46,10 @@ function getStateColor(state: OrchestratorState): string {
      return 'text-primary'
    case 'initializing':
      return 'text-yellow-600 dark:text-yellow-400'
+    case 'draining':
+      return 'text-amber-600 dark:text-amber-400'
+    case 'paused':
+      return 'text-muted-foreground'
    default:
      return 'text-muted-foreground'
  }
--- a/ui/src/components/ProgressDashboard.tsx
+++ b/ui/src/components/ProgressDashboard.tsx
@@ -55,7 +55,7 @@ export function ProgressDashboard({

  const showThought = useMemo(() => {
    if (!thought) return false
-    if (agentStatus === 'running') return true
+    if (agentStatus === 'running' || agentStatus === 'pausing') return true
    if (agentStatus === 'paused') {
      return Date.now() - lastLogTimestamp < IDLE_TIMEOUT
    }
--- a/ui/src/components/ProjectSelector.tsx
+++ b/ui/src/components/ProjectSelector.tsx
@@ -73,16 +73,17 @@ export function ProjectSelector({
        <DropdownMenuTrigger asChild>
          <Button
            variant="outline"
-            className="min-w-[200px] justify-between"
+            className="min-w-[140px] sm:min-w-[200px] justify-between"
            disabled={isLoading}
+            title={selectedProjectData?.path}
          >
            {isLoading ? (
              <Loader2 size={18} className="animate-spin" />
            ) : selectedProject ? (
              <>
-                <span className="flex items-center gap-2">
-                  <FolderOpen size={18} />
-                  {selectedProject}
+                <span className="flex items-center gap-2 truncate">
+                  <FolderOpen size={18} className="shrink-0" />
+                  <span className="truncate">{selectedProject}</span>
                </span>
                {selectedProjectData && selectedProjectData.stats.total > 0 && (
                  <Badge className="ml-2">{selectedProjectData.stats.percentage}%</Badge>
@@ -101,6 +102,7 @@ export function ProjectSelector({
              {projects.map(project => (
                <DropdownMenuItem
                  key={project.name}
+                  title={project.path}
                  className={`flex items-center justify-between cursor-pointer ${
                    project.name === selectedProject ? 'bg-primary/10' : ''
                  }`}
--- a/ui/src/components/SettingsModal.tsx
+++ b/ui/src/components/SettingsModal.tsx
@@ -1,6 +1,8 @@
-import { Loader2, AlertCircle, Check, Moon, Sun } from 'lucide-react'
-import { useSettings, useUpdateSettings, useAvailableModels } from '../hooks/useProjects'
+import { useState } from 'react'
+import { Loader2, AlertCircle, Check, Moon, Sun, Eye, EyeOff, ShieldCheck } from 'lucide-react'
+import { useSettings, useUpdateSettings, useAvailableModels, useAvailableProviders } from '../hooks/useProjects'
 import { useTheme, THEMES } from '../hooks/useTheme'
+import type { ProviderInfo } from '../lib/types'
 import {
  Dialog,
  DialogContent,
@@ -17,12 +19,26 @@ interface SettingsModalProps {
  onClose: () => void
 }

+const PROVIDER_INFO_TEXT: Record<string, string> = {
+  claude: 'Default provider. Uses your Claude CLI credentials.',
+  kimi: 'Get an API key at kimi.com',
+  glm: 'Get an API key at open.bigmodel.cn',
+  ollama: 'Run models locally. Install from ollama.com',
+  custom: 'Connect to any OpenAI-compatible API endpoint.',
+}
+
 export function SettingsModal({ isOpen, onClose }: SettingsModalProps) {
  const { data: settings, isLoading, isError, refetch } = useSettings()
  const { data: modelsData } = useAvailableModels()
+  const { data: providersData } = useAvailableProviders()
  const updateSettings = useUpdateSettings()
  const { theme, setTheme, darkMode, toggleDarkMode } = useTheme()

+  const [showAuthToken, setShowAuthToken] = useState(false)
+  const [authTokenInput, setAuthTokenInput] = useState('')
+  const [customModelInput, setCustomModelInput] = useState('')
+  const [customBaseUrlInput, setCustomBaseUrlInput] = useState('')
+
  const handleYoloToggle = () => {
    if (settings && !updateSettings.isPending) {
      updateSettings.mutate({ yolo_mode: !settings.yolo_mode })
@@ -31,7 +47,7 @@ export function SettingsModal({ isOpen, onClose }: SettingsModalProps) {

  const handleModelChange = (modelId: string) => {
    if (!updateSettings.isPending) {
-      updateSettings.mutate({ model: modelId })
+      updateSettings.mutate({ api_model: modelId })
    }
  }

@@ -47,12 +63,52 @@ export function SettingsModal({ isOpen, onClose }: SettingsModalProps) {
    }
  }

+  const handleProviderChange = (providerId: string) => {
+    if (!updateSettings.isPending) {
+      updateSettings.mutate({ api_provider: providerId })
+      // Reset local state
+      setAuthTokenInput('')
+      setShowAuthToken(false)
+      setCustomModelInput('')
+      setCustomBaseUrlInput('')
+    }
+  }
+
+  const handleSaveAuthToken = () => {
+    if (authTokenInput.trim() && !updateSettings.isPending) {
+      updateSettings.mutate({ api_auth_token: authTokenInput.trim() })
+      setAuthTokenInput('')
+      setShowAuthToken(false)
+    }
+  }
+
+  const handleSaveCustomBaseUrl = () => {
+    if (customBaseUrlInput.trim() && !updateSettings.isPending) {
+      updateSettings.mutate({ api_base_url: customBaseUrlInput.trim() })
+      setCustomBaseUrlInput('')
+    }
+  }
+
+  const handleSaveCustomModel = () => {
+    if (customModelInput.trim() && !updateSettings.isPending) {
+      updateSettings.mutate({ api_model: customModelInput.trim() })
+      setCustomModelInput('')
+    }
+  }
+
+  const providers = providersData?.providers ?? []
  const models = modelsData?.models ?? []
  const isSaving = updateSettings.isPending
+  const currentProvider = settings?.api_provider ?? 'claude'
+  const currentProviderInfo: ProviderInfo | undefined = providers.find(p => p.id === currentProvider)
+  const isAlternativeProvider = currentProvider !== 'claude'
+  const showAuthField = isAlternativeProvider && currentProviderInfo?.requires_auth
+  const showBaseUrlField = currentProvider === 'custom' || currentProvider === 'azure'
+  const showCustomModelInput = currentProvider === 'custom' || currentProvider === 'ollama'

  return (
    <Dialog open={isOpen} onOpenChange={(open) => !open && onClose()}>
-      <DialogContent className="sm:max-w-sm">
+      <DialogContent aria-describedby={undefined} className="sm:max-w-lg max-h-[90vh] overflow-y-auto">
        <DialogHeader>
          <DialogTitle className="flex items-center gap-2">
            Settings
@@ -159,6 +215,163 @@ export function SettingsModal({ isOpen, onClose }: SettingsModalProps) {

            <hr className="border-border" />

+            {/* API Provider Selection */}
+            <div className="space-y-3">
+              <Label className="font-medium">API Provider</Label>
+              <div className="flex flex-wrap gap-1.5">
+                {providers.map((provider) => (
+                  <button
+                    key={provider.id}
+                    onClick={() => handleProviderChange(provider.id)}
+                    disabled={isSaving}
+                    className={`py-1.5 px-3 text-sm font-medium rounded-md border transition-colors ${
+                      currentProvider === provider.id
+                        ? 'bg-primary text-primary-foreground border-primary'
+                        : 'bg-background text-foreground border-border hover:bg-muted'
+                    } ${isSaving ? 'opacity-50 cursor-not-allowed' : ''}`}
+                  >
+                    {provider.name.split(' (')[0]}
+                  </button>
+                ))}
+              </div>
+              <p className="text-xs text-muted-foreground">
+                {PROVIDER_INFO_TEXT[currentProvider] ?? ''}
+              </p>
+
+              {/* Auth Token Field */}
+              {showAuthField && (
+                <div className="space-y-2 pt-1">
+                  <Label className="text-sm">API Key</Label>
+                  {settings.api_has_auth_token && !authTokenInput && (
+                    <div className="flex items-center gap-2 text-sm text-muted-foreground">
+                      <ShieldCheck size={14} className="text-green-500" />
+                      <span>Configured</span>
+                      <Button
+                        variant="ghost"
+                        size="sm"
+                        className="h-auto py-0.5 px-2 text-xs"
+                        onClick={() => setAuthTokenInput(' ')}
+                      >
+                        Change
+                      </Button>
+                    </div>
+                  )}
+                  {(!settings.api_has_auth_token || authTokenInput) && (
+                    <div className="flex gap-2">
+                      <div className="relative flex-1">
+                        <input
+                          type={showAuthToken ? 'text' : 'password'}
+                          value={authTokenInput.trim()}
+                          onChange={(e) => setAuthTokenInput(e.target.value)}
+                          placeholder="Enter API key..."
+                          className="w-full py-1.5 px-3 pe-9 text-sm border rounded-md bg-background"
+                        />
+                        <button
+                          type="button"
+                          onClick={() => setShowAuthToken(!showAuthToken)}
+                          className="absolute end-2 top-1/2 -translate-y-1/2 text-muted-foreground hover:text-foreground"
+                        >
+                          {showAuthToken ? <EyeOff size={14} /> : <Eye size={14} />}
+                        </button>
+                      </div>
+                      <Button
+                        size="sm"
+                        onClick={handleSaveAuthToken}
+                        disabled={!authTokenInput.trim() || isSaving}
+                      >
+                        Save
+                      </Button>
+                    </div>
+                  )}
+                </div>
+              )}
+
+              {/* Custom Base URL Field */}
+              {showBaseUrlField && (
+                <div className="space-y-2 pt-1">
+                  <Label className="text-sm">Base URL</Label>
+                  {settings.api_base_url && !customBaseUrlInput && (
+                    <div className="flex items-center gap-2 text-sm text-muted-foreground">
+                      <ShieldCheck size={14} className="text-green-500" />
+                      <span className="truncate">{settings.api_base_url}</span>
+                      <Button
+                        variant="ghost"
+                        size="sm"
+                        className="h-auto py-0.5 px-2 text-xs shrink-0"
+                        onClick={() => setCustomBaseUrlInput(settings.api_base_url || '')}
+                      >
+                        Change
+                      </Button>
+                    </div>
+                  )}
+                  {(!settings.api_base_url || customBaseUrlInput) && (
+                    <div className="flex gap-2">
+                      <input
+                        type="text"
+                        value={customBaseUrlInput}
+                        onChange={(e) => setCustomBaseUrlInput(e.target.value)}
+                        placeholder={currentProvider === 'azure' ? 'https://your-resource.services.ai.azure.com/anthropic' : 'https://api.example.com/v1'}
+                        className="flex-1 py-1.5 px-3 text-sm border rounded-md bg-background"
+                      />
+                      <Button
+                        size="sm"
+                        onClick={handleSaveCustomBaseUrl}
+                        disabled={!customBaseUrlInput.trim() || isSaving}
+                      >
+                        Save
+                      </Button>
+                    </div>
+                  )}
+                </div>
+              )}
+            </div>
+
+            {/* Model Selection */}
+            <div className="space-y-2">
+              <Label className="font-medium">Model</Label>
+              {models.length > 0 && (
+                <div className="flex rounded-lg border overflow-hidden">
+                  {models.map((model) => (
+                    <button
+                      key={model.id}
+                      onClick={() => handleModelChange(model.id)}
+                      disabled={isSaving}
+                      className={`flex-1 py-2 px-3 text-sm font-medium transition-colors ${
+                        (settings.api_model ?? settings.model) === model.id
+                          ? 'bg-primary text-primary-foreground'
+                          : 'bg-background text-foreground hover:bg-muted'
+                      } ${isSaving ? 'opacity-50 cursor-not-allowed' : ''}`}
+                    >
+                      <span className="block">{model.name}</span>
+                      <span className="block text-xs opacity-60">{model.id}</span>
+                    </button>
+                  ))}
+                </div>
+              )}
+              {/* Custom model input for Ollama/Custom */}
+              {showCustomModelInput && (
+                <div className="flex gap-2 pt-1">
+                  <input
+                    type="text"
+                    value={customModelInput}
+                    onChange={(e) => setCustomModelInput(e.target.value)}
+                    placeholder="Custom model name..."
+                    className="flex-1 py-1.5 px-3 text-sm border rounded-md bg-background"
+                    onKeyDown={(e) => e.key === 'Enter' && handleSaveCustomModel()}
+                  />
+                  <Button
+                    size="sm"
+                    onClick={handleSaveCustomModel}
+                    disabled={!customModelInput.trim() || isSaving}
+                  >
+                    Set
+                  </Button>
+                </div>
+              )}
+            </div>
+
+            <hr className="border-border" />
+
            {/* YOLO Mode Toggle */}
            <div className="flex items-center justify-between">
              <div className="space-y-0.5">
@@ -195,27 +408,6 @@ export function SettingsModal({ isOpen, onClose }: SettingsModalProps) {
              />
            </div>

-            {/* Model Selection */}
-            <div className="space-y-2">
-              <Label className="font-medium">Model</Label>
-              <div className="flex rounded-lg border overflow-hidden">
-                {models.map((model) => (
-                  <button
-                    key={model.id}
-                    onClick={() => handleModelChange(model.id)}
-                    disabled={isSaving}
-                    className={`flex-1 py-2 px-3 text-sm font-medium transition-colors ${
-                      settings.model === model.id
-                        ? 'bg-primary text-primary-foreground'
-                        : 'bg-background text-foreground hover:bg-muted'
-                    } ${isSaving ? 'opacity-50 cursor-not-allowed' : ''}`}
-                  >
-                    {model.name}
-                  </button>
-                ))}
-              </div>
-            </div>
-
            {/* Regression Agents */}
            <div className="space-y-2">
              <Label className="font-medium">Regression Agents</Label>
--- a/ui/src/components/ThemeSelector.tsx
+++ b/ui/src/components/ThemeSelector.tsx
@@ -1,6 +1,7 @@
 import { useState, useRef, useEffect } from 'react'
 import { Palette, Check } from 'lucide-react'
 import { Button } from '@/components/ui/button'
+import { Tooltip, TooltipTrigger, TooltipContent } from '@/components/ui/tooltip'
 import type { ThemeId, ThemeOption } from '../hooks/useTheme'

 interface ThemeSelectorProps {
@@ -97,16 +98,20 @@ export function ThemeSelector({ themes, currentTheme, onThemeChange }: ThemeSele
      onMouseEnter={handleMouseEnter}
      onMouseLeave={handleMouseLeave}
    >
-      <Button
-        variant="outline"
-        size="sm"
-        title="Theme"
-        aria-label="Select theme"
-        aria-expanded={isOpen}
-        aria-haspopup="true"
-      >
-        <Palette size={18} />
-      </Button>
+      <Tooltip>
+        <TooltipTrigger asChild>
+          <Button
+            variant="outline"
+            size="sm"
+            aria-label="Select theme"
+            aria-expanded={isOpen}
+            aria-haspopup="true"
+          >
+            <Palette size={18} />
+          </Button>
+        </TooltipTrigger>
+        <TooltipContent>Theme</TooltipContent>
+      </Tooltip>

      {/* Dropdown */}
      {isOpen && (
--- a/ui/src/components/ui/tooltip.tsx
+++ b/ui/src/components/ui/tooltip.tsx
@@ -0,0 +1,65 @@
+import * as React from "react"
+import * as TooltipPrimitive from "@radix-ui/react-tooltip"
+
+import { cn } from "@/lib/utils"
+
+function TooltipProvider({
+  delayDuration = 250,
+  ...props
+}: React.ComponentProps<typeof TooltipPrimitive.Provider> & {
+  delayDuration?: number
+}) {
+  return (
+    <TooltipPrimitive.Provider
+      data-slot="tooltip-provider"
+      delayDuration={delayDuration}
+      {...props}
+    />
+  )
+}
+
+function Tooltip({
+  ...props
+}: React.ComponentProps<typeof TooltipPrimitive.Root>) {
+  return <TooltipPrimitive.Root data-slot="tooltip" {...props} />
+}
+
+function TooltipTrigger({
+  ...props
+}: React.ComponentProps<typeof TooltipPrimitive.Trigger>) {
+  return <TooltipPrimitive.Trigger data-slot="tooltip-trigger" {...props} />
+}
+
+function TooltipContent({
+  className,
+  side = "bottom",
+  align = "center",
+  sideOffset = 8,
+  children,
+  ...props
+}: React.ComponentProps<typeof TooltipPrimitive.Content>) {
+  return (
+    <TooltipPrimitive.Portal>
+      <TooltipPrimitive.Content
+        data-slot="tooltip-content"
+        side={side}
+        align={align}
+        sideOffset={sideOffset}
+        className={cn(
+          "z-50 overflow-hidden rounded-md border bg-neutral-900 px-3 py-2 text-sm text-white shadow-md leading-tight min-h-7",
+          "data-[state=delayed-open]:animate-in data-[state=closed]:animate-out data-[state=closed]:fade-out-0 data-[state=delayed-open]:fade-in-0 data-[side=bottom]:slide-in-from-top-2 data-[side=left]:slide-in-from-right-2 data-[side=right]:slide-in-from-left-2 data-[side=top]:slide-in-from-bottom-2",
+          className
+        )}
+        {...props}
+      >
+        {children}
+        <TooltipPrimitive.Arrow
+          data-slot="tooltip-arrow"
+          className="fill-neutral-900"
+        />
+      </TooltipPrimitive.Content>
+    </TooltipPrimitive.Portal>
+  )
+}
+
+export { Tooltip, TooltipContent, TooltipProvider, TooltipTrigger }
--- a/ui/src/hooks/useAssistantChat.ts
+++ b/ui/src/hooks/useAssistantChat.ts
@@ -3,7 +3,7 @@
 */

 import { useState, useCallback, useRef, useEffect } from "react";
-import type { ChatMessage, AssistantChatServerMessage } from "../lib/types";
+import type { ChatMessage, AssistantChatServerMessage, SpecQuestion } from "../lib/types";

 type ConnectionStatus = "disconnected" | "connecting" | "connected" | "error";

@@ -17,8 +17,10 @@ interface UseAssistantChatReturn {
  isLoading: boolean;
  connectionStatus: ConnectionStatus;
  conversationId: number | null;
+  currentQuestions: SpecQuestion[] | null;
  start: (conversationId?: number | null) => void;
  sendMessage: (content: string) => void;
+  sendAnswer: (answers: Record<string, string | string[]>) => void;
  disconnect: () => void;
  clearMessages: () => void;
 }
@@ -36,6 +38,7 @@ export function useAssistantChat({
  const [connectionStatus, setConnectionStatus] =
    useState<ConnectionStatus>("disconnected");
  const [conversationId, setConversationId] = useState<number | null>(null);
+  const [currentQuestions, setCurrentQuestions] = useState<SpecQuestion[] | null>(null);

  const wsRef = useRef<WebSocket | null>(null);
  const currentAssistantMessageRef = useRef<string | null>(null);
@@ -204,6 +207,25 @@ export function useAssistantChat({
            break;
          }

+          case "question": {
+            // Claude is asking structured questions via ask_user tool
+            setCurrentQuestions(data.questions);
+            setIsLoading(false);
+
+            // Attach questions to the last assistant message for display context
+            setMessages((prev) => {
+              const lastMessage = prev[prev.length - 1];
+              if (lastMessage?.role === "assistant" && lastMessage.isStreaming) {
+                return [
+                  ...prev.slice(0, -1),
+                  { ...lastMessage, isStreaming: false, questions: data.questions },
+                ];
+              }
+              return prev;
+            });
+            break;
+          }
+
          case "conversation_created": {
            setConversationId(data.conversation_id);
            break;
@@ -327,6 +349,49 @@ export function useAssistantChat({
    [onError],
  );

+  const sendAnswer = useCallback(
+    (answers: Record<string, string | string[]>) => {
+      if (!wsRef.current || wsRef.current.readyState !== WebSocket.OPEN) {
+        onError?.("Not connected");
+        return;
+      }
+
+      // Format answers as display text for user message
+      const answerParts: string[] = [];
+      for (const [, value] of Object.entries(answers)) {
+        if (Array.isArray(value)) {
+          answerParts.push(value.join(", "));
+        } else {
+          answerParts.push(value);
+        }
+      }
+      const displayText = answerParts.join("; ");
+
+      // Add user message to chat
+      setMessages((prev) => [
+        ...prev,
+        {
+          id: generateId(),
+          role: "user",
+          content: displayText,
+          timestamp: new Date(),
+        },
+      ]);
+
+      setCurrentQuestions(null);
+      setIsLoading(true);
+
+      // Send structured answer to server
+      wsRef.current.send(
+        JSON.stringify({
+          type: "answer",
+          answers,
+        }),
+      );
+    },
+    [onError],
+  );
+
  const disconnect = useCallback(() => {
    reconnectAttempts.current = maxReconnectAttempts; // Prevent reconnection
    if (pingIntervalRef.current) {
@@ -350,8 +415,10 @@ export function useAssistantChat({
    isLoading,
    connectionStatus,
    conversationId,
+    currentQuestions,
    start,
    sendMessage,
+    sendAnswer,
    disconnect,
    clearMessages,
  };
--- a/ui/src/hooks/useCelebration.ts
+++ b/ui/src/hooks/useCelebration.ts
@@ -137,6 +137,7 @@ function isAllComplete(features: FeatureListResponse | undefined): boolean {
  return (
    features.pending.length === 0 &&
    features.in_progress.length === 0 &&
+    (features.needs_human_input?.length || 0) === 0 &&
    features.done.length > 0
  )
 }
--- a/ui/src/hooks/useExpandChat.ts
+++ b/ui/src/hooks/useExpandChat.ts
@@ -107,16 +107,20 @@ export function useExpandChat({
      }, 30000)
    }

-    ws.onclose = () => {
+    ws.onclose = (event) => {
      setConnectionStatus('disconnected')
      if (pingIntervalRef.current) {
        clearInterval(pingIntervalRef.current)
        pingIntervalRef.current = null
      }

+      // Don't retry on application-level errors (4xxx codes won't resolve on retry)
+      const isAppError = event.code >= 4000 && event.code <= 4999
+
      // Attempt reconnection if not intentionally closed
      if (
        !manuallyDisconnectedRef.current &&
+        !isAppError &&
        reconnectAttempts.current < maxReconnectAttempts &&
        !isCompleteRef.current
      ) {
--- a/ui/src/hooks/useProjects.ts
+++ b/ui/src/hooks/useProjects.ts
@@ -4,7 +4,7 @@

 import { useQuery, useMutation, useQueryClient } from '@tanstack/react-query'
 import * as api from '../lib/api'
-import type { FeatureCreate, FeatureUpdate, ModelsResponse, ProjectSettingsUpdate, Settings, SettingsUpdate } from '../lib/types'
+import type { DevServerConfig, FeatureCreate, FeatureUpdate, ModelsResponse, ProjectSettingsUpdate, ProvidersResponse, Settings, SettingsUpdate } from '../lib/types'

 // ============================================================================
 // Projects
@@ -133,6 +133,18 @@ export function useUpdateFeature(projectName: string) {
  })
 }

+export function useResolveHumanInput(projectName: string) {
+  const queryClient = useQueryClient()
+
+  return useMutation({
+    mutationFn: ({ featureId, fields }: { featureId: number; fields: Record<string, string | boolean | string[]> }) =>
+      api.resolveHumanInput(projectName, featureId, { fields }),
+    onSuccess: () => {
+      queryClient.invalidateQueries({ queryKey: ['features', projectName] })
+    },
+  })
+}
+
 // ============================================================================
 // Agent
 // ============================================================================
@@ -197,6 +209,28 @@ export function useResumeAgent(projectName: string) {
  })
 }

+export function useGracefulPauseAgent(projectName: string) {
+  const queryClient = useQueryClient()
+
+  return useMutation({
+    mutationFn: () => api.gracefulPauseAgent(projectName),
+    onSuccess: () => {
+      queryClient.invalidateQueries({ queryKey: ['agent-status', projectName] })
+    },
+  })
+}
+
+export function useGracefulResumeAgent(projectName: string) {
+  const queryClient = useQueryClient()
+
+  return useMutation({
+    mutationFn: () => api.gracefulResumeAgent(projectName),
+    onSuccess: () => {
+      queryClient.invalidateQueries({ queryKey: ['agent-status', projectName] })
+    },
+  })
+}
+
 // ============================================================================
 // Setup
 // ============================================================================
@@ -254,20 +288,41 @@ export function useValidatePath() {
 // Default models response for placeholder (until API responds)
 const DEFAULT_MODELS: ModelsResponse = {
  models: [
-    { id: 'claude-opus-4-5-20251101', name: 'Claude Opus 4.5' },
-    { id: 'claude-sonnet-4-5-20250929', name: 'Claude Sonnet 4.5' },
+    { id: 'claude-opus-4-6', name: 'Claude Opus' },
+    { id: 'claude-sonnet-4-5-20250929', name: 'Claude Sonnet' },
  ],
-  default: 'claude-opus-4-5-20251101',
+  default: 'claude-opus-4-6',
 }

 const DEFAULT_SETTINGS: Settings = {
  yolo_mode: false,
-  model: 'claude-opus-4-5-20251101',
+  model: 'claude-opus-4-6',
  glm_mode: false,
  ollama_mode: false,
  testing_agent_ratio: 1,
  playwright_headless: true,
  batch_size: 3,
+  api_provider: 'claude',
+  api_base_url: null,
+  api_has_auth_token: false,
+  api_model: null,
+}
+
+const DEFAULT_PROVIDERS: ProvidersResponse = {
+  providers: [
+    { id: 'claude', name: 'Claude (Anthropic)', base_url: null, models: DEFAULT_MODELS.models, default_model: 'claude-opus-4-6', requires_auth: false },
+  ],
+  current: 'claude',
+}
+
+export function useAvailableProviders() {
+  return useQuery({
+    queryKey: ['available-providers'],
+    queryFn: api.getAvailableProviders,
+    staleTime: 300000,
+    retry: 1,
+    placeholderData: DEFAULT_PROVIDERS,
+  })
 }

 export function useAvailableModels() {
@@ -319,6 +374,41 @@ export function useUpdateSettings() {
    },
    onSettled: () => {
      queryClient.invalidateQueries({ queryKey: ['settings'] })
+      queryClient.invalidateQueries({ queryKey: ['available-models'] })
+      queryClient.invalidateQueries({ queryKey: ['available-providers'] })
+    },
+  })
+}
+
+// ============================================================================
+// Dev Server Config
+// ============================================================================
+
+// Default config for placeholder (until API responds)
+const DEFAULT_DEV_SERVER_CONFIG: DevServerConfig = {
+  detected_type: null,
+  detected_command: null,
+  custom_command: null,
+  effective_command: null,
+}
+
+export function useDevServerConfig(projectName: string | null) {
+  return useQuery({
+    queryKey: ['dev-server-config', projectName],
+    queryFn: () => api.getDevServerConfig(projectName!),
+    enabled: !!projectName,
+    staleTime: 30_000,
+    placeholderData: DEFAULT_DEV_SERVER_CONFIG,
+  })
+}
+
+export function useUpdateDevServerConfig(projectName: string) {
+  const queryClient = useQueryClient()
+  return useMutation({
+    mutationFn: (customCommand: string | null) =>
+      api.updateDevServerConfig(projectName, customCommand),
+    onSuccess: () => {
+      queryClient.invalidateQueries({ queryKey: ['dev-server-config', projectName] })
    },
  })
 }
--- a/ui/src/hooks/useSpecChat.ts
+++ b/ui/src/hooks/useSpecChat.ts
@@ -157,15 +157,18 @@ export function useSpecChat({
      }, 30000)
    }

-    ws.onclose = () => {
+    ws.onclose = (event) => {
      setConnectionStatus('disconnected')
      if (pingIntervalRef.current) {
        clearInterval(pingIntervalRef.current)
        pingIntervalRef.current = null
      }

+      // Don't retry on application-level errors (4xxx codes won't resolve on retry)
+      const isAppError = event.code >= 4000 && event.code <= 4999
+
      // Attempt reconnection if not intentionally closed
-      if (reconnectAttempts.current < maxReconnectAttempts && !isCompleteRef.current) {
+      if (!isAppError && reconnectAttempts.current < maxReconnectAttempts && !isCompleteRef.current) {
        reconnectAttempts.current++
        const delay = Math.min(1000 * Math.pow(2, reconnectAttempts.current), 10000)
        reconnectTimeoutRef.current = window.setTimeout(connect, delay)
--- a/ui/src/hooks/useWebSocket.ts
+++ b/ui/src/hooks/useWebSocket.ts
@@ -33,6 +33,7 @@ interface WebSocketState {
  progress: {
    passing: number
    in_progress: number
+    needs_human_input: number
    total: number
    percentage: number
  }
@@ -60,7 +61,7 @@ const MAX_AGENT_LOGS = 500 // Keep last 500 log lines per agent

 export function useProjectWebSocket(projectName: string | null) {
  const [state, setState] = useState<WebSocketState>({
-    progress: { passing: 0, in_progress: 0, total: 0, percentage: 0 },
+    progress: { passing: 0, in_progress: 0, needs_human_input: 0, total: 0, percentage: 0 },
    agentStatus: 'loading',
    logs: [],
    isConnected: false,
@@ -107,6 +108,7 @@ export function useProjectWebSocket(projectName: string | null) {
                progress: {
                  passing: message.passing,
                  in_progress: message.in_progress,
+                  needs_human_input: message.needs_human_input ?? 0,
                  total: message.total,
                  percentage: message.percentage,
                },
@@ -335,10 +337,14 @@ export function useProjectWebSocket(projectName: string | null) {
        }
      }

-      ws.onclose = () => {
+      ws.onclose = (event) => {
        setState(prev => ({ ...prev, isConnected: false }))
        wsRef.current = null

+        // Don't retry on application-level errors (4xxx codes won't resolve on retry)
+        const isAppError = event.code >= 4000 && event.code <= 4999
+        if (isAppError) return
+
        // Exponential backoff reconnection
        const delay = Math.min(1000 * Math.pow(2, reconnectAttempts.current), 30000)
        reconnectAttempts.current++
@@ -381,7 +387,7 @@ export function useProjectWebSocket(projectName: string | null) {
    // Reset state when project changes to clear stale data
    // Use 'loading' for agentStatus to show loading indicator until WebSocket provides actual status
    setState({
-      progress: { passing: 0, in_progress: 0, total: 0, percentage: 0 },
+      progress: { passing: 0, in_progress: 0, needs_human_input: 0, total: 0, percentage: 0 },
      agentStatus: 'loading',
      logs: [],
      isConnected: false,
--- a/ui/src/lib/api.ts
+++ b/ui/src/lib/api.ts
@@ -24,6 +24,7 @@ import type {
  Settings,
  SettingsUpdate,
  ModelsResponse,
+  ProvidersResponse,
  DevServerStatusResponse,
  DevServerConfig,
  TerminalInfo,
@@ -180,6 +181,17 @@ export async function createFeaturesBulk(
  })
 }

+export async function resolveHumanInput(
+  projectName: string,
+  featureId: number,
+  response: { fields: Record<string, string | boolean | string[]> }
+): Promise<Feature> {
+  return fetchJSON(`/projects/${encodeURIComponent(projectName)}/features/${featureId}/resolve-human-input`, {
+    method: 'POST',
+    body: JSON.stringify(response),
+  })
+}
+
 // ============================================================================
 // Dependency Graph API
 // ============================================================================
@@ -270,6 +282,18 @@ export async function resumeAgent(projectName: string): Promise<AgentActionRespo
  })
 }

+export async function gracefulPauseAgent(projectName: string): Promise<AgentActionResponse> {
+  return fetchJSON(`/projects/${encodeURIComponent(projectName)}/agent/graceful-pause`, {
+    method: 'POST',
+  })
+}
+
+export async function gracefulResumeAgent(projectName: string): Promise<AgentActionResponse> {
+  return fetchJSON(`/projects/${encodeURIComponent(projectName)}/agent/graceful-resume`, {
+    method: 'POST',
+  })
+}
+
 // ============================================================================
 // Spec Creation API
 // ============================================================================
@@ -399,6 +423,10 @@ export async function getAvailableModels(): Promise<ModelsResponse> {
  return fetchJSON('/settings/models')
 }

+export async function getAvailableProviders(): Promise<ProvidersResponse> {
+  return fetchJSON('/settings/providers')
+}
+
 export async function getSettings(): Promise<Settings> {
  return fetchJSON('/settings')
 }
@@ -440,6 +468,16 @@ export async function getDevServerConfig(projectName: string): Promise<DevServer
  return fetchJSON(`/projects/${encodeURIComponent(projectName)}/devserver/config`)
 }

+export async function updateDevServerConfig(
+  projectName: string,
+  customCommand: string | null
+): Promise<DevServerConfig> {
+  return fetchJSON(`/projects/${encodeURIComponent(projectName)}/devserver/config`, {
+    method: 'PATCH',
+    body: JSON.stringify({ custom_command: customCommand }),
+  })
+}
+
 // ============================================================================
 // Terminal API
 // ============================================================================
--- a/ui/src/lib/types.ts
+++ b/ui/src/lib/types.ts
@@ -57,6 +57,26 @@ export interface ProjectPrompts {
  coding_prompt: string
 }

+// Human input types
+export interface HumanInputField {
+  id: string
+  label: string
+  type: 'text' | 'textarea' | 'select' | 'boolean'
+  required: boolean
+  placeholder?: string
+  options?: { value: string; label: string }[]
+}
+
+export interface HumanInputRequest {
+  prompt: string
+  fields: HumanInputField[]
+}
+
+export interface HumanInputResponseData {
+  fields: Record<string, string | boolean | string[]>
+  responded_at?: string
+}
+
 // Feature types
 export interface Feature {
  id: number
@@ -70,10 +90,13 @@ export interface Feature {
  dependencies?: number[]           // Optional for backwards compat
  blocked?: boolean                 // Computed by API
  blocking_dependencies?: number[]  // Computed by API
+  needs_human_input?: boolean
+  human_input_request?: HumanInputRequest | null
+  human_input_response?: HumanInputResponseData | null
 }

 // Status type for graph nodes
-export type FeatureStatus = 'pending' | 'in_progress' | 'done' | 'blocked'
+export type FeatureStatus = 'pending' | 'in_progress' | 'done' | 'blocked' | 'needs_human_input'

 // Graph visualization types
 export interface GraphNode {
@@ -99,6 +122,7 @@ export interface FeatureListResponse {
  pending: Feature[]
  in_progress: Feature[]
  done: Feature[]
+  needs_human_input: Feature[]
 }

 export interface FeatureCreate {
@@ -120,7 +144,7 @@ export interface FeatureUpdate {
 }

 // Agent types
-export type AgentStatus = 'stopped' | 'running' | 'paused' | 'crashed' | 'loading'
+export type AgentStatus = 'stopped' | 'running' | 'paused' | 'crashed' | 'loading' | 'pausing' | 'paused_graceful'

 export interface AgentStatusResponse {
  status: AgentStatus
@@ -216,6 +240,8 @@ export type OrchestratorState =
  | 'spawning'
  | 'monitoring'
  | 'complete'
+  | 'draining'
+  | 'paused'

 // Orchestrator event for recent activity
 export interface OrchestratorEvent {
@@ -248,6 +274,7 @@ export interface WSProgressMessage {
  in_progress: number
  total: number
  percentage: number
+  needs_human_input?: number
 }

 export interface WSFeatureUpdateMessage {
@@ -465,6 +492,11 @@ export interface AssistantChatConversationCreatedMessage {
  conversation_id: number
 }

+export interface AssistantChatQuestionMessage {
+  type: 'question'
+  questions: SpecQuestion[]
+}
+
 export interface AssistantChatPongMessage {
  type: 'pong'
 }
@@ -472,6 +504,7 @@ export interface AssistantChatPongMessage {
 export type AssistantChatServerMessage =
  | AssistantChatTextMessage
  | AssistantChatToolCallMessage
+  | AssistantChatQuestionMessage
  | AssistantChatResponseDoneMessage
  | AssistantChatErrorMessage
  | AssistantChatConversationCreatedMessage
@@ -525,6 +558,20 @@ export interface ModelsResponse {
  default: string
 }

+export interface ProviderInfo {
+  id: string
+  name: string
+  base_url: string | null
+  models: ModelInfo[]
+  default_model: string
+  requires_auth: boolean
+}
+
+export interface ProvidersResponse {
+  providers: ProviderInfo[]
+  current: string
+}
+
 export interface Settings {
  yolo_mode: boolean
  model: string
@@ -533,6 +580,10 @@ export interface Settings {
  testing_agent_ratio: number  // Regression testing agents (0-3)
  playwright_headless: boolean
  batch_size: number  // Features per coding agent batch (1-3)
+  api_provider: string
+  api_base_url: string | null
+  api_has_auth_token: boolean
+  api_model: string | null
 }

 export interface SettingsUpdate {
@@ -541,6 +592,10 @@ export interface SettingsUpdate {
  testing_agent_ratio?: number
  playwright_headless?: boolean
  batch_size?: number
+  api_provider?: string
+  api_base_url?: string
+  api_auth_token?: string
+  api_model?: string
 }

 export interface ProjectSettingsUpdate {
--- a/ui/src/styles/globals.css
+++ b/ui/src/styles/globals.css
@@ -1271,6 +1271,186 @@
  margin: 2rem 0;
 }

+/* ============================================================================
+   Chat Prose Typography (for markdown in chat bubbles)
+   ============================================================================ */
+
+.chat-prose {
+  line-height: 1.6;
+  color: inherit;
+}
+
+.chat-prose > :first-child {
+  margin-top: 0;
+}
+
+.chat-prose > :last-child {
+  margin-bottom: 0;
+}
+
+.chat-prose h1 {
+  font-size: 1.25rem;
+  font-weight: 700;
+  margin-top: 1.25rem;
+  margin-bottom: 0.5rem;
+}
+
+.chat-prose h2 {
+  font-size: 1.125rem;
+  font-weight: 700;
+  margin-top: 1rem;
+  margin-bottom: 0.5rem;
+}
+
+.chat-prose h3 {
+  font-size: 1rem;
+  font-weight: 600;
+  margin-top: 0.75rem;
+  margin-bottom: 0.375rem;
+}
+
+.chat-prose h4,
+.chat-prose h5,
+.chat-prose h6 {
+  font-size: 0.875rem;
+  font-weight: 600;
+  margin-top: 0.75rem;
+  margin-bottom: 0.25rem;
+}
+
+.chat-prose p {
+  margin-bottom: 0.5rem;
+}
+
+.chat-prose ul,
+.chat-prose ol {
+  margin-bottom: 0.5rem;
+  padding-left: 1.25rem;
+}
+
+.chat-prose ul {
+  list-style-type: disc;
+}
+
+.chat-prose ol {
+  list-style-type: decimal;
+}
+
+.chat-prose li {
+  margin-bottom: 0.25rem;
+}
+
+.chat-prose li > ul,
+.chat-prose li > ol {
+  margin-top: 0.25rem;
+  margin-bottom: 0;
+}
+
+.chat-prose pre {
+  background: var(--muted);
+  border: 1px solid var(--border);
+  border-radius: var(--radius);
+  padding: 0.75rem;
+  overflow-x: auto;
+  margin-bottom: 0.5rem;
+  font-family: var(--font-mono);
+  font-size: 0.75rem;
+  line-height: 1.5;
+}
+
+.chat-prose code:not(pre code) {
+  background: var(--muted);
+  padding: 0.1rem 0.3rem;
+  border-radius: 0.25rem;
+  font-family: var(--font-mono);
+  font-size: 0.75rem;
+}
+
+.chat-prose table {
+  width: 100%;
+  border-collapse: collapse;
+  margin-bottom: 0.5rem;
+  font-size: 0.8125rem;
+}
+
+.chat-prose th {
+  background: var(--muted);
+  font-weight: 600;
+  text-align: left;
+  padding: 0.375rem 0.5rem;
+  border: 1px solid var(--border);
+}
+
+.chat-prose td {
+  padding: 0.375rem 0.5rem;
+  border: 1px solid var(--border);
+}
+
+.chat-prose blockquote {
+  border-left: 3px solid var(--primary);
+  padding-left: 0.75rem;
+  margin-bottom: 0.5rem;
+  font-style: italic;
+  opacity: 0.9;
+}
+
+.chat-prose a {
+  color: var(--primary);
+  text-decoration: underline;
+  text-underline-offset: 2px;
+}
+
+.chat-prose a:hover {
+  opacity: 0.8;
+}
+
+.chat-prose strong {
+  font-weight: 700;
+}
+
+.chat-prose hr {
+  border: none;
+  border-top: 1px solid var(--border);
+  margin: 0.75rem 0;
+}
+
+.chat-prose img {
+  max-width: 100%;
+  border-radius: var(--radius);
+}
+
+/* User message overrides - need contrast against primary-colored bubble */
+.chat-prose-user pre {
+  background: rgb(255 255 255 / 0.15);
+  border-color: rgb(255 255 255 / 0.2);
+}
+
+.chat-prose-user code:not(pre code) {
+  background: rgb(255 255 255 / 0.15);
+}
+
+.chat-prose-user th {
+  background: rgb(255 255 255 / 0.15);
+}
+
+.chat-prose-user th,
+.chat-prose-user td {
+  border-color: rgb(255 255 255 / 0.2);
+}
+
+.chat-prose-user blockquote {
+  border-left-color: rgb(255 255 255 / 0.5);
+}
+
+.chat-prose-user a {
+  color: inherit;
+  text-decoration: underline;
+}
+
+.chat-prose-user hr {
+  border-top-color: rgb(255 255 255 / 0.2);
+}
+
 /* ============================================================================
   Scrollbar Styling
   ============================================================================ */
--- a/ui/vite.config.ts
+++ b/ui/vite.config.ts
@@ -36,6 +36,8 @@ export default defineConfig({
            '@radix-ui/react-slot',
            '@radix-ui/react-switch',
          ],
+          // Markdown rendering
+          'vendor-markdown': ['react-markdown', 'remark-gfm'],
          // Icons and utilities
          'vendor-utils': [
            'lucide-react',