feat: migrate browser automation from Playwright MCP to CLI, fix headless setting

Major changes across 21 files (755 additions, 196 deletions): Browser Automation Migration: - Add versioned project migration system (prompts.py) with content-based detection and section-level regex replacement for coding/testing prompts - Migrate STEP 5 (browser verification) and BROWSER AUTOMATION sections in coding prompt template to use playwright-cli commands - Migrate STEP 2 and AVAILABLE TOOLS sections in testing prompt template - Migration auto-runs at agent startup (autonomous_agent_demo.py), copies playwright-cli skill, scaffolds .playwright/cli.config.json, updates .gitignore, and stamps .migration_version file - Add playwright-cli command validation to security allowlist (security.py) with tests for allowed subcommands and blocked eval/run-code Headless Browser Setting Fix: - Add _apply_playwright_headless() to process_manager.py that reads/updates .playwright/cli.config.json before agent subprocess launch - Remove dead PLAYWRIGHT_HEADLESS env var that was never consumed - Settings UI toggle now correctly controls visible browser window Playwright CLI Auto-Install: - Add ensurePlaywrightCli() to lib/cli.js for npm global entry point - Add playwright-cli detection + npm install to start.bat, start.sh, start_ui.bat, start_ui.sh for all startup paths Other Improvements: - Add project folder path tooltip to ProjectSelector.tsx dropdown items - Remove legacy Playwright MCP server configuration from client.py - Update CLAUDE.md with playwright-cli skill documentation Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-16 18:33:08 +00:00 · 2026-02-11 13:37:03 +02:00
parent f285db1ad3
commit e9873a2642
21 changed files with 754 additions and 195 deletions
--- a/.claude/templates/coding_prompt.template.md
+++ b/.claude/templates/coding_prompt.template.md
@@ -86,24 +86,33 @@ Implement the chosen feature thoroughly:

 **CRITICAL:** You MUST verify features through the actual UI.

-Use browser automation tools:
+Use `playwright-cli` for browser automation:

- Navigate to the app in a real browser
- Interact like a human user (click, type, scroll)
- Take screenshots at each step (use inline screenshots only -- do NOT save screenshot files to disk)
- Verify both functionality AND visual appearance
+- Open the browser: `playwright-cli open http://localhost:PORT`
+- Take a snapshot to see page elements: `playwright-cli snapshot`
+- Read the snapshot YAML file to see element refs
+- Click elements by ref: `playwright-cli click e5`
+- Type text: `playwright-cli type "search query"`
+- Fill form fields: `playwright-cli fill e3 "value"`
+- Take screenshots: `playwright-cli screenshot`
+- Read the screenshot file to verify visual appearance
+- Check console errors: `playwright-cli console`
+- Close browser when done: `playwright-cli close`
+
+**Token-efficient workflow:** `playwright-cli screenshot` and `snapshot` save files
+to `.playwright-cli/`. You will see a file link in the output. Read the file only
+when you need to verify visual appearance or find element refs.

 **DO:**
-
 - Test through the UI with clicks and keyboard input
- Take screenshots to verify visual appearance (inline only, never save to disk)
- Check for console errors in browser
+- Take screenshots and read them to verify visual appearance
+- Check for console errors with `playwright-cli console`
 - Verify complete user workflows end-to-end
+- Always run `playwright-cli close` when finished testing

 **DON'T:**
-
- Only test with curl commands (backend testing alone is insufficient)
- Use JavaScript evaluation to bypass UI (no shortcuts)
+- Only test with curl commands
+- Use JavaScript evaluation to bypass UI (`eval` and `run-code` are blocked)
 - Skip visual verification
 - Mark tests passing without thorough verification

@@ -145,7 +154,7 @@ Use the feature_mark_passing tool with feature_id=42
 - Combine or consolidate features
 - Reorder features

-**ONLY MARK A FEATURE AS PASSING AFTER VERIFICATION WITH SCREENSHOTS.**
+**ONLY MARK A FEATURE AS PASSING AFTER VERIFICATION WITH BROWSER AUTOMATION.**

 ### STEP 7: COMMIT YOUR PROGRESS

@@ -192,11 +201,15 @@ Before context fills up:

 ## BROWSER AUTOMATION

-Use Playwright MCP tools (`browser_*`) for UI verification. Key tools: `navigate`, `click`, `type`, `fill_form`, `take_screenshot`, `console_messages`, `network_requests`. All tools have auto-wait built in.
+Use `playwright-cli` commands for UI verification. Key commands: `open`, `goto`,
+`snapshot`, `click`, `type`, `fill`, `screenshot`, `console`, `close`.

-**Screenshot rule:** Always use inline mode (base64). NEVER save screenshots as files to disk.
+**How it works:** `playwright-cli` uses a persistent browser daemon. `open` starts it,
+subsequent commands interact via socket, `close` shuts it down. Screenshots and snapshots
+save to `.playwright-cli/` -- read the files when you need to verify content.

-Test like a human user with mouse and keyboard. Use `browser_console_messages` to detect errors. Don't bypass UI with JavaScript evaluation.
+Test like a human user with mouse and keyboard. Use `playwright-cli console` to detect
+JS errors. Don't bypass UI with JavaScript evaluation.

 ---

--- a/.claude/templates/testing_prompt.template.md
+++ b/.claude/templates/testing_prompt.template.md
@@ -31,26 +31,32 @@ For the feature returned:
 1. Read and understand the feature's verification steps
 2. Navigate to the relevant part of the application
 3. Execute each verification step using browser automation
-4. Take screenshots to document the verification (inline only -- do NOT save to disk)
+4. Take screenshots and read them to verify visual appearance
 5. Check for console errors

-Use browser automation tools:
+### Browser Automation (Playwright CLI)

 **Navigation & Screenshots:**
- browser_navigate - Navigate to a URL
- browser_take_screenshot - Capture screenshot (inline mode only -- never save to disk)
- browser_snapshot - Get accessibility tree snapshot
+- `playwright-cli open <url>` - Open browser and navigate
+- `playwright-cli goto <url>` - Navigate to URL
+- `playwright-cli screenshot` - Save screenshot to `.playwright-cli/`
+- `playwright-cli snapshot` - Save page snapshot with element refs to `.playwright-cli/`

 **Element Interaction:**
- browser_click - Click elements
- browser_type - Type text into editable elements
- browser_fill_form - Fill multiple form fields
- browser_select_option - Select dropdown options
- browser_press_key - Press keyboard keys
+- `playwright-cli click <ref>` - Click elements (ref from snapshot)
+- `playwright-cli type <text>` - Type text
+- `playwright-cli fill <ref> <text>` - Fill form fields
+- `playwright-cli select <ref> <val>` - Select dropdown
+- `playwright-cli press <key>` - Keyboard input

 **Debugging:**
- browser_console_messages - Get browser console output (check for errors)
- browser_network_requests - Monitor API calls
+- `playwright-cli console` - Check for JS errors
+- `playwright-cli network` - Monitor API calls
+
+**Cleanup:**
+- `playwright-cli close` - Close browser when done (ALWAYS do this)
+
+**Note:** Screenshots and snapshots save to files. Read the file to see the content.

 ### STEP 3: HANDLE RESULTS

@@ -79,7 +85,7 @@ A regression has been introduced. You MUST fix it:

 4. **Verify the fix:**
   - Run through all verification steps again
-   - Take screenshots confirming the fix (inline only, never save to disk)
+   - Take screenshots and read them to confirm the fix

 5. **Mark as passing after fix:**
   ```
@@ -98,7 +104,7 @@ A regression has been introduced. You MUST fix it:

 ---

-## AVAILABLE MCP TOOLS
+## AVAILABLE TOOLS

 ### Feature Management
 - `feature_get_stats` - Get progress overview (passing/in_progress/total counts)
@@ -106,19 +112,17 @@ A regression has been introduced. You MUST fix it:
 - `feature_mark_failing` - Mark a feature as failing (when you find a regression)
 - `feature_mark_passing` - Mark a feature as passing (after fixing a regression)

-### Browser Automation (Playwright)
-All interaction tools have **built-in auto-wait** -- no manual timeouts needed.
-
- `browser_navigate` - Navigate to URL
- `browser_take_screenshot` - Capture screenshot (inline only, never save to disk)
- `browser_snapshot` - Get accessibility tree
- `browser_click` - Click elements
- `browser_type` - Type text
- `browser_fill_form` - Fill form fields
- `browser_select_option` - Select dropdown
- `browser_press_key` - Keyboard input
- `browser_console_messages` - Check for JS errors
- `browser_network_requests` - Monitor API calls
+### Browser Automation (Playwright CLI)
+Use `playwright-cli` commands for browser interaction. Key commands:
+- `playwright-cli open <url>` - Open browser
+- `playwright-cli goto <url>` - Navigate to URL
+- `playwright-cli screenshot` - Take screenshot (saved to `.playwright-cli/`)
+- `playwright-cli snapshot` - Get page snapshot with element refs
+- `playwright-cli click <ref>` - Click element
+- `playwright-cli type <text>` - Type text
+- `playwright-cli fill <ref> <text>` - Fill form field
+- `playwright-cli console` - Check for JS errors
+- `playwright-cli close` - Close browser (always do this when done)

 ---

--- a/.gitignore
+++ b/.gitignore
@@ -10,6 +10,10 @@ issues/
 # Browser profiles for parallel agent execution
 .browser-profiles/

+# Playwright CLI daemon artifacts
+.playwright-cli/
+.playwright/
+
 # Log files
 logs/
 *.log
--- a/.npmignore
+++ b/.npmignore
@@ -28,5 +28,4 @@ start.sh
 start_ui.sh
 start_ui.py
 .claude/agents/
-.claude/skills/
 .claude/settings.json
--- a/CLAUDE.md
+++ b/CLAUDE.md
@@ -85,7 +85,7 @@ python autonomous_agent_demo.py --project-dir my-app --yolo

 **What's different in YOLO mode:**
 - No regression testing
- No Playwright MCP server (browser automation disabled)
+- No Playwright CLI (browser automation disabled)
 - Features marked passing after lint/type-check succeeds
 - Faster iteration for prototyping

@@ -163,7 +163,7 @@ Publishing: `npm publish` (triggers `prepublishOnly` which builds UI, then publi
 - `autonomous_agent_demo.py` - Entry point for running the agent (supports `--yolo`, `--parallel`, `--batch-size`, `--batch-features`)
 - `autoforge_paths.py` - Central path resolution with dual-path backward compatibility and migration
 - `agent.py` - Agent session loop using Claude Agent SDK
- `client.py` - ClaudeSDKClient configuration with security hooks, MCP servers, and Vertex AI support
+- `client.py` - ClaudeSDKClient configuration with security hooks, feature MCP server, and Vertex AI support
 - `security.py` - Bash command allowlist validation (ALLOWED_COMMANDS whitelist)
 - `prompts.py` - Prompt template loading with project-specific fallback and batch feature prompts
 - `progress.py` - Progress tracking, database queries, webhook notifications
@@ -288,6 +288,9 @@ Projects can be stored in any directory (registered in `~/.autoforge/registry.db
 - `.autoforge/.agent.lock` - Lock file to prevent multiple agent instances
 - `.autoforge/allowed_commands.yaml` - Project-specific bash command allowlist (optional)
 - `.autoforge/.gitignore` - Ignores runtime files
+- `.claude/skills/playwright-cli/` - Playwright CLI skill for browser automation
+- `.playwright/cli.config.json` - Browser configuration (headless, viewport, etc.)
+- `.playwright-cli/` - Playwright CLI daemon artifacts (screenshots, snapshots) - gitignored
 - `CLAUDE.md` - Stays at project root (SDK convention)
 - `app_spec.txt` - Root copy for agent template compatibility

@@ -445,6 +448,7 @@ Alternative providers are configured via the **Settings UI** (gear icon > API Pr
 **Skills** (`.claude/skills/`):
 - `frontend-design` - Distinctive, production-grade UI design
 - `gsd-to-autoforge-spec` - Convert GSD codebase mapping to AutoForge app_spec format
+- `playwright-cli` - Browser automation via Playwright CLI (copied to each project)

 **Other:**
 - `.claude/templates/` - Prompt templates copied to new projects
@@ -479,7 +483,7 @@ When running with `--parallel`, the orchestrator:
 1. Spawns multiple Claude agents as subprocesses (up to `--max-concurrency`)
 2. Each agent claims features atomically via `feature_claim_and_get`
 3. Features blocked by unmet dependencies are skipped
-4. Browser contexts are isolated per agent using `--isolated` flag
+4. Browser sessions are isolated per agent via `PLAYWRIGHT_CLI_SESSION` environment variable
 5. AgentTracker parses output and emits `agent_update` messages for UI

 ### Process Limits (Parallel Mode)
--- a/agent.py
+++ b/agent.py
@@ -240,17 +240,7 @@ async def run_autonomous_agent(
        print_session_header(iteration, is_initializer)

        # Create client (fresh context)
-        # Pass agent_id for browser isolation in multi-agent scenarios
-        import os
-        if agent_type == "testing":
-            agent_id = f"testing-{os.getpid()}"  # Unique ID for testing agents
-        elif feature_ids and len(feature_ids) > 1:
-            agent_id = f"batch-{feature_ids[0]}"
-        elif feature_id:
-            agent_id = f"feature-{feature_id}"
-        else:
-            agent_id = None
-        client = create_client(project_dir, model, yolo_mode=yolo_mode, agent_id=agent_id, agent_type=agent_type)
+        client = create_client(project_dir, model, yolo_mode=yolo_mode, agent_type=agent_type)

        # Choose prompt based on agent type
        if agent_type == "initializer":
--- a/autoforge_paths.py
+++ b/autoforge_paths.py
@@ -43,6 +43,7 @@ assistant.db-shm
 .claude_assistant_settings.json
 .claude_settings.expand.*.json
 .progress_cache
+.migration_version
 """


--- a/autonomous_agent_demo.py
+++ b/autonomous_agent_demo.py
@@ -237,6 +237,12 @@ def main() -> None:
    if migrated:
        print(f"Migrated project files to .autoforge/: {', '.join(migrated)}", flush=True)

+    # Migrate project to current AutoForge version (idempotent, safe)
+    from prompts import migrate_project_to_current
+    version_migrated = migrate_project_to_current(project_dir)
+    if version_migrated:
+        print(f"Upgraded project: {', '.join(version_migrated)}", flush=True)
+
    # Parse batch testing feature IDs (comma-separated string -> list[int])
    testing_feature_ids: list[int] | None = None
    if args.testing_feature_ids:
--- a/client.py
+++ b/client.py
@@ -21,16 +21,6 @@ from security import SENSITIVE_DIRECTORIES, bash_security_hook
 # Load environment variables from .env file if present
 load_dotenv()

-# Default Playwright headless mode - can be overridden via PLAYWRIGHT_HEADLESS env var
-# When True, browser runs invisibly in background (default - saves CPU)
-# When False, browser window is visible (useful for monitoring agent progress)
-DEFAULT_PLAYWRIGHT_HEADLESS = True
-
-# Default browser for Playwright - can be overridden via PLAYWRIGHT_BROWSER env var
-# Options: chrome, firefox, webkit, msedge
-# Firefox is recommended for lower CPU usage
-DEFAULT_PLAYWRIGHT_BROWSER = "firefox"
-
 # Extra read paths for cross-project file access (read-only)
 # Set EXTRA_READ_PATHS environment variable with comma-separated absolute paths
 # Example: EXTRA_READ_PATHS=/Volumes/Data/dev,/Users/shared/libs
@@ -41,6 +31,7 @@ EXTRA_READ_PATHS_VAR = "EXTRA_READ_PATHS"
 # this blocklist and the filesystem browser API share a single source of truth.
 EXTRA_READ_PATHS_BLOCKLIST = SENSITIVE_DIRECTORIES

+
 def convert_model_for_vertex(model: str) -> str:
    """
    Convert model name format for Vertex AI compatibility.
@@ -72,43 +63,6 @@ def convert_model_for_vertex(model: str) -> str:
    return model


-def get_playwright_headless() -> bool:
-    """
-    Get the Playwright headless mode setting.
-
-    Reads from PLAYWRIGHT_HEADLESS environment variable, defaults to True.
-    Returns True for headless mode (invisible browser), False for visible browser.
-    """
-    value = os.getenv("PLAYWRIGHT_HEADLESS", str(DEFAULT_PLAYWRIGHT_HEADLESS).lower()).strip().lower()
-    truthy = {"true", "1", "yes", "on"}
-    falsy = {"false", "0", "no", "off"}
-    if value not in truthy | falsy:
-        print(f"   - Warning: Invalid PLAYWRIGHT_HEADLESS='{value}', defaulting to {DEFAULT_PLAYWRIGHT_HEADLESS}")
-        return DEFAULT_PLAYWRIGHT_HEADLESS
-    return value in truthy
-
-
-# Valid browsers supported by Playwright MCP
-VALID_PLAYWRIGHT_BROWSERS = {"chrome", "firefox", "webkit", "msedge"}
-
-
-def get_playwright_browser() -> str:
-    """
-    Get the browser to use for Playwright.
-
-    Reads from PLAYWRIGHT_BROWSER environment variable, defaults to firefox.
-    Options: chrome, firefox, webkit, msedge
-    Firefox is recommended for lower CPU usage.
-    """
-    value = os.getenv("PLAYWRIGHT_BROWSER", DEFAULT_PLAYWRIGHT_BROWSER).strip().lower()
-    if value not in VALID_PLAYWRIGHT_BROWSERS:
-        print(f"   - Warning: Invalid PLAYWRIGHT_BROWSER='{value}', "
-              f"valid options: {', '.join(sorted(VALID_PLAYWRIGHT_BROWSERS))}. "
-              f"Defaulting to {DEFAULT_PLAYWRIGHT_BROWSER}")
-        return DEFAULT_PLAYWRIGHT_BROWSER
-    return value
-
-
 def get_extra_read_paths() -> list[Path]:
    """
    Get extra read-only paths from EXTRA_READ_PATHS environment variable.
@@ -228,41 +182,6 @@ ALL_FEATURE_MCP_TOOLS = sorted(
    set(CODING_AGENT_TOOLS) | set(TESTING_AGENT_TOOLS) | set(INITIALIZER_AGENT_TOOLS)
 )

-# Playwright MCP tools for browser automation.
-# Full set of tools for comprehensive UI testing including drag-and-drop,
-# hover menus, file uploads, tab management, etc.
-PLAYWRIGHT_TOOLS = [
-    # Core navigation & screenshots
-    "mcp__playwright__browser_navigate",
-    "mcp__playwright__browser_navigate_back",
-    "mcp__playwright__browser_take_screenshot",
-    "mcp__playwright__browser_snapshot",
-
-    # Element interaction
-    "mcp__playwright__browser_click",
-    "mcp__playwright__browser_type",
-    "mcp__playwright__browser_fill_form",
-    "mcp__playwright__browser_select_option",
-    "mcp__playwright__browser_press_key",
-    "mcp__playwright__browser_drag",
-    "mcp__playwright__browser_hover",
-    "mcp__playwright__browser_file_upload",
-
-    # JavaScript & debugging
-    "mcp__playwright__browser_evaluate",
-    # "mcp__playwright__browser_run_code",  # REMOVED - causes Playwright MCP server crash
-    "mcp__playwright__browser_console_messages",
-    "mcp__playwright__browser_network_requests",
-
-    # Browser management
-    "mcp__playwright__browser_resize",
-    "mcp__playwright__browser_wait_for",
-    "mcp__playwright__browser_handle_dialog",
-    "mcp__playwright__browser_install",
-    "mcp__playwright__browser_close",
-    "mcp__playwright__browser_tabs",
-]
-
 # Built-in tools available to agents.
 # WebFetch and WebSearch are included so coding agents can look up current
 # documentation for frameworks and libraries they are implementing.
@@ -282,7 +201,6 @@ def create_client(
    project_dir: Path,
    model: str,
    yolo_mode: bool = False,
-    agent_id: str | None = None,
    agent_type: str = "coding",
 ):
    """
@@ -291,9 +209,7 @@ def create_client(
    Args:
        project_dir: Directory for the project
        model: Claude model to use
-        yolo_mode: If True, skip Playwright MCP server for rapid prototyping
-        agent_id: Optional unique identifier for browser isolation in parallel mode.
-                  When provided, each agent gets its own browser profile.
+        yolo_mode: If True, skip browser testing for rapid prototyping
        agent_type: One of "coding", "testing", or "initializer". Controls which
                    MCP tools are exposed and the max_turns limit.

@@ -327,11 +243,8 @@ def create_client(
    }
    max_turns = max_turns_map.get(agent_type, 300)

-    # Build allowed tools list based on mode and agent type.
-    # In YOLO mode, exclude Playwright tools for faster prototyping.
+    # Build allowed tools list based on agent type.
    allowed_tools = [*BUILTIN_TOOLS, *feature_tools]
-    if not yolo_mode:
-        allowed_tools.extend(PLAYWRIGHT_TOOLS)

    # Build permissions list.
    # We permit ALL feature MCP tools at the security layer (so the MCP server
@@ -363,10 +276,6 @@ def create_client(
        permissions_list.append(f"Glob({path}/**)")
        permissions_list.append(f"Grep({path}/**)")

-    if not yolo_mode:
-        # Allow Playwright MCP tools for browser automation (standard mode only)
-        permissions_list.extend(PLAYWRIGHT_TOOLS)
-
    # Create comprehensive security settings
    # Note: Using relative paths ("./**") restricts access to project directory
    # since cwd is set to project_dir
@@ -395,9 +304,9 @@ def create_client(
        print(f"   - Extra read paths (validated): {', '.join(str(p) for p in extra_read_paths)}")
    print("   - Bash commands restricted to allowlist (see security.py)")
    if yolo_mode:
-        print("   - MCP servers: features (database) - YOLO MODE (no Playwright)")
+        print("   - MCP servers: features (database) - YOLO MODE (no browser testing)")
    else:
-        print("   - MCP servers: playwright (browser), features (database)")
+        print("   - MCP servers: features (database)")
    print("   - Project settings enabled (skills, commands, CLAUDE.md)")
    print()

@@ -421,36 +330,6 @@ def create_client(
            },
        },
    }
-    if not yolo_mode:
-        # Include Playwright MCP server for browser automation (standard mode only)
-        # Browser and headless mode configurable via environment variables
-        browser = get_playwright_browser()
-        playwright_args = [
-            "@playwright/mcp@latest",
-            "--viewport-size", "1280x720",
-            "--browser", browser,
-        ]
-        if get_playwright_headless():
-            playwright_args.append("--headless")
-        print(f"   - Browser: {browser} (headless={get_playwright_headless()})")
-
-        # Browser isolation for parallel execution
-        # Each agent gets its own isolated browser context to prevent tab conflicts
-        if agent_id:
-            # Use --isolated for ephemeral browser context
-            # This creates a fresh, isolated context without persistent state
-            # Note: --isolated and --user-data-dir are mutually exclusive
-            playwright_args.append("--isolated")
-            print(f"   - Browser isolation enabled for agent: {agent_id}")
-
-        mcp_servers["playwright"] = {
-            "command": "npx",
-            "args": playwright_args,
-            "env": {
-                "NODE_COMPILE_CACHE": "",  # Disable V8 compile caching to prevent .node file accumulation in %TEMP%
-            },
-        }
-
    # Build environment overrides for API endpoint configuration
    # Uses get_effective_sdk_env() which reads provider settings from the database,
    # ensuring UI-configured alternative providers (GLM, Ollama, Kimi, Custom) propagate
--- a/lib/cli.js
+++ b/lib/cli.js
@@ -517,6 +517,41 @@ function killProcess(pid) {
  }
 }

+// ---------------------------------------------------------------------------
+// Playwright CLI
+// ---------------------------------------------------------------------------
+
+/**
+ * Ensure playwright-cli is available globally for browser automation.
+ * Returns true if available (already installed or freshly installed).
+ *
+ * @param {boolean} showProgress - If true, print install progress
+ */
+function ensurePlaywrightCli(showProgress) {
+  try {
+    execSync('playwright-cli --version', {
+      timeout: 10_000,
+      stdio: ['pipe', 'pipe', 'pipe'],
+    });
+    return true;
+  } catch {
+    // Not installed — try to install
+  }
+
+  if (showProgress) {
+    log('      Installing playwright-cli for browser automation...');
+  }
+  try {
+    execSync('npm install -g @playwright/cli', {
+      timeout: 120_000,
+      stdio: ['pipe', 'pipe', 'pipe'],
+    });
+    return true;
+  } catch {
+    return false;
+  }
+}
+
 // ---------------------------------------------------------------------------
 // CLI commands
 // ---------------------------------------------------------------------------
@@ -613,6 +648,14 @@ function startServer(opts) {
  }
  const wasAlreadyReady = ensureVenv(python, repair);

+  // Ensure playwright-cli for browser automation (quick check, installs once)
+  if (!ensurePlaywrightCli(!wasAlreadyReady)) {
+    log('');
+    log('  Note: playwright-cli not available (browser automation will be limited)');
+    log('  Install manually: npm install -g @playwright/cli');
+    log('');
+  }
+
  // Step 3: Config file
  const configCreated = ensureEnvFile();

--- a/package.json
+++ b/package.json
@@ -19,6 +19,7 @@
    "ui/dist/",
    "ui/package.json",
    ".claude/commands/",
+    ".claude/skills/",
    ".claude/templates/",
    "examples/",
    "start.py",
--- a/prompts.py
+++ b/prompts.py
@@ -16,6 +16,9 @@ from pathlib import Path
 # Base templates location (generic templates)
 TEMPLATES_DIR = Path(__file__).parent / ".claude" / "templates"

+# Migration version — bump when adding new migration steps
+CURRENT_MIGRATION_VERSION = 1
+

 def get_project_prompts_dir(project_dir: Path) -> Path:
    """Get the prompts directory for a specific project."""
@@ -99,9 +102,9 @@ def _strip_browser_testing_sections(prompt: str) -> str:
        flags=re.DOTALL,
    )

-    # Replace the screenshots-only marking rule with YOLO-appropriate wording
+    # Replace the marking rule with YOLO-appropriate wording
    prompt = prompt.replace(
-        "**ONLY MARK A FEATURE AS PASSING AFTER VERIFICATION WITH SCREENSHOTS.**",
+        "**ONLY MARK A FEATURE AS PASSING AFTER VERIFICATION WITH BROWSER AUTOMATION.**",
        "**YOLO mode: Mark a feature as passing after lint/type-check succeeds and server starts cleanly.**",
    )

@@ -351,9 +354,70 @@ def scaffold_project_prompts(project_dir: Path) -> Path:
        except (OSError, PermissionError) as e:
            print(f"  Warning: Could not copy allowed_commands.yaml: {e}")

+    # Copy Playwright CLI skill for browser automation
+    skills_src = Path(__file__).parent / ".claude" / "skills" / "playwright-cli"
+    skills_dest = project_dir / ".claude" / "skills" / "playwright-cli"
+    if skills_src.exists() and not skills_dest.exists():
+        try:
+            shutil.copytree(skills_src, skills_dest)
+            copied_files.append(".claude/skills/playwright-cli/")
+        except (OSError, PermissionError) as e:
+            print(f"  Warning: Could not copy playwright-cli skill: {e}")
+
+    # Ensure .playwright-cli/ and .playwright/ are in project .gitignore
+    project_gitignore = project_dir / ".gitignore"
+    entries_to_add = [".playwright-cli/", ".playwright/"]
+    existing_lines: list[str] = []
+    if project_gitignore.exists():
+        try:
+            existing_lines = project_gitignore.read_text(encoding="utf-8").splitlines()
+        except (OSError, PermissionError):
+            pass
+    missing_entries = [e for e in entries_to_add if e not in existing_lines]
+    if missing_entries:
+        try:
+            with open(project_gitignore, "a", encoding="utf-8") as f:
+                # Add newline before entries if file doesn't end with one
+                if existing_lines and existing_lines[-1].strip():
+                    f.write("\n")
+                for entry in missing_entries:
+                    f.write(f"{entry}\n")
+        except (OSError, PermissionError) as e:
+            print(f"  Warning: Could not update .gitignore: {e}")
+
+    # Scaffold .playwright/cli.config.json for browser settings
+    playwright_config_dir = project_dir / ".playwright"
+    playwright_config_file = playwright_config_dir / "cli.config.json"
+    if not playwright_config_file.exists():
+        try:
+            playwright_config_dir.mkdir(parents=True, exist_ok=True)
+            import json
+            config = {
+                "browser": {
+                    "browserName": "chromium",
+                    "launchOptions": {
+                        "channel": "chrome",
+                        "headless": True,
+                    },
+                    "contextOptions": {
+                        "viewport": {"width": 1280, "height": 720},
+                    },
+                    "isolated": True,
+                },
+            }
+            with open(playwright_config_file, "w", encoding="utf-8") as f:
+                json.dump(config, f, indent=2)
+                f.write("\n")
+            copied_files.append(".playwright/cli.config.json")
+        except (OSError, PermissionError) as e:
+            print(f"  Warning: Could not create playwright config: {e}")
+
    if copied_files:
        print(f"  Created project files: {', '.join(copied_files)}")

+    # Stamp new projects at the current migration version so they never trigger migration
+    _set_migration_version(project_dir, CURRENT_MIGRATION_VERSION)
+
    return project_prompts


@@ -425,3 +489,330 @@ def copy_spec_to_project(project_dir: Path) -> None:
            return

    print("Warning: No app_spec.txt found to copy to project directory")
+
+
+# ---------------------------------------------------------------------------
+# Project version migration
+# ---------------------------------------------------------------------------
+
+# Replacement content: coding_prompt.md STEP 5 section (Playwright CLI)
+_CLI_STEP5_CONTENT = """\
+### STEP 5: VERIFY WITH BROWSER AUTOMATION
+
+**CRITICAL:** You MUST verify features through the actual UI.
+
+Use `playwright-cli` for browser automation:
+
+- Open the browser: `playwright-cli open http://localhost:PORT`
+- Take a snapshot to see page elements: `playwright-cli snapshot`
+- Read the snapshot YAML file to see element refs
+- Click elements by ref: `playwright-cli click e5`
+- Type text: `playwright-cli type "search query"`
+- Fill form fields: `playwright-cli fill e3 "value"`
+- Take screenshots: `playwright-cli screenshot`
+- Read the screenshot file to verify visual appearance
+- Check console errors: `playwright-cli console`
+- Close browser when done: `playwright-cli close`
+
+**Token-efficient workflow:** `playwright-cli screenshot` and `snapshot` save files
+to `.playwright-cli/`. You will see a file link in the output. Read the file only
+when you need to verify visual appearance or find element refs.
+
+**DO:**
+- Test through the UI with clicks and keyboard input
+- Take screenshots and read them to verify visual appearance
+- Check for console errors with `playwright-cli console`
+- Verify complete user workflows end-to-end
+- Always run `playwright-cli close` when finished testing
+
+**DON'T:**
+- Only test with curl commands
+- Use JavaScript evaluation to bypass UI (`eval` and `run-code` are blocked)
+- Skip visual verification
+- Mark tests passing without thorough verification
+
+"""
+
+# Replacement content: coding_prompt.md BROWSER AUTOMATION reference section
+_CLI_BROWSER_SECTION = """\
+## BROWSER AUTOMATION
+
+Use `playwright-cli` commands for UI verification. Key commands: `open`, `goto`,
+`snapshot`, `click`, `type`, `fill`, `screenshot`, `console`, `close`.
+
+**How it works:** `playwright-cli` uses a persistent browser daemon. `open` starts it,
+subsequent commands interact via socket, `close` shuts it down. Screenshots and snapshots
+save to `.playwright-cli/` -- read the files when you need to verify content.
+
+Test like a human user with mouse and keyboard. Use `playwright-cli console` to detect
+JS errors. Don't bypass UI with JavaScript evaluation.
+
+"""
+
+# Replacement content: testing_prompt.md STEP 2 section (Playwright CLI)
+_CLI_TESTING_STEP2 = """\
+### STEP 2: VERIFY THE FEATURE
+
+**CRITICAL:** You MUST verify the feature through the actual UI using browser automation.
+
+For the feature returned:
+1. Read and understand the feature's verification steps
+2. Navigate to the relevant part of the application
+3. Execute each verification step using browser automation
+4. Take screenshots and read them to verify visual appearance
+5. Check for console errors
+
+### Browser Automation (Playwright CLI)
+
+**Navigation & Screenshots:**
+- `playwright-cli open <url>` - Open browser and navigate
+- `playwright-cli goto <url>` - Navigate to URL
+- `playwright-cli screenshot` - Save screenshot to `.playwright-cli/`
+- `playwright-cli snapshot` - Save page snapshot with element refs to `.playwright-cli/`
+
+**Element Interaction:**
+- `playwright-cli click <ref>` - Click elements (ref from snapshot)
+- `playwright-cli type <text>` - Type text
+- `playwright-cli fill <ref> <text>` - Fill form fields
+- `playwright-cli select <ref> <val>` - Select dropdown
+- `playwright-cli press <key>` - Keyboard input
+
+**Debugging:**
+- `playwright-cli console` - Check for JS errors
+- `playwright-cli network` - Monitor API calls
+
+**Cleanup:**
+- `playwright-cli close` - Close browser when done (ALWAYS do this)
+
+**Note:** Screenshots and snapshots save to files. Read the file to see the content.
+
+"""
+
+# Replacement content: testing_prompt.md AVAILABLE TOOLS browser subsection
+_CLI_TESTING_TOOLS = """\
+### Browser Automation (Playwright CLI)
+Use `playwright-cli` commands for browser interaction. Key commands:
+- `playwright-cli open <url>` - Open browser
+- `playwright-cli goto <url>` - Navigate to URL
+- `playwright-cli screenshot` - Take screenshot (saved to `.playwright-cli/`)
+- `playwright-cli snapshot` - Get page snapshot with element refs
+- `playwright-cli click <ref>` - Click element
+- `playwright-cli type <text>` - Type text
+- `playwright-cli fill <ref> <text>` - Fill form field
+- `playwright-cli console` - Check for JS errors
+- `playwright-cli close` - Close browser (always do this when done)
+
+"""
+
+
+def _get_migration_version(project_dir: Path) -> int:
+    """Read the migration version from .autoforge/.migration_version."""
+    from autoforge_paths import get_autoforge_dir
+    version_file = get_autoforge_dir(project_dir) / ".migration_version"
+    if not version_file.exists():
+        return 0
+    try:
+        return int(version_file.read_text().strip())
+    except (ValueError, OSError):
+        return 0
+
+
+def _set_migration_version(project_dir: Path, version: int) -> None:
+    """Write the migration version to .autoforge/.migration_version."""
+    from autoforge_paths import get_autoforge_dir
+    version_file = get_autoforge_dir(project_dir) / ".migration_version"
+    version_file.parent.mkdir(parents=True, exist_ok=True)
+    version_file.write_text(str(version))
+
+
+def _migrate_coding_prompt_to_cli(content: str) -> str:
+    """Replace MCP-based Playwright sections with CLI-based content in coding prompt."""
+    # Replace STEP 5 section (from header to just before STEP 5.5)
+    content = re.sub(
+        r"### STEP 5: VERIFY WITH BROWSER AUTOMATION.*?(?=### STEP 5\.5:)",
+        _CLI_STEP5_CONTENT,
+        content,
+        count=1,
+        flags=re.DOTALL,
+    )
+
+    # Replace BROWSER AUTOMATION reference section (from header to next ---)
+    content = re.sub(
+        r"## BROWSER AUTOMATION\n\n.*?(?=---)",
+        _CLI_BROWSER_SECTION,
+        content,
+        count=1,
+        flags=re.DOTALL,
+    )
+
+    # Replace inline screenshot rule
+    content = content.replace(
+        "**ONLY MARK A FEATURE AS PASSING AFTER VERIFICATION WITH SCREENSHOTS.**",
+        "**ONLY MARK A FEATURE AS PASSING AFTER VERIFICATION WITH BROWSER AUTOMATION.**",
+    )
+
+    # Replace inline screenshot references (various phrasings from old templates)
+    for old_phrase in (
+        "(inline only -- do NOT save to disk)",
+        "(inline only, never save to disk)",
+        "(inline mode only -- never save to disk)",
+    ):
+        content = content.replace(old_phrase, "(saved to `.playwright-cli/`)")
+
+    return content
+
+
+def _migrate_testing_prompt_to_cli(content: str) -> str:
+    """Replace MCP-based Playwright sections with CLI-based content in testing prompt."""
+    # Replace AVAILABLE TOOLS browser subsection FIRST (before STEP 2, to avoid
+    # matching the new CLI subsection header that the STEP 2 replacement inserts).
+    # In old prompts, ### Browser Automation (Playwright) only exists in AVAILABLE TOOLS.
+    content = re.sub(
+        r"### Browser Automation \(Playwright[^)]*\)\n.*?(?=---)",
+        _CLI_TESTING_TOOLS,
+        content,
+        count=1,
+        flags=re.DOTALL,
+    )
+
+    # Replace STEP 2 verification section (from header to just before STEP 3)
+    content = re.sub(
+        r"### STEP 2: VERIFY THE FEATURE.*?(?=### STEP 3:)",
+        _CLI_TESTING_STEP2,
+        content,
+        count=1,
+        flags=re.DOTALL,
+    )
+
+    # Replace inline screenshot references (various phrasings from old templates)
+    for old_phrase in (
+        "(inline only -- do NOT save to disk)",
+        "(inline only, never save to disk)",
+        "(inline mode only -- never save to disk)",
+    ):
+        content = content.replace(old_phrase, "(saved to `.playwright-cli/`)")
+
+    return content
+
+
+def _migrate_v0_to_v1(project_dir: Path) -> list[str]:
+    """Migrate from v0 (MCP-based Playwright) to v1 (Playwright CLI).
+
+    Four idempotent sub-steps:
+    A. Copy playwright-cli skill to project
+    B. Scaffold .playwright/cli.config.json
+    C. Update .gitignore with .playwright-cli/ and .playwright/
+    D. Update coding_prompt.md and testing_prompt.md
+    """
+    import json
+
+    migrated: list[str] = []
+
+    # A. Copy Playwright CLI skill
+    skills_src = Path(__file__).parent / ".claude" / "skills" / "playwright-cli"
+    skills_dest = project_dir / ".claude" / "skills" / "playwright-cli"
+    if skills_src.exists() and not skills_dest.exists():
+        try:
+            shutil.copytree(skills_src, skills_dest)
+            migrated.append("Copied playwright-cli skill")
+        except (OSError, PermissionError) as e:
+            print(f"  Warning: Could not copy playwright-cli skill: {e}")
+
+    # B. Scaffold .playwright/cli.config.json
+    playwright_config_dir = project_dir / ".playwright"
+    playwright_config_file = playwright_config_dir / "cli.config.json"
+    if not playwright_config_file.exists():
+        try:
+            playwright_config_dir.mkdir(parents=True, exist_ok=True)
+            config = {
+                "browser": {
+                    "browserName": "chromium",
+                    "launchOptions": {
+                        "channel": "chrome",
+                        "headless": True,
+                    },
+                    "contextOptions": {
+                        "viewport": {"width": 1280, "height": 720},
+                    },
+                    "isolated": True,
+                },
+            }
+            with open(playwright_config_file, "w", encoding="utf-8") as f:
+                json.dump(config, f, indent=2)
+                f.write("\n")
+            migrated.append("Created .playwright/cli.config.json")
+        except (OSError, PermissionError) as e:
+            print(f"  Warning: Could not create playwright config: {e}")
+
+    # C. Update .gitignore
+    project_gitignore = project_dir / ".gitignore"
+    entries_to_add = [".playwright-cli/", ".playwright/"]
+    existing_lines: list[str] = []
+    if project_gitignore.exists():
+        try:
+            existing_lines = project_gitignore.read_text(encoding="utf-8").splitlines()
+        except (OSError, PermissionError):
+            pass
+    missing_entries = [e for e in entries_to_add if e not in existing_lines]
+    if missing_entries:
+        try:
+            with open(project_gitignore, "a", encoding="utf-8") as f:
+                if existing_lines and existing_lines[-1].strip():
+                    f.write("\n")
+                for entry in missing_entries:
+                    f.write(f"{entry}\n")
+            migrated.append(f"Added {', '.join(missing_entries)} to .gitignore")
+        except (OSError, PermissionError) as e:
+            print(f"  Warning: Could not update .gitignore: {e}")
+
+    # D. Update prompts
+    prompts_dir = get_project_prompts_dir(project_dir)
+
+    # D1. Update coding_prompt.md
+    coding_prompt_path = prompts_dir / "coding_prompt.md"
+    if coding_prompt_path.exists():
+        try:
+            content = coding_prompt_path.read_text(encoding="utf-8")
+            if "Playwright MCP" in content or "browser_navigate" in content or "browser_take_screenshot" in content:
+                updated = _migrate_coding_prompt_to_cli(content)
+                if updated != content:
+                    coding_prompt_path.write_text(updated, encoding="utf-8")
+                    migrated.append("Updated coding_prompt.md to Playwright CLI")
+        except (OSError, PermissionError) as e:
+            print(f"  Warning: Could not update coding_prompt.md: {e}")
+
+    # D2. Update testing_prompt.md
+    testing_prompt_path = prompts_dir / "testing_prompt.md"
+    if testing_prompt_path.exists():
+        try:
+            content = testing_prompt_path.read_text(encoding="utf-8")
+            if "browser_navigate" in content or "browser_take_screenshot" in content:
+                updated = _migrate_testing_prompt_to_cli(content)
+                if updated != content:
+                    testing_prompt_path.write_text(updated, encoding="utf-8")
+                    migrated.append("Updated testing_prompt.md to Playwright CLI")
+        except (OSError, PermissionError) as e:
+            print(f"  Warning: Could not update testing_prompt.md: {e}")
+
+    return migrated
+
+
+def migrate_project_to_current(project_dir: Path) -> list[str]:
+    """Migrate an existing project to the current AutoForge version.
+
+    Idempotent — safe to call on every agent start. Returns list of
+    human-readable descriptions of what was migrated.
+    """
+    current = _get_migration_version(project_dir)
+    if current >= CURRENT_MIGRATION_VERSION:
+        return []
+
+    migrated: list[str] = []
+
+    if current < 1:
+        migrated.extend(_migrate_v0_to_v1(project_dir))
+
+    # Future: if current < 2: migrated.extend(_migrate_v1_to_v2(project_dir))
+
+    _set_migration_version(project_dir, CURRENT_MIGRATION_VERSION)
+    return migrated
--- a/security.py
+++ b/security.py
@@ -66,10 +66,12 @@ ALLOWED_COMMANDS = {
    "bash",
    # Script execution
    "init.sh",  # Init scripts; validated separately
+    # Browser automation
+    "playwright-cli",  # Playwright CLI for browser testing; validated separately
 }

 # Commands that need additional validation even when in the allowlist
-COMMANDS_NEEDING_EXTRA_VALIDATION = {"pkill", "chmod", "init.sh"}
+COMMANDS_NEEDING_EXTRA_VALIDATION = {"pkill", "chmod", "init.sh", "playwright-cli"}

 # Commands that are NEVER allowed, even with user approval
 # These commands can cause permanent system damage or security breaches
@@ -438,6 +440,37 @@ def validate_init_script(command_string: str) -> tuple[bool, str]:
    return False, f"Only ./init.sh is allowed, got: {script}"


+def validate_playwright_command(command_string: str) -> tuple[bool, str]:
+    """
+    Validate playwright-cli commands - block dangerous subcommands.
+
+    Blocks `run-code` (arbitrary Node.js execution) and `eval` (arbitrary JS
+    evaluation) which bypass the security sandbox.
+
+    Returns:
+        Tuple of (is_allowed, reason_if_blocked)
+    """
+    try:
+        tokens = shlex.split(command_string)
+    except ValueError:
+        return False, "Could not parse playwright-cli command"
+
+    if not tokens:
+        return False, "Empty command"
+
+    BLOCKED_SUBCOMMANDS = {"run-code", "eval"}
+
+    # Find the subcommand: first non-flag token after 'playwright-cli'
+    for token in tokens[1:]:
+        if token.startswith("-"):
+            continue  # skip flags like -s=agent-1
+        if token in BLOCKED_SUBCOMMANDS:
+            return False, f"playwright-cli '{token}' is not allowed"
+        break  # first non-flag token is the subcommand
+
+    return True, ""
+
+
 def matches_pattern(command: str, pattern: str) -> bool:
    """
    Check if a command matches a pattern.
@@ -955,5 +988,9 @@ async def bash_security_hook(input_data, tool_use_id=None, context=None):
                allowed, reason = validate_init_script(cmd_segment)
                if not allowed:
                    return {"decision": "block", "reason": reason}
+            elif cmd == "playwright-cli":
+                allowed, reason = validate_playwright_command(cmd_segment)
+                if not allowed:
+                    return {"decision": "block", "reason": reason}

    return {}
--- a/server/services/process_manager.py
+++ b/server/services/process_manager.py
@@ -227,6 +227,28 @@ class AgentProcessManager:
        """Remove lock file."""
        self.lock_file.unlink(missing_ok=True)

+    def _apply_playwright_headless(self, headless: bool) -> None:
+        """Update .playwright/cli.config.json with the current headless setting.
+
+        playwright-cli reads this config file on each ``open`` command, so
+        updating it before the agent starts is sufficient.
+        """
+        config_file = self.project_dir / ".playwright" / "cli.config.json"
+        if not config_file.exists():
+            return
+        try:
+            import json
+            config = json.loads(config_file.read_text(encoding="utf-8"))
+            launch_opts = config.get("browser", {}).get("launchOptions", {})
+            if launch_opts.get("headless") == headless:
+                return  # already correct
+            launch_opts["headless"] = headless
+            config.setdefault("browser", {})["launchOptions"] = launch_opts
+            config_file.write_text(json.dumps(config, indent=2) + "\n", encoding="utf-8")
+            logger.info("Set playwright headless=%s for %s", headless, self.project_name)
+        except Exception:
+            logger.warning("Failed to update playwright config", exc_info=True)
+
    def _cleanup_stale_features(self) -> None:
        """Clear in_progress flag for all features when agent stops/crashes.

@@ -361,6 +383,15 @@ class AgentProcessManager:
        if not self._check_lock():
            return False, "Another agent instance is already running for this project"

+        # Clean up stale browser daemons from previous runs
+        try:
+            subprocess.run(
+                ["playwright-cli", "kill-all"],
+                timeout=5, capture_output=True,
+            )
+        except (subprocess.TimeoutExpired, FileNotFoundError, OSError):
+            pass
+
        # Clean up features stuck from a previous crash/stop
        self._cleanup_stale_features()

@@ -397,6 +428,10 @@ class AgentProcessManager:
        # Add --batch-size flag for multi-feature batching
        cmd.extend(["--batch-size", str(batch_size)])

+        # Apply headless setting to .playwright/cli.config.json so playwright-cli
+        # picks it up (the only mechanism it supports for headless control)
+        self._apply_playwright_headless(playwright_headless)
+
        try:
            # Start subprocess with piped stdout/stderr
            # Use project_dir as cwd so Claude SDK sandbox allows access to project files
@@ -409,7 +444,7 @@ class AgentProcessManager:
            subprocess_env = {
                **os.environ,
                "PYTHONUNBUFFERED": "1",
-                "PLAYWRIGHT_HEADLESS": "true" if playwright_headless else "false",
+                "PLAYWRIGHT_CLI_SESSION": f"agent-{self.project_name}-{os.getpid()}",
                "NODE_COMPILE_CACHE": "",  # Disable V8 compile caching to prevent .node file accumulation in %TEMP%
                **api_env,
            }
@@ -469,6 +504,15 @@ class AgentProcessManager:
                except asyncio.CancelledError:
                    pass

+            # Kill browser daemons before stopping agent
+            try:
+                subprocess.run(
+                    ["playwright-cli", "kill-all"],
+                    timeout=5, capture_output=True,
+                )
+            except (subprocess.TimeoutExpired, FileNotFoundError, OSError):
+                pass
+
            # CRITICAL: Kill entire process tree, not just orchestrator
            # This ensures all spawned coding/testing agents are also terminated
            proc = self.process  # Capture reference before async call
--- a/start.bat
+++ b/start.bat
@@ -54,5 +54,15 @@ REM Install dependencies
 echo Installing dependencies...
 pip install -r requirements.txt --quiet

+REM Ensure playwright-cli is available for browser automation
+where playwright-cli >nul 2>&1
+if %ERRORLEVEL% neq 0 (
+    echo Installing playwright-cli for browser automation...
+    call npm install -g @playwright/cli >nul 2>&1
+    if %ERRORLEVEL% neq 0 (
+        echo Note: Could not install playwright-cli. Install manually: npm install -g @playwright/cli
+    )
+)
+
 REM Run the app
 python start.py
--- a/start.sh
+++ b/start.sh
@@ -74,5 +74,14 @@ fi
 echo "Installing dependencies..."
 pip install -r requirements.txt --quiet

+# Ensure playwright-cli is available for browser automation
+if ! command -v playwright-cli &> /dev/null; then
+    echo "Installing playwright-cli for browser automation..."
+    npm install -g @playwright/cli --quiet 2>/dev/null
+    if [ $? -ne 0 ]; then
+        echo "Note: Could not install playwright-cli. Install manually: npm install -g @playwright/cli"
+    fi
+fi
+
 # Run the app
 python start.py
--- a/start_ui.bat
+++ b/start_ui.bat
@@ -37,5 +37,15 @@ REM Install dependencies
 echo Installing dependencies...
 pip install -r requirements.txt --quiet

+REM Ensure playwright-cli is available for browser automation
+where playwright-cli >nul 2>&1
+if %ERRORLEVEL% neq 0 (
+    echo Installing playwright-cli for browser automation...
+    call npm install -g @playwright/cli >nul 2>&1
+    if %ERRORLEVEL% neq 0 (
+        echo Note: Could not install playwright-cli. Install manually: npm install -g @playwright/cli
+    )
+)
+
 REM Run the Python launcher
 python "%~dp0start_ui.py" %*
--- a/start_ui.sh
+++ b/start_ui.sh
@@ -80,5 +80,14 @@ fi
 echo "Installing dependencies..."
 pip install -r requirements.txt --quiet

+# Ensure playwright-cli is available for browser automation
+if ! command -v playwright-cli &> /dev/null; then
+    echo "Installing playwright-cli for browser automation..."
+    npm install -g @playwright/cli --quiet 2>/dev/null
+    if [ $? -ne 0 ]; then
+        echo "Note: Could not install playwright-cli. Install manually: npm install -g @playwright/cli"
+    fi
+fi
+
 # Run the Python launcher
 python start_ui.py "$@"
--- a/temp_cleanup.py
+++ b/temp_cleanup.py
@@ -125,14 +125,18 @@ def cleanup_stale_temp(max_age_seconds: int = MAX_AGE_SECONDS) -> dict:

 def cleanup_project_screenshots(project_dir: Path, max_age_seconds: int = 300) -> dict:
    """
-    Clean up stale screenshot files from the project root.
+    Clean up stale Playwright CLI artifacts from the project.

-    Playwright browser verification can leave .png files in the project
-    directory. This removes them after they've aged out (default 5 minutes).
+    The Playwright CLI daemon saves screenshots, snapshots, and other artifacts
+    to `{project_dir}/.playwright-cli/`. This removes them after they've aged
+    out (default 5 minutes).
+
+    Also cleans up legacy screenshot patterns from the project root (from the
+    old Playwright MCP server approach).

    Args:
        project_dir: Path to the project directory.
-        max_age_seconds: Maximum age in seconds before a screenshot is deleted.
+        max_age_seconds: Maximum age in seconds before an artifact is deleted.
                        Defaults to 5 minutes (300 seconds).

    Returns:
@@ -141,13 +145,33 @@ def cleanup_project_screenshots(project_dir: Path, max_age_seconds: int = 300) -
    cutoff_time = time.time() - max_age_seconds
    stats: dict = {"files_deleted": 0, "bytes_freed": 0, "errors": []}

-    screenshot_patterns = [
+    # Clean up .playwright-cli/ directory (new CLI approach)
+    playwright_cli_dir = project_dir / ".playwright-cli"
+    if playwright_cli_dir.exists():
+        for item in playwright_cli_dir.iterdir():
+            if not item.is_file():
+                continue
+            try:
+                mtime = item.stat().st_mtime
+                if mtime < cutoff_time:
+                    size = item.stat().st_size
+                    item.unlink(missing_ok=True)
+                    if not item.exists():
+                        stats["files_deleted"] += 1
+                        stats["bytes_freed"] += size
+                        logger.debug(f"Deleted playwright-cli artifact: {item}")
+            except Exception as e:
+                stats["errors"].append(f"Failed to delete {item}: {e}")
+                logger.debug(f"Failed to delete artifact {item}: {e}")
+
+    # Legacy cleanup: root-level screenshot patterns (from old MCP server approach)
+    legacy_patterns = [
        "feature*-*.png",
        "screenshot-*.png",
        "step-*.png",
    ]

-    for pattern in screenshot_patterns:
+    for pattern in legacy_patterns:
        for item in project_dir.glob(pattern):
            if not item.is_file():
                continue
@@ -159,14 +183,14 @@ def cleanup_project_screenshots(project_dir: Path, max_age_seconds: int = 300) -
                    if not item.exists():
                        stats["files_deleted"] += 1
                        stats["bytes_freed"] += size
-                        logger.debug(f"Deleted project screenshot: {item}")
+                        logger.debug(f"Deleted legacy screenshot: {item}")
            except Exception as e:
                stats["errors"].append(f"Failed to delete {item}: {e}")
                logger.debug(f"Failed to delete screenshot {item}: {e}")

    if stats["files_deleted"] > 0:
        mb_freed = stats["bytes_freed"] / (1024 * 1024)
-        logger.info(f"Screenshot cleanup: {stats['files_deleted']} files, {mb_freed:.1f} MB freed")
+        logger.info(f"Artifact cleanup: {stats['files_deleted']} files, {mb_freed:.1f} MB freed")

    return stats

--- a/test_security.py
+++ b/test_security.py
@@ -25,6 +25,7 @@ from security import (
    validate_chmod_command,
    validate_init_script,
    validate_pkill_command,
+    validate_playwright_command,
    validate_project_command,
 )

@@ -923,6 +924,70 @@ pkill_processes:
    return passed, failed


+def test_playwright_cli_validation():
+    """Test playwright-cli subcommand validation."""
+    print("\nTesting playwright-cli validation:\n")
+    passed = 0
+    failed = 0
+
+    # Test cases: (command, should_be_allowed, description)
+    test_cases = [
+        # Allowed cases
+        ("playwright-cli screenshot", True, "screenshot allowed"),
+        ("playwright-cli snapshot", True, "snapshot allowed"),
+        ("playwright-cli click e5", True, "click with ref"),
+        ("playwright-cli open http://localhost:3000", True, "open URL"),
+        ("playwright-cli -s=agent-1 click e5", True, "session flag with click"),
+        ("playwright-cli close", True, "close browser"),
+        ("playwright-cli goto http://localhost:3000/page", True, "goto URL"),
+        ("playwright-cli fill e3 'test value'", True, "fill form field"),
+        ("playwright-cli console", True, "console messages"),
+        # Blocked cases
+        ("playwright-cli run-code 'await page.evaluate(() => {})'", False, "run-code blocked"),
+        ("playwright-cli eval 'document.title'", False, "eval blocked"),
+        ("playwright-cli -s=test eval 'document.title'", False, "eval with session flag blocked"),
+    ]
+
+    for cmd, should_allow, description in test_cases:
+        allowed, reason = validate_playwright_command(cmd)
+        if allowed == should_allow:
+            print(f"  PASS: {cmd!r} ({description})")
+            passed += 1
+        else:
+            expected = "allowed" if should_allow else "blocked"
+            actual = "allowed" if allowed else "blocked"
+            print(f"  FAIL: {cmd!r} ({description})")
+            print(f"         Expected: {expected}, Got: {actual}")
+            if reason:
+                print(f"         Reason: {reason}")
+            failed += 1
+
+    # Integration test: verify through the security hook
+    print("\n  Integration tests (via security hook):\n")
+
+    # playwright-cli screenshot should be allowed
+    input_data = {"tool_name": "Bash", "tool_input": {"command": "playwright-cli screenshot"}}
+    result = asyncio.run(bash_security_hook(input_data))
+    if result.get("decision") != "block":
+        print("  PASS: playwright-cli screenshot allowed via hook")
+        passed += 1
+    else:
+        print(f"  FAIL: playwright-cli screenshot should be allowed: {result.get('reason')}")
+        failed += 1
+
+    # playwright-cli run-code should be blocked
+    input_data = {"tool_name": "Bash", "tool_input": {"command": "playwright-cli run-code 'code'"}}
+    result = asyncio.run(bash_security_hook(input_data))
+    if result.get("decision") == "block":
+        print("  PASS: playwright-cli run-code blocked via hook")
+        passed += 1
+    else:
+        print("  FAIL: playwright-cli run-code should be blocked via hook")
+        failed += 1
+
+    return passed, failed
+
+
 def main():
    print("=" * 70)
    print("  SECURITY HOOK TESTS")
@@ -991,6 +1056,11 @@ def main():
    passed += pkill_passed
    failed += pkill_failed

+    # Test playwright-cli validation
+    pw_passed, pw_failed = test_playwright_cli_validation()
+    passed += pw_passed
+    failed += pw_failed
+
    # Commands that SHOULD be blocked
    # Note: blocklisted commands (sudo, shutdown, dd, aws) are tested in
    # test_blocklist_enforcement(). chmod validation is tested in
@@ -1012,6 +1082,9 @@ def main():
        # Shell injection attempts
        "$(echo pkill) node",
        'eval "pkill node"',
+        # playwright-cli dangerous subcommands
+        "playwright-cli run-code 'await page.goto(\"http://evil.com\")'",
+        "playwright-cli eval 'document.cookie'",
    ]

    for cmd in dangerous:
@@ -1077,6 +1150,12 @@ def main():
        "/usr/local/bin/node app.js",
        # Combined chmod and init.sh (integration test for both validators)
        "chmod +x init.sh && ./init.sh",
+        # Playwright CLI allowed commands
+        "playwright-cli open http://localhost:3000",
+        "playwright-cli screenshot",
+        "playwright-cli snapshot",
+        "playwright-cli click e5",
+        "playwright-cli -s=agent-1 close",
    ]

    for cmd in safe:
--- a/ui/src/components/ProjectSelector.tsx
+++ b/ui/src/components/ProjectSelector.tsx
@@ -75,6 +75,7 @@ export function ProjectSelector({
            variant="outline"
            className="min-w-[140px] sm:min-w-[200px] justify-between"
            disabled={isLoading}
+            title={selectedProjectData?.path}
          >
            {isLoading ? (
              <Loader2 size={18} className="animate-spin" />
@@ -101,6 +102,7 @@ export function ProjectSelector({
              {projects.map(project => (
                <DropdownMenuItem
                  key={project.name}
+                  title={project.path}
                  className={`flex items-center justify-between cursor-pointer ${
                    project.name === selectedProject ? 'bg-primary/10' : ''
                  }`}