mirror of
https://github.com/leonvanzyl/autocoder.git
synced 2026-03-16 18:33:08 +00:00
feat: migrate browser automation from Playwright MCP to CLI, fix headless setting
Major changes across 21 files (755 additions, 196 deletions): Browser Automation Migration: - Add versioned project migration system (prompts.py) with content-based detection and section-level regex replacement for coding/testing prompts - Migrate STEP 5 (browser verification) and BROWSER AUTOMATION sections in coding prompt template to use playwright-cli commands - Migrate STEP 2 and AVAILABLE TOOLS sections in testing prompt template - Migration auto-runs at agent startup (autonomous_agent_demo.py), copies playwright-cli skill, scaffolds .playwright/cli.config.json, updates .gitignore, and stamps .migration_version file - Add playwright-cli command validation to security allowlist (security.py) with tests for allowed subcommands and blocked eval/run-code Headless Browser Setting Fix: - Add _apply_playwright_headless() to process_manager.py that reads/updates .playwright/cli.config.json before agent subprocess launch - Remove dead PLAYWRIGHT_HEADLESS env var that was never consumed - Settings UI toggle now correctly controls visible browser window Playwright CLI Auto-Install: - Add ensurePlaywrightCli() to lib/cli.js for npm global entry point - Add playwright-cli detection + npm install to start.bat, start.sh, start_ui.bat, start_ui.sh for all startup paths Other Improvements: - Add project folder path tooltip to ProjectSelector.tsx dropdown items - Remove legacy Playwright MCP server configuration from client.py - Update CLAUDE.md with playwright-cli skill documentation Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This commit is contained in:
@@ -86,24 +86,33 @@ Implement the chosen feature thoroughly:
|
||||
|
||||
**CRITICAL:** You MUST verify features through the actual UI.
|
||||
|
||||
Use browser automation tools:
|
||||
Use `playwright-cli` for browser automation:
|
||||
|
||||
- Navigate to the app in a real browser
|
||||
- Interact like a human user (click, type, scroll)
|
||||
- Take screenshots at each step (use inline screenshots only -- do NOT save screenshot files to disk)
|
||||
- Verify both functionality AND visual appearance
|
||||
- Open the browser: `playwright-cli open http://localhost:PORT`
|
||||
- Take a snapshot to see page elements: `playwright-cli snapshot`
|
||||
- Read the snapshot YAML file to see element refs
|
||||
- Click elements by ref: `playwright-cli click e5`
|
||||
- Type text: `playwright-cli type "search query"`
|
||||
- Fill form fields: `playwright-cli fill e3 "value"`
|
||||
- Take screenshots: `playwright-cli screenshot`
|
||||
- Read the screenshot file to verify visual appearance
|
||||
- Check console errors: `playwright-cli console`
|
||||
- Close browser when done: `playwright-cli close`
|
||||
|
||||
**Token-efficient workflow:** `playwright-cli screenshot` and `snapshot` save files
|
||||
to `.playwright-cli/`. You will see a file link in the output. Read the file only
|
||||
when you need to verify visual appearance or find element refs.
|
||||
|
||||
**DO:**
|
||||
|
||||
- Test through the UI with clicks and keyboard input
|
||||
- Take screenshots to verify visual appearance (inline only, never save to disk)
|
||||
- Check for console errors in browser
|
||||
- Take screenshots and read them to verify visual appearance
|
||||
- Check for console errors with `playwright-cli console`
|
||||
- Verify complete user workflows end-to-end
|
||||
- Always run `playwright-cli close` when finished testing
|
||||
|
||||
**DON'T:**
|
||||
|
||||
- Only test with curl commands (backend testing alone is insufficient)
|
||||
- Use JavaScript evaluation to bypass UI (no shortcuts)
|
||||
- Only test with curl commands
|
||||
- Use JavaScript evaluation to bypass UI (`eval` and `run-code` are blocked)
|
||||
- Skip visual verification
|
||||
- Mark tests passing without thorough verification
|
||||
|
||||
@@ -145,7 +154,7 @@ Use the feature_mark_passing tool with feature_id=42
|
||||
- Combine or consolidate features
|
||||
- Reorder features
|
||||
|
||||
**ONLY MARK A FEATURE AS PASSING AFTER VERIFICATION WITH SCREENSHOTS.**
|
||||
**ONLY MARK A FEATURE AS PASSING AFTER VERIFICATION WITH BROWSER AUTOMATION.**
|
||||
|
||||
### STEP 7: COMMIT YOUR PROGRESS
|
||||
|
||||
@@ -192,11 +201,15 @@ Before context fills up:
|
||||
|
||||
## BROWSER AUTOMATION
|
||||
|
||||
Use Playwright MCP tools (`browser_*`) for UI verification. Key tools: `navigate`, `click`, `type`, `fill_form`, `take_screenshot`, `console_messages`, `network_requests`. All tools have auto-wait built in.
|
||||
Use `playwright-cli` commands for UI verification. Key commands: `open`, `goto`,
|
||||
`snapshot`, `click`, `type`, `fill`, `screenshot`, `console`, `close`.
|
||||
|
||||
**Screenshot rule:** Always use inline mode (base64). NEVER save screenshots as files to disk.
|
||||
**How it works:** `playwright-cli` uses a persistent browser daemon. `open` starts it,
|
||||
subsequent commands interact via socket, `close` shuts it down. Screenshots and snapshots
|
||||
save to `.playwright-cli/` -- read the files when you need to verify content.
|
||||
|
||||
Test like a human user with mouse and keyboard. Use `browser_console_messages` to detect errors. Don't bypass UI with JavaScript evaluation.
|
||||
Test like a human user with mouse and keyboard. Use `playwright-cli console` to detect
|
||||
JS errors. Don't bypass UI with JavaScript evaluation.
|
||||
|
||||
---
|
||||
|
||||
|
||||
@@ -31,26 +31,32 @@ For the feature returned:
|
||||
1. Read and understand the feature's verification steps
|
||||
2. Navigate to the relevant part of the application
|
||||
3. Execute each verification step using browser automation
|
||||
4. Take screenshots to document the verification (inline only -- do NOT save to disk)
|
||||
4. Take screenshots and read them to verify visual appearance
|
||||
5. Check for console errors
|
||||
|
||||
Use browser automation tools:
|
||||
### Browser Automation (Playwright CLI)
|
||||
|
||||
**Navigation & Screenshots:**
|
||||
- browser_navigate - Navigate to a URL
|
||||
- browser_take_screenshot - Capture screenshot (inline mode only -- never save to disk)
|
||||
- browser_snapshot - Get accessibility tree snapshot
|
||||
- `playwright-cli open <url>` - Open browser and navigate
|
||||
- `playwright-cli goto <url>` - Navigate to URL
|
||||
- `playwright-cli screenshot` - Save screenshot to `.playwright-cli/`
|
||||
- `playwright-cli snapshot` - Save page snapshot with element refs to `.playwright-cli/`
|
||||
|
||||
**Element Interaction:**
|
||||
- browser_click - Click elements
|
||||
- browser_type - Type text into editable elements
|
||||
- browser_fill_form - Fill multiple form fields
|
||||
- browser_select_option - Select dropdown options
|
||||
- browser_press_key - Press keyboard keys
|
||||
- `playwright-cli click <ref>` - Click elements (ref from snapshot)
|
||||
- `playwright-cli type <text>` - Type text
|
||||
- `playwright-cli fill <ref> <text>` - Fill form fields
|
||||
- `playwright-cli select <ref> <val>` - Select dropdown
|
||||
- `playwright-cli press <key>` - Keyboard input
|
||||
|
||||
**Debugging:**
|
||||
- browser_console_messages - Get browser console output (check for errors)
|
||||
- browser_network_requests - Monitor API calls
|
||||
- `playwright-cli console` - Check for JS errors
|
||||
- `playwright-cli network` - Monitor API calls
|
||||
|
||||
**Cleanup:**
|
||||
- `playwright-cli close` - Close browser when done (ALWAYS do this)
|
||||
|
||||
**Note:** Screenshots and snapshots save to files. Read the file to see the content.
|
||||
|
||||
### STEP 3: HANDLE RESULTS
|
||||
|
||||
@@ -79,7 +85,7 @@ A regression has been introduced. You MUST fix it:
|
||||
|
||||
4. **Verify the fix:**
|
||||
- Run through all verification steps again
|
||||
- Take screenshots confirming the fix (inline only, never save to disk)
|
||||
- Take screenshots and read them to confirm the fix
|
||||
|
||||
5. **Mark as passing after fix:**
|
||||
```
|
||||
@@ -98,7 +104,7 @@ A regression has been introduced. You MUST fix it:
|
||||
|
||||
---
|
||||
|
||||
## AVAILABLE MCP TOOLS
|
||||
## AVAILABLE TOOLS
|
||||
|
||||
### Feature Management
|
||||
- `feature_get_stats` - Get progress overview (passing/in_progress/total counts)
|
||||
@@ -106,19 +112,17 @@ A regression has been introduced. You MUST fix it:
|
||||
- `feature_mark_failing` - Mark a feature as failing (when you find a regression)
|
||||
- `feature_mark_passing` - Mark a feature as passing (after fixing a regression)
|
||||
|
||||
### Browser Automation (Playwright)
|
||||
All interaction tools have **built-in auto-wait** -- no manual timeouts needed.
|
||||
|
||||
- `browser_navigate` - Navigate to URL
|
||||
- `browser_take_screenshot` - Capture screenshot (inline only, never save to disk)
|
||||
- `browser_snapshot` - Get accessibility tree
|
||||
- `browser_click` - Click elements
|
||||
- `browser_type` - Type text
|
||||
- `browser_fill_form` - Fill form fields
|
||||
- `browser_select_option` - Select dropdown
|
||||
- `browser_press_key` - Keyboard input
|
||||
- `browser_console_messages` - Check for JS errors
|
||||
- `browser_network_requests` - Monitor API calls
|
||||
### Browser Automation (Playwright CLI)
|
||||
Use `playwright-cli` commands for browser interaction. Key commands:
|
||||
- `playwright-cli open <url>` - Open browser
|
||||
- `playwright-cli goto <url>` - Navigate to URL
|
||||
- `playwright-cli screenshot` - Take screenshot (saved to `.playwright-cli/`)
|
||||
- `playwright-cli snapshot` - Get page snapshot with element refs
|
||||
- `playwright-cli click <ref>` - Click element
|
||||
- `playwright-cli type <text>` - Type text
|
||||
- `playwright-cli fill <ref> <text>` - Fill form field
|
||||
- `playwright-cli console` - Check for JS errors
|
||||
- `playwright-cli close` - Close browser (always do this when done)
|
||||
|
||||
---
|
||||
|
||||
|
||||
4
.gitignore
vendored
4
.gitignore
vendored
@@ -10,6 +10,10 @@ issues/
|
||||
# Browser profiles for parallel agent execution
|
||||
.browser-profiles/
|
||||
|
||||
# Playwright CLI daemon artifacts
|
||||
.playwright-cli/
|
||||
.playwright/
|
||||
|
||||
# Log files
|
||||
logs/
|
||||
*.log
|
||||
|
||||
@@ -28,5 +28,4 @@ start.sh
|
||||
start_ui.sh
|
||||
start_ui.py
|
||||
.claude/agents/
|
||||
.claude/skills/
|
||||
.claude/settings.json
|
||||
|
||||
10
CLAUDE.md
10
CLAUDE.md
@@ -85,7 +85,7 @@ python autonomous_agent_demo.py --project-dir my-app --yolo
|
||||
|
||||
**What's different in YOLO mode:**
|
||||
- No regression testing
|
||||
- No Playwright MCP server (browser automation disabled)
|
||||
- No Playwright CLI (browser automation disabled)
|
||||
- Features marked passing after lint/type-check succeeds
|
||||
- Faster iteration for prototyping
|
||||
|
||||
@@ -163,7 +163,7 @@ Publishing: `npm publish` (triggers `prepublishOnly` which builds UI, then publi
|
||||
- `autonomous_agent_demo.py` - Entry point for running the agent (supports `--yolo`, `--parallel`, `--batch-size`, `--batch-features`)
|
||||
- `autoforge_paths.py` - Central path resolution with dual-path backward compatibility and migration
|
||||
- `agent.py` - Agent session loop using Claude Agent SDK
|
||||
- `client.py` - ClaudeSDKClient configuration with security hooks, MCP servers, and Vertex AI support
|
||||
- `client.py` - ClaudeSDKClient configuration with security hooks, feature MCP server, and Vertex AI support
|
||||
- `security.py` - Bash command allowlist validation (ALLOWED_COMMANDS whitelist)
|
||||
- `prompts.py` - Prompt template loading with project-specific fallback and batch feature prompts
|
||||
- `progress.py` - Progress tracking, database queries, webhook notifications
|
||||
@@ -288,6 +288,9 @@ Projects can be stored in any directory (registered in `~/.autoforge/registry.db
|
||||
- `.autoforge/.agent.lock` - Lock file to prevent multiple agent instances
|
||||
- `.autoforge/allowed_commands.yaml` - Project-specific bash command allowlist (optional)
|
||||
- `.autoforge/.gitignore` - Ignores runtime files
|
||||
- `.claude/skills/playwright-cli/` - Playwright CLI skill for browser automation
|
||||
- `.playwright/cli.config.json` - Browser configuration (headless, viewport, etc.)
|
||||
- `.playwright-cli/` - Playwright CLI daemon artifacts (screenshots, snapshots) - gitignored
|
||||
- `CLAUDE.md` - Stays at project root (SDK convention)
|
||||
- `app_spec.txt` - Root copy for agent template compatibility
|
||||
|
||||
@@ -445,6 +448,7 @@ Alternative providers are configured via the **Settings UI** (gear icon > API Pr
|
||||
**Skills** (`.claude/skills/`):
|
||||
- `frontend-design` - Distinctive, production-grade UI design
|
||||
- `gsd-to-autoforge-spec` - Convert GSD codebase mapping to AutoForge app_spec format
|
||||
- `playwright-cli` - Browser automation via Playwright CLI (copied to each project)
|
||||
|
||||
**Other:**
|
||||
- `.claude/templates/` - Prompt templates copied to new projects
|
||||
@@ -479,7 +483,7 @@ When running with `--parallel`, the orchestrator:
|
||||
1. Spawns multiple Claude agents as subprocesses (up to `--max-concurrency`)
|
||||
2. Each agent claims features atomically via `feature_claim_and_get`
|
||||
3. Features blocked by unmet dependencies are skipped
|
||||
4. Browser contexts are isolated per agent using `--isolated` flag
|
||||
4. Browser sessions are isolated per agent via `PLAYWRIGHT_CLI_SESSION` environment variable
|
||||
5. AgentTracker parses output and emits `agent_update` messages for UI
|
||||
|
||||
### Process Limits (Parallel Mode)
|
||||
|
||||
12
agent.py
12
agent.py
@@ -240,17 +240,7 @@ async def run_autonomous_agent(
|
||||
print_session_header(iteration, is_initializer)
|
||||
|
||||
# Create client (fresh context)
|
||||
# Pass agent_id for browser isolation in multi-agent scenarios
|
||||
import os
|
||||
if agent_type == "testing":
|
||||
agent_id = f"testing-{os.getpid()}" # Unique ID for testing agents
|
||||
elif feature_ids and len(feature_ids) > 1:
|
||||
agent_id = f"batch-{feature_ids[0]}"
|
||||
elif feature_id:
|
||||
agent_id = f"feature-{feature_id}"
|
||||
else:
|
||||
agent_id = None
|
||||
client = create_client(project_dir, model, yolo_mode=yolo_mode, agent_id=agent_id, agent_type=agent_type)
|
||||
client = create_client(project_dir, model, yolo_mode=yolo_mode, agent_type=agent_type)
|
||||
|
||||
# Choose prompt based on agent type
|
||||
if agent_type == "initializer":
|
||||
|
||||
@@ -43,6 +43,7 @@ assistant.db-shm
|
||||
.claude_assistant_settings.json
|
||||
.claude_settings.expand.*.json
|
||||
.progress_cache
|
||||
.migration_version
|
||||
"""
|
||||
|
||||
|
||||
|
||||
@@ -237,6 +237,12 @@ def main() -> None:
|
||||
if migrated:
|
||||
print(f"Migrated project files to .autoforge/: {', '.join(migrated)}", flush=True)
|
||||
|
||||
# Migrate project to current AutoForge version (idempotent, safe)
|
||||
from prompts import migrate_project_to_current
|
||||
version_migrated = migrate_project_to_current(project_dir)
|
||||
if version_migrated:
|
||||
print(f"Upgraded project: {', '.join(version_migrated)}", flush=True)
|
||||
|
||||
# Parse batch testing feature IDs (comma-separated string -> list[int])
|
||||
testing_feature_ids: list[int] | None = None
|
||||
if args.testing_feature_ids:
|
||||
|
||||
131
client.py
131
client.py
@@ -21,16 +21,6 @@ from security import SENSITIVE_DIRECTORIES, bash_security_hook
|
||||
# Load environment variables from .env file if present
|
||||
load_dotenv()
|
||||
|
||||
# Default Playwright headless mode - can be overridden via PLAYWRIGHT_HEADLESS env var
|
||||
# When True, browser runs invisibly in background (default - saves CPU)
|
||||
# When False, browser window is visible (useful for monitoring agent progress)
|
||||
DEFAULT_PLAYWRIGHT_HEADLESS = True
|
||||
|
||||
# Default browser for Playwright - can be overridden via PLAYWRIGHT_BROWSER env var
|
||||
# Options: chrome, firefox, webkit, msedge
|
||||
# Firefox is recommended for lower CPU usage
|
||||
DEFAULT_PLAYWRIGHT_BROWSER = "firefox"
|
||||
|
||||
# Extra read paths for cross-project file access (read-only)
|
||||
# Set EXTRA_READ_PATHS environment variable with comma-separated absolute paths
|
||||
# Example: EXTRA_READ_PATHS=/Volumes/Data/dev,/Users/shared/libs
|
||||
@@ -41,6 +31,7 @@ EXTRA_READ_PATHS_VAR = "EXTRA_READ_PATHS"
|
||||
# this blocklist and the filesystem browser API share a single source of truth.
|
||||
EXTRA_READ_PATHS_BLOCKLIST = SENSITIVE_DIRECTORIES
|
||||
|
||||
|
||||
def convert_model_for_vertex(model: str) -> str:
|
||||
"""
|
||||
Convert model name format for Vertex AI compatibility.
|
||||
@@ -72,43 +63,6 @@ def convert_model_for_vertex(model: str) -> str:
|
||||
return model
|
||||
|
||||
|
||||
def get_playwright_headless() -> bool:
|
||||
"""
|
||||
Get the Playwright headless mode setting.
|
||||
|
||||
Reads from PLAYWRIGHT_HEADLESS environment variable, defaults to True.
|
||||
Returns True for headless mode (invisible browser), False for visible browser.
|
||||
"""
|
||||
value = os.getenv("PLAYWRIGHT_HEADLESS", str(DEFAULT_PLAYWRIGHT_HEADLESS).lower()).strip().lower()
|
||||
truthy = {"true", "1", "yes", "on"}
|
||||
falsy = {"false", "0", "no", "off"}
|
||||
if value not in truthy | falsy:
|
||||
print(f" - Warning: Invalid PLAYWRIGHT_HEADLESS='{value}', defaulting to {DEFAULT_PLAYWRIGHT_HEADLESS}")
|
||||
return DEFAULT_PLAYWRIGHT_HEADLESS
|
||||
return value in truthy
|
||||
|
||||
|
||||
# Valid browsers supported by Playwright MCP
|
||||
VALID_PLAYWRIGHT_BROWSERS = {"chrome", "firefox", "webkit", "msedge"}
|
||||
|
||||
|
||||
def get_playwright_browser() -> str:
|
||||
"""
|
||||
Get the browser to use for Playwright.
|
||||
|
||||
Reads from PLAYWRIGHT_BROWSER environment variable, defaults to firefox.
|
||||
Options: chrome, firefox, webkit, msedge
|
||||
Firefox is recommended for lower CPU usage.
|
||||
"""
|
||||
value = os.getenv("PLAYWRIGHT_BROWSER", DEFAULT_PLAYWRIGHT_BROWSER).strip().lower()
|
||||
if value not in VALID_PLAYWRIGHT_BROWSERS:
|
||||
print(f" - Warning: Invalid PLAYWRIGHT_BROWSER='{value}', "
|
||||
f"valid options: {', '.join(sorted(VALID_PLAYWRIGHT_BROWSERS))}. "
|
||||
f"Defaulting to {DEFAULT_PLAYWRIGHT_BROWSER}")
|
||||
return DEFAULT_PLAYWRIGHT_BROWSER
|
||||
return value
|
||||
|
||||
|
||||
def get_extra_read_paths() -> list[Path]:
|
||||
"""
|
||||
Get extra read-only paths from EXTRA_READ_PATHS environment variable.
|
||||
@@ -228,41 +182,6 @@ ALL_FEATURE_MCP_TOOLS = sorted(
|
||||
set(CODING_AGENT_TOOLS) | set(TESTING_AGENT_TOOLS) | set(INITIALIZER_AGENT_TOOLS)
|
||||
)
|
||||
|
||||
# Playwright MCP tools for browser automation.
|
||||
# Full set of tools for comprehensive UI testing including drag-and-drop,
|
||||
# hover menus, file uploads, tab management, etc.
|
||||
PLAYWRIGHT_TOOLS = [
|
||||
# Core navigation & screenshots
|
||||
"mcp__playwright__browser_navigate",
|
||||
"mcp__playwright__browser_navigate_back",
|
||||
"mcp__playwright__browser_take_screenshot",
|
||||
"mcp__playwright__browser_snapshot",
|
||||
|
||||
# Element interaction
|
||||
"mcp__playwright__browser_click",
|
||||
"mcp__playwright__browser_type",
|
||||
"mcp__playwright__browser_fill_form",
|
||||
"mcp__playwright__browser_select_option",
|
||||
"mcp__playwright__browser_press_key",
|
||||
"mcp__playwright__browser_drag",
|
||||
"mcp__playwright__browser_hover",
|
||||
"mcp__playwright__browser_file_upload",
|
||||
|
||||
# JavaScript & debugging
|
||||
"mcp__playwright__browser_evaluate",
|
||||
# "mcp__playwright__browser_run_code", # REMOVED - causes Playwright MCP server crash
|
||||
"mcp__playwright__browser_console_messages",
|
||||
"mcp__playwright__browser_network_requests",
|
||||
|
||||
# Browser management
|
||||
"mcp__playwright__browser_resize",
|
||||
"mcp__playwright__browser_wait_for",
|
||||
"mcp__playwright__browser_handle_dialog",
|
||||
"mcp__playwright__browser_install",
|
||||
"mcp__playwright__browser_close",
|
||||
"mcp__playwright__browser_tabs",
|
||||
]
|
||||
|
||||
# Built-in tools available to agents.
|
||||
# WebFetch and WebSearch are included so coding agents can look up current
|
||||
# documentation for frameworks and libraries they are implementing.
|
||||
@@ -282,7 +201,6 @@ def create_client(
|
||||
project_dir: Path,
|
||||
model: str,
|
||||
yolo_mode: bool = False,
|
||||
agent_id: str | None = None,
|
||||
agent_type: str = "coding",
|
||||
):
|
||||
"""
|
||||
@@ -291,9 +209,7 @@ def create_client(
|
||||
Args:
|
||||
project_dir: Directory for the project
|
||||
model: Claude model to use
|
||||
yolo_mode: If True, skip Playwright MCP server for rapid prototyping
|
||||
agent_id: Optional unique identifier for browser isolation in parallel mode.
|
||||
When provided, each agent gets its own browser profile.
|
||||
yolo_mode: If True, skip browser testing for rapid prototyping
|
||||
agent_type: One of "coding", "testing", or "initializer". Controls which
|
||||
MCP tools are exposed and the max_turns limit.
|
||||
|
||||
@@ -327,11 +243,8 @@ def create_client(
|
||||
}
|
||||
max_turns = max_turns_map.get(agent_type, 300)
|
||||
|
||||
# Build allowed tools list based on mode and agent type.
|
||||
# In YOLO mode, exclude Playwright tools for faster prototyping.
|
||||
# Build allowed tools list based on agent type.
|
||||
allowed_tools = [*BUILTIN_TOOLS, *feature_tools]
|
||||
if not yolo_mode:
|
||||
allowed_tools.extend(PLAYWRIGHT_TOOLS)
|
||||
|
||||
# Build permissions list.
|
||||
# We permit ALL feature MCP tools at the security layer (so the MCP server
|
||||
@@ -363,10 +276,6 @@ def create_client(
|
||||
permissions_list.append(f"Glob({path}/**)")
|
||||
permissions_list.append(f"Grep({path}/**)")
|
||||
|
||||
if not yolo_mode:
|
||||
# Allow Playwright MCP tools for browser automation (standard mode only)
|
||||
permissions_list.extend(PLAYWRIGHT_TOOLS)
|
||||
|
||||
# Create comprehensive security settings
|
||||
# Note: Using relative paths ("./**") restricts access to project directory
|
||||
# since cwd is set to project_dir
|
||||
@@ -395,9 +304,9 @@ def create_client(
|
||||
print(f" - Extra read paths (validated): {', '.join(str(p) for p in extra_read_paths)}")
|
||||
print(" - Bash commands restricted to allowlist (see security.py)")
|
||||
if yolo_mode:
|
||||
print(" - MCP servers: features (database) - YOLO MODE (no Playwright)")
|
||||
print(" - MCP servers: features (database) - YOLO MODE (no browser testing)")
|
||||
else:
|
||||
print(" - MCP servers: playwright (browser), features (database)")
|
||||
print(" - MCP servers: features (database)")
|
||||
print(" - Project settings enabled (skills, commands, CLAUDE.md)")
|
||||
print()
|
||||
|
||||
@@ -421,36 +330,6 @@ def create_client(
|
||||
},
|
||||
},
|
||||
}
|
||||
if not yolo_mode:
|
||||
# Include Playwright MCP server for browser automation (standard mode only)
|
||||
# Browser and headless mode configurable via environment variables
|
||||
browser = get_playwright_browser()
|
||||
playwright_args = [
|
||||
"@playwright/mcp@latest",
|
||||
"--viewport-size", "1280x720",
|
||||
"--browser", browser,
|
||||
]
|
||||
if get_playwright_headless():
|
||||
playwright_args.append("--headless")
|
||||
print(f" - Browser: {browser} (headless={get_playwright_headless()})")
|
||||
|
||||
# Browser isolation for parallel execution
|
||||
# Each agent gets its own isolated browser context to prevent tab conflicts
|
||||
if agent_id:
|
||||
# Use --isolated for ephemeral browser context
|
||||
# This creates a fresh, isolated context without persistent state
|
||||
# Note: --isolated and --user-data-dir are mutually exclusive
|
||||
playwright_args.append("--isolated")
|
||||
print(f" - Browser isolation enabled for agent: {agent_id}")
|
||||
|
||||
mcp_servers["playwright"] = {
|
||||
"command": "npx",
|
||||
"args": playwright_args,
|
||||
"env": {
|
||||
"NODE_COMPILE_CACHE": "", # Disable V8 compile caching to prevent .node file accumulation in %TEMP%
|
||||
},
|
||||
}
|
||||
|
||||
# Build environment overrides for API endpoint configuration
|
||||
# Uses get_effective_sdk_env() which reads provider settings from the database,
|
||||
# ensuring UI-configured alternative providers (GLM, Ollama, Kimi, Custom) propagate
|
||||
|
||||
43
lib/cli.js
43
lib/cli.js
@@ -517,6 +517,41 @@ function killProcess(pid) {
|
||||
}
|
||||
}
|
||||
|
||||
// ---------------------------------------------------------------------------
|
||||
// Playwright CLI
|
||||
// ---------------------------------------------------------------------------
|
||||
|
||||
/**
|
||||
* Ensure playwright-cli is available globally for browser automation.
|
||||
* Returns true if available (already installed or freshly installed).
|
||||
*
|
||||
* @param {boolean} showProgress - If true, print install progress
|
||||
*/
|
||||
function ensurePlaywrightCli(showProgress) {
|
||||
try {
|
||||
execSync('playwright-cli --version', {
|
||||
timeout: 10_000,
|
||||
stdio: ['pipe', 'pipe', 'pipe'],
|
||||
});
|
||||
return true;
|
||||
} catch {
|
||||
// Not installed — try to install
|
||||
}
|
||||
|
||||
if (showProgress) {
|
||||
log(' Installing playwright-cli for browser automation...');
|
||||
}
|
||||
try {
|
||||
execSync('npm install -g @playwright/cli', {
|
||||
timeout: 120_000,
|
||||
stdio: ['pipe', 'pipe', 'pipe'],
|
||||
});
|
||||
return true;
|
||||
} catch {
|
||||
return false;
|
||||
}
|
||||
}
|
||||
|
||||
// ---------------------------------------------------------------------------
|
||||
// CLI commands
|
||||
// ---------------------------------------------------------------------------
|
||||
@@ -613,6 +648,14 @@ function startServer(opts) {
|
||||
}
|
||||
const wasAlreadyReady = ensureVenv(python, repair);
|
||||
|
||||
// Ensure playwright-cli for browser automation (quick check, installs once)
|
||||
if (!ensurePlaywrightCli(!wasAlreadyReady)) {
|
||||
log('');
|
||||
log(' Note: playwright-cli not available (browser automation will be limited)');
|
||||
log(' Install manually: npm install -g @playwright/cli');
|
||||
log('');
|
||||
}
|
||||
|
||||
// Step 3: Config file
|
||||
const configCreated = ensureEnvFile();
|
||||
|
||||
|
||||
@@ -19,6 +19,7 @@
|
||||
"ui/dist/",
|
||||
"ui/package.json",
|
||||
".claude/commands/",
|
||||
".claude/skills/",
|
||||
".claude/templates/",
|
||||
"examples/",
|
||||
"start.py",
|
||||
|
||||
395
prompts.py
395
prompts.py
@@ -16,6 +16,9 @@ from pathlib import Path
|
||||
# Base templates location (generic templates)
|
||||
TEMPLATES_DIR = Path(__file__).parent / ".claude" / "templates"
|
||||
|
||||
# Migration version — bump when adding new migration steps
|
||||
CURRENT_MIGRATION_VERSION = 1
|
||||
|
||||
|
||||
def get_project_prompts_dir(project_dir: Path) -> Path:
|
||||
"""Get the prompts directory for a specific project."""
|
||||
@@ -99,9 +102,9 @@ def _strip_browser_testing_sections(prompt: str) -> str:
|
||||
flags=re.DOTALL,
|
||||
)
|
||||
|
||||
# Replace the screenshots-only marking rule with YOLO-appropriate wording
|
||||
# Replace the marking rule with YOLO-appropriate wording
|
||||
prompt = prompt.replace(
|
||||
"**ONLY MARK A FEATURE AS PASSING AFTER VERIFICATION WITH SCREENSHOTS.**",
|
||||
"**ONLY MARK A FEATURE AS PASSING AFTER VERIFICATION WITH BROWSER AUTOMATION.**",
|
||||
"**YOLO mode: Mark a feature as passing after lint/type-check succeeds and server starts cleanly.**",
|
||||
)
|
||||
|
||||
@@ -351,9 +354,70 @@ def scaffold_project_prompts(project_dir: Path) -> Path:
|
||||
except (OSError, PermissionError) as e:
|
||||
print(f" Warning: Could not copy allowed_commands.yaml: {e}")
|
||||
|
||||
# Copy Playwright CLI skill for browser automation
|
||||
skills_src = Path(__file__).parent / ".claude" / "skills" / "playwright-cli"
|
||||
skills_dest = project_dir / ".claude" / "skills" / "playwright-cli"
|
||||
if skills_src.exists() and not skills_dest.exists():
|
||||
try:
|
||||
shutil.copytree(skills_src, skills_dest)
|
||||
copied_files.append(".claude/skills/playwright-cli/")
|
||||
except (OSError, PermissionError) as e:
|
||||
print(f" Warning: Could not copy playwright-cli skill: {e}")
|
||||
|
||||
# Ensure .playwright-cli/ and .playwright/ are in project .gitignore
|
||||
project_gitignore = project_dir / ".gitignore"
|
||||
entries_to_add = [".playwright-cli/", ".playwright/"]
|
||||
existing_lines: list[str] = []
|
||||
if project_gitignore.exists():
|
||||
try:
|
||||
existing_lines = project_gitignore.read_text(encoding="utf-8").splitlines()
|
||||
except (OSError, PermissionError):
|
||||
pass
|
||||
missing_entries = [e for e in entries_to_add if e not in existing_lines]
|
||||
if missing_entries:
|
||||
try:
|
||||
with open(project_gitignore, "a", encoding="utf-8") as f:
|
||||
# Add newline before entries if file doesn't end with one
|
||||
if existing_lines and existing_lines[-1].strip():
|
||||
f.write("\n")
|
||||
for entry in missing_entries:
|
||||
f.write(f"{entry}\n")
|
||||
except (OSError, PermissionError) as e:
|
||||
print(f" Warning: Could not update .gitignore: {e}")
|
||||
|
||||
# Scaffold .playwright/cli.config.json for browser settings
|
||||
playwright_config_dir = project_dir / ".playwright"
|
||||
playwright_config_file = playwright_config_dir / "cli.config.json"
|
||||
if not playwright_config_file.exists():
|
||||
try:
|
||||
playwright_config_dir.mkdir(parents=True, exist_ok=True)
|
||||
import json
|
||||
config = {
|
||||
"browser": {
|
||||
"browserName": "chromium",
|
||||
"launchOptions": {
|
||||
"channel": "chrome",
|
||||
"headless": True,
|
||||
},
|
||||
"contextOptions": {
|
||||
"viewport": {"width": 1280, "height": 720},
|
||||
},
|
||||
"isolated": True,
|
||||
},
|
||||
}
|
||||
with open(playwright_config_file, "w", encoding="utf-8") as f:
|
||||
json.dump(config, f, indent=2)
|
||||
f.write("\n")
|
||||
copied_files.append(".playwright/cli.config.json")
|
||||
except (OSError, PermissionError) as e:
|
||||
print(f" Warning: Could not create playwright config: {e}")
|
||||
|
||||
if copied_files:
|
||||
print(f" Created project files: {', '.join(copied_files)}")
|
||||
|
||||
# Stamp new projects at the current migration version so they never trigger migration
|
||||
_set_migration_version(project_dir, CURRENT_MIGRATION_VERSION)
|
||||
|
||||
return project_prompts
|
||||
|
||||
|
||||
@@ -425,3 +489,330 @@ def copy_spec_to_project(project_dir: Path) -> None:
|
||||
return
|
||||
|
||||
print("Warning: No app_spec.txt found to copy to project directory")
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Project version migration
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
# Replacement content: coding_prompt.md STEP 5 section (Playwright CLI)
|
||||
_CLI_STEP5_CONTENT = """\
|
||||
### STEP 5: VERIFY WITH BROWSER AUTOMATION
|
||||
|
||||
**CRITICAL:** You MUST verify features through the actual UI.
|
||||
|
||||
Use `playwright-cli` for browser automation:
|
||||
|
||||
- Open the browser: `playwright-cli open http://localhost:PORT`
|
||||
- Take a snapshot to see page elements: `playwright-cli snapshot`
|
||||
- Read the snapshot YAML file to see element refs
|
||||
- Click elements by ref: `playwright-cli click e5`
|
||||
- Type text: `playwright-cli type "search query"`
|
||||
- Fill form fields: `playwright-cli fill e3 "value"`
|
||||
- Take screenshots: `playwright-cli screenshot`
|
||||
- Read the screenshot file to verify visual appearance
|
||||
- Check console errors: `playwright-cli console`
|
||||
- Close browser when done: `playwright-cli close`
|
||||
|
||||
**Token-efficient workflow:** `playwright-cli screenshot` and `snapshot` save files
|
||||
to `.playwright-cli/`. You will see a file link in the output. Read the file only
|
||||
when you need to verify visual appearance or find element refs.
|
||||
|
||||
**DO:**
|
||||
- Test through the UI with clicks and keyboard input
|
||||
- Take screenshots and read them to verify visual appearance
|
||||
- Check for console errors with `playwright-cli console`
|
||||
- Verify complete user workflows end-to-end
|
||||
- Always run `playwright-cli close` when finished testing
|
||||
|
||||
**DON'T:**
|
||||
- Only test with curl commands
|
||||
- Use JavaScript evaluation to bypass UI (`eval` and `run-code` are blocked)
|
||||
- Skip visual verification
|
||||
- Mark tests passing without thorough verification
|
||||
|
||||
"""
|
||||
|
||||
# Replacement content: coding_prompt.md BROWSER AUTOMATION reference section
|
||||
_CLI_BROWSER_SECTION = """\
|
||||
## BROWSER AUTOMATION
|
||||
|
||||
Use `playwright-cli` commands for UI verification. Key commands: `open`, `goto`,
|
||||
`snapshot`, `click`, `type`, `fill`, `screenshot`, `console`, `close`.
|
||||
|
||||
**How it works:** `playwright-cli` uses a persistent browser daemon. `open` starts it,
|
||||
subsequent commands interact via socket, `close` shuts it down. Screenshots and snapshots
|
||||
save to `.playwright-cli/` -- read the files when you need to verify content.
|
||||
|
||||
Test like a human user with mouse and keyboard. Use `playwright-cli console` to detect
|
||||
JS errors. Don't bypass UI with JavaScript evaluation.
|
||||
|
||||
"""
|
||||
|
||||
# Replacement content: testing_prompt.md STEP 2 section (Playwright CLI)
|
||||
_CLI_TESTING_STEP2 = """\
|
||||
### STEP 2: VERIFY THE FEATURE
|
||||
|
||||
**CRITICAL:** You MUST verify the feature through the actual UI using browser automation.
|
||||
|
||||
For the feature returned:
|
||||
1. Read and understand the feature's verification steps
|
||||
2. Navigate to the relevant part of the application
|
||||
3. Execute each verification step using browser automation
|
||||
4. Take screenshots and read them to verify visual appearance
|
||||
5. Check for console errors
|
||||
|
||||
### Browser Automation (Playwright CLI)
|
||||
|
||||
**Navigation & Screenshots:**
|
||||
- `playwright-cli open <url>` - Open browser and navigate
|
||||
- `playwright-cli goto <url>` - Navigate to URL
|
||||
- `playwright-cli screenshot` - Save screenshot to `.playwright-cli/`
|
||||
- `playwright-cli snapshot` - Save page snapshot with element refs to `.playwright-cli/`
|
||||
|
||||
**Element Interaction:**
|
||||
- `playwright-cli click <ref>` - Click elements (ref from snapshot)
|
||||
- `playwright-cli type <text>` - Type text
|
||||
- `playwright-cli fill <ref> <text>` - Fill form fields
|
||||
- `playwright-cli select <ref> <val>` - Select dropdown
|
||||
- `playwright-cli press <key>` - Keyboard input
|
||||
|
||||
**Debugging:**
|
||||
- `playwright-cli console` - Check for JS errors
|
||||
- `playwright-cli network` - Monitor API calls
|
||||
|
||||
**Cleanup:**
|
||||
- `playwright-cli close` - Close browser when done (ALWAYS do this)
|
||||
|
||||
**Note:** Screenshots and snapshots save to files. Read the file to see the content.
|
||||
|
||||
"""
|
||||
|
||||
# Replacement content: testing_prompt.md AVAILABLE TOOLS browser subsection
|
||||
_CLI_TESTING_TOOLS = """\
|
||||
### Browser Automation (Playwright CLI)
|
||||
Use `playwright-cli` commands for browser interaction. Key commands:
|
||||
- `playwright-cli open <url>` - Open browser
|
||||
- `playwright-cli goto <url>` - Navigate to URL
|
||||
- `playwright-cli screenshot` - Take screenshot (saved to `.playwright-cli/`)
|
||||
- `playwright-cli snapshot` - Get page snapshot with element refs
|
||||
- `playwright-cli click <ref>` - Click element
|
||||
- `playwright-cli type <text>` - Type text
|
||||
- `playwright-cli fill <ref> <text>` - Fill form field
|
||||
- `playwright-cli console` - Check for JS errors
|
||||
- `playwright-cli close` - Close browser (always do this when done)
|
||||
|
||||
"""
|
||||
|
||||
|
||||
def _get_migration_version(project_dir: Path) -> int:
|
||||
"""Read the migration version from .autoforge/.migration_version."""
|
||||
from autoforge_paths import get_autoforge_dir
|
||||
version_file = get_autoforge_dir(project_dir) / ".migration_version"
|
||||
if not version_file.exists():
|
||||
return 0
|
||||
try:
|
||||
return int(version_file.read_text().strip())
|
||||
except (ValueError, OSError):
|
||||
return 0
|
||||
|
||||
|
||||
def _set_migration_version(project_dir: Path, version: int) -> None:
|
||||
"""Write the migration version to .autoforge/.migration_version."""
|
||||
from autoforge_paths import get_autoforge_dir
|
||||
version_file = get_autoforge_dir(project_dir) / ".migration_version"
|
||||
version_file.parent.mkdir(parents=True, exist_ok=True)
|
||||
version_file.write_text(str(version))
|
||||
|
||||
|
||||
def _migrate_coding_prompt_to_cli(content: str) -> str:
|
||||
"""Replace MCP-based Playwright sections with CLI-based content in coding prompt."""
|
||||
# Replace STEP 5 section (from header to just before STEP 5.5)
|
||||
content = re.sub(
|
||||
r"### STEP 5: VERIFY WITH BROWSER AUTOMATION.*?(?=### STEP 5\.5:)",
|
||||
_CLI_STEP5_CONTENT,
|
||||
content,
|
||||
count=1,
|
||||
flags=re.DOTALL,
|
||||
)
|
||||
|
||||
# Replace BROWSER AUTOMATION reference section (from header to next ---)
|
||||
content = re.sub(
|
||||
r"## BROWSER AUTOMATION\n\n.*?(?=---)",
|
||||
_CLI_BROWSER_SECTION,
|
||||
content,
|
||||
count=1,
|
||||
flags=re.DOTALL,
|
||||
)
|
||||
|
||||
# Replace inline screenshot rule
|
||||
content = content.replace(
|
||||
"**ONLY MARK A FEATURE AS PASSING AFTER VERIFICATION WITH SCREENSHOTS.**",
|
||||
"**ONLY MARK A FEATURE AS PASSING AFTER VERIFICATION WITH BROWSER AUTOMATION.**",
|
||||
)
|
||||
|
||||
# Replace inline screenshot references (various phrasings from old templates)
|
||||
for old_phrase in (
|
||||
"(inline only -- do NOT save to disk)",
|
||||
"(inline only, never save to disk)",
|
||||
"(inline mode only -- never save to disk)",
|
||||
):
|
||||
content = content.replace(old_phrase, "(saved to `.playwright-cli/`)")
|
||||
|
||||
return content
|
||||
|
||||
|
||||
def _migrate_testing_prompt_to_cli(content: str) -> str:
|
||||
"""Replace MCP-based Playwright sections with CLI-based content in testing prompt."""
|
||||
# Replace AVAILABLE TOOLS browser subsection FIRST (before STEP 2, to avoid
|
||||
# matching the new CLI subsection header that the STEP 2 replacement inserts).
|
||||
# In old prompts, ### Browser Automation (Playwright) only exists in AVAILABLE TOOLS.
|
||||
content = re.sub(
|
||||
r"### Browser Automation \(Playwright[^)]*\)\n.*?(?=---)",
|
||||
_CLI_TESTING_TOOLS,
|
||||
content,
|
||||
count=1,
|
||||
flags=re.DOTALL,
|
||||
)
|
||||
|
||||
# Replace STEP 2 verification section (from header to just before STEP 3)
|
||||
content = re.sub(
|
||||
r"### STEP 2: VERIFY THE FEATURE.*?(?=### STEP 3:)",
|
||||
_CLI_TESTING_STEP2,
|
||||
content,
|
||||
count=1,
|
||||
flags=re.DOTALL,
|
||||
)
|
||||
|
||||
# Replace inline screenshot references (various phrasings from old templates)
|
||||
for old_phrase in (
|
||||
"(inline only -- do NOT save to disk)",
|
||||
"(inline only, never save to disk)",
|
||||
"(inline mode only -- never save to disk)",
|
||||
):
|
||||
content = content.replace(old_phrase, "(saved to `.playwright-cli/`)")
|
||||
|
||||
return content
|
||||
|
||||
|
||||
def _migrate_v0_to_v1(project_dir: Path) -> list[str]:
|
||||
"""Migrate from v0 (MCP-based Playwright) to v1 (Playwright CLI).
|
||||
|
||||
Four idempotent sub-steps:
|
||||
A. Copy playwright-cli skill to project
|
||||
B. Scaffold .playwright/cli.config.json
|
||||
C. Update .gitignore with .playwright-cli/ and .playwright/
|
||||
D. Update coding_prompt.md and testing_prompt.md
|
||||
"""
|
||||
import json
|
||||
|
||||
migrated: list[str] = []
|
||||
|
||||
# A. Copy Playwright CLI skill
|
||||
skills_src = Path(__file__).parent / ".claude" / "skills" / "playwright-cli"
|
||||
skills_dest = project_dir / ".claude" / "skills" / "playwright-cli"
|
||||
if skills_src.exists() and not skills_dest.exists():
|
||||
try:
|
||||
shutil.copytree(skills_src, skills_dest)
|
||||
migrated.append("Copied playwright-cli skill")
|
||||
except (OSError, PermissionError) as e:
|
||||
print(f" Warning: Could not copy playwright-cli skill: {e}")
|
||||
|
||||
# B. Scaffold .playwright/cli.config.json
|
||||
playwright_config_dir = project_dir / ".playwright"
|
||||
playwright_config_file = playwright_config_dir / "cli.config.json"
|
||||
if not playwright_config_file.exists():
|
||||
try:
|
||||
playwright_config_dir.mkdir(parents=True, exist_ok=True)
|
||||
config = {
|
||||
"browser": {
|
||||
"browserName": "chromium",
|
||||
"launchOptions": {
|
||||
"channel": "chrome",
|
||||
"headless": True,
|
||||
},
|
||||
"contextOptions": {
|
||||
"viewport": {"width": 1280, "height": 720},
|
||||
},
|
||||
"isolated": True,
|
||||
},
|
||||
}
|
||||
with open(playwright_config_file, "w", encoding="utf-8") as f:
|
||||
json.dump(config, f, indent=2)
|
||||
f.write("\n")
|
||||
migrated.append("Created .playwright/cli.config.json")
|
||||
except (OSError, PermissionError) as e:
|
||||
print(f" Warning: Could not create playwright config: {e}")
|
||||
|
||||
# C. Update .gitignore
|
||||
project_gitignore = project_dir / ".gitignore"
|
||||
entries_to_add = [".playwright-cli/", ".playwright/"]
|
||||
existing_lines: list[str] = []
|
||||
if project_gitignore.exists():
|
||||
try:
|
||||
existing_lines = project_gitignore.read_text(encoding="utf-8").splitlines()
|
||||
except (OSError, PermissionError):
|
||||
pass
|
||||
missing_entries = [e for e in entries_to_add if e not in existing_lines]
|
||||
if missing_entries:
|
||||
try:
|
||||
with open(project_gitignore, "a", encoding="utf-8") as f:
|
||||
if existing_lines and existing_lines[-1].strip():
|
||||
f.write("\n")
|
||||
for entry in missing_entries:
|
||||
f.write(f"{entry}\n")
|
||||
migrated.append(f"Added {', '.join(missing_entries)} to .gitignore")
|
||||
except (OSError, PermissionError) as e:
|
||||
print(f" Warning: Could not update .gitignore: {e}")
|
||||
|
||||
# D. Update prompts
|
||||
prompts_dir = get_project_prompts_dir(project_dir)
|
||||
|
||||
# D1. Update coding_prompt.md
|
||||
coding_prompt_path = prompts_dir / "coding_prompt.md"
|
||||
if coding_prompt_path.exists():
|
||||
try:
|
||||
content = coding_prompt_path.read_text(encoding="utf-8")
|
||||
if "Playwright MCP" in content or "browser_navigate" in content or "browser_take_screenshot" in content:
|
||||
updated = _migrate_coding_prompt_to_cli(content)
|
||||
if updated != content:
|
||||
coding_prompt_path.write_text(updated, encoding="utf-8")
|
||||
migrated.append("Updated coding_prompt.md to Playwright CLI")
|
||||
except (OSError, PermissionError) as e:
|
||||
print(f" Warning: Could not update coding_prompt.md: {e}")
|
||||
|
||||
# D2. Update testing_prompt.md
|
||||
testing_prompt_path = prompts_dir / "testing_prompt.md"
|
||||
if testing_prompt_path.exists():
|
||||
try:
|
||||
content = testing_prompt_path.read_text(encoding="utf-8")
|
||||
if "browser_navigate" in content or "browser_take_screenshot" in content:
|
||||
updated = _migrate_testing_prompt_to_cli(content)
|
||||
if updated != content:
|
||||
testing_prompt_path.write_text(updated, encoding="utf-8")
|
||||
migrated.append("Updated testing_prompt.md to Playwright CLI")
|
||||
except (OSError, PermissionError) as e:
|
||||
print(f" Warning: Could not update testing_prompt.md: {e}")
|
||||
|
||||
return migrated
|
||||
|
||||
|
||||
def migrate_project_to_current(project_dir: Path) -> list[str]:
|
||||
"""Migrate an existing project to the current AutoForge version.
|
||||
|
||||
Idempotent — safe to call on every agent start. Returns list of
|
||||
human-readable descriptions of what was migrated.
|
||||
"""
|
||||
current = _get_migration_version(project_dir)
|
||||
if current >= CURRENT_MIGRATION_VERSION:
|
||||
return []
|
||||
|
||||
migrated: list[str] = []
|
||||
|
||||
if current < 1:
|
||||
migrated.extend(_migrate_v0_to_v1(project_dir))
|
||||
|
||||
# Future: if current < 2: migrated.extend(_migrate_v1_to_v2(project_dir))
|
||||
|
||||
_set_migration_version(project_dir, CURRENT_MIGRATION_VERSION)
|
||||
return migrated
|
||||
|
||||
39
security.py
39
security.py
@@ -66,10 +66,12 @@ ALLOWED_COMMANDS = {
|
||||
"bash",
|
||||
# Script execution
|
||||
"init.sh", # Init scripts; validated separately
|
||||
# Browser automation
|
||||
"playwright-cli", # Playwright CLI for browser testing; validated separately
|
||||
}
|
||||
|
||||
# Commands that need additional validation even when in the allowlist
|
||||
COMMANDS_NEEDING_EXTRA_VALIDATION = {"pkill", "chmod", "init.sh"}
|
||||
COMMANDS_NEEDING_EXTRA_VALIDATION = {"pkill", "chmod", "init.sh", "playwright-cli"}
|
||||
|
||||
# Commands that are NEVER allowed, even with user approval
|
||||
# These commands can cause permanent system damage or security breaches
|
||||
@@ -438,6 +440,37 @@ def validate_init_script(command_string: str) -> tuple[bool, str]:
|
||||
return False, f"Only ./init.sh is allowed, got: {script}"
|
||||
|
||||
|
||||
def validate_playwright_command(command_string: str) -> tuple[bool, str]:
|
||||
"""
|
||||
Validate playwright-cli commands - block dangerous subcommands.
|
||||
|
||||
Blocks `run-code` (arbitrary Node.js execution) and `eval` (arbitrary JS
|
||||
evaluation) which bypass the security sandbox.
|
||||
|
||||
Returns:
|
||||
Tuple of (is_allowed, reason_if_blocked)
|
||||
"""
|
||||
try:
|
||||
tokens = shlex.split(command_string)
|
||||
except ValueError:
|
||||
return False, "Could not parse playwright-cli command"
|
||||
|
||||
if not tokens:
|
||||
return False, "Empty command"
|
||||
|
||||
BLOCKED_SUBCOMMANDS = {"run-code", "eval"}
|
||||
|
||||
# Find the subcommand: first non-flag token after 'playwright-cli'
|
||||
for token in tokens[1:]:
|
||||
if token.startswith("-"):
|
||||
continue # skip flags like -s=agent-1
|
||||
if token in BLOCKED_SUBCOMMANDS:
|
||||
return False, f"playwright-cli '{token}' is not allowed"
|
||||
break # first non-flag token is the subcommand
|
||||
|
||||
return True, ""
|
||||
|
||||
|
||||
def matches_pattern(command: str, pattern: str) -> bool:
|
||||
"""
|
||||
Check if a command matches a pattern.
|
||||
@@ -955,5 +988,9 @@ async def bash_security_hook(input_data, tool_use_id=None, context=None):
|
||||
allowed, reason = validate_init_script(cmd_segment)
|
||||
if not allowed:
|
||||
return {"decision": "block", "reason": reason}
|
||||
elif cmd == "playwright-cli":
|
||||
allowed, reason = validate_playwright_command(cmd_segment)
|
||||
if not allowed:
|
||||
return {"decision": "block", "reason": reason}
|
||||
|
||||
return {}
|
||||
|
||||
@@ -227,6 +227,28 @@ class AgentProcessManager:
|
||||
"""Remove lock file."""
|
||||
self.lock_file.unlink(missing_ok=True)
|
||||
|
||||
def _apply_playwright_headless(self, headless: bool) -> None:
|
||||
"""Update .playwright/cli.config.json with the current headless setting.
|
||||
|
||||
playwright-cli reads this config file on each ``open`` command, so
|
||||
updating it before the agent starts is sufficient.
|
||||
"""
|
||||
config_file = self.project_dir / ".playwright" / "cli.config.json"
|
||||
if not config_file.exists():
|
||||
return
|
||||
try:
|
||||
import json
|
||||
config = json.loads(config_file.read_text(encoding="utf-8"))
|
||||
launch_opts = config.get("browser", {}).get("launchOptions", {})
|
||||
if launch_opts.get("headless") == headless:
|
||||
return # already correct
|
||||
launch_opts["headless"] = headless
|
||||
config.setdefault("browser", {})["launchOptions"] = launch_opts
|
||||
config_file.write_text(json.dumps(config, indent=2) + "\n", encoding="utf-8")
|
||||
logger.info("Set playwright headless=%s for %s", headless, self.project_name)
|
||||
except Exception:
|
||||
logger.warning("Failed to update playwright config", exc_info=True)
|
||||
|
||||
def _cleanup_stale_features(self) -> None:
|
||||
"""Clear in_progress flag for all features when agent stops/crashes.
|
||||
|
||||
@@ -361,6 +383,15 @@ class AgentProcessManager:
|
||||
if not self._check_lock():
|
||||
return False, "Another agent instance is already running for this project"
|
||||
|
||||
# Clean up stale browser daemons from previous runs
|
||||
try:
|
||||
subprocess.run(
|
||||
["playwright-cli", "kill-all"],
|
||||
timeout=5, capture_output=True,
|
||||
)
|
||||
except (subprocess.TimeoutExpired, FileNotFoundError, OSError):
|
||||
pass
|
||||
|
||||
# Clean up features stuck from a previous crash/stop
|
||||
self._cleanup_stale_features()
|
||||
|
||||
@@ -397,6 +428,10 @@ class AgentProcessManager:
|
||||
# Add --batch-size flag for multi-feature batching
|
||||
cmd.extend(["--batch-size", str(batch_size)])
|
||||
|
||||
# Apply headless setting to .playwright/cli.config.json so playwright-cli
|
||||
# picks it up (the only mechanism it supports for headless control)
|
||||
self._apply_playwright_headless(playwright_headless)
|
||||
|
||||
try:
|
||||
# Start subprocess with piped stdout/stderr
|
||||
# Use project_dir as cwd so Claude SDK sandbox allows access to project files
|
||||
@@ -409,7 +444,7 @@ class AgentProcessManager:
|
||||
subprocess_env = {
|
||||
**os.environ,
|
||||
"PYTHONUNBUFFERED": "1",
|
||||
"PLAYWRIGHT_HEADLESS": "true" if playwright_headless else "false",
|
||||
"PLAYWRIGHT_CLI_SESSION": f"agent-{self.project_name}-{os.getpid()}",
|
||||
"NODE_COMPILE_CACHE": "", # Disable V8 compile caching to prevent .node file accumulation in %TEMP%
|
||||
**api_env,
|
||||
}
|
||||
@@ -469,6 +504,15 @@ class AgentProcessManager:
|
||||
except asyncio.CancelledError:
|
||||
pass
|
||||
|
||||
# Kill browser daemons before stopping agent
|
||||
try:
|
||||
subprocess.run(
|
||||
["playwright-cli", "kill-all"],
|
||||
timeout=5, capture_output=True,
|
||||
)
|
||||
except (subprocess.TimeoutExpired, FileNotFoundError, OSError):
|
||||
pass
|
||||
|
||||
# CRITICAL: Kill entire process tree, not just orchestrator
|
||||
# This ensures all spawned coding/testing agents are also terminated
|
||||
proc = self.process # Capture reference before async call
|
||||
|
||||
10
start.bat
10
start.bat
@@ -54,5 +54,15 @@ REM Install dependencies
|
||||
echo Installing dependencies...
|
||||
pip install -r requirements.txt --quiet
|
||||
|
||||
REM Ensure playwright-cli is available for browser automation
|
||||
where playwright-cli >nul 2>&1
|
||||
if %ERRORLEVEL% neq 0 (
|
||||
echo Installing playwright-cli for browser automation...
|
||||
call npm install -g @playwright/cli >nul 2>&1
|
||||
if %ERRORLEVEL% neq 0 (
|
||||
echo Note: Could not install playwright-cli. Install manually: npm install -g @playwright/cli
|
||||
)
|
||||
)
|
||||
|
||||
REM Run the app
|
||||
python start.py
|
||||
|
||||
9
start.sh
9
start.sh
@@ -74,5 +74,14 @@ fi
|
||||
echo "Installing dependencies..."
|
||||
pip install -r requirements.txt --quiet
|
||||
|
||||
# Ensure playwright-cli is available for browser automation
|
||||
if ! command -v playwright-cli &> /dev/null; then
|
||||
echo "Installing playwright-cli for browser automation..."
|
||||
npm install -g @playwright/cli --quiet 2>/dev/null
|
||||
if [ $? -ne 0 ]; then
|
||||
echo "Note: Could not install playwright-cli. Install manually: npm install -g @playwright/cli"
|
||||
fi
|
||||
fi
|
||||
|
||||
# Run the app
|
||||
python start.py
|
||||
|
||||
10
start_ui.bat
10
start_ui.bat
@@ -37,5 +37,15 @@ REM Install dependencies
|
||||
echo Installing dependencies...
|
||||
pip install -r requirements.txt --quiet
|
||||
|
||||
REM Ensure playwright-cli is available for browser automation
|
||||
where playwright-cli >nul 2>&1
|
||||
if %ERRORLEVEL% neq 0 (
|
||||
echo Installing playwright-cli for browser automation...
|
||||
call npm install -g @playwright/cli >nul 2>&1
|
||||
if %ERRORLEVEL% neq 0 (
|
||||
echo Note: Could not install playwright-cli. Install manually: npm install -g @playwright/cli
|
||||
)
|
||||
)
|
||||
|
||||
REM Run the Python launcher
|
||||
python "%~dp0start_ui.py" %*
|
||||
|
||||
@@ -80,5 +80,14 @@ fi
|
||||
echo "Installing dependencies..."
|
||||
pip install -r requirements.txt --quiet
|
||||
|
||||
# Ensure playwright-cli is available for browser automation
|
||||
if ! command -v playwright-cli &> /dev/null; then
|
||||
echo "Installing playwright-cli for browser automation..."
|
||||
npm install -g @playwright/cli --quiet 2>/dev/null
|
||||
if [ $? -ne 0 ]; then
|
||||
echo "Note: Could not install playwright-cli. Install manually: npm install -g @playwright/cli"
|
||||
fi
|
||||
fi
|
||||
|
||||
# Run the Python launcher
|
||||
python start_ui.py "$@"
|
||||
|
||||
@@ -125,14 +125,18 @@ def cleanup_stale_temp(max_age_seconds: int = MAX_AGE_SECONDS) -> dict:
|
||||
|
||||
def cleanup_project_screenshots(project_dir: Path, max_age_seconds: int = 300) -> dict:
|
||||
"""
|
||||
Clean up stale screenshot files from the project root.
|
||||
Clean up stale Playwright CLI artifacts from the project.
|
||||
|
||||
Playwright browser verification can leave .png files in the project
|
||||
directory. This removes them after they've aged out (default 5 minutes).
|
||||
The Playwright CLI daemon saves screenshots, snapshots, and other artifacts
|
||||
to `{project_dir}/.playwright-cli/`. This removes them after they've aged
|
||||
out (default 5 minutes).
|
||||
|
||||
Also cleans up legacy screenshot patterns from the project root (from the
|
||||
old Playwright MCP server approach).
|
||||
|
||||
Args:
|
||||
project_dir: Path to the project directory.
|
||||
max_age_seconds: Maximum age in seconds before a screenshot is deleted.
|
||||
max_age_seconds: Maximum age in seconds before an artifact is deleted.
|
||||
Defaults to 5 minutes (300 seconds).
|
||||
|
||||
Returns:
|
||||
@@ -141,13 +145,33 @@ def cleanup_project_screenshots(project_dir: Path, max_age_seconds: int = 300) -
|
||||
cutoff_time = time.time() - max_age_seconds
|
||||
stats: dict = {"files_deleted": 0, "bytes_freed": 0, "errors": []}
|
||||
|
||||
screenshot_patterns = [
|
||||
# Clean up .playwright-cli/ directory (new CLI approach)
|
||||
playwright_cli_dir = project_dir / ".playwright-cli"
|
||||
if playwright_cli_dir.exists():
|
||||
for item in playwright_cli_dir.iterdir():
|
||||
if not item.is_file():
|
||||
continue
|
||||
try:
|
||||
mtime = item.stat().st_mtime
|
||||
if mtime < cutoff_time:
|
||||
size = item.stat().st_size
|
||||
item.unlink(missing_ok=True)
|
||||
if not item.exists():
|
||||
stats["files_deleted"] += 1
|
||||
stats["bytes_freed"] += size
|
||||
logger.debug(f"Deleted playwright-cli artifact: {item}")
|
||||
except Exception as e:
|
||||
stats["errors"].append(f"Failed to delete {item}: {e}")
|
||||
logger.debug(f"Failed to delete artifact {item}: {e}")
|
||||
|
||||
# Legacy cleanup: root-level screenshot patterns (from old MCP server approach)
|
||||
legacy_patterns = [
|
||||
"feature*-*.png",
|
||||
"screenshot-*.png",
|
||||
"step-*.png",
|
||||
]
|
||||
|
||||
for pattern in screenshot_patterns:
|
||||
for pattern in legacy_patterns:
|
||||
for item in project_dir.glob(pattern):
|
||||
if not item.is_file():
|
||||
continue
|
||||
@@ -159,14 +183,14 @@ def cleanup_project_screenshots(project_dir: Path, max_age_seconds: int = 300) -
|
||||
if not item.exists():
|
||||
stats["files_deleted"] += 1
|
||||
stats["bytes_freed"] += size
|
||||
logger.debug(f"Deleted project screenshot: {item}")
|
||||
logger.debug(f"Deleted legacy screenshot: {item}")
|
||||
except Exception as e:
|
||||
stats["errors"].append(f"Failed to delete {item}: {e}")
|
||||
logger.debug(f"Failed to delete screenshot {item}: {e}")
|
||||
|
||||
if stats["files_deleted"] > 0:
|
||||
mb_freed = stats["bytes_freed"] / (1024 * 1024)
|
||||
logger.info(f"Screenshot cleanup: {stats['files_deleted']} files, {mb_freed:.1f} MB freed")
|
||||
logger.info(f"Artifact cleanup: {stats['files_deleted']} files, {mb_freed:.1f} MB freed")
|
||||
|
||||
return stats
|
||||
|
||||
|
||||
@@ -25,6 +25,7 @@ from security import (
|
||||
validate_chmod_command,
|
||||
validate_init_script,
|
||||
validate_pkill_command,
|
||||
validate_playwright_command,
|
||||
validate_project_command,
|
||||
)
|
||||
|
||||
@@ -923,6 +924,70 @@ pkill_processes:
|
||||
return passed, failed
|
||||
|
||||
|
||||
def test_playwright_cli_validation():
|
||||
"""Test playwright-cli subcommand validation."""
|
||||
print("\nTesting playwright-cli validation:\n")
|
||||
passed = 0
|
||||
failed = 0
|
||||
|
||||
# Test cases: (command, should_be_allowed, description)
|
||||
test_cases = [
|
||||
# Allowed cases
|
||||
("playwright-cli screenshot", True, "screenshot allowed"),
|
||||
("playwright-cli snapshot", True, "snapshot allowed"),
|
||||
("playwright-cli click e5", True, "click with ref"),
|
||||
("playwright-cli open http://localhost:3000", True, "open URL"),
|
||||
("playwright-cli -s=agent-1 click e5", True, "session flag with click"),
|
||||
("playwright-cli close", True, "close browser"),
|
||||
("playwright-cli goto http://localhost:3000/page", True, "goto URL"),
|
||||
("playwright-cli fill e3 'test value'", True, "fill form field"),
|
||||
("playwright-cli console", True, "console messages"),
|
||||
# Blocked cases
|
||||
("playwright-cli run-code 'await page.evaluate(() => {})'", False, "run-code blocked"),
|
||||
("playwright-cli eval 'document.title'", False, "eval blocked"),
|
||||
("playwright-cli -s=test eval 'document.title'", False, "eval with session flag blocked"),
|
||||
]
|
||||
|
||||
for cmd, should_allow, description in test_cases:
|
||||
allowed, reason = validate_playwright_command(cmd)
|
||||
if allowed == should_allow:
|
||||
print(f" PASS: {cmd!r} ({description})")
|
||||
passed += 1
|
||||
else:
|
||||
expected = "allowed" if should_allow else "blocked"
|
||||
actual = "allowed" if allowed else "blocked"
|
||||
print(f" FAIL: {cmd!r} ({description})")
|
||||
print(f" Expected: {expected}, Got: {actual}")
|
||||
if reason:
|
||||
print(f" Reason: {reason}")
|
||||
failed += 1
|
||||
|
||||
# Integration test: verify through the security hook
|
||||
print("\n Integration tests (via security hook):\n")
|
||||
|
||||
# playwright-cli screenshot should be allowed
|
||||
input_data = {"tool_name": "Bash", "tool_input": {"command": "playwright-cli screenshot"}}
|
||||
result = asyncio.run(bash_security_hook(input_data))
|
||||
if result.get("decision") != "block":
|
||||
print(" PASS: playwright-cli screenshot allowed via hook")
|
||||
passed += 1
|
||||
else:
|
||||
print(f" FAIL: playwright-cli screenshot should be allowed: {result.get('reason')}")
|
||||
failed += 1
|
||||
|
||||
# playwright-cli run-code should be blocked
|
||||
input_data = {"tool_name": "Bash", "tool_input": {"command": "playwright-cli run-code 'code'"}}
|
||||
result = asyncio.run(bash_security_hook(input_data))
|
||||
if result.get("decision") == "block":
|
||||
print(" PASS: playwright-cli run-code blocked via hook")
|
||||
passed += 1
|
||||
else:
|
||||
print(" FAIL: playwright-cli run-code should be blocked via hook")
|
||||
failed += 1
|
||||
|
||||
return passed, failed
|
||||
|
||||
|
||||
def main():
|
||||
print("=" * 70)
|
||||
print(" SECURITY HOOK TESTS")
|
||||
@@ -991,6 +1056,11 @@ def main():
|
||||
passed += pkill_passed
|
||||
failed += pkill_failed
|
||||
|
||||
# Test playwright-cli validation
|
||||
pw_passed, pw_failed = test_playwright_cli_validation()
|
||||
passed += pw_passed
|
||||
failed += pw_failed
|
||||
|
||||
# Commands that SHOULD be blocked
|
||||
# Note: blocklisted commands (sudo, shutdown, dd, aws) are tested in
|
||||
# test_blocklist_enforcement(). chmod validation is tested in
|
||||
@@ -1012,6 +1082,9 @@ def main():
|
||||
# Shell injection attempts
|
||||
"$(echo pkill) node",
|
||||
'eval "pkill node"',
|
||||
# playwright-cli dangerous subcommands
|
||||
"playwright-cli run-code 'await page.goto(\"http://evil.com\")'",
|
||||
"playwright-cli eval 'document.cookie'",
|
||||
]
|
||||
|
||||
for cmd in dangerous:
|
||||
@@ -1077,6 +1150,12 @@ def main():
|
||||
"/usr/local/bin/node app.js",
|
||||
# Combined chmod and init.sh (integration test for both validators)
|
||||
"chmod +x init.sh && ./init.sh",
|
||||
# Playwright CLI allowed commands
|
||||
"playwright-cli open http://localhost:3000",
|
||||
"playwright-cli screenshot",
|
||||
"playwright-cli snapshot",
|
||||
"playwright-cli click e5",
|
||||
"playwright-cli -s=agent-1 close",
|
||||
]
|
||||
|
||||
for cmd in safe:
|
||||
|
||||
@@ -75,6 +75,7 @@ export function ProjectSelector({
|
||||
variant="outline"
|
||||
className="min-w-[140px] sm:min-w-[200px] justify-between"
|
||||
disabled={isLoading}
|
||||
title={selectedProjectData?.path}
|
||||
>
|
||||
{isLoading ? (
|
||||
<Loader2 size={18} className="animate-spin" />
|
||||
@@ -101,6 +102,7 @@ export function ProjectSelector({
|
||||
{projects.map(project => (
|
||||
<DropdownMenuItem
|
||||
key={project.name}
|
||||
title={project.path}
|
||||
className={`flex items-center justify-between cursor-pointer ${
|
||||
project.name === selectedProject ? 'bg-primary/10' : ''
|
||||
}`}
|
||||
|
||||
Reference in New Issue
Block a user