mirror of
https://github.com/leonvanzyl/autocoder.git
synced 2026-03-16 18:33:08 +00:00
feat: migrate browser automation from Playwright MCP to CLI, fix headless setting
Major changes across 21 files (755 additions, 196 deletions): Browser Automation Migration: - Add versioned project migration system (prompts.py) with content-based detection and section-level regex replacement for coding/testing prompts - Migrate STEP 5 (browser verification) and BROWSER AUTOMATION sections in coding prompt template to use playwright-cli commands - Migrate STEP 2 and AVAILABLE TOOLS sections in testing prompt template - Migration auto-runs at agent startup (autonomous_agent_demo.py), copies playwright-cli skill, scaffolds .playwright/cli.config.json, updates .gitignore, and stamps .migration_version file - Add playwright-cli command validation to security allowlist (security.py) with tests for allowed subcommands and blocked eval/run-code Headless Browser Setting Fix: - Add _apply_playwright_headless() to process_manager.py that reads/updates .playwright/cli.config.json before agent subprocess launch - Remove dead PLAYWRIGHT_HEADLESS env var that was never consumed - Settings UI toggle now correctly controls visible browser window Playwright CLI Auto-Install: - Add ensurePlaywrightCli() to lib/cli.js for npm global entry point - Add playwright-cli detection + npm install to start.bat, start.sh, start_ui.bat, start_ui.sh for all startup paths Other Improvements: - Add project folder path tooltip to ProjectSelector.tsx dropdown items - Remove legacy Playwright MCP server configuration from client.py - Update CLAUDE.md with playwright-cli skill documentation Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This commit is contained in:
@@ -86,24 +86,33 @@ Implement the chosen feature thoroughly:
|
|||||||
|
|
||||||
**CRITICAL:** You MUST verify features through the actual UI.
|
**CRITICAL:** You MUST verify features through the actual UI.
|
||||||
|
|
||||||
Use browser automation tools:
|
Use `playwright-cli` for browser automation:
|
||||||
|
|
||||||
- Navigate to the app in a real browser
|
- Open the browser: `playwright-cli open http://localhost:PORT`
|
||||||
- Interact like a human user (click, type, scroll)
|
- Take a snapshot to see page elements: `playwright-cli snapshot`
|
||||||
- Take screenshots at each step (use inline screenshots only -- do NOT save screenshot files to disk)
|
- Read the snapshot YAML file to see element refs
|
||||||
- Verify both functionality AND visual appearance
|
- Click elements by ref: `playwright-cli click e5`
|
||||||
|
- Type text: `playwright-cli type "search query"`
|
||||||
|
- Fill form fields: `playwright-cli fill e3 "value"`
|
||||||
|
- Take screenshots: `playwright-cli screenshot`
|
||||||
|
- Read the screenshot file to verify visual appearance
|
||||||
|
- Check console errors: `playwright-cli console`
|
||||||
|
- Close browser when done: `playwright-cli close`
|
||||||
|
|
||||||
|
**Token-efficient workflow:** `playwright-cli screenshot` and `snapshot` save files
|
||||||
|
to `.playwright-cli/`. You will see a file link in the output. Read the file only
|
||||||
|
when you need to verify visual appearance or find element refs.
|
||||||
|
|
||||||
**DO:**
|
**DO:**
|
||||||
|
|
||||||
- Test through the UI with clicks and keyboard input
|
- Test through the UI with clicks and keyboard input
|
||||||
- Take screenshots to verify visual appearance (inline only, never save to disk)
|
- Take screenshots and read them to verify visual appearance
|
||||||
- Check for console errors in browser
|
- Check for console errors with `playwright-cli console`
|
||||||
- Verify complete user workflows end-to-end
|
- Verify complete user workflows end-to-end
|
||||||
|
- Always run `playwright-cli close` when finished testing
|
||||||
|
|
||||||
**DON'T:**
|
**DON'T:**
|
||||||
|
- Only test with curl commands
|
||||||
- Only test with curl commands (backend testing alone is insufficient)
|
- Use JavaScript evaluation to bypass UI (`eval` and `run-code` are blocked)
|
||||||
- Use JavaScript evaluation to bypass UI (no shortcuts)
|
|
||||||
- Skip visual verification
|
- Skip visual verification
|
||||||
- Mark tests passing without thorough verification
|
- Mark tests passing without thorough verification
|
||||||
|
|
||||||
@@ -145,7 +154,7 @@ Use the feature_mark_passing tool with feature_id=42
|
|||||||
- Combine or consolidate features
|
- Combine or consolidate features
|
||||||
- Reorder features
|
- Reorder features
|
||||||
|
|
||||||
**ONLY MARK A FEATURE AS PASSING AFTER VERIFICATION WITH SCREENSHOTS.**
|
**ONLY MARK A FEATURE AS PASSING AFTER VERIFICATION WITH BROWSER AUTOMATION.**
|
||||||
|
|
||||||
### STEP 7: COMMIT YOUR PROGRESS
|
### STEP 7: COMMIT YOUR PROGRESS
|
||||||
|
|
||||||
@@ -192,11 +201,15 @@ Before context fills up:
|
|||||||
|
|
||||||
## BROWSER AUTOMATION
|
## BROWSER AUTOMATION
|
||||||
|
|
||||||
Use Playwright MCP tools (`browser_*`) for UI verification. Key tools: `navigate`, `click`, `type`, `fill_form`, `take_screenshot`, `console_messages`, `network_requests`. All tools have auto-wait built in.
|
Use `playwright-cli` commands for UI verification. Key commands: `open`, `goto`,
|
||||||
|
`snapshot`, `click`, `type`, `fill`, `screenshot`, `console`, `close`.
|
||||||
|
|
||||||
**Screenshot rule:** Always use inline mode (base64). NEVER save screenshots as files to disk.
|
**How it works:** `playwright-cli` uses a persistent browser daemon. `open` starts it,
|
||||||
|
subsequent commands interact via socket, `close` shuts it down. Screenshots and snapshots
|
||||||
|
save to `.playwright-cli/` -- read the files when you need to verify content.
|
||||||
|
|
||||||
Test like a human user with mouse and keyboard. Use `browser_console_messages` to detect errors. Don't bypass UI with JavaScript evaluation.
|
Test like a human user with mouse and keyboard. Use `playwright-cli console` to detect
|
||||||
|
JS errors. Don't bypass UI with JavaScript evaluation.
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
|
|||||||
@@ -31,26 +31,32 @@ For the feature returned:
|
|||||||
1. Read and understand the feature's verification steps
|
1. Read and understand the feature's verification steps
|
||||||
2. Navigate to the relevant part of the application
|
2. Navigate to the relevant part of the application
|
||||||
3. Execute each verification step using browser automation
|
3. Execute each verification step using browser automation
|
||||||
4. Take screenshots to document the verification (inline only -- do NOT save to disk)
|
4. Take screenshots and read them to verify visual appearance
|
||||||
5. Check for console errors
|
5. Check for console errors
|
||||||
|
|
||||||
Use browser automation tools:
|
### Browser Automation (Playwright CLI)
|
||||||
|
|
||||||
**Navigation & Screenshots:**
|
**Navigation & Screenshots:**
|
||||||
- browser_navigate - Navigate to a URL
|
- `playwright-cli open <url>` - Open browser and navigate
|
||||||
- browser_take_screenshot - Capture screenshot (inline mode only -- never save to disk)
|
- `playwright-cli goto <url>` - Navigate to URL
|
||||||
- browser_snapshot - Get accessibility tree snapshot
|
- `playwright-cli screenshot` - Save screenshot to `.playwright-cli/`
|
||||||
|
- `playwright-cli snapshot` - Save page snapshot with element refs to `.playwright-cli/`
|
||||||
|
|
||||||
**Element Interaction:**
|
**Element Interaction:**
|
||||||
- browser_click - Click elements
|
- `playwright-cli click <ref>` - Click elements (ref from snapshot)
|
||||||
- browser_type - Type text into editable elements
|
- `playwright-cli type <text>` - Type text
|
||||||
- browser_fill_form - Fill multiple form fields
|
- `playwright-cli fill <ref> <text>` - Fill form fields
|
||||||
- browser_select_option - Select dropdown options
|
- `playwright-cli select <ref> <val>` - Select dropdown
|
||||||
- browser_press_key - Press keyboard keys
|
- `playwright-cli press <key>` - Keyboard input
|
||||||
|
|
||||||
**Debugging:**
|
**Debugging:**
|
||||||
- browser_console_messages - Get browser console output (check for errors)
|
- `playwright-cli console` - Check for JS errors
|
||||||
- browser_network_requests - Monitor API calls
|
- `playwright-cli network` - Monitor API calls
|
||||||
|
|
||||||
|
**Cleanup:**
|
||||||
|
- `playwright-cli close` - Close browser when done (ALWAYS do this)
|
||||||
|
|
||||||
|
**Note:** Screenshots and snapshots save to files. Read the file to see the content.
|
||||||
|
|
||||||
### STEP 3: HANDLE RESULTS
|
### STEP 3: HANDLE RESULTS
|
||||||
|
|
||||||
@@ -79,7 +85,7 @@ A regression has been introduced. You MUST fix it:
|
|||||||
|
|
||||||
4. **Verify the fix:**
|
4. **Verify the fix:**
|
||||||
- Run through all verification steps again
|
- Run through all verification steps again
|
||||||
- Take screenshots confirming the fix (inline only, never save to disk)
|
- Take screenshots and read them to confirm the fix
|
||||||
|
|
||||||
5. **Mark as passing after fix:**
|
5. **Mark as passing after fix:**
|
||||||
```
|
```
|
||||||
@@ -98,7 +104,7 @@ A regression has been introduced. You MUST fix it:
|
|||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
## AVAILABLE MCP TOOLS
|
## AVAILABLE TOOLS
|
||||||
|
|
||||||
### Feature Management
|
### Feature Management
|
||||||
- `feature_get_stats` - Get progress overview (passing/in_progress/total counts)
|
- `feature_get_stats` - Get progress overview (passing/in_progress/total counts)
|
||||||
@@ -106,19 +112,17 @@ A regression has been introduced. You MUST fix it:
|
|||||||
- `feature_mark_failing` - Mark a feature as failing (when you find a regression)
|
- `feature_mark_failing` - Mark a feature as failing (when you find a regression)
|
||||||
- `feature_mark_passing` - Mark a feature as passing (after fixing a regression)
|
- `feature_mark_passing` - Mark a feature as passing (after fixing a regression)
|
||||||
|
|
||||||
### Browser Automation (Playwright)
|
### Browser Automation (Playwright CLI)
|
||||||
All interaction tools have **built-in auto-wait** -- no manual timeouts needed.
|
Use `playwright-cli` commands for browser interaction. Key commands:
|
||||||
|
- `playwright-cli open <url>` - Open browser
|
||||||
- `browser_navigate` - Navigate to URL
|
- `playwright-cli goto <url>` - Navigate to URL
|
||||||
- `browser_take_screenshot` - Capture screenshot (inline only, never save to disk)
|
- `playwright-cli screenshot` - Take screenshot (saved to `.playwright-cli/`)
|
||||||
- `browser_snapshot` - Get accessibility tree
|
- `playwright-cli snapshot` - Get page snapshot with element refs
|
||||||
- `browser_click` - Click elements
|
- `playwright-cli click <ref>` - Click element
|
||||||
- `browser_type` - Type text
|
- `playwright-cli type <text>` - Type text
|
||||||
- `browser_fill_form` - Fill form fields
|
- `playwright-cli fill <ref> <text>` - Fill form field
|
||||||
- `browser_select_option` - Select dropdown
|
- `playwright-cli console` - Check for JS errors
|
||||||
- `browser_press_key` - Keyboard input
|
- `playwright-cli close` - Close browser (always do this when done)
|
||||||
- `browser_console_messages` - Check for JS errors
|
|
||||||
- `browser_network_requests` - Monitor API calls
|
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
|
|||||||
4
.gitignore
vendored
4
.gitignore
vendored
@@ -10,6 +10,10 @@ issues/
|
|||||||
# Browser profiles for parallel agent execution
|
# Browser profiles for parallel agent execution
|
||||||
.browser-profiles/
|
.browser-profiles/
|
||||||
|
|
||||||
|
# Playwright CLI daemon artifacts
|
||||||
|
.playwright-cli/
|
||||||
|
.playwright/
|
||||||
|
|
||||||
# Log files
|
# Log files
|
||||||
logs/
|
logs/
|
||||||
*.log
|
*.log
|
||||||
|
|||||||
@@ -28,5 +28,4 @@ start.sh
|
|||||||
start_ui.sh
|
start_ui.sh
|
||||||
start_ui.py
|
start_ui.py
|
||||||
.claude/agents/
|
.claude/agents/
|
||||||
.claude/skills/
|
|
||||||
.claude/settings.json
|
.claude/settings.json
|
||||||
|
|||||||
10
CLAUDE.md
10
CLAUDE.md
@@ -85,7 +85,7 @@ python autonomous_agent_demo.py --project-dir my-app --yolo
|
|||||||
|
|
||||||
**What's different in YOLO mode:**
|
**What's different in YOLO mode:**
|
||||||
- No regression testing
|
- No regression testing
|
||||||
- No Playwright MCP server (browser automation disabled)
|
- No Playwright CLI (browser automation disabled)
|
||||||
- Features marked passing after lint/type-check succeeds
|
- Features marked passing after lint/type-check succeeds
|
||||||
- Faster iteration for prototyping
|
- Faster iteration for prototyping
|
||||||
|
|
||||||
@@ -163,7 +163,7 @@ Publishing: `npm publish` (triggers `prepublishOnly` which builds UI, then publi
|
|||||||
- `autonomous_agent_demo.py` - Entry point for running the agent (supports `--yolo`, `--parallel`, `--batch-size`, `--batch-features`)
|
- `autonomous_agent_demo.py` - Entry point for running the agent (supports `--yolo`, `--parallel`, `--batch-size`, `--batch-features`)
|
||||||
- `autoforge_paths.py` - Central path resolution with dual-path backward compatibility and migration
|
- `autoforge_paths.py` - Central path resolution with dual-path backward compatibility and migration
|
||||||
- `agent.py` - Agent session loop using Claude Agent SDK
|
- `agent.py` - Agent session loop using Claude Agent SDK
|
||||||
- `client.py` - ClaudeSDKClient configuration with security hooks, MCP servers, and Vertex AI support
|
- `client.py` - ClaudeSDKClient configuration with security hooks, feature MCP server, and Vertex AI support
|
||||||
- `security.py` - Bash command allowlist validation (ALLOWED_COMMANDS whitelist)
|
- `security.py` - Bash command allowlist validation (ALLOWED_COMMANDS whitelist)
|
||||||
- `prompts.py` - Prompt template loading with project-specific fallback and batch feature prompts
|
- `prompts.py` - Prompt template loading with project-specific fallback and batch feature prompts
|
||||||
- `progress.py` - Progress tracking, database queries, webhook notifications
|
- `progress.py` - Progress tracking, database queries, webhook notifications
|
||||||
@@ -288,6 +288,9 @@ Projects can be stored in any directory (registered in `~/.autoforge/registry.db
|
|||||||
- `.autoforge/.agent.lock` - Lock file to prevent multiple agent instances
|
- `.autoforge/.agent.lock` - Lock file to prevent multiple agent instances
|
||||||
- `.autoforge/allowed_commands.yaml` - Project-specific bash command allowlist (optional)
|
- `.autoforge/allowed_commands.yaml` - Project-specific bash command allowlist (optional)
|
||||||
- `.autoforge/.gitignore` - Ignores runtime files
|
- `.autoforge/.gitignore` - Ignores runtime files
|
||||||
|
- `.claude/skills/playwright-cli/` - Playwright CLI skill for browser automation
|
||||||
|
- `.playwright/cli.config.json` - Browser configuration (headless, viewport, etc.)
|
||||||
|
- `.playwright-cli/` - Playwright CLI daemon artifacts (screenshots, snapshots) - gitignored
|
||||||
- `CLAUDE.md` - Stays at project root (SDK convention)
|
- `CLAUDE.md` - Stays at project root (SDK convention)
|
||||||
- `app_spec.txt` - Root copy for agent template compatibility
|
- `app_spec.txt` - Root copy for agent template compatibility
|
||||||
|
|
||||||
@@ -445,6 +448,7 @@ Alternative providers are configured via the **Settings UI** (gear icon > API Pr
|
|||||||
**Skills** (`.claude/skills/`):
|
**Skills** (`.claude/skills/`):
|
||||||
- `frontend-design` - Distinctive, production-grade UI design
|
- `frontend-design` - Distinctive, production-grade UI design
|
||||||
- `gsd-to-autoforge-spec` - Convert GSD codebase mapping to AutoForge app_spec format
|
- `gsd-to-autoforge-spec` - Convert GSD codebase mapping to AutoForge app_spec format
|
||||||
|
- `playwright-cli` - Browser automation via Playwright CLI (copied to each project)
|
||||||
|
|
||||||
**Other:**
|
**Other:**
|
||||||
- `.claude/templates/` - Prompt templates copied to new projects
|
- `.claude/templates/` - Prompt templates copied to new projects
|
||||||
@@ -479,7 +483,7 @@ When running with `--parallel`, the orchestrator:
|
|||||||
1. Spawns multiple Claude agents as subprocesses (up to `--max-concurrency`)
|
1. Spawns multiple Claude agents as subprocesses (up to `--max-concurrency`)
|
||||||
2. Each agent claims features atomically via `feature_claim_and_get`
|
2. Each agent claims features atomically via `feature_claim_and_get`
|
||||||
3. Features blocked by unmet dependencies are skipped
|
3. Features blocked by unmet dependencies are skipped
|
||||||
4. Browser contexts are isolated per agent using `--isolated` flag
|
4. Browser sessions are isolated per agent via `PLAYWRIGHT_CLI_SESSION` environment variable
|
||||||
5. AgentTracker parses output and emits `agent_update` messages for UI
|
5. AgentTracker parses output and emits `agent_update` messages for UI
|
||||||
|
|
||||||
### Process Limits (Parallel Mode)
|
### Process Limits (Parallel Mode)
|
||||||
|
|||||||
12
agent.py
12
agent.py
@@ -240,17 +240,7 @@ async def run_autonomous_agent(
|
|||||||
print_session_header(iteration, is_initializer)
|
print_session_header(iteration, is_initializer)
|
||||||
|
|
||||||
# Create client (fresh context)
|
# Create client (fresh context)
|
||||||
# Pass agent_id for browser isolation in multi-agent scenarios
|
client = create_client(project_dir, model, yolo_mode=yolo_mode, agent_type=agent_type)
|
||||||
import os
|
|
||||||
if agent_type == "testing":
|
|
||||||
agent_id = f"testing-{os.getpid()}" # Unique ID for testing agents
|
|
||||||
elif feature_ids and len(feature_ids) > 1:
|
|
||||||
agent_id = f"batch-{feature_ids[0]}"
|
|
||||||
elif feature_id:
|
|
||||||
agent_id = f"feature-{feature_id}"
|
|
||||||
else:
|
|
||||||
agent_id = None
|
|
||||||
client = create_client(project_dir, model, yolo_mode=yolo_mode, agent_id=agent_id, agent_type=agent_type)
|
|
||||||
|
|
||||||
# Choose prompt based on agent type
|
# Choose prompt based on agent type
|
||||||
if agent_type == "initializer":
|
if agent_type == "initializer":
|
||||||
|
|||||||
@@ -43,6 +43,7 @@ assistant.db-shm
|
|||||||
.claude_assistant_settings.json
|
.claude_assistant_settings.json
|
||||||
.claude_settings.expand.*.json
|
.claude_settings.expand.*.json
|
||||||
.progress_cache
|
.progress_cache
|
||||||
|
.migration_version
|
||||||
"""
|
"""
|
||||||
|
|
||||||
|
|
||||||
|
|||||||
@@ -237,6 +237,12 @@ def main() -> None:
|
|||||||
if migrated:
|
if migrated:
|
||||||
print(f"Migrated project files to .autoforge/: {', '.join(migrated)}", flush=True)
|
print(f"Migrated project files to .autoforge/: {', '.join(migrated)}", flush=True)
|
||||||
|
|
||||||
|
# Migrate project to current AutoForge version (idempotent, safe)
|
||||||
|
from prompts import migrate_project_to_current
|
||||||
|
version_migrated = migrate_project_to_current(project_dir)
|
||||||
|
if version_migrated:
|
||||||
|
print(f"Upgraded project: {', '.join(version_migrated)}", flush=True)
|
||||||
|
|
||||||
# Parse batch testing feature IDs (comma-separated string -> list[int])
|
# Parse batch testing feature IDs (comma-separated string -> list[int])
|
||||||
testing_feature_ids: list[int] | None = None
|
testing_feature_ids: list[int] | None = None
|
||||||
if args.testing_feature_ids:
|
if args.testing_feature_ids:
|
||||||
|
|||||||
131
client.py
131
client.py
@@ -21,16 +21,6 @@ from security import SENSITIVE_DIRECTORIES, bash_security_hook
|
|||||||
# Load environment variables from .env file if present
|
# Load environment variables from .env file if present
|
||||||
load_dotenv()
|
load_dotenv()
|
||||||
|
|
||||||
# Default Playwright headless mode - can be overridden via PLAYWRIGHT_HEADLESS env var
|
|
||||||
# When True, browser runs invisibly in background (default - saves CPU)
|
|
||||||
# When False, browser window is visible (useful for monitoring agent progress)
|
|
||||||
DEFAULT_PLAYWRIGHT_HEADLESS = True
|
|
||||||
|
|
||||||
# Default browser for Playwright - can be overridden via PLAYWRIGHT_BROWSER env var
|
|
||||||
# Options: chrome, firefox, webkit, msedge
|
|
||||||
# Firefox is recommended for lower CPU usage
|
|
||||||
DEFAULT_PLAYWRIGHT_BROWSER = "firefox"
|
|
||||||
|
|
||||||
# Extra read paths for cross-project file access (read-only)
|
# Extra read paths for cross-project file access (read-only)
|
||||||
# Set EXTRA_READ_PATHS environment variable with comma-separated absolute paths
|
# Set EXTRA_READ_PATHS environment variable with comma-separated absolute paths
|
||||||
# Example: EXTRA_READ_PATHS=/Volumes/Data/dev,/Users/shared/libs
|
# Example: EXTRA_READ_PATHS=/Volumes/Data/dev,/Users/shared/libs
|
||||||
@@ -41,6 +31,7 @@ EXTRA_READ_PATHS_VAR = "EXTRA_READ_PATHS"
|
|||||||
# this blocklist and the filesystem browser API share a single source of truth.
|
# this blocklist and the filesystem browser API share a single source of truth.
|
||||||
EXTRA_READ_PATHS_BLOCKLIST = SENSITIVE_DIRECTORIES
|
EXTRA_READ_PATHS_BLOCKLIST = SENSITIVE_DIRECTORIES
|
||||||
|
|
||||||
|
|
||||||
def convert_model_for_vertex(model: str) -> str:
|
def convert_model_for_vertex(model: str) -> str:
|
||||||
"""
|
"""
|
||||||
Convert model name format for Vertex AI compatibility.
|
Convert model name format for Vertex AI compatibility.
|
||||||
@@ -72,43 +63,6 @@ def convert_model_for_vertex(model: str) -> str:
|
|||||||
return model
|
return model
|
||||||
|
|
||||||
|
|
||||||
def get_playwright_headless() -> bool:
|
|
||||||
"""
|
|
||||||
Get the Playwright headless mode setting.
|
|
||||||
|
|
||||||
Reads from PLAYWRIGHT_HEADLESS environment variable, defaults to True.
|
|
||||||
Returns True for headless mode (invisible browser), False for visible browser.
|
|
||||||
"""
|
|
||||||
value = os.getenv("PLAYWRIGHT_HEADLESS", str(DEFAULT_PLAYWRIGHT_HEADLESS).lower()).strip().lower()
|
|
||||||
truthy = {"true", "1", "yes", "on"}
|
|
||||||
falsy = {"false", "0", "no", "off"}
|
|
||||||
if value not in truthy | falsy:
|
|
||||||
print(f" - Warning: Invalid PLAYWRIGHT_HEADLESS='{value}', defaulting to {DEFAULT_PLAYWRIGHT_HEADLESS}")
|
|
||||||
return DEFAULT_PLAYWRIGHT_HEADLESS
|
|
||||||
return value in truthy
|
|
||||||
|
|
||||||
|
|
||||||
# Valid browsers supported by Playwright MCP
|
|
||||||
VALID_PLAYWRIGHT_BROWSERS = {"chrome", "firefox", "webkit", "msedge"}
|
|
||||||
|
|
||||||
|
|
||||||
def get_playwright_browser() -> str:
|
|
||||||
"""
|
|
||||||
Get the browser to use for Playwright.
|
|
||||||
|
|
||||||
Reads from PLAYWRIGHT_BROWSER environment variable, defaults to firefox.
|
|
||||||
Options: chrome, firefox, webkit, msedge
|
|
||||||
Firefox is recommended for lower CPU usage.
|
|
||||||
"""
|
|
||||||
value = os.getenv("PLAYWRIGHT_BROWSER", DEFAULT_PLAYWRIGHT_BROWSER).strip().lower()
|
|
||||||
if value not in VALID_PLAYWRIGHT_BROWSERS:
|
|
||||||
print(f" - Warning: Invalid PLAYWRIGHT_BROWSER='{value}', "
|
|
||||||
f"valid options: {', '.join(sorted(VALID_PLAYWRIGHT_BROWSERS))}. "
|
|
||||||
f"Defaulting to {DEFAULT_PLAYWRIGHT_BROWSER}")
|
|
||||||
return DEFAULT_PLAYWRIGHT_BROWSER
|
|
||||||
return value
|
|
||||||
|
|
||||||
|
|
||||||
def get_extra_read_paths() -> list[Path]:
|
def get_extra_read_paths() -> list[Path]:
|
||||||
"""
|
"""
|
||||||
Get extra read-only paths from EXTRA_READ_PATHS environment variable.
|
Get extra read-only paths from EXTRA_READ_PATHS environment variable.
|
||||||
@@ -228,41 +182,6 @@ ALL_FEATURE_MCP_TOOLS = sorted(
|
|||||||
set(CODING_AGENT_TOOLS) | set(TESTING_AGENT_TOOLS) | set(INITIALIZER_AGENT_TOOLS)
|
set(CODING_AGENT_TOOLS) | set(TESTING_AGENT_TOOLS) | set(INITIALIZER_AGENT_TOOLS)
|
||||||
)
|
)
|
||||||
|
|
||||||
# Playwright MCP tools for browser automation.
|
|
||||||
# Full set of tools for comprehensive UI testing including drag-and-drop,
|
|
||||||
# hover menus, file uploads, tab management, etc.
|
|
||||||
PLAYWRIGHT_TOOLS = [
|
|
||||||
# Core navigation & screenshots
|
|
||||||
"mcp__playwright__browser_navigate",
|
|
||||||
"mcp__playwright__browser_navigate_back",
|
|
||||||
"mcp__playwright__browser_take_screenshot",
|
|
||||||
"mcp__playwright__browser_snapshot",
|
|
||||||
|
|
||||||
# Element interaction
|
|
||||||
"mcp__playwright__browser_click",
|
|
||||||
"mcp__playwright__browser_type",
|
|
||||||
"mcp__playwright__browser_fill_form",
|
|
||||||
"mcp__playwright__browser_select_option",
|
|
||||||
"mcp__playwright__browser_press_key",
|
|
||||||
"mcp__playwright__browser_drag",
|
|
||||||
"mcp__playwright__browser_hover",
|
|
||||||
"mcp__playwright__browser_file_upload",
|
|
||||||
|
|
||||||
# JavaScript & debugging
|
|
||||||
"mcp__playwright__browser_evaluate",
|
|
||||||
# "mcp__playwright__browser_run_code", # REMOVED - causes Playwright MCP server crash
|
|
||||||
"mcp__playwright__browser_console_messages",
|
|
||||||
"mcp__playwright__browser_network_requests",
|
|
||||||
|
|
||||||
# Browser management
|
|
||||||
"mcp__playwright__browser_resize",
|
|
||||||
"mcp__playwright__browser_wait_for",
|
|
||||||
"mcp__playwright__browser_handle_dialog",
|
|
||||||
"mcp__playwright__browser_install",
|
|
||||||
"mcp__playwright__browser_close",
|
|
||||||
"mcp__playwright__browser_tabs",
|
|
||||||
]
|
|
||||||
|
|
||||||
# Built-in tools available to agents.
|
# Built-in tools available to agents.
|
||||||
# WebFetch and WebSearch are included so coding agents can look up current
|
# WebFetch and WebSearch are included so coding agents can look up current
|
||||||
# documentation for frameworks and libraries they are implementing.
|
# documentation for frameworks and libraries they are implementing.
|
||||||
@@ -282,7 +201,6 @@ def create_client(
|
|||||||
project_dir: Path,
|
project_dir: Path,
|
||||||
model: str,
|
model: str,
|
||||||
yolo_mode: bool = False,
|
yolo_mode: bool = False,
|
||||||
agent_id: str | None = None,
|
|
||||||
agent_type: str = "coding",
|
agent_type: str = "coding",
|
||||||
):
|
):
|
||||||
"""
|
"""
|
||||||
@@ -291,9 +209,7 @@ def create_client(
|
|||||||
Args:
|
Args:
|
||||||
project_dir: Directory for the project
|
project_dir: Directory for the project
|
||||||
model: Claude model to use
|
model: Claude model to use
|
||||||
yolo_mode: If True, skip Playwright MCP server for rapid prototyping
|
yolo_mode: If True, skip browser testing for rapid prototyping
|
||||||
agent_id: Optional unique identifier for browser isolation in parallel mode.
|
|
||||||
When provided, each agent gets its own browser profile.
|
|
||||||
agent_type: One of "coding", "testing", or "initializer". Controls which
|
agent_type: One of "coding", "testing", or "initializer". Controls which
|
||||||
MCP tools are exposed and the max_turns limit.
|
MCP tools are exposed and the max_turns limit.
|
||||||
|
|
||||||
@@ -327,11 +243,8 @@ def create_client(
|
|||||||
}
|
}
|
||||||
max_turns = max_turns_map.get(agent_type, 300)
|
max_turns = max_turns_map.get(agent_type, 300)
|
||||||
|
|
||||||
# Build allowed tools list based on mode and agent type.
|
# Build allowed tools list based on agent type.
|
||||||
# In YOLO mode, exclude Playwright tools for faster prototyping.
|
|
||||||
allowed_tools = [*BUILTIN_TOOLS, *feature_tools]
|
allowed_tools = [*BUILTIN_TOOLS, *feature_tools]
|
||||||
if not yolo_mode:
|
|
||||||
allowed_tools.extend(PLAYWRIGHT_TOOLS)
|
|
||||||
|
|
||||||
# Build permissions list.
|
# Build permissions list.
|
||||||
# We permit ALL feature MCP tools at the security layer (so the MCP server
|
# We permit ALL feature MCP tools at the security layer (so the MCP server
|
||||||
@@ -363,10 +276,6 @@ def create_client(
|
|||||||
permissions_list.append(f"Glob({path}/**)")
|
permissions_list.append(f"Glob({path}/**)")
|
||||||
permissions_list.append(f"Grep({path}/**)")
|
permissions_list.append(f"Grep({path}/**)")
|
||||||
|
|
||||||
if not yolo_mode:
|
|
||||||
# Allow Playwright MCP tools for browser automation (standard mode only)
|
|
||||||
permissions_list.extend(PLAYWRIGHT_TOOLS)
|
|
||||||
|
|
||||||
# Create comprehensive security settings
|
# Create comprehensive security settings
|
||||||
# Note: Using relative paths ("./**") restricts access to project directory
|
# Note: Using relative paths ("./**") restricts access to project directory
|
||||||
# since cwd is set to project_dir
|
# since cwd is set to project_dir
|
||||||
@@ -395,9 +304,9 @@ def create_client(
|
|||||||
print(f" - Extra read paths (validated): {', '.join(str(p) for p in extra_read_paths)}")
|
print(f" - Extra read paths (validated): {', '.join(str(p) for p in extra_read_paths)}")
|
||||||
print(" - Bash commands restricted to allowlist (see security.py)")
|
print(" - Bash commands restricted to allowlist (see security.py)")
|
||||||
if yolo_mode:
|
if yolo_mode:
|
||||||
print(" - MCP servers: features (database) - YOLO MODE (no Playwright)")
|
print(" - MCP servers: features (database) - YOLO MODE (no browser testing)")
|
||||||
else:
|
else:
|
||||||
print(" - MCP servers: playwright (browser), features (database)")
|
print(" - MCP servers: features (database)")
|
||||||
print(" - Project settings enabled (skills, commands, CLAUDE.md)")
|
print(" - Project settings enabled (skills, commands, CLAUDE.md)")
|
||||||
print()
|
print()
|
||||||
|
|
||||||
@@ -421,36 +330,6 @@ def create_client(
|
|||||||
},
|
},
|
||||||
},
|
},
|
||||||
}
|
}
|
||||||
if not yolo_mode:
|
|
||||||
# Include Playwright MCP server for browser automation (standard mode only)
|
|
||||||
# Browser and headless mode configurable via environment variables
|
|
||||||
browser = get_playwright_browser()
|
|
||||||
playwright_args = [
|
|
||||||
"@playwright/mcp@latest",
|
|
||||||
"--viewport-size", "1280x720",
|
|
||||||
"--browser", browser,
|
|
||||||
]
|
|
||||||
if get_playwright_headless():
|
|
||||||
playwright_args.append("--headless")
|
|
||||||
print(f" - Browser: {browser} (headless={get_playwright_headless()})")
|
|
||||||
|
|
||||||
# Browser isolation for parallel execution
|
|
||||||
# Each agent gets its own isolated browser context to prevent tab conflicts
|
|
||||||
if agent_id:
|
|
||||||
# Use --isolated for ephemeral browser context
|
|
||||||
# This creates a fresh, isolated context without persistent state
|
|
||||||
# Note: --isolated and --user-data-dir are mutually exclusive
|
|
||||||
playwright_args.append("--isolated")
|
|
||||||
print(f" - Browser isolation enabled for agent: {agent_id}")
|
|
||||||
|
|
||||||
mcp_servers["playwright"] = {
|
|
||||||
"command": "npx",
|
|
||||||
"args": playwright_args,
|
|
||||||
"env": {
|
|
||||||
"NODE_COMPILE_CACHE": "", # Disable V8 compile caching to prevent .node file accumulation in %TEMP%
|
|
||||||
},
|
|
||||||
}
|
|
||||||
|
|
||||||
# Build environment overrides for API endpoint configuration
|
# Build environment overrides for API endpoint configuration
|
||||||
# Uses get_effective_sdk_env() which reads provider settings from the database,
|
# Uses get_effective_sdk_env() which reads provider settings from the database,
|
||||||
# ensuring UI-configured alternative providers (GLM, Ollama, Kimi, Custom) propagate
|
# ensuring UI-configured alternative providers (GLM, Ollama, Kimi, Custom) propagate
|
||||||
|
|||||||
43
lib/cli.js
43
lib/cli.js
@@ -517,6 +517,41 @@ function killProcess(pid) {
|
|||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
|
// ---------------------------------------------------------------------------
|
||||||
|
// Playwright CLI
|
||||||
|
// ---------------------------------------------------------------------------
|
||||||
|
|
||||||
|
/**
|
||||||
|
* Ensure playwright-cli is available globally for browser automation.
|
||||||
|
* Returns true if available (already installed or freshly installed).
|
||||||
|
*
|
||||||
|
* @param {boolean} showProgress - If true, print install progress
|
||||||
|
*/
|
||||||
|
function ensurePlaywrightCli(showProgress) {
|
||||||
|
try {
|
||||||
|
execSync('playwright-cli --version', {
|
||||||
|
timeout: 10_000,
|
||||||
|
stdio: ['pipe', 'pipe', 'pipe'],
|
||||||
|
});
|
||||||
|
return true;
|
||||||
|
} catch {
|
||||||
|
// Not installed — try to install
|
||||||
|
}
|
||||||
|
|
||||||
|
if (showProgress) {
|
||||||
|
log(' Installing playwright-cli for browser automation...');
|
||||||
|
}
|
||||||
|
try {
|
||||||
|
execSync('npm install -g @playwright/cli', {
|
||||||
|
timeout: 120_000,
|
||||||
|
stdio: ['pipe', 'pipe', 'pipe'],
|
||||||
|
});
|
||||||
|
return true;
|
||||||
|
} catch {
|
||||||
|
return false;
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
// ---------------------------------------------------------------------------
|
// ---------------------------------------------------------------------------
|
||||||
// CLI commands
|
// CLI commands
|
||||||
// ---------------------------------------------------------------------------
|
// ---------------------------------------------------------------------------
|
||||||
@@ -613,6 +648,14 @@ function startServer(opts) {
|
|||||||
}
|
}
|
||||||
const wasAlreadyReady = ensureVenv(python, repair);
|
const wasAlreadyReady = ensureVenv(python, repair);
|
||||||
|
|
||||||
|
// Ensure playwright-cli for browser automation (quick check, installs once)
|
||||||
|
if (!ensurePlaywrightCli(!wasAlreadyReady)) {
|
||||||
|
log('');
|
||||||
|
log(' Note: playwright-cli not available (browser automation will be limited)');
|
||||||
|
log(' Install manually: npm install -g @playwright/cli');
|
||||||
|
log('');
|
||||||
|
}
|
||||||
|
|
||||||
// Step 3: Config file
|
// Step 3: Config file
|
||||||
const configCreated = ensureEnvFile();
|
const configCreated = ensureEnvFile();
|
||||||
|
|
||||||
|
|||||||
@@ -19,6 +19,7 @@
|
|||||||
"ui/dist/",
|
"ui/dist/",
|
||||||
"ui/package.json",
|
"ui/package.json",
|
||||||
".claude/commands/",
|
".claude/commands/",
|
||||||
|
".claude/skills/",
|
||||||
".claude/templates/",
|
".claude/templates/",
|
||||||
"examples/",
|
"examples/",
|
||||||
"start.py",
|
"start.py",
|
||||||
|
|||||||
395
prompts.py
395
prompts.py
@@ -16,6 +16,9 @@ from pathlib import Path
|
|||||||
# Base templates location (generic templates)
|
# Base templates location (generic templates)
|
||||||
TEMPLATES_DIR = Path(__file__).parent / ".claude" / "templates"
|
TEMPLATES_DIR = Path(__file__).parent / ".claude" / "templates"
|
||||||
|
|
||||||
|
# Migration version — bump when adding new migration steps
|
||||||
|
CURRENT_MIGRATION_VERSION = 1
|
||||||
|
|
||||||
|
|
||||||
def get_project_prompts_dir(project_dir: Path) -> Path:
|
def get_project_prompts_dir(project_dir: Path) -> Path:
|
||||||
"""Get the prompts directory for a specific project."""
|
"""Get the prompts directory for a specific project."""
|
||||||
@@ -99,9 +102,9 @@ def _strip_browser_testing_sections(prompt: str) -> str:
|
|||||||
flags=re.DOTALL,
|
flags=re.DOTALL,
|
||||||
)
|
)
|
||||||
|
|
||||||
# Replace the screenshots-only marking rule with YOLO-appropriate wording
|
# Replace the marking rule with YOLO-appropriate wording
|
||||||
prompt = prompt.replace(
|
prompt = prompt.replace(
|
||||||
"**ONLY MARK A FEATURE AS PASSING AFTER VERIFICATION WITH SCREENSHOTS.**",
|
"**ONLY MARK A FEATURE AS PASSING AFTER VERIFICATION WITH BROWSER AUTOMATION.**",
|
||||||
"**YOLO mode: Mark a feature as passing after lint/type-check succeeds and server starts cleanly.**",
|
"**YOLO mode: Mark a feature as passing after lint/type-check succeeds and server starts cleanly.**",
|
||||||
)
|
)
|
||||||
|
|
||||||
@@ -351,9 +354,70 @@ def scaffold_project_prompts(project_dir: Path) -> Path:
|
|||||||
except (OSError, PermissionError) as e:
|
except (OSError, PermissionError) as e:
|
||||||
print(f" Warning: Could not copy allowed_commands.yaml: {e}")
|
print(f" Warning: Could not copy allowed_commands.yaml: {e}")
|
||||||
|
|
||||||
|
# Copy Playwright CLI skill for browser automation
|
||||||
|
skills_src = Path(__file__).parent / ".claude" / "skills" / "playwright-cli"
|
||||||
|
skills_dest = project_dir / ".claude" / "skills" / "playwright-cli"
|
||||||
|
if skills_src.exists() and not skills_dest.exists():
|
||||||
|
try:
|
||||||
|
shutil.copytree(skills_src, skills_dest)
|
||||||
|
copied_files.append(".claude/skills/playwright-cli/")
|
||||||
|
except (OSError, PermissionError) as e:
|
||||||
|
print(f" Warning: Could not copy playwright-cli skill: {e}")
|
||||||
|
|
||||||
|
# Ensure .playwright-cli/ and .playwright/ are in project .gitignore
|
||||||
|
project_gitignore = project_dir / ".gitignore"
|
||||||
|
entries_to_add = [".playwright-cli/", ".playwright/"]
|
||||||
|
existing_lines: list[str] = []
|
||||||
|
if project_gitignore.exists():
|
||||||
|
try:
|
||||||
|
existing_lines = project_gitignore.read_text(encoding="utf-8").splitlines()
|
||||||
|
except (OSError, PermissionError):
|
||||||
|
pass
|
||||||
|
missing_entries = [e for e in entries_to_add if e not in existing_lines]
|
||||||
|
if missing_entries:
|
||||||
|
try:
|
||||||
|
with open(project_gitignore, "a", encoding="utf-8") as f:
|
||||||
|
# Add newline before entries if file doesn't end with one
|
||||||
|
if existing_lines and existing_lines[-1].strip():
|
||||||
|
f.write("\n")
|
||||||
|
for entry in missing_entries:
|
||||||
|
f.write(f"{entry}\n")
|
||||||
|
except (OSError, PermissionError) as e:
|
||||||
|
print(f" Warning: Could not update .gitignore: {e}")
|
||||||
|
|
||||||
|
# Scaffold .playwright/cli.config.json for browser settings
|
||||||
|
playwright_config_dir = project_dir / ".playwright"
|
||||||
|
playwright_config_file = playwright_config_dir / "cli.config.json"
|
||||||
|
if not playwright_config_file.exists():
|
||||||
|
try:
|
||||||
|
playwright_config_dir.mkdir(parents=True, exist_ok=True)
|
||||||
|
import json
|
||||||
|
config = {
|
||||||
|
"browser": {
|
||||||
|
"browserName": "chromium",
|
||||||
|
"launchOptions": {
|
||||||
|
"channel": "chrome",
|
||||||
|
"headless": True,
|
||||||
|
},
|
||||||
|
"contextOptions": {
|
||||||
|
"viewport": {"width": 1280, "height": 720},
|
||||||
|
},
|
||||||
|
"isolated": True,
|
||||||
|
},
|
||||||
|
}
|
||||||
|
with open(playwright_config_file, "w", encoding="utf-8") as f:
|
||||||
|
json.dump(config, f, indent=2)
|
||||||
|
f.write("\n")
|
||||||
|
copied_files.append(".playwright/cli.config.json")
|
||||||
|
except (OSError, PermissionError) as e:
|
||||||
|
print(f" Warning: Could not create playwright config: {e}")
|
||||||
|
|
||||||
if copied_files:
|
if copied_files:
|
||||||
print(f" Created project files: {', '.join(copied_files)}")
|
print(f" Created project files: {', '.join(copied_files)}")
|
||||||
|
|
||||||
|
# Stamp new projects at the current migration version so they never trigger migration
|
||||||
|
_set_migration_version(project_dir, CURRENT_MIGRATION_VERSION)
|
||||||
|
|
||||||
return project_prompts
|
return project_prompts
|
||||||
|
|
||||||
|
|
||||||
@@ -425,3 +489,330 @@ def copy_spec_to_project(project_dir: Path) -> None:
|
|||||||
return
|
return
|
||||||
|
|
||||||
print("Warning: No app_spec.txt found to copy to project directory")
|
print("Warning: No app_spec.txt found to copy to project directory")
|
||||||
|
|
||||||
|
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
# Project version migration
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
|
||||||
|
# Replacement content: coding_prompt.md STEP 5 section (Playwright CLI)
|
||||||
|
_CLI_STEP5_CONTENT = """\
|
||||||
|
### STEP 5: VERIFY WITH BROWSER AUTOMATION
|
||||||
|
|
||||||
|
**CRITICAL:** You MUST verify features through the actual UI.
|
||||||
|
|
||||||
|
Use `playwright-cli` for browser automation:
|
||||||
|
|
||||||
|
- Open the browser: `playwright-cli open http://localhost:PORT`
|
||||||
|
- Take a snapshot to see page elements: `playwright-cli snapshot`
|
||||||
|
- Read the snapshot YAML file to see element refs
|
||||||
|
- Click elements by ref: `playwright-cli click e5`
|
||||||
|
- Type text: `playwright-cli type "search query"`
|
||||||
|
- Fill form fields: `playwright-cli fill e3 "value"`
|
||||||
|
- Take screenshots: `playwright-cli screenshot`
|
||||||
|
- Read the screenshot file to verify visual appearance
|
||||||
|
- Check console errors: `playwright-cli console`
|
||||||
|
- Close browser when done: `playwright-cli close`
|
||||||
|
|
||||||
|
**Token-efficient workflow:** `playwright-cli screenshot` and `snapshot` save files
|
||||||
|
to `.playwright-cli/`. You will see a file link in the output. Read the file only
|
||||||
|
when you need to verify visual appearance or find element refs.
|
||||||
|
|
||||||
|
**DO:**
|
||||||
|
- Test through the UI with clicks and keyboard input
|
||||||
|
- Take screenshots and read them to verify visual appearance
|
||||||
|
- Check for console errors with `playwright-cli console`
|
||||||
|
- Verify complete user workflows end-to-end
|
||||||
|
- Always run `playwright-cli close` when finished testing
|
||||||
|
|
||||||
|
**DON'T:**
|
||||||
|
- Only test with curl commands
|
||||||
|
- Use JavaScript evaluation to bypass UI (`eval` and `run-code` are blocked)
|
||||||
|
- Skip visual verification
|
||||||
|
- Mark tests passing without thorough verification
|
||||||
|
|
||||||
|
"""
|
||||||
|
|
||||||
|
# Replacement content: coding_prompt.md BROWSER AUTOMATION reference section
|
||||||
|
_CLI_BROWSER_SECTION = """\
|
||||||
|
## BROWSER AUTOMATION
|
||||||
|
|
||||||
|
Use `playwright-cli` commands for UI verification. Key commands: `open`, `goto`,
|
||||||
|
`snapshot`, `click`, `type`, `fill`, `screenshot`, `console`, `close`.
|
||||||
|
|
||||||
|
**How it works:** `playwright-cli` uses a persistent browser daemon. `open` starts it,
|
||||||
|
subsequent commands interact via socket, `close` shuts it down. Screenshots and snapshots
|
||||||
|
save to `.playwright-cli/` -- read the files when you need to verify content.
|
||||||
|
|
||||||
|
Test like a human user with mouse and keyboard. Use `playwright-cli console` to detect
|
||||||
|
JS errors. Don't bypass UI with JavaScript evaluation.
|
||||||
|
|
||||||
|
"""
|
||||||
|
|
||||||
|
# Replacement content: testing_prompt.md STEP 2 section (Playwright CLI)
|
||||||
|
_CLI_TESTING_STEP2 = """\
|
||||||
|
### STEP 2: VERIFY THE FEATURE
|
||||||
|
|
||||||
|
**CRITICAL:** You MUST verify the feature through the actual UI using browser automation.
|
||||||
|
|
||||||
|
For the feature returned:
|
||||||
|
1. Read and understand the feature's verification steps
|
||||||
|
2. Navigate to the relevant part of the application
|
||||||
|
3. Execute each verification step using browser automation
|
||||||
|
4. Take screenshots and read them to verify visual appearance
|
||||||
|
5. Check for console errors
|
||||||
|
|
||||||
|
### Browser Automation (Playwright CLI)
|
||||||
|
|
||||||
|
**Navigation & Screenshots:**
|
||||||
|
- `playwright-cli open <url>` - Open browser and navigate
|
||||||
|
- `playwright-cli goto <url>` - Navigate to URL
|
||||||
|
- `playwright-cli screenshot` - Save screenshot to `.playwright-cli/`
|
||||||
|
- `playwright-cli snapshot` - Save page snapshot with element refs to `.playwright-cli/`
|
||||||
|
|
||||||
|
**Element Interaction:**
|
||||||
|
- `playwright-cli click <ref>` - Click elements (ref from snapshot)
|
||||||
|
- `playwright-cli type <text>` - Type text
|
||||||
|
- `playwright-cli fill <ref> <text>` - Fill form fields
|
||||||
|
- `playwright-cli select <ref> <val>` - Select dropdown
|
||||||
|
- `playwright-cli press <key>` - Keyboard input
|
||||||
|
|
||||||
|
**Debugging:**
|
||||||
|
- `playwright-cli console` - Check for JS errors
|
||||||
|
- `playwright-cli network` - Monitor API calls
|
||||||
|
|
||||||
|
**Cleanup:**
|
||||||
|
- `playwright-cli close` - Close browser when done (ALWAYS do this)
|
||||||
|
|
||||||
|
**Note:** Screenshots and snapshots save to files. Read the file to see the content.
|
||||||
|
|
||||||
|
"""
|
||||||
|
|
||||||
|
# Replacement content: testing_prompt.md AVAILABLE TOOLS browser subsection
|
||||||
|
_CLI_TESTING_TOOLS = """\
|
||||||
|
### Browser Automation (Playwright CLI)
|
||||||
|
Use `playwright-cli` commands for browser interaction. Key commands:
|
||||||
|
- `playwright-cli open <url>` - Open browser
|
||||||
|
- `playwright-cli goto <url>` - Navigate to URL
|
||||||
|
- `playwright-cli screenshot` - Take screenshot (saved to `.playwright-cli/`)
|
||||||
|
- `playwright-cli snapshot` - Get page snapshot with element refs
|
||||||
|
- `playwright-cli click <ref>` - Click element
|
||||||
|
- `playwright-cli type <text>` - Type text
|
||||||
|
- `playwright-cli fill <ref> <text>` - Fill form field
|
||||||
|
- `playwright-cli console` - Check for JS errors
|
||||||
|
- `playwright-cli close` - Close browser (always do this when done)
|
||||||
|
|
||||||
|
"""
|
||||||
|
|
||||||
|
|
||||||
|
def _get_migration_version(project_dir: Path) -> int:
|
||||||
|
"""Read the migration version from .autoforge/.migration_version."""
|
||||||
|
from autoforge_paths import get_autoforge_dir
|
||||||
|
version_file = get_autoforge_dir(project_dir) / ".migration_version"
|
||||||
|
if not version_file.exists():
|
||||||
|
return 0
|
||||||
|
try:
|
||||||
|
return int(version_file.read_text().strip())
|
||||||
|
except (ValueError, OSError):
|
||||||
|
return 0
|
||||||
|
|
||||||
|
|
||||||
|
def _set_migration_version(project_dir: Path, version: int) -> None:
|
||||||
|
"""Write the migration version to .autoforge/.migration_version."""
|
||||||
|
from autoforge_paths import get_autoforge_dir
|
||||||
|
version_file = get_autoforge_dir(project_dir) / ".migration_version"
|
||||||
|
version_file.parent.mkdir(parents=True, exist_ok=True)
|
||||||
|
version_file.write_text(str(version))
|
||||||
|
|
||||||
|
|
||||||
|
def _migrate_coding_prompt_to_cli(content: str) -> str:
|
||||||
|
"""Replace MCP-based Playwright sections with CLI-based content in coding prompt."""
|
||||||
|
# Replace STEP 5 section (from header to just before STEP 5.5)
|
||||||
|
content = re.sub(
|
||||||
|
r"### STEP 5: VERIFY WITH BROWSER AUTOMATION.*?(?=### STEP 5\.5:)",
|
||||||
|
_CLI_STEP5_CONTENT,
|
||||||
|
content,
|
||||||
|
count=1,
|
||||||
|
flags=re.DOTALL,
|
||||||
|
)
|
||||||
|
|
||||||
|
# Replace BROWSER AUTOMATION reference section (from header to next ---)
|
||||||
|
content = re.sub(
|
||||||
|
r"## BROWSER AUTOMATION\n\n.*?(?=---)",
|
||||||
|
_CLI_BROWSER_SECTION,
|
||||||
|
content,
|
||||||
|
count=1,
|
||||||
|
flags=re.DOTALL,
|
||||||
|
)
|
||||||
|
|
||||||
|
# Replace inline screenshot rule
|
||||||
|
content = content.replace(
|
||||||
|
"**ONLY MARK A FEATURE AS PASSING AFTER VERIFICATION WITH SCREENSHOTS.**",
|
||||||
|
"**ONLY MARK A FEATURE AS PASSING AFTER VERIFICATION WITH BROWSER AUTOMATION.**",
|
||||||
|
)
|
||||||
|
|
||||||
|
# Replace inline screenshot references (various phrasings from old templates)
|
||||||
|
for old_phrase in (
|
||||||
|
"(inline only -- do NOT save to disk)",
|
||||||
|
"(inline only, never save to disk)",
|
||||||
|
"(inline mode only -- never save to disk)",
|
||||||
|
):
|
||||||
|
content = content.replace(old_phrase, "(saved to `.playwright-cli/`)")
|
||||||
|
|
||||||
|
return content
|
||||||
|
|
||||||
|
|
||||||
|
def _migrate_testing_prompt_to_cli(content: str) -> str:
|
||||||
|
"""Replace MCP-based Playwright sections with CLI-based content in testing prompt."""
|
||||||
|
# Replace AVAILABLE TOOLS browser subsection FIRST (before STEP 2, to avoid
|
||||||
|
# matching the new CLI subsection header that the STEP 2 replacement inserts).
|
||||||
|
# In old prompts, ### Browser Automation (Playwright) only exists in AVAILABLE TOOLS.
|
||||||
|
content = re.sub(
|
||||||
|
r"### Browser Automation \(Playwright[^)]*\)\n.*?(?=---)",
|
||||||
|
_CLI_TESTING_TOOLS,
|
||||||
|
content,
|
||||||
|
count=1,
|
||||||
|
flags=re.DOTALL,
|
||||||
|
)
|
||||||
|
|
||||||
|
# Replace STEP 2 verification section (from header to just before STEP 3)
|
||||||
|
content = re.sub(
|
||||||
|
r"### STEP 2: VERIFY THE FEATURE.*?(?=### STEP 3:)",
|
||||||
|
_CLI_TESTING_STEP2,
|
||||||
|
content,
|
||||||
|
count=1,
|
||||||
|
flags=re.DOTALL,
|
||||||
|
)
|
||||||
|
|
||||||
|
# Replace inline screenshot references (various phrasings from old templates)
|
||||||
|
for old_phrase in (
|
||||||
|
"(inline only -- do NOT save to disk)",
|
||||||
|
"(inline only, never save to disk)",
|
||||||
|
"(inline mode only -- never save to disk)",
|
||||||
|
):
|
||||||
|
content = content.replace(old_phrase, "(saved to `.playwright-cli/`)")
|
||||||
|
|
||||||
|
return content
|
||||||
|
|
||||||
|
|
||||||
|
def _migrate_v0_to_v1(project_dir: Path) -> list[str]:
|
||||||
|
"""Migrate from v0 (MCP-based Playwright) to v1 (Playwright CLI).
|
||||||
|
|
||||||
|
Four idempotent sub-steps:
|
||||||
|
A. Copy playwright-cli skill to project
|
||||||
|
B. Scaffold .playwright/cli.config.json
|
||||||
|
C. Update .gitignore with .playwright-cli/ and .playwright/
|
||||||
|
D. Update coding_prompt.md and testing_prompt.md
|
||||||
|
"""
|
||||||
|
import json
|
||||||
|
|
||||||
|
migrated: list[str] = []
|
||||||
|
|
||||||
|
# A. Copy Playwright CLI skill
|
||||||
|
skills_src = Path(__file__).parent / ".claude" / "skills" / "playwright-cli"
|
||||||
|
skills_dest = project_dir / ".claude" / "skills" / "playwright-cli"
|
||||||
|
if skills_src.exists() and not skills_dest.exists():
|
||||||
|
try:
|
||||||
|
shutil.copytree(skills_src, skills_dest)
|
||||||
|
migrated.append("Copied playwright-cli skill")
|
||||||
|
except (OSError, PermissionError) as e:
|
||||||
|
print(f" Warning: Could not copy playwright-cli skill: {e}")
|
||||||
|
|
||||||
|
# B. Scaffold .playwright/cli.config.json
|
||||||
|
playwright_config_dir = project_dir / ".playwright"
|
||||||
|
playwright_config_file = playwright_config_dir / "cli.config.json"
|
||||||
|
if not playwright_config_file.exists():
|
||||||
|
try:
|
||||||
|
playwright_config_dir.mkdir(parents=True, exist_ok=True)
|
||||||
|
config = {
|
||||||
|
"browser": {
|
||||||
|
"browserName": "chromium",
|
||||||
|
"launchOptions": {
|
||||||
|
"channel": "chrome",
|
||||||
|
"headless": True,
|
||||||
|
},
|
||||||
|
"contextOptions": {
|
||||||
|
"viewport": {"width": 1280, "height": 720},
|
||||||
|
},
|
||||||
|
"isolated": True,
|
||||||
|
},
|
||||||
|
}
|
||||||
|
with open(playwright_config_file, "w", encoding="utf-8") as f:
|
||||||
|
json.dump(config, f, indent=2)
|
||||||
|
f.write("\n")
|
||||||
|
migrated.append("Created .playwright/cli.config.json")
|
||||||
|
except (OSError, PermissionError) as e:
|
||||||
|
print(f" Warning: Could not create playwright config: {e}")
|
||||||
|
|
||||||
|
# C. Update .gitignore
|
||||||
|
project_gitignore = project_dir / ".gitignore"
|
||||||
|
entries_to_add = [".playwright-cli/", ".playwright/"]
|
||||||
|
existing_lines: list[str] = []
|
||||||
|
if project_gitignore.exists():
|
||||||
|
try:
|
||||||
|
existing_lines = project_gitignore.read_text(encoding="utf-8").splitlines()
|
||||||
|
except (OSError, PermissionError):
|
||||||
|
pass
|
||||||
|
missing_entries = [e for e in entries_to_add if e not in existing_lines]
|
||||||
|
if missing_entries:
|
||||||
|
try:
|
||||||
|
with open(project_gitignore, "a", encoding="utf-8") as f:
|
||||||
|
if existing_lines and existing_lines[-1].strip():
|
||||||
|
f.write("\n")
|
||||||
|
for entry in missing_entries:
|
||||||
|
f.write(f"{entry}\n")
|
||||||
|
migrated.append(f"Added {', '.join(missing_entries)} to .gitignore")
|
||||||
|
except (OSError, PermissionError) as e:
|
||||||
|
print(f" Warning: Could not update .gitignore: {e}")
|
||||||
|
|
||||||
|
# D. Update prompts
|
||||||
|
prompts_dir = get_project_prompts_dir(project_dir)
|
||||||
|
|
||||||
|
# D1. Update coding_prompt.md
|
||||||
|
coding_prompt_path = prompts_dir / "coding_prompt.md"
|
||||||
|
if coding_prompt_path.exists():
|
||||||
|
try:
|
||||||
|
content = coding_prompt_path.read_text(encoding="utf-8")
|
||||||
|
if "Playwright MCP" in content or "browser_navigate" in content or "browser_take_screenshot" in content:
|
||||||
|
updated = _migrate_coding_prompt_to_cli(content)
|
||||||
|
if updated != content:
|
||||||
|
coding_prompt_path.write_text(updated, encoding="utf-8")
|
||||||
|
migrated.append("Updated coding_prompt.md to Playwright CLI")
|
||||||
|
except (OSError, PermissionError) as e:
|
||||||
|
print(f" Warning: Could not update coding_prompt.md: {e}")
|
||||||
|
|
||||||
|
# D2. Update testing_prompt.md
|
||||||
|
testing_prompt_path = prompts_dir / "testing_prompt.md"
|
||||||
|
if testing_prompt_path.exists():
|
||||||
|
try:
|
||||||
|
content = testing_prompt_path.read_text(encoding="utf-8")
|
||||||
|
if "browser_navigate" in content or "browser_take_screenshot" in content:
|
||||||
|
updated = _migrate_testing_prompt_to_cli(content)
|
||||||
|
if updated != content:
|
||||||
|
testing_prompt_path.write_text(updated, encoding="utf-8")
|
||||||
|
migrated.append("Updated testing_prompt.md to Playwright CLI")
|
||||||
|
except (OSError, PermissionError) as e:
|
||||||
|
print(f" Warning: Could not update testing_prompt.md: {e}")
|
||||||
|
|
||||||
|
return migrated
|
||||||
|
|
||||||
|
|
||||||
|
def migrate_project_to_current(project_dir: Path) -> list[str]:
|
||||||
|
"""Migrate an existing project to the current AutoForge version.
|
||||||
|
|
||||||
|
Idempotent — safe to call on every agent start. Returns list of
|
||||||
|
human-readable descriptions of what was migrated.
|
||||||
|
"""
|
||||||
|
current = _get_migration_version(project_dir)
|
||||||
|
if current >= CURRENT_MIGRATION_VERSION:
|
||||||
|
return []
|
||||||
|
|
||||||
|
migrated: list[str] = []
|
||||||
|
|
||||||
|
if current < 1:
|
||||||
|
migrated.extend(_migrate_v0_to_v1(project_dir))
|
||||||
|
|
||||||
|
# Future: if current < 2: migrated.extend(_migrate_v1_to_v2(project_dir))
|
||||||
|
|
||||||
|
_set_migration_version(project_dir, CURRENT_MIGRATION_VERSION)
|
||||||
|
return migrated
|
||||||
|
|||||||
39
security.py
39
security.py
@@ -66,10 +66,12 @@ ALLOWED_COMMANDS = {
|
|||||||
"bash",
|
"bash",
|
||||||
# Script execution
|
# Script execution
|
||||||
"init.sh", # Init scripts; validated separately
|
"init.sh", # Init scripts; validated separately
|
||||||
|
# Browser automation
|
||||||
|
"playwright-cli", # Playwright CLI for browser testing; validated separately
|
||||||
}
|
}
|
||||||
|
|
||||||
# Commands that need additional validation even when in the allowlist
|
# Commands that need additional validation even when in the allowlist
|
||||||
COMMANDS_NEEDING_EXTRA_VALIDATION = {"pkill", "chmod", "init.sh"}
|
COMMANDS_NEEDING_EXTRA_VALIDATION = {"pkill", "chmod", "init.sh", "playwright-cli"}
|
||||||
|
|
||||||
# Commands that are NEVER allowed, even with user approval
|
# Commands that are NEVER allowed, even with user approval
|
||||||
# These commands can cause permanent system damage or security breaches
|
# These commands can cause permanent system damage or security breaches
|
||||||
@@ -438,6 +440,37 @@ def validate_init_script(command_string: str) -> tuple[bool, str]:
|
|||||||
return False, f"Only ./init.sh is allowed, got: {script}"
|
return False, f"Only ./init.sh is allowed, got: {script}"
|
||||||
|
|
||||||
|
|
||||||
|
def validate_playwright_command(command_string: str) -> tuple[bool, str]:
|
||||||
|
"""
|
||||||
|
Validate playwright-cli commands - block dangerous subcommands.
|
||||||
|
|
||||||
|
Blocks `run-code` (arbitrary Node.js execution) and `eval` (arbitrary JS
|
||||||
|
evaluation) which bypass the security sandbox.
|
||||||
|
|
||||||
|
Returns:
|
||||||
|
Tuple of (is_allowed, reason_if_blocked)
|
||||||
|
"""
|
||||||
|
try:
|
||||||
|
tokens = shlex.split(command_string)
|
||||||
|
except ValueError:
|
||||||
|
return False, "Could not parse playwright-cli command"
|
||||||
|
|
||||||
|
if not tokens:
|
||||||
|
return False, "Empty command"
|
||||||
|
|
||||||
|
BLOCKED_SUBCOMMANDS = {"run-code", "eval"}
|
||||||
|
|
||||||
|
# Find the subcommand: first non-flag token after 'playwright-cli'
|
||||||
|
for token in tokens[1:]:
|
||||||
|
if token.startswith("-"):
|
||||||
|
continue # skip flags like -s=agent-1
|
||||||
|
if token in BLOCKED_SUBCOMMANDS:
|
||||||
|
return False, f"playwright-cli '{token}' is not allowed"
|
||||||
|
break # first non-flag token is the subcommand
|
||||||
|
|
||||||
|
return True, ""
|
||||||
|
|
||||||
|
|
||||||
def matches_pattern(command: str, pattern: str) -> bool:
|
def matches_pattern(command: str, pattern: str) -> bool:
|
||||||
"""
|
"""
|
||||||
Check if a command matches a pattern.
|
Check if a command matches a pattern.
|
||||||
@@ -955,5 +988,9 @@ async def bash_security_hook(input_data, tool_use_id=None, context=None):
|
|||||||
allowed, reason = validate_init_script(cmd_segment)
|
allowed, reason = validate_init_script(cmd_segment)
|
||||||
if not allowed:
|
if not allowed:
|
||||||
return {"decision": "block", "reason": reason}
|
return {"decision": "block", "reason": reason}
|
||||||
|
elif cmd == "playwright-cli":
|
||||||
|
allowed, reason = validate_playwright_command(cmd_segment)
|
||||||
|
if not allowed:
|
||||||
|
return {"decision": "block", "reason": reason}
|
||||||
|
|
||||||
return {}
|
return {}
|
||||||
|
|||||||
@@ -227,6 +227,28 @@ class AgentProcessManager:
|
|||||||
"""Remove lock file."""
|
"""Remove lock file."""
|
||||||
self.lock_file.unlink(missing_ok=True)
|
self.lock_file.unlink(missing_ok=True)
|
||||||
|
|
||||||
|
def _apply_playwright_headless(self, headless: bool) -> None:
|
||||||
|
"""Update .playwright/cli.config.json with the current headless setting.
|
||||||
|
|
||||||
|
playwright-cli reads this config file on each ``open`` command, so
|
||||||
|
updating it before the agent starts is sufficient.
|
||||||
|
"""
|
||||||
|
config_file = self.project_dir / ".playwright" / "cli.config.json"
|
||||||
|
if not config_file.exists():
|
||||||
|
return
|
||||||
|
try:
|
||||||
|
import json
|
||||||
|
config = json.loads(config_file.read_text(encoding="utf-8"))
|
||||||
|
launch_opts = config.get("browser", {}).get("launchOptions", {})
|
||||||
|
if launch_opts.get("headless") == headless:
|
||||||
|
return # already correct
|
||||||
|
launch_opts["headless"] = headless
|
||||||
|
config.setdefault("browser", {})["launchOptions"] = launch_opts
|
||||||
|
config_file.write_text(json.dumps(config, indent=2) + "\n", encoding="utf-8")
|
||||||
|
logger.info("Set playwright headless=%s for %s", headless, self.project_name)
|
||||||
|
except Exception:
|
||||||
|
logger.warning("Failed to update playwright config", exc_info=True)
|
||||||
|
|
||||||
def _cleanup_stale_features(self) -> None:
|
def _cleanup_stale_features(self) -> None:
|
||||||
"""Clear in_progress flag for all features when agent stops/crashes.
|
"""Clear in_progress flag for all features when agent stops/crashes.
|
||||||
|
|
||||||
@@ -361,6 +383,15 @@ class AgentProcessManager:
|
|||||||
if not self._check_lock():
|
if not self._check_lock():
|
||||||
return False, "Another agent instance is already running for this project"
|
return False, "Another agent instance is already running for this project"
|
||||||
|
|
||||||
|
# Clean up stale browser daemons from previous runs
|
||||||
|
try:
|
||||||
|
subprocess.run(
|
||||||
|
["playwright-cli", "kill-all"],
|
||||||
|
timeout=5, capture_output=True,
|
||||||
|
)
|
||||||
|
except (subprocess.TimeoutExpired, FileNotFoundError, OSError):
|
||||||
|
pass
|
||||||
|
|
||||||
# Clean up features stuck from a previous crash/stop
|
# Clean up features stuck from a previous crash/stop
|
||||||
self._cleanup_stale_features()
|
self._cleanup_stale_features()
|
||||||
|
|
||||||
@@ -397,6 +428,10 @@ class AgentProcessManager:
|
|||||||
# Add --batch-size flag for multi-feature batching
|
# Add --batch-size flag for multi-feature batching
|
||||||
cmd.extend(["--batch-size", str(batch_size)])
|
cmd.extend(["--batch-size", str(batch_size)])
|
||||||
|
|
||||||
|
# Apply headless setting to .playwright/cli.config.json so playwright-cli
|
||||||
|
# picks it up (the only mechanism it supports for headless control)
|
||||||
|
self._apply_playwright_headless(playwright_headless)
|
||||||
|
|
||||||
try:
|
try:
|
||||||
# Start subprocess with piped stdout/stderr
|
# Start subprocess with piped stdout/stderr
|
||||||
# Use project_dir as cwd so Claude SDK sandbox allows access to project files
|
# Use project_dir as cwd so Claude SDK sandbox allows access to project files
|
||||||
@@ -409,7 +444,7 @@ class AgentProcessManager:
|
|||||||
subprocess_env = {
|
subprocess_env = {
|
||||||
**os.environ,
|
**os.environ,
|
||||||
"PYTHONUNBUFFERED": "1",
|
"PYTHONUNBUFFERED": "1",
|
||||||
"PLAYWRIGHT_HEADLESS": "true" if playwright_headless else "false",
|
"PLAYWRIGHT_CLI_SESSION": f"agent-{self.project_name}-{os.getpid()}",
|
||||||
"NODE_COMPILE_CACHE": "", # Disable V8 compile caching to prevent .node file accumulation in %TEMP%
|
"NODE_COMPILE_CACHE": "", # Disable V8 compile caching to prevent .node file accumulation in %TEMP%
|
||||||
**api_env,
|
**api_env,
|
||||||
}
|
}
|
||||||
@@ -469,6 +504,15 @@ class AgentProcessManager:
|
|||||||
except asyncio.CancelledError:
|
except asyncio.CancelledError:
|
||||||
pass
|
pass
|
||||||
|
|
||||||
|
# Kill browser daemons before stopping agent
|
||||||
|
try:
|
||||||
|
subprocess.run(
|
||||||
|
["playwright-cli", "kill-all"],
|
||||||
|
timeout=5, capture_output=True,
|
||||||
|
)
|
||||||
|
except (subprocess.TimeoutExpired, FileNotFoundError, OSError):
|
||||||
|
pass
|
||||||
|
|
||||||
# CRITICAL: Kill entire process tree, not just orchestrator
|
# CRITICAL: Kill entire process tree, not just orchestrator
|
||||||
# This ensures all spawned coding/testing agents are also terminated
|
# This ensures all spawned coding/testing agents are also terminated
|
||||||
proc = self.process # Capture reference before async call
|
proc = self.process # Capture reference before async call
|
||||||
|
|||||||
10
start.bat
10
start.bat
@@ -54,5 +54,15 @@ REM Install dependencies
|
|||||||
echo Installing dependencies...
|
echo Installing dependencies...
|
||||||
pip install -r requirements.txt --quiet
|
pip install -r requirements.txt --quiet
|
||||||
|
|
||||||
|
REM Ensure playwright-cli is available for browser automation
|
||||||
|
where playwright-cli >nul 2>&1
|
||||||
|
if %ERRORLEVEL% neq 0 (
|
||||||
|
echo Installing playwright-cli for browser automation...
|
||||||
|
call npm install -g @playwright/cli >nul 2>&1
|
||||||
|
if %ERRORLEVEL% neq 0 (
|
||||||
|
echo Note: Could not install playwright-cli. Install manually: npm install -g @playwright/cli
|
||||||
|
)
|
||||||
|
)
|
||||||
|
|
||||||
REM Run the app
|
REM Run the app
|
||||||
python start.py
|
python start.py
|
||||||
|
|||||||
9
start.sh
9
start.sh
@@ -74,5 +74,14 @@ fi
|
|||||||
echo "Installing dependencies..."
|
echo "Installing dependencies..."
|
||||||
pip install -r requirements.txt --quiet
|
pip install -r requirements.txt --quiet
|
||||||
|
|
||||||
|
# Ensure playwright-cli is available for browser automation
|
||||||
|
if ! command -v playwright-cli &> /dev/null; then
|
||||||
|
echo "Installing playwright-cli for browser automation..."
|
||||||
|
npm install -g @playwright/cli --quiet 2>/dev/null
|
||||||
|
if [ $? -ne 0 ]; then
|
||||||
|
echo "Note: Could not install playwright-cli. Install manually: npm install -g @playwright/cli"
|
||||||
|
fi
|
||||||
|
fi
|
||||||
|
|
||||||
# Run the app
|
# Run the app
|
||||||
python start.py
|
python start.py
|
||||||
|
|||||||
10
start_ui.bat
10
start_ui.bat
@@ -37,5 +37,15 @@ REM Install dependencies
|
|||||||
echo Installing dependencies...
|
echo Installing dependencies...
|
||||||
pip install -r requirements.txt --quiet
|
pip install -r requirements.txt --quiet
|
||||||
|
|
||||||
|
REM Ensure playwright-cli is available for browser automation
|
||||||
|
where playwright-cli >nul 2>&1
|
||||||
|
if %ERRORLEVEL% neq 0 (
|
||||||
|
echo Installing playwright-cli for browser automation...
|
||||||
|
call npm install -g @playwright/cli >nul 2>&1
|
||||||
|
if %ERRORLEVEL% neq 0 (
|
||||||
|
echo Note: Could not install playwright-cli. Install manually: npm install -g @playwright/cli
|
||||||
|
)
|
||||||
|
)
|
||||||
|
|
||||||
REM Run the Python launcher
|
REM Run the Python launcher
|
||||||
python "%~dp0start_ui.py" %*
|
python "%~dp0start_ui.py" %*
|
||||||
|
|||||||
@@ -80,5 +80,14 @@ fi
|
|||||||
echo "Installing dependencies..."
|
echo "Installing dependencies..."
|
||||||
pip install -r requirements.txt --quiet
|
pip install -r requirements.txt --quiet
|
||||||
|
|
||||||
|
# Ensure playwright-cli is available for browser automation
|
||||||
|
if ! command -v playwright-cli &> /dev/null; then
|
||||||
|
echo "Installing playwright-cli for browser automation..."
|
||||||
|
npm install -g @playwright/cli --quiet 2>/dev/null
|
||||||
|
if [ $? -ne 0 ]; then
|
||||||
|
echo "Note: Could not install playwright-cli. Install manually: npm install -g @playwright/cli"
|
||||||
|
fi
|
||||||
|
fi
|
||||||
|
|
||||||
# Run the Python launcher
|
# Run the Python launcher
|
||||||
python start_ui.py "$@"
|
python start_ui.py "$@"
|
||||||
|
|||||||
@@ -125,14 +125,18 @@ def cleanup_stale_temp(max_age_seconds: int = MAX_AGE_SECONDS) -> dict:
|
|||||||
|
|
||||||
def cleanup_project_screenshots(project_dir: Path, max_age_seconds: int = 300) -> dict:
|
def cleanup_project_screenshots(project_dir: Path, max_age_seconds: int = 300) -> dict:
|
||||||
"""
|
"""
|
||||||
Clean up stale screenshot files from the project root.
|
Clean up stale Playwright CLI artifacts from the project.
|
||||||
|
|
||||||
Playwright browser verification can leave .png files in the project
|
The Playwright CLI daemon saves screenshots, snapshots, and other artifacts
|
||||||
directory. This removes them after they've aged out (default 5 minutes).
|
to `{project_dir}/.playwright-cli/`. This removes them after they've aged
|
||||||
|
out (default 5 minutes).
|
||||||
|
|
||||||
|
Also cleans up legacy screenshot patterns from the project root (from the
|
||||||
|
old Playwright MCP server approach).
|
||||||
|
|
||||||
Args:
|
Args:
|
||||||
project_dir: Path to the project directory.
|
project_dir: Path to the project directory.
|
||||||
max_age_seconds: Maximum age in seconds before a screenshot is deleted.
|
max_age_seconds: Maximum age in seconds before an artifact is deleted.
|
||||||
Defaults to 5 minutes (300 seconds).
|
Defaults to 5 minutes (300 seconds).
|
||||||
|
|
||||||
Returns:
|
Returns:
|
||||||
@@ -141,13 +145,33 @@ def cleanup_project_screenshots(project_dir: Path, max_age_seconds: int = 300) -
|
|||||||
cutoff_time = time.time() - max_age_seconds
|
cutoff_time = time.time() - max_age_seconds
|
||||||
stats: dict = {"files_deleted": 0, "bytes_freed": 0, "errors": []}
|
stats: dict = {"files_deleted": 0, "bytes_freed": 0, "errors": []}
|
||||||
|
|
||||||
screenshot_patterns = [
|
# Clean up .playwright-cli/ directory (new CLI approach)
|
||||||
|
playwright_cli_dir = project_dir / ".playwright-cli"
|
||||||
|
if playwright_cli_dir.exists():
|
||||||
|
for item in playwright_cli_dir.iterdir():
|
||||||
|
if not item.is_file():
|
||||||
|
continue
|
||||||
|
try:
|
||||||
|
mtime = item.stat().st_mtime
|
||||||
|
if mtime < cutoff_time:
|
||||||
|
size = item.stat().st_size
|
||||||
|
item.unlink(missing_ok=True)
|
||||||
|
if not item.exists():
|
||||||
|
stats["files_deleted"] += 1
|
||||||
|
stats["bytes_freed"] += size
|
||||||
|
logger.debug(f"Deleted playwright-cli artifact: {item}")
|
||||||
|
except Exception as e:
|
||||||
|
stats["errors"].append(f"Failed to delete {item}: {e}")
|
||||||
|
logger.debug(f"Failed to delete artifact {item}: {e}")
|
||||||
|
|
||||||
|
# Legacy cleanup: root-level screenshot patterns (from old MCP server approach)
|
||||||
|
legacy_patterns = [
|
||||||
"feature*-*.png",
|
"feature*-*.png",
|
||||||
"screenshot-*.png",
|
"screenshot-*.png",
|
||||||
"step-*.png",
|
"step-*.png",
|
||||||
]
|
]
|
||||||
|
|
||||||
for pattern in screenshot_patterns:
|
for pattern in legacy_patterns:
|
||||||
for item in project_dir.glob(pattern):
|
for item in project_dir.glob(pattern):
|
||||||
if not item.is_file():
|
if not item.is_file():
|
||||||
continue
|
continue
|
||||||
@@ -159,14 +183,14 @@ def cleanup_project_screenshots(project_dir: Path, max_age_seconds: int = 300) -
|
|||||||
if not item.exists():
|
if not item.exists():
|
||||||
stats["files_deleted"] += 1
|
stats["files_deleted"] += 1
|
||||||
stats["bytes_freed"] += size
|
stats["bytes_freed"] += size
|
||||||
logger.debug(f"Deleted project screenshot: {item}")
|
logger.debug(f"Deleted legacy screenshot: {item}")
|
||||||
except Exception as e:
|
except Exception as e:
|
||||||
stats["errors"].append(f"Failed to delete {item}: {e}")
|
stats["errors"].append(f"Failed to delete {item}: {e}")
|
||||||
logger.debug(f"Failed to delete screenshot {item}: {e}")
|
logger.debug(f"Failed to delete screenshot {item}: {e}")
|
||||||
|
|
||||||
if stats["files_deleted"] > 0:
|
if stats["files_deleted"] > 0:
|
||||||
mb_freed = stats["bytes_freed"] / (1024 * 1024)
|
mb_freed = stats["bytes_freed"] / (1024 * 1024)
|
||||||
logger.info(f"Screenshot cleanup: {stats['files_deleted']} files, {mb_freed:.1f} MB freed")
|
logger.info(f"Artifact cleanup: {stats['files_deleted']} files, {mb_freed:.1f} MB freed")
|
||||||
|
|
||||||
return stats
|
return stats
|
||||||
|
|
||||||
|
|||||||
@@ -25,6 +25,7 @@ from security import (
|
|||||||
validate_chmod_command,
|
validate_chmod_command,
|
||||||
validate_init_script,
|
validate_init_script,
|
||||||
validate_pkill_command,
|
validate_pkill_command,
|
||||||
|
validate_playwright_command,
|
||||||
validate_project_command,
|
validate_project_command,
|
||||||
)
|
)
|
||||||
|
|
||||||
@@ -923,6 +924,70 @@ pkill_processes:
|
|||||||
return passed, failed
|
return passed, failed
|
||||||
|
|
||||||
|
|
||||||
|
def test_playwright_cli_validation():
|
||||||
|
"""Test playwright-cli subcommand validation."""
|
||||||
|
print("\nTesting playwright-cli validation:\n")
|
||||||
|
passed = 0
|
||||||
|
failed = 0
|
||||||
|
|
||||||
|
# Test cases: (command, should_be_allowed, description)
|
||||||
|
test_cases = [
|
||||||
|
# Allowed cases
|
||||||
|
("playwright-cli screenshot", True, "screenshot allowed"),
|
||||||
|
("playwright-cli snapshot", True, "snapshot allowed"),
|
||||||
|
("playwright-cli click e5", True, "click with ref"),
|
||||||
|
("playwright-cli open http://localhost:3000", True, "open URL"),
|
||||||
|
("playwright-cli -s=agent-1 click e5", True, "session flag with click"),
|
||||||
|
("playwright-cli close", True, "close browser"),
|
||||||
|
("playwright-cli goto http://localhost:3000/page", True, "goto URL"),
|
||||||
|
("playwright-cli fill e3 'test value'", True, "fill form field"),
|
||||||
|
("playwright-cli console", True, "console messages"),
|
||||||
|
# Blocked cases
|
||||||
|
("playwright-cli run-code 'await page.evaluate(() => {})'", False, "run-code blocked"),
|
||||||
|
("playwright-cli eval 'document.title'", False, "eval blocked"),
|
||||||
|
("playwright-cli -s=test eval 'document.title'", False, "eval with session flag blocked"),
|
||||||
|
]
|
||||||
|
|
||||||
|
for cmd, should_allow, description in test_cases:
|
||||||
|
allowed, reason = validate_playwright_command(cmd)
|
||||||
|
if allowed == should_allow:
|
||||||
|
print(f" PASS: {cmd!r} ({description})")
|
||||||
|
passed += 1
|
||||||
|
else:
|
||||||
|
expected = "allowed" if should_allow else "blocked"
|
||||||
|
actual = "allowed" if allowed else "blocked"
|
||||||
|
print(f" FAIL: {cmd!r} ({description})")
|
||||||
|
print(f" Expected: {expected}, Got: {actual}")
|
||||||
|
if reason:
|
||||||
|
print(f" Reason: {reason}")
|
||||||
|
failed += 1
|
||||||
|
|
||||||
|
# Integration test: verify through the security hook
|
||||||
|
print("\n Integration tests (via security hook):\n")
|
||||||
|
|
||||||
|
# playwright-cli screenshot should be allowed
|
||||||
|
input_data = {"tool_name": "Bash", "tool_input": {"command": "playwright-cli screenshot"}}
|
||||||
|
result = asyncio.run(bash_security_hook(input_data))
|
||||||
|
if result.get("decision") != "block":
|
||||||
|
print(" PASS: playwright-cli screenshot allowed via hook")
|
||||||
|
passed += 1
|
||||||
|
else:
|
||||||
|
print(f" FAIL: playwright-cli screenshot should be allowed: {result.get('reason')}")
|
||||||
|
failed += 1
|
||||||
|
|
||||||
|
# playwright-cli run-code should be blocked
|
||||||
|
input_data = {"tool_name": "Bash", "tool_input": {"command": "playwright-cli run-code 'code'"}}
|
||||||
|
result = asyncio.run(bash_security_hook(input_data))
|
||||||
|
if result.get("decision") == "block":
|
||||||
|
print(" PASS: playwright-cli run-code blocked via hook")
|
||||||
|
passed += 1
|
||||||
|
else:
|
||||||
|
print(" FAIL: playwright-cli run-code should be blocked via hook")
|
||||||
|
failed += 1
|
||||||
|
|
||||||
|
return passed, failed
|
||||||
|
|
||||||
|
|
||||||
def main():
|
def main():
|
||||||
print("=" * 70)
|
print("=" * 70)
|
||||||
print(" SECURITY HOOK TESTS")
|
print(" SECURITY HOOK TESTS")
|
||||||
@@ -991,6 +1056,11 @@ def main():
|
|||||||
passed += pkill_passed
|
passed += pkill_passed
|
||||||
failed += pkill_failed
|
failed += pkill_failed
|
||||||
|
|
||||||
|
# Test playwright-cli validation
|
||||||
|
pw_passed, pw_failed = test_playwright_cli_validation()
|
||||||
|
passed += pw_passed
|
||||||
|
failed += pw_failed
|
||||||
|
|
||||||
# Commands that SHOULD be blocked
|
# Commands that SHOULD be blocked
|
||||||
# Note: blocklisted commands (sudo, shutdown, dd, aws) are tested in
|
# Note: blocklisted commands (sudo, shutdown, dd, aws) are tested in
|
||||||
# test_blocklist_enforcement(). chmod validation is tested in
|
# test_blocklist_enforcement(). chmod validation is tested in
|
||||||
@@ -1012,6 +1082,9 @@ def main():
|
|||||||
# Shell injection attempts
|
# Shell injection attempts
|
||||||
"$(echo pkill) node",
|
"$(echo pkill) node",
|
||||||
'eval "pkill node"',
|
'eval "pkill node"',
|
||||||
|
# playwright-cli dangerous subcommands
|
||||||
|
"playwright-cli run-code 'await page.goto(\"http://evil.com\")'",
|
||||||
|
"playwright-cli eval 'document.cookie'",
|
||||||
]
|
]
|
||||||
|
|
||||||
for cmd in dangerous:
|
for cmd in dangerous:
|
||||||
@@ -1077,6 +1150,12 @@ def main():
|
|||||||
"/usr/local/bin/node app.js",
|
"/usr/local/bin/node app.js",
|
||||||
# Combined chmod and init.sh (integration test for both validators)
|
# Combined chmod and init.sh (integration test for both validators)
|
||||||
"chmod +x init.sh && ./init.sh",
|
"chmod +x init.sh && ./init.sh",
|
||||||
|
# Playwright CLI allowed commands
|
||||||
|
"playwright-cli open http://localhost:3000",
|
||||||
|
"playwright-cli screenshot",
|
||||||
|
"playwright-cli snapshot",
|
||||||
|
"playwright-cli click e5",
|
||||||
|
"playwright-cli -s=agent-1 close",
|
||||||
]
|
]
|
||||||
|
|
||||||
for cmd in safe:
|
for cmd in safe:
|
||||||
|
|||||||
@@ -75,6 +75,7 @@ export function ProjectSelector({
|
|||||||
variant="outline"
|
variant="outline"
|
||||||
className="min-w-[140px] sm:min-w-[200px] justify-between"
|
className="min-w-[140px] sm:min-w-[200px] justify-between"
|
||||||
disabled={isLoading}
|
disabled={isLoading}
|
||||||
|
title={selectedProjectData?.path}
|
||||||
>
|
>
|
||||||
{isLoading ? (
|
{isLoading ? (
|
||||||
<Loader2 size={18} className="animate-spin" />
|
<Loader2 size={18} className="animate-spin" />
|
||||||
@@ -101,6 +102,7 @@ export function ProjectSelector({
|
|||||||
{projects.map(project => (
|
{projects.map(project => (
|
||||||
<DropdownMenuItem
|
<DropdownMenuItem
|
||||||
key={project.name}
|
key={project.name}
|
||||||
|
title={project.path}
|
||||||
className={`flex items-center justify-between cursor-pointer ${
|
className={`flex items-center justify-between cursor-pointer ${
|
||||||
project.name === selectedProject ? 'bg-primary/10' : ''
|
project.name === selectedProject ? 'bg-primary/10' : ''
|
||||||
}`}
|
}`}
|
||||||
|
|||||||
Reference in New Issue
Block a user