34 Commits

Author SHA1 Message Date
Auto
d846a021b8 fix: address PR #184 review findings for blocked-for-human-input feature
A) Graph view: add needs_human_input bucket to handleGraphNodeClick so
   clicking blocked nodes opens the feature modal
B) MCP validation: validate field type enum, require options for select,
   enforce unique non-empty field IDs and labels
C) Progress fallback: include needs_human_input in non-WebSocket total
D) WebSocket: track needs_human_input count in progress state
E) Cleanup guard: remove unnecessary needs_human_input check in
   _cleanup_stale_features (resolved via merge conflict)
F) Defensive SQL: require in_progress=1 in feature_request_human_input

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-12 07:36:48 +02:00
Auto
819ebcd112 Merge remote-tracking branch 'origin/master' into feature/blocked-for-human-input
# Conflicts:
#	server/services/process_manager.py
2026-02-12 07:36:11 +02:00
Auto
f4636fdfd5 fix: handle pausing/draining states in UI guards and process cleanup
Follow-up fixes after merging PR #183 (graceful pause/drain mode):

- process_manager: _stream_output finally block now transitions from
  pausing/paused_graceful to crashed/stopped (not just running), and
  cleans up the drain signal file on process exit
- App.tsx: block Reset button and R shortcut during pausing/paused_graceful
- AgentThought/ProgressDashboard: keep thought bubble visible while pausing
- OrchestratorAvatar: add draining/paused cases to animation, glow, and
  description switch statements
- AgentMissionControl: show Draining/Paused badge text for new states
- registry.py: remove redundant type annotation to fix mypy no-redef
- process_manager.py: add type:ignore for SQLAlchemy Column assignment
- websocket.py: reclassify test-pass lines as 'testing' not 'success'
- review-pr.md: add post-review recommended action guidance

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-12 07:28:37 +02:00
Leon van Zyl
c114248b09 Merge pull request #183 from CaitlynByrne/feat/pause-drain
feat: add graceful pause (drain mode) for running agents
2026-02-12 07:22:01 +02:00
Auto
76dd4b8d80 version patch 2026-02-11 18:48:44 +02:00
Auto
4e84de3839 0.1.12 2026-02-11 18:48:21 +02:00
Auto
8a934c3374 fix: isolate Playwright CLI browser sessions per agent in parallel mode
Set unique PLAYWRIGHT_CLI_SESSION environment variable for each spawned
agent subprocess to prevent concurrent agents from sharing a single
browser instance and interfering with each other's navigation.

- _spawn_coding_agent: session named "coding-{feature_id}"
- _spawn_coding_agent_batch: session named "coding-{primary_id}"
- _spawn_testing_agent: session named "testing-{counter}" using an
  incrementing counter (since multiple testing agents can test
  overlapping features, feature ID alone isn't sufficient)

Previously, after migrating from Playwright MCP to CLI, all parallel
agents shared the default browser session, causing them to navigate
away from each other's pages.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-11 18:48:19 +02:00
Auto
81e8c37f29 feat: expose read-only MCP tools to all agent types, fix settings base URL handling
Add feature_get_ready, feature_get_blocked, and feature_get_graph to
CODING_AGENT_TOOLS, TESTING_AGENT_TOOLS, and INITIALIZER_AGENT_TOOLS.
These read-only tools were available on the MCP server but blocked by
the allowed_tools lists, causing "blocked/not allowed" errors when
agents tried to query project state.

Fix SettingsModal custom base URL input:
- Remove fallback to current settings value when saving, so empty input
  is not silently replaced with the existing URL
- Remove .trim() on the input value to prevent cursor jumping while typing
- Fix "Change" button pre-fill using empty string instead of space

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-11 17:09:22 +02:00
Leon van Zyl
6ffbf09b91 Merge pull request #190 from nogataka/feature/azure-claude-provider
feat: add Azure Anthropic (Claude) provider support
2026-02-11 16:59:35 +02:00
Auto
d1b0b73b20 version patch 2026-02-11 13:38:55 +02:00
Auto
9fb7926df1 0.1.11 2026-02-11 13:38:30 +02:00
Auto
e9873a2642 feat: migrate browser automation from Playwright MCP to CLI, fix headless setting
Major changes across 21 files (755 additions, 196 deletions):

Browser Automation Migration:
- Add versioned project migration system (prompts.py) with content-based
  detection and section-level regex replacement for coding/testing prompts
- Migrate STEP 5 (browser verification) and BROWSER AUTOMATION sections
  in coding prompt template to use playwright-cli commands
- Migrate STEP 2 and AVAILABLE TOOLS sections in testing prompt template
- Migration auto-runs at agent startup (autonomous_agent_demo.py), copies
  playwright-cli skill, scaffolds .playwright/cli.config.json, updates
  .gitignore, and stamps .migration_version file
- Add playwright-cli command validation to security allowlist (security.py)
  with tests for allowed subcommands and blocked eval/run-code

Headless Browser Setting Fix:
- Add _apply_playwright_headless() to process_manager.py that reads/updates
  .playwright/cli.config.json before agent subprocess launch
- Remove dead PLAYWRIGHT_HEADLESS env var that was never consumed
- Settings UI toggle now correctly controls visible browser window

Playwright CLI Auto-Install:
- Add ensurePlaywrightCli() to lib/cli.js for npm global entry point
- Add playwright-cli detection + npm install to start.bat, start.sh,
  start_ui.bat, start_ui.sh for all startup paths

Other Improvements:
- Add project folder path tooltip to ProjectSelector.tsx dropdown items
- Remove legacy Playwright MCP server configuration from client.py
- Update CLAUDE.md with playwright-cli skill documentation

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-11 13:37:03 +02:00
Auto
f285db1ad3 add paywright cli skill 2026-02-11 08:38:53 +02:00
nogataka
d2b3ba9aee feat: add Azure Anthropic (Claude) provider support
- Add "Azure Anthropic (Claude)" to API_PROVIDERS in registry.py
  with ANTHROPIC_API_KEY auth (required for Claude CLI to route
  through custom base URL instead of default Anthropic endpoint)
- Add Azure env var template to .env.example
- Show Base URL input field for Azure provider in Settings UI
  with "Configured" state and Azure-specific placeholder
- Widen Settings modal for better readability with long URLs
- Add Azure endpoint detection and "Azure Mode" log label
- Rename misleading "GLM Mode" fallback label to "Alternative API"

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-10 21:29:05 +09:00
Auto
55064945a4 version patch 2026-02-09 08:56:33 +02:00
Auto
859987e3b4 0.1.10 2026-02-09 08:55:49 +02:00
Auto
f87970daca fix: prevent temp file accumulation during long agent runs
Address three issues reported after overnight AutoForge runs:
1. ~193GB of .node files in %TEMP% from V8 compile caching
2. Stale npm artifact folders on drive root when %TEMP% fills up
3. PNG screenshot files left in project root by Playwright

Changes:
- Widen .node cleanup glob from ".78912*.node" to ".[0-9a-f]*.node"
  to match all V8 compile cache hex prefixes
- Add "node-compile-cache" directory to temp cleanup patterns
- Set NODE_COMPILE_CACHE="" in all subprocess environments (client.py,
  parallel_orchestrator.py, process_manager.py) to disable V8 compile
  caching at the source
- Add cleanup_project_screenshots() to remove stale .png files from
  project directories (feature*-*.png, screenshot-*.png, step-*.png)
- Run cleanup_stale_temp() at server startup in lifespan()
- Add _run_inter_session_cleanup() to orchestrator, called after each
  agent completes (both coding and testing paths)
- Update coding and testing prompt templates to instruct agents to use
  inline (base64) screenshots only, never saving files to disk

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-09 08:54:52 +02:00
Caitlyn Byrne
656df0fd9a feat: add "blocked for human input" feature across full stack
Agents can now request structured human input when they encounter
genuine blockers (API keys, design choices, external configs). The
request is displayed in the UI with a dynamic form, and the human's
response is stored and made available when the agent resumes.

Changes span 21 files + 1 new component:
- Database: 3 new columns (needs_human_input, human_input_request,
  human_input_response) with migration
- MCP: new feature_request_human_input tool + guards on existing tools
- API: new resolve-human-input endpoint, 4th feature bucket
- Orchestrator: skip needs_human_input features in scheduling
- Progress: 4-tuple return from count_passing_tests
- WebSocket: needs_human_input count in progress messages
- UI: conditional 4th Kanban column, HumanInputForm component,
  amber status indicators, dependency graph support

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-08 14:11:35 -05:00
Caitlyn Byrne
9721368188 feat: add graceful pause (drain mode) for running agents
File-based signal (.pause_drain) lets the orchestrator finish current
work before pausing instead of hard-freezing the process tree.  New
status states pausing/paused_graceful flow through WebSocket to the UI
where a Pause button, draining indicator, and Resume button are shown.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-08 13:37:22 -05:00
Auto
9eb08d3f71 version patch 2026-02-08 15:51:11 +02:00
Auto
8d76deb75f 0.1.9 2026-02-08 15:50:50 +02:00
Auto
3a31761542 ui: add resizable drag handle to assistant chat panel
Add a draggable resize handle on the left edge of the AI assistant
panel, allowing users to adjust the panel width by clicking and
dragging. Width is persisted to localStorage across sessions.

- Drag handle with hover highlight (border -> primary color)
- Min width 300px, max width 90vw
- Width saved to localStorage under 'assistant-panel-width'
- Cursor changes to col-resize and text selection disabled during drag

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-08 15:45:21 +02:00
Auto
96feb38aea ui: restructure header navbar into two-row responsive layout
Redesign the header from a single overflowing row into a clean two-row
layout that prevents content from overlapping the logo and bleeding
outside the navbar on smaller screens.

Row 1: Logo + project selector + spacer + mode badges + utility icons
Row 2: Agent controls + dev server + spacer + settings + reset
(only rendered when a project is selected, with a subtle border divider)

Changes:
- App.tsx: Split header into two logical rows with flex spacers for
  right-alignment; hide title text below md breakpoint; move mode
  badges (Ollama/GLM) to row 1 with sm:hidden for small screens
- ProjectSelector: Responsive min-width (140px mobile, 200px desktop);
  truncate long project names instead of pushing icons off-screen
- AgentControl: Responsive gap (gap-2 mobile, gap-4 desktop)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-08 15:41:17 +02:00
Auto
1925818d49 feat: fix tooltip shortcuts and add dev server config dialog
Tooltip fixes (PR #177 follow-up):
- Remove duplicate title attr on Settings button that caused double-tooltip
- Restore keyboard shortcut hints in tooltip text: Settings (,), Reset (R)
- Clean up spurious peer markers in package-lock.json

Dev server config dialog:
- Add DevServerConfigDialog component for custom dev commands
- Open config dialog automatically when start fails with "no dev command"
- Add useDevServerConfig/useUpdateDevServerConfig hooks
- Add updateDevServerConfig API function
- Add config gear button next to dev server start

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-08 15:29:44 +02:00
Leon van Zyl
38fc8788a2 Merge pull request #177 from brainit-consulting/feat/navbar-tooltips
ui: add Radix tooltips to header icons
2026-02-08 15:26:28 +02:00
Emile du Toit
b439e2d241 ui: add Radix tooltips to header icons 2026-02-07 19:56:59 -05:00
Auto
b0490be501 version patch 2026-02-06 15:27:09 +02:00
Auto
13a3ff9ac1 0.1.8 2026-02-06 15:26:48 +02:00
Auto
71f17c73c2 feat: add structured questions (AskUserQuestion) to assistant chat
Add interactive multiple-choice question support to the project assistant,
allowing it to present clickable options when clarification is needed.

Backend changes:
- Add ask_user MCP tool to feature_mcp.py with input validation
- Add mcp__features__ask_user to assistant allowed tools list
- Intercept ask_user tool calls in _query_claude() to yield question messages
- Add answer WebSocket message handler in assistant_chat router
- Document ask_user tool in assistant system prompt

Frontend changes:
- Add AssistantChatQuestionMessage type and update server message union
- Add currentQuestions state and sendAnswer() to useAssistantChat hook
- Handle question WebSocket messages by attaching to last assistant message
- Render QuestionOptions component between messages and input area
- Disable text input while structured questions are active

Flow: Claude calls ask_user → backend intercepts → WebSocket question message →
frontend renders QuestionOptions → user clicks options → answer sent back →
Claude receives formatted answer and continues conversation.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-06 15:26:36 +02:00
Auto
46ac373748 0.1.7 2026-02-06 14:37:42 +02:00
Auto
0d04a062a2 feat: add full markdown rendering to chat messages
Replace the custom BOLD_REGEX parser in ChatMessage.tsx with
react-markdown + remark-gfm for proper rendering of headers, tables,
lists, code blocks, blockquotes, links, and horizontal rules in all
chat UIs (AssistantChat, SpecCreationChat, ExpandProjectChat).

Changes:
- Add react-markdown and remark-gfm dependencies
- Add vendor-markdown chunk to Vite manual chunks for code splitting
- Add .chat-prose CSS class with styles for all markdown elements
- Add .chat-prose-user modifier for contrast on primary-colored bubbles
- Replace line-splitting + regex logic with ReactMarkdown component
- Links open in new tabs via custom component override
- System messages remain plain text (unchanged)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-06 14:37:39 +02:00
Auto
7d08700f3a version patch 2026-02-06 13:41:17 +02:00
Auto
5ecf74cb31 0.1.6 2026-02-06 13:40:53 +02:00
Auto
9259a799e3 fix: propagate alternative API provider settings to agent subprocesses
When users configured GLM/Ollama/Kimi via the Settings UI, agents still
used Claude because conflicting env vars leaked through subprocess env.

Root cause: get_effective_sdk_env() set ANTHROPIC_AUTH_TOKEN for GLM but
didn't clear ANTHROPIC_API_KEY, which leaked from os.environ. The CLI
prioritized the wrong credential.

Changes:
- registry.py: Clear conflicting auth vars (API_KEY vs AUTH_TOKEN) and
  Vertex AI vars when building env for alternative providers
- client.py: Replace manual os.getenv() loop with get_effective_sdk_env()
  so agent SDK reads provider settings from the database
- autonomous_agent_demo.py: Apply UI-configured provider settings to
  process env so CLI-launched agents also respect Settings UI config
- start.py: Pass --model from settings when launching agent subprocess
- server/schemas.py: Allow non-Claude model names when an alternative
  provider is configured (prevents 422 errors for glm-4.7, etc.)
- .env.example: Document env vars for GLM, Ollama, and Kimi providers

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-06 13:38:36 +02:00
77 changed files with 5729 additions and 494 deletions

View File

@@ -72,4 +72,21 @@ Pull request(s): $ARGUMENTS
- What this PR is actually about (one sentence) - What this PR is actually about (one sentence)
- The key concerns, if any (or "no significant concerns") - The key concerns, if any (or "no significant concerns")
- **Verdict: MERGE** / **MERGE (with minor follow-up)** / **DON'T MERGE** with a one-line reason - **Verdict: MERGE** / **MERGE (with minor follow-up)** / **DON'T MERGE** with a one-line reason
- This section should be scannable in under 10 seconds - This section should be scannable in under 10 seconds
10. **Post-Review Action**
- Immediately after the TLDR, provide a `## Recommended Action` section
- Based on the verdict, recommend one of the following actions:
**If verdict is MERGE (no concerns):**
- Recommend merging as-is. No further action needed.
**If verdict is MERGE (with minor follow-up):**
- If the concerns are low-risk and straightforward to fix (e.g., naming tweaks, small refactors, missing type annotations, minor style issues, trivial bug fixes), recommend merging the PR now and offer to immediately address the concerns in a follow-up commit directly on the target branch
- List the specific changes you would make in the follow-up
- Ask the user: *"Should I merge this PR and push a follow-up commit addressing these concerns?"*
**If verdict is DON'T MERGE:**
- If the blocking concerns are still relatively contained and you are confident you can resolve them quickly (e.g., a small bug fix, a missing validation, a straightforward architectural adjustment), recommend merging the PR and immediately addressing the issues in a follow-up commit — but only if the fixes are low-risk and well-understood
- If the issues are too complex, risky, or require author input (e.g., design decisions, major refactors, unclear intent), recommend sending the PR back to the author with specific feedback on what needs to change
- Be honest about your confidence level — if you're unsure whether you can address the concerns correctly, say so and defer to the author

View File

@@ -0,0 +1,259 @@
---
name: playwright-cli
description: Automates browser interactions for web testing, form filling, screenshots, and data extraction. Use when the user needs to navigate websites, interact with web pages, fill forms, take screenshots, test web applications, or extract information from web pages.
allowed-tools: Bash(playwright-cli:*)
---
# Browser Automation with playwright-cli
## Quick start
```bash
# open new browser
playwright-cli open
# navigate to a page
playwright-cli goto https://playwright.dev
# interact with the page using refs from the snapshot
playwright-cli click e15
playwright-cli type "page.click"
playwright-cli press Enter
# take a screenshot
playwright-cli screenshot
# close the browser
playwright-cli close
```
## Commands
### Core
```bash
playwright-cli open
# open and navigate right away
playwright-cli open https://example.com/
playwright-cli goto https://playwright.dev
playwright-cli type "search query"
playwright-cli click e3
playwright-cli dblclick e7
playwright-cli fill e5 "user@example.com"
playwright-cli drag e2 e8
playwright-cli hover e4
playwright-cli select e9 "option-value"
playwright-cli upload ./document.pdf
playwright-cli check e12
playwright-cli uncheck e12
playwright-cli snapshot
playwright-cli snapshot --filename=after-click.yaml
playwright-cli eval "document.title"
playwright-cli eval "el => el.textContent" e5
playwright-cli dialog-accept
playwright-cli dialog-accept "confirmation text"
playwright-cli dialog-dismiss
playwright-cli resize 1920 1080
playwright-cli close
```
### Navigation
```bash
playwright-cli go-back
playwright-cli go-forward
playwright-cli reload
```
### Keyboard
```bash
playwright-cli press Enter
playwright-cli press ArrowDown
playwright-cli keydown Shift
playwright-cli keyup Shift
```
### Mouse
```bash
playwright-cli mousemove 150 300
playwright-cli mousedown
playwright-cli mousedown right
playwright-cli mouseup
playwright-cli mouseup right
playwright-cli mousewheel 0 100
```
### Save as
```bash
playwright-cli screenshot
playwright-cli screenshot e5
playwright-cli screenshot --filename=page.png
playwright-cli pdf --filename=page.pdf
```
### Tabs
```bash
playwright-cli tab-list
playwright-cli tab-new
playwright-cli tab-new https://example.com/page
playwright-cli tab-close
playwright-cli tab-close 2
playwright-cli tab-select 0
```
### Storage
```bash
playwright-cli state-save
playwright-cli state-save auth.json
playwright-cli state-load auth.json
# Cookies
playwright-cli cookie-list
playwright-cli cookie-list --domain=example.com
playwright-cli cookie-get session_id
playwright-cli cookie-set session_id abc123
playwright-cli cookie-set session_id abc123 --domain=example.com --httpOnly --secure
playwright-cli cookie-delete session_id
playwright-cli cookie-clear
# LocalStorage
playwright-cli localstorage-list
playwright-cli localstorage-get theme
playwright-cli localstorage-set theme dark
playwright-cli localstorage-delete theme
playwright-cli localstorage-clear
# SessionStorage
playwright-cli sessionstorage-list
playwright-cli sessionstorage-get step
playwright-cli sessionstorage-set step 3
playwright-cli sessionstorage-delete step
playwright-cli sessionstorage-clear
```
### Network
```bash
playwright-cli route "**/*.jpg" --status=404
playwright-cli route "https://api.example.com/**" --body='{"mock": true}'
playwright-cli route-list
playwright-cli unroute "**/*.jpg"
playwright-cli unroute
```
### DevTools
```bash
playwright-cli console
playwright-cli console warning
playwright-cli network
playwright-cli run-code "async page => await page.context().grantPermissions(['geolocation'])"
playwright-cli tracing-start
playwright-cli tracing-stop
playwright-cli video-start
playwright-cli video-stop video.webm
```
### Install
```bash
playwright-cli install --skills
playwright-cli install-browser
```
### Configuration
```bash
# Use specific browser when creating session
playwright-cli open --browser=chrome
playwright-cli open --browser=firefox
playwright-cli open --browser=webkit
playwright-cli open --browser=msedge
# Connect to browser via extension
playwright-cli open --extension
# Use persistent profile (by default profile is in-memory)
playwright-cli open --persistent
# Use persistent profile with custom directory
playwright-cli open --profile=/path/to/profile
# Start with config file
playwright-cli open --config=my-config.json
# Close the browser
playwright-cli close
# Delete user data for the default session
playwright-cli delete-data
```
### Browser Sessions
```bash
# create new browser session named "mysession" with persistent profile
playwright-cli -s=mysession open example.com --persistent
# same with manually specified profile directory (use when requested explicitly)
playwright-cli -s=mysession open example.com --profile=/path/to/profile
playwright-cli -s=mysession click e6
playwright-cli -s=mysession close # stop a named browser
playwright-cli -s=mysession delete-data # delete user data for persistent session
playwright-cli list
# Close all browsers
playwright-cli close-all
# Forcefully kill all browser processes
playwright-cli kill-all
```
## Example: Form submission
```bash
playwright-cli open https://example.com/form
playwright-cli snapshot
playwright-cli fill e1 "user@example.com"
playwright-cli fill e2 "password123"
playwright-cli click e3
playwright-cli snapshot
playwright-cli close
```
## Example: Multi-tab workflow
```bash
playwright-cli open https://example.com
playwright-cli tab-new https://example.com/other
playwright-cli tab-list
playwright-cli tab-select 0
playwright-cli snapshot
playwright-cli close
```
## Example: Debugging with DevTools
```bash
playwright-cli open https://example.com
playwright-cli click e4
playwright-cli fill e7 "test"
playwright-cli console
playwright-cli network
playwright-cli close
```
```bash
playwright-cli open https://example.com
playwright-cli tracing-start
playwright-cli click e4
playwright-cli fill e7 "test"
playwright-cli tracing-stop
playwright-cli close
```
## Specific tasks
* **Request mocking** [references/request-mocking.md](references/request-mocking.md)
* **Running Playwright code** [references/running-code.md](references/running-code.md)
* **Browser session management** [references/session-management.md](references/session-management.md)
* **Storage state (cookies, localStorage)** [references/storage-state.md](references/storage-state.md)
* **Test generation** [references/test-generation.md](references/test-generation.md)
* **Tracing** [references/tracing.md](references/tracing.md)
* **Video recording** [references/video-recording.md](references/video-recording.md)

View File

@@ -0,0 +1,87 @@
# Request Mocking
Intercept, mock, modify, and block network requests.
## CLI Route Commands
```bash
# Mock with custom status
playwright-cli route "**/*.jpg" --status=404
# Mock with JSON body
playwright-cli route "**/api/users" --body='[{"id":1,"name":"Alice"}]' --content-type=application/json
# Mock with custom headers
playwright-cli route "**/api/data" --body='{"ok":true}' --header="X-Custom: value"
# Remove headers from requests
playwright-cli route "**/*" --remove-header=cookie,authorization
# List active routes
playwright-cli route-list
# Remove a route or all routes
playwright-cli unroute "**/*.jpg"
playwright-cli unroute
```
## URL Patterns
```
**/api/users - Exact path match
**/api/*/details - Wildcard in path
**/*.{png,jpg,jpeg} - Match file extensions
**/search?q=* - Match query parameters
```
## Advanced Mocking with run-code
For conditional responses, request body inspection, response modification, or delays:
### Conditional Response Based on Request
```bash
playwright-cli run-code "async page => {
await page.route('**/api/login', route => {
const body = route.request().postDataJSON();
if (body.username === 'admin') {
route.fulfill({ body: JSON.stringify({ token: 'mock-token' }) });
} else {
route.fulfill({ status: 401, body: JSON.stringify({ error: 'Invalid' }) });
}
});
}"
```
### Modify Real Response
```bash
playwright-cli run-code "async page => {
await page.route('**/api/user', async route => {
const response = await route.fetch();
const json = await response.json();
json.isPremium = true;
await route.fulfill({ response, json });
});
}"
```
### Simulate Network Failures
```bash
playwright-cli run-code "async page => {
await page.route('**/api/offline', route => route.abort('internetdisconnected'));
}"
# Options: connectionrefused, timedout, connectionreset, internetdisconnected
```
### Delayed Response
```bash
playwright-cli run-code "async page => {
await page.route('**/api/slow', async route => {
await new Promise(r => setTimeout(r, 3000));
route.fulfill({ body: JSON.stringify({ data: 'loaded' }) });
});
}"
```

View File

@@ -0,0 +1,232 @@
# Running Custom Playwright Code
Use `run-code` to execute arbitrary Playwright code for advanced scenarios not covered by CLI commands.
## Syntax
```bash
playwright-cli run-code "async page => {
// Your Playwright code here
// Access page.context() for browser context operations
}"
```
## Geolocation
```bash
# Grant geolocation permission and set location
playwright-cli run-code "async page => {
await page.context().grantPermissions(['geolocation']);
await page.context().setGeolocation({ latitude: 37.7749, longitude: -122.4194 });
}"
# Set location to London
playwright-cli run-code "async page => {
await page.context().grantPermissions(['geolocation']);
await page.context().setGeolocation({ latitude: 51.5074, longitude: -0.1278 });
}"
# Clear geolocation override
playwright-cli run-code "async page => {
await page.context().clearPermissions();
}"
```
## Permissions
```bash
# Grant multiple permissions
playwright-cli run-code "async page => {
await page.context().grantPermissions([
'geolocation',
'notifications',
'camera',
'microphone'
]);
}"
# Grant permissions for specific origin
playwright-cli run-code "async page => {
await page.context().grantPermissions(['clipboard-read'], {
origin: 'https://example.com'
});
}"
```
## Media Emulation
```bash
# Emulate dark color scheme
playwright-cli run-code "async page => {
await page.emulateMedia({ colorScheme: 'dark' });
}"
# Emulate light color scheme
playwright-cli run-code "async page => {
await page.emulateMedia({ colorScheme: 'light' });
}"
# Emulate reduced motion
playwright-cli run-code "async page => {
await page.emulateMedia({ reducedMotion: 'reduce' });
}"
# Emulate print media
playwright-cli run-code "async page => {
await page.emulateMedia({ media: 'print' });
}"
```
## Wait Strategies
```bash
# Wait for network idle
playwright-cli run-code "async page => {
await page.waitForLoadState('networkidle');
}"
# Wait for specific element
playwright-cli run-code "async page => {
await page.waitForSelector('.loading', { state: 'hidden' });
}"
# Wait for function to return true
playwright-cli run-code "async page => {
await page.waitForFunction(() => window.appReady === true);
}"
# Wait with timeout
playwright-cli run-code "async page => {
await page.waitForSelector('.result', { timeout: 10000 });
}"
```
## Frames and Iframes
```bash
# Work with iframe
playwright-cli run-code "async page => {
const frame = page.locator('iframe#my-iframe').contentFrame();
await frame.locator('button').click();
}"
# Get all frames
playwright-cli run-code "async page => {
const frames = page.frames();
return frames.map(f => f.url());
}"
```
## File Downloads
```bash
# Handle file download
playwright-cli run-code "async page => {
const [download] = await Promise.all([
page.waitForEvent('download'),
page.click('a.download-link')
]);
await download.saveAs('./downloaded-file.pdf');
return download.suggestedFilename();
}"
```
## Clipboard
```bash
# Read clipboard (requires permission)
playwright-cli run-code "async page => {
await page.context().grantPermissions(['clipboard-read']);
return await page.evaluate(() => navigator.clipboard.readText());
}"
# Write to clipboard
playwright-cli run-code "async page => {
await page.evaluate(text => navigator.clipboard.writeText(text), 'Hello clipboard!');
}"
```
## Page Information
```bash
# Get page title
playwright-cli run-code "async page => {
return await page.title();
}"
# Get current URL
playwright-cli run-code "async page => {
return page.url();
}"
# Get page content
playwright-cli run-code "async page => {
return await page.content();
}"
# Get viewport size
playwright-cli run-code "async page => {
return page.viewportSize();
}"
```
## JavaScript Execution
```bash
# Execute JavaScript and return result
playwright-cli run-code "async page => {
return await page.evaluate(() => {
return {
userAgent: navigator.userAgent,
language: navigator.language,
cookiesEnabled: navigator.cookieEnabled
};
});
}"
# Pass arguments to evaluate
playwright-cli run-code "async page => {
const multiplier = 5;
return await page.evaluate(m => document.querySelectorAll('li').length * m, multiplier);
}"
```
## Error Handling
```bash
# Try-catch in run-code
playwright-cli run-code "async page => {
try {
await page.click('.maybe-missing', { timeout: 1000 });
return 'clicked';
} catch (e) {
return 'element not found';
}
}"
```
## Complex Workflows
```bash
# Login and save state
playwright-cli run-code "async page => {
await page.goto('https://example.com/login');
await page.fill('input[name=email]', 'user@example.com');
await page.fill('input[name=password]', 'secret');
await page.click('button[type=submit]');
await page.waitForURL('**/dashboard');
await page.context().storageState({ path: 'auth.json' });
return 'Login successful';
}"
# Scrape data from multiple pages
playwright-cli run-code "async page => {
const results = [];
for (let i = 1; i <= 3; i++) {
await page.goto(\`https://example.com/page/\${i}\`);
const items = await page.locator('.item').allTextContents();
results.push(...items);
}
return results;
}"
```

View File

@@ -0,0 +1,169 @@
# Browser Session Management
Run multiple isolated browser sessions concurrently with state persistence.
## Named Browser Sessions
Use `-b` flag to isolate browser contexts:
```bash
# Browser 1: Authentication flow
playwright-cli -s=auth open https://app.example.com/login
# Browser 2: Public browsing (separate cookies, storage)
playwright-cli -s=public open https://example.com
# Commands are isolated by browser session
playwright-cli -s=auth fill e1 "user@example.com"
playwright-cli -s=public snapshot
```
## Browser Session Isolation Properties
Each browser session has independent:
- Cookies
- LocalStorage / SessionStorage
- IndexedDB
- Cache
- Browsing history
- Open tabs
## Browser Session Commands
```bash
# List all browser sessions
playwright-cli list
# Stop a browser session (close the browser)
playwright-cli close # stop the default browser
playwright-cli -s=mysession close # stop a named browser
# Stop all browser sessions
playwright-cli close-all
# Forcefully kill all daemon processes (for stale/zombie processes)
playwright-cli kill-all
# Delete browser session user data (profile directory)
playwright-cli delete-data # delete default browser data
playwright-cli -s=mysession delete-data # delete named browser data
```
## Environment Variable
Set a default browser session name via environment variable:
```bash
export PLAYWRIGHT_CLI_SESSION="mysession"
playwright-cli open example.com # Uses "mysession" automatically
```
## Common Patterns
### Concurrent Scraping
```bash
#!/bin/bash
# Scrape multiple sites concurrently
# Start all browsers
playwright-cli -s=site1 open https://site1.com &
playwright-cli -s=site2 open https://site2.com &
playwright-cli -s=site3 open https://site3.com &
wait
# Take snapshots from each
playwright-cli -s=site1 snapshot
playwright-cli -s=site2 snapshot
playwright-cli -s=site3 snapshot
# Cleanup
playwright-cli close-all
```
### A/B Testing Sessions
```bash
# Test different user experiences
playwright-cli -s=variant-a open "https://app.com?variant=a"
playwright-cli -s=variant-b open "https://app.com?variant=b"
# Compare
playwright-cli -s=variant-a screenshot
playwright-cli -s=variant-b screenshot
```
### Persistent Profile
By default, browser profile is kept in memory only. Use `--persistent` flag on `open` to persist the browser profile to disk:
```bash
# Use persistent profile (auto-generated location)
playwright-cli open https://example.com --persistent
# Use persistent profile with custom directory
playwright-cli open https://example.com --profile=/path/to/profile
```
## Default Browser Session
When `-s` is omitted, commands use the default browser session:
```bash
# These use the same default browser session
playwright-cli open https://example.com
playwright-cli snapshot
playwright-cli close # Stops default browser
```
## Browser Session Configuration
Configure a browser session with specific settings when opening:
```bash
# Open with config file
playwright-cli open https://example.com --config=.playwright/my-cli.json
# Open with specific browser
playwright-cli open https://example.com --browser=firefox
# Open in headed mode
playwright-cli open https://example.com --headed
# Open with persistent profile
playwright-cli open https://example.com --persistent
```
## Best Practices
### 1. Name Browser Sessions Semantically
```bash
# GOOD: Clear purpose
playwright-cli -s=github-auth open https://github.com
playwright-cli -s=docs-scrape open https://docs.example.com
# AVOID: Generic names
playwright-cli -s=s1 open https://github.com
```
### 2. Always Clean Up
```bash
# Stop browsers when done
playwright-cli -s=auth close
playwright-cli -s=scrape close
# Or stop all at once
playwright-cli close-all
# If browsers become unresponsive or zombie processes remain
playwright-cli kill-all
```
### 3. Delete Stale Browser Data
```bash
# Remove old browser data to free disk space
playwright-cli -s=oldsession delete-data
```

View File

@@ -0,0 +1,275 @@
# Storage Management
Manage cookies, localStorage, sessionStorage, and browser storage state.
## Storage State
Save and restore complete browser state including cookies and storage.
### Save Storage State
```bash
# Save to auto-generated filename (storage-state-{timestamp}.json)
playwright-cli state-save
# Save to specific filename
playwright-cli state-save my-auth-state.json
```
### Restore Storage State
```bash
# Load storage state from file
playwright-cli state-load my-auth-state.json
# Reload page to apply cookies
playwright-cli open https://example.com
```
### Storage State File Format
The saved file contains:
```json
{
"cookies": [
{
"name": "session_id",
"value": "abc123",
"domain": "example.com",
"path": "/",
"expires": 1735689600,
"httpOnly": true,
"secure": true,
"sameSite": "Lax"
}
],
"origins": [
{
"origin": "https://example.com",
"localStorage": [
{ "name": "theme", "value": "dark" },
{ "name": "user_id", "value": "12345" }
]
}
]
}
```
## Cookies
### List All Cookies
```bash
playwright-cli cookie-list
```
### Filter Cookies by Domain
```bash
playwright-cli cookie-list --domain=example.com
```
### Filter Cookies by Path
```bash
playwright-cli cookie-list --path=/api
```
### Get Specific Cookie
```bash
playwright-cli cookie-get session_id
```
### Set a Cookie
```bash
# Basic cookie
playwright-cli cookie-set session abc123
# Cookie with options
playwright-cli cookie-set session abc123 --domain=example.com --path=/ --httpOnly --secure --sameSite=Lax
# Cookie with expiration (Unix timestamp)
playwright-cli cookie-set remember_me token123 --expires=1735689600
```
### Delete a Cookie
```bash
playwright-cli cookie-delete session_id
```
### Clear All Cookies
```bash
playwright-cli cookie-clear
```
### Advanced: Multiple Cookies or Custom Options
For complex scenarios like adding multiple cookies at once, use `run-code`:
```bash
playwright-cli run-code "async page => {
await page.context().addCookies([
{ name: 'session_id', value: 'sess_abc123', domain: 'example.com', path: '/', httpOnly: true },
{ name: 'preferences', value: JSON.stringify({ theme: 'dark' }), domain: 'example.com', path: '/' }
]);
}"
```
## Local Storage
### List All localStorage Items
```bash
playwright-cli localstorage-list
```
### Get Single Value
```bash
playwright-cli localstorage-get token
```
### Set Value
```bash
playwright-cli localstorage-set theme dark
```
### Set JSON Value
```bash
playwright-cli localstorage-set user_settings '{"theme":"dark","language":"en"}'
```
### Delete Single Item
```bash
playwright-cli localstorage-delete token
```
### Clear All localStorage
```bash
playwright-cli localstorage-clear
```
### Advanced: Multiple Operations
For complex scenarios like setting multiple values at once, use `run-code`:
```bash
playwright-cli run-code "async page => {
await page.evaluate(() => {
localStorage.setItem('token', 'jwt_abc123');
localStorage.setItem('user_id', '12345');
localStorage.setItem('expires_at', Date.now() + 3600000);
});
}"
```
## Session Storage
### List All sessionStorage Items
```bash
playwright-cli sessionstorage-list
```
### Get Single Value
```bash
playwright-cli sessionstorage-get form_data
```
### Set Value
```bash
playwright-cli sessionstorage-set step 3
```
### Delete Single Item
```bash
playwright-cli sessionstorage-delete step
```
### Clear sessionStorage
```bash
playwright-cli sessionstorage-clear
```
## IndexedDB
### List Databases
```bash
playwright-cli run-code "async page => {
return await page.evaluate(async () => {
const databases = await indexedDB.databases();
return databases;
});
}"
```
### Delete Database
```bash
playwright-cli run-code "async page => {
await page.evaluate(() => {
indexedDB.deleteDatabase('myDatabase');
});
}"
```
## Common Patterns
### Authentication State Reuse
```bash
# Step 1: Login and save state
playwright-cli open https://app.example.com/login
playwright-cli snapshot
playwright-cli fill e1 "user@example.com"
playwright-cli fill e2 "password123"
playwright-cli click e3
# Save the authenticated state
playwright-cli state-save auth.json
# Step 2: Later, restore state and skip login
playwright-cli state-load auth.json
playwright-cli open https://app.example.com/dashboard
# Already logged in!
```
### Save and Restore Roundtrip
```bash
# Set up authentication state
playwright-cli open https://example.com
playwright-cli eval "() => { document.cookie = 'session=abc123'; localStorage.setItem('user', 'john'); }"
# Save state to file
playwright-cli state-save my-session.json
# ... later, in a new session ...
# Restore state
playwright-cli state-load my-session.json
playwright-cli open https://example.com
# Cookies and localStorage are restored!
```
## Security Notes
- Never commit storage state files containing auth tokens
- Add `*.auth-state.json` to `.gitignore`
- Delete state files after automation completes
- Use environment variables for sensitive data
- By default, sessions run in-memory mode which is safer for sensitive operations

View File

@@ -0,0 +1,88 @@
# Test Generation
Generate Playwright test code automatically as you interact with the browser.
## How It Works
Every action you perform with `playwright-cli` generates corresponding Playwright TypeScript code.
This code appears in the output and can be copied directly into your test files.
## Example Workflow
```bash
# Start a session
playwright-cli open https://example.com/login
# Take a snapshot to see elements
playwright-cli snapshot
# Output shows: e1 [textbox "Email"], e2 [textbox "Password"], e3 [button "Sign In"]
# Fill form fields - generates code automatically
playwright-cli fill e1 "user@example.com"
# Ran Playwright code:
# await page.getByRole('textbox', { name: 'Email' }).fill('user@example.com');
playwright-cli fill e2 "password123"
# Ran Playwright code:
# await page.getByRole('textbox', { name: 'Password' }).fill('password123');
playwright-cli click e3
# Ran Playwright code:
# await page.getByRole('button', { name: 'Sign In' }).click();
```
## Building a Test File
Collect the generated code into a Playwright test:
```typescript
import { test, expect } from '@playwright/test';
test('login flow', async ({ page }) => {
// Generated code from playwright-cli session:
await page.goto('https://example.com/login');
await page.getByRole('textbox', { name: 'Email' }).fill('user@example.com');
await page.getByRole('textbox', { name: 'Password' }).fill('password123');
await page.getByRole('button', { name: 'Sign In' }).click();
// Add assertions
await expect(page).toHaveURL(/.*dashboard/);
});
```
## Best Practices
### 1. Use Semantic Locators
The generated code uses role-based locators when possible, which are more resilient:
```typescript
// Generated (good - semantic)
await page.getByRole('button', { name: 'Submit' }).click();
// Avoid (fragile - CSS selectors)
await page.locator('#submit-btn').click();
```
### 2. Explore Before Recording
Take snapshots to understand the page structure before recording actions:
```bash
playwright-cli open https://example.com
playwright-cli snapshot
# Review the element structure
playwright-cli click e5
```
### 3. Add Assertions Manually
Generated code captures actions but not assertions. Add expectations in your test:
```typescript
// Generated action
await page.getByRole('button', { name: 'Submit' }).click();
// Manual assertion
await expect(page.getByText('Success')).toBeVisible();
```

View File

@@ -0,0 +1,139 @@
# Tracing
Capture detailed execution traces for debugging and analysis. Traces include DOM snapshots, screenshots, network activity, and console logs.
## Basic Usage
```bash
# Start trace recording
playwright-cli tracing-start
# Perform actions
playwright-cli open https://example.com
playwright-cli click e1
playwright-cli fill e2 "test"
# Stop trace recording
playwright-cli tracing-stop
```
## Trace Output Files
When you start tracing, Playwright creates a `traces/` directory with several files:
### `trace-{timestamp}.trace`
**Action log** - The main trace file containing:
- Every action performed (clicks, fills, navigations)
- DOM snapshots before and after each action
- Screenshots at each step
- Timing information
- Console messages
- Source locations
### `trace-{timestamp}.network`
**Network log** - Complete network activity:
- All HTTP requests and responses
- Request headers and bodies
- Response headers and bodies
- Timing (DNS, connect, TLS, TTFB, download)
- Resource sizes
- Failed requests and errors
### `resources/`
**Resources directory** - Cached resources:
- Images, fonts, stylesheets, scripts
- Response bodies for replay
- Assets needed to reconstruct page state
## What Traces Capture
| Category | Details |
|----------|---------|
| **Actions** | Clicks, fills, hovers, keyboard input, navigations |
| **DOM** | Full DOM snapshot before/after each action |
| **Screenshots** | Visual state at each step |
| **Network** | All requests, responses, headers, bodies, timing |
| **Console** | All console.log, warn, error messages |
| **Timing** | Precise timing for each operation |
## Use Cases
### Debugging Failed Actions
```bash
playwright-cli tracing-start
playwright-cli open https://app.example.com
# This click fails - why?
playwright-cli click e5
playwright-cli tracing-stop
# Open trace to see DOM state when click was attempted
```
### Analyzing Performance
```bash
playwright-cli tracing-start
playwright-cli open https://slow-site.com
playwright-cli tracing-stop
# View network waterfall to identify slow resources
```
### Capturing Evidence
```bash
# Record a complete user flow for documentation
playwright-cli tracing-start
playwright-cli open https://app.example.com/checkout
playwright-cli fill e1 "4111111111111111"
playwright-cli fill e2 "12/25"
playwright-cli fill e3 "123"
playwright-cli click e4
playwright-cli tracing-stop
# Trace shows exact sequence of events
```
## Trace vs Video vs Screenshot
| Feature | Trace | Video | Screenshot |
|---------|-------|-------|------------|
| **Format** | .trace file | .webm video | .png/.jpeg image |
| **DOM inspection** | Yes | No | No |
| **Network details** | Yes | No | No |
| **Step-by-step replay** | Yes | Continuous | Single frame |
| **File size** | Medium | Large | Small |
| **Best for** | Debugging | Demos | Quick capture |
## Best Practices
### 1. Start Tracing Before the Problem
```bash
# Trace the entire flow, not just the failing step
playwright-cli tracing-start
playwright-cli open https://example.com
# ... all steps leading to the issue ...
playwright-cli tracing-stop
```
### 2. Clean Up Old Traces
Traces can consume significant disk space:
```bash
# Remove traces older than 7 days
find .playwright-cli/traces -mtime +7 -delete
```
## Limitations
- Traces add overhead to automation
- Large traces can consume significant disk space
- Some dynamic content may not replay perfectly

View File

@@ -0,0 +1,43 @@
# Video Recording
Capture browser automation sessions as video for debugging, documentation, or verification. Produces WebM (VP8/VP9 codec).
## Basic Recording
```bash
# Start recording
playwright-cli video-start
# Perform actions
playwright-cli open https://example.com
playwright-cli snapshot
playwright-cli click e1
playwright-cli fill e2 "test input"
# Stop and save
playwright-cli video-stop demo.webm
```
## Best Practices
### 1. Use Descriptive Filenames
```bash
# Include context in filename
playwright-cli video-stop recordings/login-flow-2024-01-15.webm
playwright-cli video-stop recordings/checkout-test-run-42.webm
```
## Tracing vs Video
| Feature | Video | Tracing |
|---------|-------|---------|
| Output | WebM file | Trace file (viewable in Trace Viewer) |
| Shows | Visual recording | DOM snapshots, network, console, actions |
| Use case | Demos, documentation | Debugging, analysis |
| Size | Larger | Smaller |
## Limitations
- Recording adds slight overhead to automation
- Large recordings can consume significant disk space

View File

@@ -86,24 +86,33 @@ Implement the chosen feature thoroughly:
**CRITICAL:** You MUST verify features through the actual UI. **CRITICAL:** You MUST verify features through the actual UI.
Use browser automation tools: Use `playwright-cli` for browser automation:
- Navigate to the app in a real browser - Open the browser: `playwright-cli open http://localhost:PORT`
- Interact like a human user (click, type, scroll) - Take a snapshot to see page elements: `playwright-cli snapshot`
- Take screenshots at each step - Read the snapshot YAML file to see element refs
- Verify both functionality AND visual appearance - Click elements by ref: `playwright-cli click e5`
- Type text: `playwright-cli type "search query"`
- Fill form fields: `playwright-cli fill e3 "value"`
- Take screenshots: `playwright-cli screenshot`
- Read the screenshot file to verify visual appearance
- Check console errors: `playwright-cli console`
- Close browser when done: `playwright-cli close`
**Token-efficient workflow:** `playwright-cli screenshot` and `snapshot` save files
to `.playwright-cli/`. You will see a file link in the output. Read the file only
when you need to verify visual appearance or find element refs.
**DO:** **DO:**
- Test through the UI with clicks and keyboard input - Test through the UI with clicks and keyboard input
- Take screenshots to verify visual appearance - Take screenshots and read them to verify visual appearance
- Check for console errors in browser - Check for console errors with `playwright-cli console`
- Verify complete user workflows end-to-end - Verify complete user workflows end-to-end
- Always run `playwright-cli close` when finished testing
**DON'T:** **DON'T:**
- Only test with curl commands
- Only test with curl commands (backend testing alone is insufficient) - Use JavaScript evaluation to bypass UI (`eval` and `run-code` are blocked)
- Use JavaScript evaluation to bypass UI (no shortcuts)
- Skip visual verification - Skip visual verification
- Mark tests passing without thorough verification - Mark tests passing without thorough verification
@@ -145,7 +154,7 @@ Use the feature_mark_passing tool with feature_id=42
- Combine or consolidate features - Combine or consolidate features
- Reorder features - Reorder features
**ONLY MARK A FEATURE AS PASSING AFTER VERIFICATION WITH SCREENSHOTS.** **ONLY MARK A FEATURE AS PASSING AFTER VERIFICATION WITH BROWSER AUTOMATION.**
### STEP 7: COMMIT YOUR PROGRESS ### STEP 7: COMMIT YOUR PROGRESS
@@ -192,9 +201,15 @@ Before context fills up:
## BROWSER AUTOMATION ## BROWSER AUTOMATION
Use Playwright MCP tools (`browser_*`) for UI verification. Key tools: `navigate`, `click`, `type`, `fill_form`, `take_screenshot`, `console_messages`, `network_requests`. All tools have auto-wait built in. Use `playwright-cli` commands for UI verification. Key commands: `open`, `goto`,
`snapshot`, `click`, `type`, `fill`, `screenshot`, `console`, `close`.
Test like a human user with mouse and keyboard. Use `browser_console_messages` to detect errors. Don't bypass UI with JavaScript evaluation. **How it works:** `playwright-cli` uses a persistent browser daemon. `open` starts it,
subsequent commands interact via socket, `close` shuts it down. Screenshots and snapshots
save to `.playwright-cli/` -- read the files when you need to verify content.
Test like a human user with mouse and keyboard. Use `playwright-cli console` to detect
JS errors. Don't bypass UI with JavaScript evaluation.
--- ---

View File

@@ -31,26 +31,32 @@ For the feature returned:
1. Read and understand the feature's verification steps 1. Read and understand the feature's verification steps
2. Navigate to the relevant part of the application 2. Navigate to the relevant part of the application
3. Execute each verification step using browser automation 3. Execute each verification step using browser automation
4. Take screenshots to document the verification 4. Take screenshots and read them to verify visual appearance
5. Check for console errors 5. Check for console errors
Use browser automation tools: ### Browser Automation (Playwright CLI)
**Navigation & Screenshots:** **Navigation & Screenshots:**
- browser_navigate - Navigate to a URL - `playwright-cli open <url>` - Open browser and navigate
- browser_take_screenshot - Capture screenshot (use for visual verification) - `playwright-cli goto <url>` - Navigate to URL
- browser_snapshot - Get accessibility tree snapshot - `playwright-cli screenshot` - Save screenshot to `.playwright-cli/`
- `playwright-cli snapshot` - Save page snapshot with element refs to `.playwright-cli/`
**Element Interaction:** **Element Interaction:**
- browser_click - Click elements - `playwright-cli click <ref>` - Click elements (ref from snapshot)
- browser_type - Type text into editable elements - `playwright-cli type <text>` - Type text
- browser_fill_form - Fill multiple form fields - `playwright-cli fill <ref> <text>` - Fill form fields
- browser_select_option - Select dropdown options - `playwright-cli select <ref> <val>` - Select dropdown
- browser_press_key - Press keyboard keys - `playwright-cli press <key>` - Keyboard input
**Debugging:** **Debugging:**
- browser_console_messages - Get browser console output (check for errors) - `playwright-cli console` - Check for JS errors
- browser_network_requests - Monitor API calls - `playwright-cli network` - Monitor API calls
**Cleanup:**
- `playwright-cli close` - Close browser when done (ALWAYS do this)
**Note:** Screenshots and snapshots save to files. Read the file to see the content.
### STEP 3: HANDLE RESULTS ### STEP 3: HANDLE RESULTS
@@ -79,7 +85,7 @@ A regression has been introduced. You MUST fix it:
4. **Verify the fix:** 4. **Verify the fix:**
- Run through all verification steps again - Run through all verification steps again
- Take screenshots confirming the fix - Take screenshots and read them to confirm the fix
5. **Mark as passing after fix:** 5. **Mark as passing after fix:**
``` ```
@@ -98,7 +104,7 @@ A regression has been introduced. You MUST fix it:
--- ---
## AVAILABLE MCP TOOLS ## AVAILABLE TOOLS
### Feature Management ### Feature Management
- `feature_get_stats` - Get progress overview (passing/in_progress/total counts) - `feature_get_stats` - Get progress overview (passing/in_progress/total counts)
@@ -106,19 +112,17 @@ A regression has been introduced. You MUST fix it:
- `feature_mark_failing` - Mark a feature as failing (when you find a regression) - `feature_mark_failing` - Mark a feature as failing (when you find a regression)
- `feature_mark_passing` - Mark a feature as passing (after fixing a regression) - `feature_mark_passing` - Mark a feature as passing (after fixing a regression)
### Browser Automation (Playwright) ### Browser Automation (Playwright CLI)
All interaction tools have **built-in auto-wait** -- no manual timeouts needed. Use `playwright-cli` commands for browser interaction. Key commands:
- `playwright-cli open <url>` - Open browser
- `browser_navigate` - Navigate to URL - `playwright-cli goto <url>` - Navigate to URL
- `browser_take_screenshot` - Capture screenshot - `playwright-cli screenshot` - Take screenshot (saved to `.playwright-cli/`)
- `browser_snapshot` - Get accessibility tree - `playwright-cli snapshot` - Get page snapshot with element refs
- `browser_click` - Click elements - `playwright-cli click <ref>` - Click element
- `browser_type` - Type text - `playwright-cli type <text>` - Type text
- `browser_fill_form` - Fill form fields - `playwright-cli fill <ref> <text>` - Fill form field
- `browser_select_option` - Select dropdown - `playwright-cli console` - Check for JS errors
- `browser_press_key` - Keyboard input - `playwright-cli close` - Close browser (always do this when done)
- `browser_console_messages` - Check for JS errors
- `browser_network_requests` - Monitor API calls
--- ---

View File

@@ -30,7 +30,34 @@
# ANTHROPIC_DEFAULT_HAIKU_MODEL=claude-3-5-haiku@20241022 # ANTHROPIC_DEFAULT_HAIKU_MODEL=claude-3-5-haiku@20241022
# =================== # ===================
# Alternative API Providers (GLM, Ollama, Kimi, Custom) # Alternative API Providers (Azure, GLM, Ollama, Kimi, Custom)
# =================== # ===================
# Configure alternative providers via the Settings UI (gear icon > API Provider). # Configure via Settings UI (recommended) or set env vars below.
# The Settings UI is the recommended way to switch providers and models. # When both are set, env vars take precedence.
#
# Azure Anthropic (Claude):
# ANTHROPIC_BASE_URL=https://your-resource.services.ai.azure.com/anthropic
# ANTHROPIC_API_KEY=your-azure-api-key
# ANTHROPIC_DEFAULT_OPUS_MODEL=claude-opus-4-6
# ANTHROPIC_DEFAULT_SONNET_MODEL=claude-sonnet-4-5
# ANTHROPIC_DEFAULT_HAIKU_MODEL=claude-haiku-4-5
#
# GLM (Zhipu AI):
# ANTHROPIC_BASE_URL=https://api.z.ai/api/anthropic
# ANTHROPIC_AUTH_TOKEN=your-glm-api-key
# ANTHROPIC_DEFAULT_OPUS_MODEL=glm-4.7
# ANTHROPIC_DEFAULT_SONNET_MODEL=glm-4.7
# ANTHROPIC_DEFAULT_HAIKU_MODEL=glm-4.7
#
# Ollama (Local):
# ANTHROPIC_BASE_URL=http://localhost:11434
# ANTHROPIC_DEFAULT_OPUS_MODEL=qwen3-coder
# ANTHROPIC_DEFAULT_SONNET_MODEL=qwen3-coder
# ANTHROPIC_DEFAULT_HAIKU_MODEL=qwen3-coder
#
# Kimi (Moonshot):
# ANTHROPIC_BASE_URL=https://api.kimi.com/coding/
# ANTHROPIC_API_KEY=your-kimi-api-key
# ANTHROPIC_DEFAULT_OPUS_MODEL=kimi-k2.5
# ANTHROPIC_DEFAULT_SONNET_MODEL=kimi-k2.5
# ANTHROPIC_DEFAULT_HAIKU_MODEL=kimi-k2.5

4
.gitignore vendored
View File

@@ -10,6 +10,10 @@ issues/
# Browser profiles for parallel agent execution # Browser profiles for parallel agent execution
.browser-profiles/ .browser-profiles/
# Playwright CLI daemon artifacts
.playwright-cli/
.playwright/
# Log files # Log files
logs/ logs/
*.log *.log

View File

@@ -28,5 +28,4 @@ start.sh
start_ui.sh start_ui.sh
start_ui.py start_ui.py
.claude/agents/ .claude/agents/
.claude/skills/
.claude/settings.json .claude/settings.json

View File

@@ -85,7 +85,7 @@ python autonomous_agent_demo.py --project-dir my-app --yolo
**What's different in YOLO mode:** **What's different in YOLO mode:**
- No regression testing - No regression testing
- No Playwright MCP server (browser automation disabled) - No Playwright CLI (browser automation disabled)
- Features marked passing after lint/type-check succeeds - Features marked passing after lint/type-check succeeds
- Faster iteration for prototyping - Faster iteration for prototyping
@@ -163,7 +163,7 @@ Publishing: `npm publish` (triggers `prepublishOnly` which builds UI, then publi
- `autonomous_agent_demo.py` - Entry point for running the agent (supports `--yolo`, `--parallel`, `--batch-size`, `--batch-features`) - `autonomous_agent_demo.py` - Entry point for running the agent (supports `--yolo`, `--parallel`, `--batch-size`, `--batch-features`)
- `autoforge_paths.py` - Central path resolution with dual-path backward compatibility and migration - `autoforge_paths.py` - Central path resolution with dual-path backward compatibility and migration
- `agent.py` - Agent session loop using Claude Agent SDK - `agent.py` - Agent session loop using Claude Agent SDK
- `client.py` - ClaudeSDKClient configuration with security hooks, MCP servers, and Vertex AI support - `client.py` - ClaudeSDKClient configuration with security hooks, feature MCP server, and Vertex AI support
- `security.py` - Bash command allowlist validation (ALLOWED_COMMANDS whitelist) - `security.py` - Bash command allowlist validation (ALLOWED_COMMANDS whitelist)
- `prompts.py` - Prompt template loading with project-specific fallback and batch feature prompts - `prompts.py` - Prompt template loading with project-specific fallback and batch feature prompts
- `progress.py` - Progress tracking, database queries, webhook notifications - `progress.py` - Progress tracking, database queries, webhook notifications
@@ -288,6 +288,9 @@ Projects can be stored in any directory (registered in `~/.autoforge/registry.db
- `.autoforge/.agent.lock` - Lock file to prevent multiple agent instances - `.autoforge/.agent.lock` - Lock file to prevent multiple agent instances
- `.autoforge/allowed_commands.yaml` - Project-specific bash command allowlist (optional) - `.autoforge/allowed_commands.yaml` - Project-specific bash command allowlist (optional)
- `.autoforge/.gitignore` - Ignores runtime files - `.autoforge/.gitignore` - Ignores runtime files
- `.claude/skills/playwright-cli/` - Playwright CLI skill for browser automation
- `.playwright/cli.config.json` - Browser configuration (headless, viewport, etc.)
- `.playwright-cli/` - Playwright CLI daemon artifacts (screenshots, snapshots) - gitignored
- `CLAUDE.md` - Stays at project root (SDK convention) - `CLAUDE.md` - Stays at project root (SDK convention)
- `app_spec.txt` - Root copy for agent template compatibility - `app_spec.txt` - Root copy for agent template compatibility
@@ -445,6 +448,7 @@ Alternative providers are configured via the **Settings UI** (gear icon > API Pr
**Skills** (`.claude/skills/`): **Skills** (`.claude/skills/`):
- `frontend-design` - Distinctive, production-grade UI design - `frontend-design` - Distinctive, production-grade UI design
- `gsd-to-autoforge-spec` - Convert GSD codebase mapping to AutoForge app_spec format - `gsd-to-autoforge-spec` - Convert GSD codebase mapping to AutoForge app_spec format
- `playwright-cli` - Browser automation via Playwright CLI (copied to each project)
**Other:** **Other:**
- `.claude/templates/` - Prompt templates copied to new projects - `.claude/templates/` - Prompt templates copied to new projects
@@ -479,7 +483,7 @@ When running with `--parallel`, the orchestrator:
1. Spawns multiple Claude agents as subprocesses (up to `--max-concurrency`) 1. Spawns multiple Claude agents as subprocesses (up to `--max-concurrency`)
2. Each agent claims features atomically via `feature_claim_and_get` 2. Each agent claims features atomically via `feature_claim_and_get`
3. Features blocked by unmet dependencies are skipped 3. Features blocked by unmet dependencies are skipped
4. Browser contexts are isolated per agent using `--isolated` flag 4. Browser sessions are isolated per agent via `PLAYWRIGHT_CLI_SESSION` environment variable
5. AgentTracker parses output and emits `agent_update` messages for UI 5. AgentTracker parses output and emits `agent_update` messages for UI
### Process Limits (Parallel Mode) ### Process Limits (Parallel Mode)

View File

@@ -222,7 +222,7 @@ async def run_autonomous_agent(
# Check if all features are already complete (before starting a new session) # Check if all features are already complete (before starting a new session)
# Skip this check if running as initializer (needs to create features first) # Skip this check if running as initializer (needs to create features first)
if not is_initializer and iteration == 1: if not is_initializer and iteration == 1:
passing, in_progress, total = count_passing_tests(project_dir) passing, in_progress, total, _nhi = count_passing_tests(project_dir)
if total > 0 and passing == total: if total > 0 and passing == total:
print("\n" + "=" * 70) print("\n" + "=" * 70)
print(" ALL FEATURES ALREADY COMPLETE!") print(" ALL FEATURES ALREADY COMPLETE!")
@@ -240,17 +240,7 @@ async def run_autonomous_agent(
print_session_header(iteration, is_initializer) print_session_header(iteration, is_initializer)
# Create client (fresh context) # Create client (fresh context)
# Pass agent_id for browser isolation in multi-agent scenarios client = create_client(project_dir, model, yolo_mode=yolo_mode, agent_type=agent_type)
import os
if agent_type == "testing":
agent_id = f"testing-{os.getpid()}" # Unique ID for testing agents
elif feature_ids and len(feature_ids) > 1:
agent_id = f"batch-{feature_ids[0]}"
elif feature_id:
agent_id = f"feature-{feature_id}"
else:
agent_id = None
client = create_client(project_dir, model, yolo_mode=yolo_mode, agent_id=agent_id, agent_type=agent_type)
# Choose prompt based on agent type # Choose prompt based on agent type
if agent_type == "initializer": if agent_type == "initializer":
@@ -358,7 +348,7 @@ async def run_autonomous_agent(
print_progress_summary(project_dir) print_progress_summary(project_dir)
# Check if all features are complete - exit gracefully if done # Check if all features are complete - exit gracefully if done
passing, in_progress, total = count_passing_tests(project_dir) passing, in_progress, total, _nhi = count_passing_tests(project_dir)
if total > 0 and passing == total: if total > 0 and passing == total:
print("\n" + "=" * 70) print("\n" + "=" * 70)
print(" ALL FEATURES COMPLETE!") print(" ALL FEATURES COMPLETE!")

View File

@@ -43,10 +43,10 @@ class Feature(Base):
__tablename__ = "features" __tablename__ = "features"
# Composite index for common status query pattern (passes, in_progress) # Composite index for common status query pattern (passes, in_progress, needs_human_input)
# Used by feature_get_stats, get_ready_features, and other status queries # Used by feature_get_stats, get_ready_features, and other status queries
__table_args__ = ( __table_args__ = (
Index('ix_feature_status', 'passes', 'in_progress'), Index('ix_feature_status', 'passes', 'in_progress', 'needs_human_input'),
) )
id = Column(Integer, primary_key=True, index=True) id = Column(Integer, primary_key=True, index=True)
@@ -61,6 +61,11 @@ class Feature(Base):
# NULL/empty = no dependencies (backwards compatible) # NULL/empty = no dependencies (backwards compatible)
dependencies = Column(JSON, nullable=True, default=None) dependencies = Column(JSON, nullable=True, default=None)
# Human input: agent can request structured input from a human
needs_human_input = Column(Boolean, nullable=False, default=False, index=True)
human_input_request = Column(JSON, nullable=True, default=None) # Agent's structured request
human_input_response = Column(JSON, nullable=True, default=None) # Human's response
def to_dict(self) -> dict: def to_dict(self) -> dict:
"""Convert feature to dictionary for JSON serialization.""" """Convert feature to dictionary for JSON serialization."""
return { return {
@@ -75,6 +80,10 @@ class Feature(Base):
"in_progress": self.in_progress if self.in_progress is not None else False, "in_progress": self.in_progress if self.in_progress is not None else False,
# Dependencies: NULL/empty treated as empty list for backwards compat # Dependencies: NULL/empty treated as empty list for backwards compat
"dependencies": self.dependencies if self.dependencies else [], "dependencies": self.dependencies if self.dependencies else [],
# Human input fields
"needs_human_input": self.needs_human_input if self.needs_human_input is not None else False,
"human_input_request": self.human_input_request,
"human_input_response": self.human_input_response,
} }
def get_dependencies_safe(self) -> list[int]: def get_dependencies_safe(self) -> list[int]:
@@ -302,6 +311,21 @@ def _is_network_path(path: Path) -> bool:
return False return False
def _migrate_add_human_input_columns(engine) -> None:
"""Add human input columns to existing databases that don't have them."""
with engine.connect() as conn:
result = conn.execute(text("PRAGMA table_info(features)"))
columns = [row[1] for row in result.fetchall()]
if "needs_human_input" not in columns:
conn.execute(text("ALTER TABLE features ADD COLUMN needs_human_input BOOLEAN DEFAULT 0"))
if "human_input_request" not in columns:
conn.execute(text("ALTER TABLE features ADD COLUMN human_input_request TEXT DEFAULT NULL"))
if "human_input_response" not in columns:
conn.execute(text("ALTER TABLE features ADD COLUMN human_input_response TEXT DEFAULT NULL"))
conn.commit()
def _migrate_add_schedules_tables(engine) -> None: def _migrate_add_schedules_tables(engine) -> None:
"""Create schedules and schedule_overrides tables if they don't exist.""" """Create schedules and schedule_overrides tables if they don't exist."""
from sqlalchemy import inspect from sqlalchemy import inspect
@@ -425,6 +449,7 @@ def create_database(project_dir: Path) -> tuple:
_migrate_fix_null_boolean_fields(engine) _migrate_fix_null_boolean_fields(engine)
_migrate_add_dependencies_column(engine) _migrate_add_dependencies_column(engine)
_migrate_add_testing_columns(engine) _migrate_add_testing_columns(engine)
_migrate_add_human_input_columns(engine)
# Migrate to add schedules tables # Migrate to add schedules tables
_migrate_add_schedules_tables(engine) _migrate_add_schedules_tables(engine)

View File

@@ -39,10 +39,12 @@ assistant.db-wal
assistant.db-shm assistant.db-shm
.agent.lock .agent.lock
.devserver.lock .devserver.lock
.pause_drain
.claude_settings.json .claude_settings.json
.claude_assistant_settings.json .claude_assistant_settings.json
.claude_settings.expand.*.json .claude_settings.expand.*.json
.progress_cache .progress_cache
.migration_version
""" """
@@ -145,6 +147,15 @@ def get_claude_assistant_settings_path(project_dir: Path) -> Path:
return _resolve_path(project_dir, ".claude_assistant_settings.json") return _resolve_path(project_dir, ".claude_assistant_settings.json")
def get_pause_drain_path(project_dir: Path) -> Path:
"""Return the path to the ``.pause_drain`` signal file.
This file is created to request a graceful pause (drain mode).
Always uses the new location since it's a transient signal file.
"""
return project_dir / ".autoforge" / ".pause_drain"
def get_progress_cache_path(project_dir: Path) -> Path: def get_progress_cache_path(project_dir: Path) -> Path:
"""Resolve the path to ``.progress_cache``.""" """Resolve the path to ``.progress_cache``."""
return _resolve_path(project_dir, ".progress_cache") return _resolve_path(project_dir, ".progress_cache")

View File

@@ -44,8 +44,10 @@ from dotenv import load_dotenv
# IMPORTANT: Must be called BEFORE importing other modules that read env vars at load time # IMPORTANT: Must be called BEFORE importing other modules that read env vars at load time
load_dotenv() load_dotenv()
import os
from agent import run_autonomous_agent from agent import run_autonomous_agent
from registry import DEFAULT_MODEL, get_project_path from registry import DEFAULT_MODEL, get_effective_sdk_env, get_project_path
def parse_args() -> argparse.Namespace: def parse_args() -> argparse.Namespace:
@@ -195,6 +197,14 @@ def main() -> None:
# Note: Authentication is handled by start.bat/start.sh before this script runs. # Note: Authentication is handled by start.bat/start.sh before this script runs.
# The Claude SDK auto-detects credentials from ~/.claude/.credentials.json # The Claude SDK auto-detects credentials from ~/.claude/.credentials.json
# Apply UI-configured provider settings to this process's environment.
# This ensures CLI-launched agents respect Settings UI provider config (GLM, Ollama, etc.).
# Uses setdefault so explicit env vars / .env file take precedence.
sdk_overrides = get_effective_sdk_env()
for key, value in sdk_overrides.items():
if value: # Only set non-empty values (empty values are used to clear conflicts)
os.environ.setdefault(key, value)
# Handle deprecated --parallel flag # Handle deprecated --parallel flag
if args.parallel is not None: if args.parallel is not None:
print("WARNING: --parallel is deprecated. Use --concurrency instead.", flush=True) print("WARNING: --parallel is deprecated. Use --concurrency instead.", flush=True)
@@ -227,6 +237,12 @@ def main() -> None:
if migrated: if migrated:
print(f"Migrated project files to .autoforge/: {', '.join(migrated)}", flush=True) print(f"Migrated project files to .autoforge/: {', '.join(migrated)}", flush=True)
# Migrate project to current AutoForge version (idempotent, safe)
from prompts import migrate_project_to_current
version_migrated = migrate_project_to_current(project_dir)
if version_migrated:
print(f"Upgraded project: {', '.join(version_migrated)}", flush=True)
# Parse batch testing feature IDs (comma-separated string -> list[int]) # Parse batch testing feature IDs (comma-separated string -> list[int])
testing_feature_ids: list[int] | None = None testing_feature_ids: list[int] | None = None
if args.testing_feature_ids: if args.testing_feature_ids:

157
client.py
View File

@@ -16,22 +16,11 @@ from claude_agent_sdk import ClaudeAgentOptions, ClaudeSDKClient
from claude_agent_sdk.types import HookContext, HookInput, HookMatcher, SyncHookJSONOutput from claude_agent_sdk.types import HookContext, HookInput, HookMatcher, SyncHookJSONOutput
from dotenv import load_dotenv from dotenv import load_dotenv
from env_constants import API_ENV_VARS
from security import SENSITIVE_DIRECTORIES, bash_security_hook from security import SENSITIVE_DIRECTORIES, bash_security_hook
# Load environment variables from .env file if present # Load environment variables from .env file if present
load_dotenv() load_dotenv()
# Default Playwright headless mode - can be overridden via PLAYWRIGHT_HEADLESS env var
# When True, browser runs invisibly in background (default - saves CPU)
# When False, browser window is visible (useful for monitoring agent progress)
DEFAULT_PLAYWRIGHT_HEADLESS = True
# Default browser for Playwright - can be overridden via PLAYWRIGHT_BROWSER env var
# Options: chrome, firefox, webkit, msedge
# Firefox is recommended for lower CPU usage
DEFAULT_PLAYWRIGHT_BROWSER = "firefox"
# Extra read paths for cross-project file access (read-only) # Extra read paths for cross-project file access (read-only)
# Set EXTRA_READ_PATHS environment variable with comma-separated absolute paths # Set EXTRA_READ_PATHS environment variable with comma-separated absolute paths
# Example: EXTRA_READ_PATHS=/Volumes/Data/dev,/Users/shared/libs # Example: EXTRA_READ_PATHS=/Volumes/Data/dev,/Users/shared/libs
@@ -42,6 +31,7 @@ EXTRA_READ_PATHS_VAR = "EXTRA_READ_PATHS"
# this blocklist and the filesystem browser API share a single source of truth. # this blocklist and the filesystem browser API share a single source of truth.
EXTRA_READ_PATHS_BLOCKLIST = SENSITIVE_DIRECTORIES EXTRA_READ_PATHS_BLOCKLIST = SENSITIVE_DIRECTORIES
def convert_model_for_vertex(model: str) -> str: def convert_model_for_vertex(model: str) -> str:
""" """
Convert model name format for Vertex AI compatibility. Convert model name format for Vertex AI compatibility.
@@ -73,43 +63,6 @@ def convert_model_for_vertex(model: str) -> str:
return model return model
def get_playwright_headless() -> bool:
"""
Get the Playwright headless mode setting.
Reads from PLAYWRIGHT_HEADLESS environment variable, defaults to True.
Returns True for headless mode (invisible browser), False for visible browser.
"""
value = os.getenv("PLAYWRIGHT_HEADLESS", str(DEFAULT_PLAYWRIGHT_HEADLESS).lower()).strip().lower()
truthy = {"true", "1", "yes", "on"}
falsy = {"false", "0", "no", "off"}
if value not in truthy | falsy:
print(f" - Warning: Invalid PLAYWRIGHT_HEADLESS='{value}', defaulting to {DEFAULT_PLAYWRIGHT_HEADLESS}")
return DEFAULT_PLAYWRIGHT_HEADLESS
return value in truthy
# Valid browsers supported by Playwright MCP
VALID_PLAYWRIGHT_BROWSERS = {"chrome", "firefox", "webkit", "msedge"}
def get_playwright_browser() -> str:
"""
Get the browser to use for Playwright.
Reads from PLAYWRIGHT_BROWSER environment variable, defaults to firefox.
Options: chrome, firefox, webkit, msedge
Firefox is recommended for lower CPU usage.
"""
value = os.getenv("PLAYWRIGHT_BROWSER", DEFAULT_PLAYWRIGHT_BROWSER).strip().lower()
if value not in VALID_PLAYWRIGHT_BROWSERS:
print(f" - Warning: Invalid PLAYWRIGHT_BROWSER='{value}', "
f"valid options: {', '.join(sorted(VALID_PLAYWRIGHT_BROWSERS))}. "
f"Defaulting to {DEFAULT_PLAYWRIGHT_BROWSER}")
return DEFAULT_PLAYWRIGHT_BROWSER
return value
def get_extra_read_paths() -> list[Path]: def get_extra_read_paths() -> list[Path]:
""" """
Get extra read-only paths from EXTRA_READ_PATHS environment variable. Get extra read-only paths from EXTRA_READ_PATHS environment variable.
@@ -188,7 +141,6 @@ def get_extra_read_paths() -> list[Path]:
# overhead and preventing agents from calling tools meant for other roles. # overhead and preventing agents from calling tools meant for other roles.
# #
# Tools intentionally omitted from ALL agent lists (UI/orchestrator only): # Tools intentionally omitted from ALL agent lists (UI/orchestrator only):
# feature_get_ready, feature_get_blocked, feature_get_graph,
# feature_remove_dependency # feature_remove_dependency
# #
# The ghost tool "feature_release_testing" was removed entirely -- it was # The ghost tool "feature_release_testing" was removed entirely -- it was
@@ -198,6 +150,9 @@ CODING_AGENT_TOOLS = [
"mcp__features__feature_get_stats", "mcp__features__feature_get_stats",
"mcp__features__feature_get_by_id", "mcp__features__feature_get_by_id",
"mcp__features__feature_get_summary", "mcp__features__feature_get_summary",
"mcp__features__feature_get_ready",
"mcp__features__feature_get_blocked",
"mcp__features__feature_get_graph",
"mcp__features__feature_claim_and_get", "mcp__features__feature_claim_and_get",
"mcp__features__feature_mark_in_progress", "mcp__features__feature_mark_in_progress",
"mcp__features__feature_mark_passing", "mcp__features__feature_mark_passing",
@@ -210,12 +165,18 @@ TESTING_AGENT_TOOLS = [
"mcp__features__feature_get_stats", "mcp__features__feature_get_stats",
"mcp__features__feature_get_by_id", "mcp__features__feature_get_by_id",
"mcp__features__feature_get_summary", "mcp__features__feature_get_summary",
"mcp__features__feature_get_ready",
"mcp__features__feature_get_blocked",
"mcp__features__feature_get_graph",
"mcp__features__feature_mark_passing", "mcp__features__feature_mark_passing",
"mcp__features__feature_mark_failing", "mcp__features__feature_mark_failing",
] ]
INITIALIZER_AGENT_TOOLS = [ INITIALIZER_AGENT_TOOLS = [
"mcp__features__feature_get_stats", "mcp__features__feature_get_stats",
"mcp__features__feature_get_ready",
"mcp__features__feature_get_blocked",
"mcp__features__feature_get_graph",
"mcp__features__feature_create_bulk", "mcp__features__feature_create_bulk",
"mcp__features__feature_create", "mcp__features__feature_create",
"mcp__features__feature_add_dependency", "mcp__features__feature_add_dependency",
@@ -229,41 +190,6 @@ ALL_FEATURE_MCP_TOOLS = sorted(
set(CODING_AGENT_TOOLS) | set(TESTING_AGENT_TOOLS) | set(INITIALIZER_AGENT_TOOLS) set(CODING_AGENT_TOOLS) | set(TESTING_AGENT_TOOLS) | set(INITIALIZER_AGENT_TOOLS)
) )
# Playwright MCP tools for browser automation.
# Full set of tools for comprehensive UI testing including drag-and-drop,
# hover menus, file uploads, tab management, etc.
PLAYWRIGHT_TOOLS = [
# Core navigation & screenshots
"mcp__playwright__browser_navigate",
"mcp__playwright__browser_navigate_back",
"mcp__playwright__browser_take_screenshot",
"mcp__playwright__browser_snapshot",
# Element interaction
"mcp__playwright__browser_click",
"mcp__playwright__browser_type",
"mcp__playwright__browser_fill_form",
"mcp__playwright__browser_select_option",
"mcp__playwright__browser_press_key",
"mcp__playwright__browser_drag",
"mcp__playwright__browser_hover",
"mcp__playwright__browser_file_upload",
# JavaScript & debugging
"mcp__playwright__browser_evaluate",
# "mcp__playwright__browser_run_code", # REMOVED - causes Playwright MCP server crash
"mcp__playwright__browser_console_messages",
"mcp__playwright__browser_network_requests",
# Browser management
"mcp__playwright__browser_resize",
"mcp__playwright__browser_wait_for",
"mcp__playwright__browser_handle_dialog",
"mcp__playwright__browser_install",
"mcp__playwright__browser_close",
"mcp__playwright__browser_tabs",
]
# Built-in tools available to agents. # Built-in tools available to agents.
# WebFetch and WebSearch are included so coding agents can look up current # WebFetch and WebSearch are included so coding agents can look up current
# documentation for frameworks and libraries they are implementing. # documentation for frameworks and libraries they are implementing.
@@ -283,7 +209,6 @@ def create_client(
project_dir: Path, project_dir: Path,
model: str, model: str,
yolo_mode: bool = False, yolo_mode: bool = False,
agent_id: str | None = None,
agent_type: str = "coding", agent_type: str = "coding",
): ):
""" """
@@ -292,9 +217,7 @@ def create_client(
Args: Args:
project_dir: Directory for the project project_dir: Directory for the project
model: Claude model to use model: Claude model to use
yolo_mode: If True, skip Playwright MCP server for rapid prototyping yolo_mode: If True, skip browser testing for rapid prototyping
agent_id: Optional unique identifier for browser isolation in parallel mode.
When provided, each agent gets its own browser profile.
agent_type: One of "coding", "testing", or "initializer". Controls which agent_type: One of "coding", "testing", or "initializer". Controls which
MCP tools are exposed and the max_turns limit. MCP tools are exposed and the max_turns limit.
@@ -328,11 +251,8 @@ def create_client(
} }
max_turns = max_turns_map.get(agent_type, 300) max_turns = max_turns_map.get(agent_type, 300)
# Build allowed tools list based on mode and agent type. # Build allowed tools list based on agent type.
# In YOLO mode, exclude Playwright tools for faster prototyping.
allowed_tools = [*BUILTIN_TOOLS, *feature_tools] allowed_tools = [*BUILTIN_TOOLS, *feature_tools]
if not yolo_mode:
allowed_tools.extend(PLAYWRIGHT_TOOLS)
# Build permissions list. # Build permissions list.
# We permit ALL feature MCP tools at the security layer (so the MCP server # We permit ALL feature MCP tools at the security layer (so the MCP server
@@ -364,10 +284,6 @@ def create_client(
permissions_list.append(f"Glob({path}/**)") permissions_list.append(f"Glob({path}/**)")
permissions_list.append(f"Grep({path}/**)") permissions_list.append(f"Grep({path}/**)")
if not yolo_mode:
# Allow Playwright MCP tools for browser automation (standard mode only)
permissions_list.extend(PLAYWRIGHT_TOOLS)
# Create comprehensive security settings # Create comprehensive security settings
# Note: Using relative paths ("./**") restricts access to project directory # Note: Using relative paths ("./**") restricts access to project directory
# since cwd is set to project_dir # since cwd is set to project_dir
@@ -396,9 +312,9 @@ def create_client(
print(f" - Extra read paths (validated): {', '.join(str(p) for p in extra_read_paths)}") print(f" - Extra read paths (validated): {', '.join(str(p) for p in extra_read_paths)}")
print(" - Bash commands restricted to allowlist (see security.py)") print(" - Bash commands restricted to allowlist (see security.py)")
if yolo_mode: if yolo_mode:
print(" - MCP servers: features (database) - YOLO MODE (no Playwright)") print(" - MCP servers: features (database) - YOLO MODE (no browser testing)")
else: else:
print(" - MCP servers: playwright (browser), features (database)") print(" - MCP servers: features (database)")
print(" - Project settings enabled (skills, commands, CLAUDE.md)") print(" - Project settings enabled (skills, commands, CLAUDE.md)")
print() print()
@@ -422,48 +338,19 @@ def create_client(
}, },
}, },
} }
if not yolo_mode:
# Include Playwright MCP server for browser automation (standard mode only)
# Browser and headless mode configurable via environment variables
browser = get_playwright_browser()
playwright_args = [
"@playwright/mcp@latest",
"--viewport-size", "1280x720",
"--browser", browser,
]
if get_playwright_headless():
playwright_args.append("--headless")
print(f" - Browser: {browser} (headless={get_playwright_headless()})")
# Browser isolation for parallel execution
# Each agent gets its own isolated browser context to prevent tab conflicts
if agent_id:
# Use --isolated for ephemeral browser context
# This creates a fresh, isolated context without persistent state
# Note: --isolated and --user-data-dir are mutually exclusive
playwright_args.append("--isolated")
print(f" - Browser isolation enabled for agent: {agent_id}")
mcp_servers["playwright"] = {
"command": "npx",
"args": playwright_args,
}
# Build environment overrides for API endpoint configuration # Build environment overrides for API endpoint configuration
# These override system env vars for the Claude CLI subprocess, # Uses get_effective_sdk_env() which reads provider settings from the database,
# allowing AutoForge to use alternative APIs (e.g., GLM) without # ensuring UI-configured alternative providers (GLM, Ollama, Kimi, Custom) propagate
# affecting the user's global Claude Code settings # correctly to the Claude CLI subprocess
sdk_env = {} from registry import get_effective_sdk_env
for var in API_ENV_VARS: sdk_env = get_effective_sdk_env()
value = os.getenv(var)
if value:
sdk_env[var] = value
# Detect alternative API mode (Ollama, GLM, or Vertex AI) # Detect alternative API mode (Ollama, GLM, or Vertex AI)
base_url = sdk_env.get("ANTHROPIC_BASE_URL", "") base_url = sdk_env.get("ANTHROPIC_BASE_URL", "")
is_vertex = sdk_env.get("CLAUDE_CODE_USE_VERTEX") == "1" is_vertex = sdk_env.get("CLAUDE_CODE_USE_VERTEX") == "1"
is_alternative_api = bool(base_url) or is_vertex is_alternative_api = bool(base_url) or is_vertex
is_ollama = "localhost:11434" in base_url or "127.0.0.1:11434" in base_url is_ollama = "localhost:11434" in base_url or "127.0.0.1:11434" in base_url
is_azure = "services.ai.azure.com" in base_url
model = convert_model_for_vertex(model) model = convert_model_for_vertex(model)
if sdk_env: if sdk_env:
print(f" - API overrides: {', '.join(sdk_env.keys())}") print(f" - API overrides: {', '.join(sdk_env.keys())}")
@@ -473,8 +360,10 @@ def create_client(
print(f" - Vertex AI Mode: Using GCP project '{project_id}' with model '{model}' in region '{region}'") print(f" - Vertex AI Mode: Using GCP project '{project_id}' with model '{model}' in region '{region}'")
elif is_ollama: elif is_ollama:
print(" - Ollama Mode: Using local models") print(" - Ollama Mode: Using local models")
elif is_azure:
print(f" - Azure Mode: Using {base_url}")
elif "ANTHROPIC_BASE_URL" in sdk_env: elif "ANTHROPIC_BASE_URL" in sdk_env:
print(f" - GLM Mode: Using {sdk_env['ANTHROPIC_BASE_URL']}") print(f" - Alternative API: Using {sdk_env['ANTHROPIC_BASE_URL']}")
# Create a wrapper for bash_security_hook that passes project_dir via context # Create a wrapper for bash_security_hook that passes project_dir via context
async def bash_hook_with_context(input_data, tool_use_id=None, context=None): async def bash_hook_with_context(input_data, tool_use_id=None, context=None):

View File

@@ -517,6 +517,41 @@ function killProcess(pid) {
} }
} }
// ---------------------------------------------------------------------------
// Playwright CLI
// ---------------------------------------------------------------------------
/**
* Ensure playwright-cli is available globally for browser automation.
* Returns true if available (already installed or freshly installed).
*
* @param {boolean} showProgress - If true, print install progress
*/
function ensurePlaywrightCli(showProgress) {
try {
execSync('playwright-cli --version', {
timeout: 10_000,
stdio: ['pipe', 'pipe', 'pipe'],
});
return true;
} catch {
// Not installed — try to install
}
if (showProgress) {
log(' Installing playwright-cli for browser automation...');
}
try {
execSync('npm install -g @playwright/cli', {
timeout: 120_000,
stdio: ['pipe', 'pipe', 'pipe'],
});
return true;
} catch {
return false;
}
}
// --------------------------------------------------------------------------- // ---------------------------------------------------------------------------
// CLI commands // CLI commands
// --------------------------------------------------------------------------- // ---------------------------------------------------------------------------
@@ -613,6 +648,14 @@ function startServer(opts) {
} }
const wasAlreadyReady = ensureVenv(python, repair); const wasAlreadyReady = ensureVenv(python, repair);
// Ensure playwright-cli for browser automation (quick check, installs once)
if (!ensurePlaywrightCli(!wasAlreadyReady)) {
log('');
log(' Note: playwright-cli not available (browser automation will be limited)');
log(' Install manually: npm install -g @playwright/cli');
log('');
}
// Step 3: Config file // Step 3: Config file
const configCreated = ensureEnvFile(); const configCreated = ensureEnvFile();

View File

@@ -151,17 +151,20 @@ def feature_get_stats() -> str:
result = session.query( result = session.query(
func.count(Feature.id).label('total'), func.count(Feature.id).label('total'),
func.sum(case((Feature.passes == True, 1), else_=0)).label('passing'), func.sum(case((Feature.passes == True, 1), else_=0)).label('passing'),
func.sum(case((Feature.in_progress == True, 1), else_=0)).label('in_progress') func.sum(case((Feature.in_progress == True, 1), else_=0)).label('in_progress'),
func.sum(case((Feature.needs_human_input == True, 1), else_=0)).label('needs_human_input')
).first() ).first()
total = result.total or 0 total = result.total or 0
passing = int(result.passing or 0) passing = int(result.passing or 0)
in_progress = int(result.in_progress or 0) in_progress = int(result.in_progress or 0)
needs_human_input = int(result.needs_human_input or 0)
percentage = round((passing / total) * 100, 1) if total > 0 else 0.0 percentage = round((passing / total) * 100, 1) if total > 0 else 0.0
return json.dumps({ return json.dumps({
"passing": passing, "passing": passing,
"in_progress": in_progress, "in_progress": in_progress,
"needs_human_input": needs_human_input,
"total": total, "total": total,
"percentage": percentage "percentage": percentage
}) })
@@ -221,6 +224,7 @@ def feature_get_summary(
"name": feature.name, "name": feature.name,
"passes": feature.passes, "passes": feature.passes,
"in_progress": feature.in_progress, "in_progress": feature.in_progress,
"needs_human_input": feature.needs_human_input if feature.needs_human_input is not None else False,
"dependencies": feature.dependencies or [] "dependencies": feature.dependencies or []
}) })
finally: finally:
@@ -401,11 +405,11 @@ def feature_mark_in_progress(
""" """
session = get_session() session = get_session()
try: try:
# Atomic claim: only succeeds if feature is not already claimed or passing # Atomic claim: only succeeds if feature is not already claimed, passing, or blocked for human input
result = session.execute(text(""" result = session.execute(text("""
UPDATE features UPDATE features
SET in_progress = 1 SET in_progress = 1
WHERE id = :id AND passes = 0 AND in_progress = 0 WHERE id = :id AND passes = 0 AND in_progress = 0 AND needs_human_input = 0
"""), {"id": feature_id}) """), {"id": feature_id})
session.commit() session.commit()
@@ -418,6 +422,8 @@ def feature_mark_in_progress(
return json.dumps({"error": f"Feature with ID {feature_id} is already passing"}) return json.dumps({"error": f"Feature with ID {feature_id} is already passing"})
if feature.in_progress: if feature.in_progress:
return json.dumps({"error": f"Feature with ID {feature_id} is already in-progress"}) return json.dumps({"error": f"Feature with ID {feature_id} is already in-progress"})
if getattr(feature, 'needs_human_input', False):
return json.dumps({"error": f"Feature with ID {feature_id} is blocked waiting for human input"})
return json.dumps({"error": "Failed to mark feature in-progress for unknown reason"}) return json.dumps({"error": "Failed to mark feature in-progress for unknown reason"})
# Fetch the claimed feature # Fetch the claimed feature
@@ -455,11 +461,14 @@ def feature_claim_and_get(
if feature.passes: if feature.passes:
return json.dumps({"error": f"Feature with ID {feature_id} is already passing"}) return json.dumps({"error": f"Feature with ID {feature_id} is already passing"})
# Try atomic claim: only succeeds if not already claimed if getattr(feature, 'needs_human_input', False):
return json.dumps({"error": f"Feature with ID {feature_id} is blocked waiting for human input"})
# Try atomic claim: only succeeds if not already claimed and not blocked for human input
result = session.execute(text(""" result = session.execute(text("""
UPDATE features UPDATE features
SET in_progress = 1 SET in_progress = 1
WHERE id = :id AND passes = 0 AND in_progress = 0 WHERE id = :id AND passes = 0 AND in_progress = 0 AND needs_human_input = 0
"""), {"id": feature_id}) """), {"id": feature_id})
session.commit() session.commit()
@@ -806,6 +815,8 @@ def feature_get_ready(
for f in all_features: for f in all_features:
if f.passes or f.in_progress: if f.passes or f.in_progress:
continue continue
if getattr(f, 'needs_human_input', False):
continue
deps = f.dependencies or [] deps = f.dependencies or []
if all(dep_id in passing_ids for dep_id in deps): if all(dep_id in passing_ids for dep_id in deps):
ready.append(f.to_dict()) ready.append(f.to_dict())
@@ -888,6 +899,8 @@ def feature_get_graph() -> str:
if f.passes: if f.passes:
status = "done" status = "done"
elif getattr(f, 'needs_human_input', False):
status = "needs_human_input"
elif blocking: elif blocking:
status = "blocked" status = "blocked"
elif f.in_progress: elif f.in_progress:
@@ -984,5 +997,132 @@ def feature_set_dependencies(
return json.dumps({"error": f"Failed to set dependencies: {str(e)}"}) return json.dumps({"error": f"Failed to set dependencies: {str(e)}"})
@mcp.tool()
def feature_request_human_input(
feature_id: Annotated[int, Field(description="The ID of the feature that needs human input", ge=1)],
prompt: Annotated[str, Field(min_length=1, description="Explain what you need from the human and why")],
fields: Annotated[list[dict], Field(min_length=1, description="List of input fields to collect")]
) -> str:
"""Request structured input from a human for a feature that is blocked.
Use this ONLY when the feature genuinely cannot proceed without human intervention:
- Creating API keys or external accounts
- Choosing between design approaches that require human preference
- Configuring external services the agent cannot access
- Providing credentials or secrets
Do NOT use this for issues you can solve yourself (debugging, reading docs, etc.).
The feature will be moved out of in_progress and into a "needs human input" state.
Once the human provides their response, the feature returns to the pending queue
and will include the human's response when you pick it up again.
Args:
feature_id: The ID of the feature that needs human input
prompt: A clear explanation of what you need and why
fields: List of input fields, each with:
- id (str): Unique field identifier
- label (str): Human-readable label
- type (str): "text", "textarea", "select", or "boolean" (default: "text")
- required (bool): Whether the field is required (default: true)
- placeholder (str, optional): Placeholder text
- options (list, optional): For select type: [{value, label}]
Returns:
JSON with success confirmation or error message
"""
# Validate fields
VALID_FIELD_TYPES = {"text", "textarea", "select", "boolean"}
seen_ids: set[str] = set()
for i, field in enumerate(fields):
if "id" not in field or "label" not in field:
return json.dumps({"error": f"Field at index {i} missing required 'id' or 'label'"})
fid = field["id"]
flabel = field["label"]
if not isinstance(fid, str) or not fid.strip():
return json.dumps({"error": f"Field at index {i} has empty or invalid 'id'"})
if not isinstance(flabel, str) or not flabel.strip():
return json.dumps({"error": f"Field at index {i} has empty or invalid 'label'"})
if fid in seen_ids:
return json.dumps({"error": f"Duplicate field id '{fid}' at index {i}"})
seen_ids.add(fid)
ftype = field.get("type", "text")
if ftype not in VALID_FIELD_TYPES:
return json.dumps({"error": f"Field at index {i} has invalid type '{ftype}'. Must be one of: {', '.join(sorted(VALID_FIELD_TYPES))}"})
if ftype == "select" and not field.get("options"):
return json.dumps({"error": f"Field at index {i} is type 'select' but missing 'options' array"})
request_data = {
"prompt": prompt,
"fields": fields,
}
session = get_session()
try:
# Atomically set needs_human_input, clear in_progress, store request, clear previous response
result = session.execute(text("""
UPDATE features
SET needs_human_input = 1,
in_progress = 0,
human_input_request = :request,
human_input_response = NULL
WHERE id = :id AND passes = 0 AND in_progress = 1
"""), {"id": feature_id, "request": json.dumps(request_data)})
session.commit()
if result.rowcount == 0:
feature = session.query(Feature).filter(Feature.id == feature_id).first()
if feature is None:
return json.dumps({"error": f"Feature with ID {feature_id} not found"})
if feature.passes:
return json.dumps({"error": f"Feature with ID {feature_id} is already passing"})
if not feature.in_progress:
return json.dumps({"error": f"Feature with ID {feature_id} is not in progress"})
return json.dumps({"error": "Failed to request human input for unknown reason"})
feature = session.query(Feature).filter(Feature.id == feature_id).first()
return json.dumps({
"success": True,
"feature_id": feature_id,
"name": feature.name,
"message": f"Feature '{feature.name}' is now blocked waiting for human input"
})
except Exception as e:
session.rollback()
return json.dumps({"error": f"Failed to request human input: {str(e)}"})
finally:
session.close()
@mcp.tool()
def ask_user(
questions: Annotated[list[dict], Field(description="List of questions to ask, each with question, header, options (list of {label, description}), and multiSelect (bool)")]
) -> str:
"""Ask the user structured questions with selectable options.
Use this when you need clarification or want to offer choices to the user.
Each question has a short header, the question text, and 2-4 clickable options.
The user's selections will be returned as your next message.
Args:
questions: List of questions, each with:
- question (str): The question to ask
- header (str): Short label (max 12 chars)
- options (list): Each with label (str) and description (str)
- multiSelect (bool): Allow multiple selections (default false)
Returns:
Acknowledgment that questions were presented to the user
"""
# Validate input
for i, q in enumerate(questions):
if not all(key in q for key in ["question", "header", "options"]):
return json.dumps({"error": f"Question at index {i} missing required fields"})
if len(q["options"]) < 2 or len(q["options"]) > 4:
return json.dumps({"error": f"Question at index {i} must have 2-4 options"})
return "Questions presented to the user. Their response will arrive as your next message."
if __name__ == "__main__": if __name__ == "__main__":
mcp.run() mcp.run()

View File

@@ -1,6 +1,6 @@
{ {
"name": "autoforge-ai", "name": "autoforge-ai",
"version": "0.1.5", "version": "0.1.12",
"description": "Autonomous coding agent with web UI - build complete apps with AI", "description": "Autonomous coding agent with web UI - build complete apps with AI",
"license": "AGPL-3.0", "license": "AGPL-3.0",
"bin": { "bin": {
@@ -19,6 +19,7 @@
"ui/dist/", "ui/dist/",
"ui/package.json", "ui/package.json",
".claude/commands/", ".claude/commands/",
".claude/skills/",
".claude/templates/", ".claude/templates/",
"examples/", "examples/",
"start.py", "start.py",

View File

@@ -194,6 +194,7 @@ class ParallelOrchestrator:
# Legacy alias for backward compatibility # Legacy alias for backward compatibility
self.running_agents = self.running_coding_agents self.running_agents = self.running_coding_agents
self.abort_events: dict[int, threading.Event] = {} self.abort_events: dict[int, threading.Event] = {}
self._testing_session_counter = 0
self.is_running = False self.is_running = False
# Track feature failures to prevent infinite retry loops # Track feature failures to prevent infinite retry loops
@@ -212,6 +213,9 @@ class ParallelOrchestrator:
# Signal handlers only set this flag; cleanup happens in the main loop # Signal handlers only set this flag; cleanup happens in the main loop
self._shutdown_requested = False self._shutdown_requested = False
# Graceful pause (drain mode) flag
self._drain_requested = False
# Session tracking for logging/debugging # Session tracking for logging/debugging
self.session_start_time: datetime | None = None self.session_start_time: datetime | None = None
@@ -492,6 +496,9 @@ class ParallelOrchestrator:
for fd in feature_dicts: for fd in feature_dicts:
if not fd.get("in_progress") or fd.get("passes"): if not fd.get("in_progress") or fd.get("passes"):
continue continue
# Skip if blocked for human input
if fd.get("needs_human_input"):
continue
# Skip if already running in this orchestrator instance # Skip if already running in this orchestrator instance
if fd["id"] in running_ids: if fd["id"] in running_ids:
continue continue
@@ -536,11 +543,14 @@ class ParallelOrchestrator:
running_ids.update(batch_ids) running_ids.update(batch_ids)
ready = [] ready = []
skipped_reasons = {"passes": 0, "in_progress": 0, "running": 0, "failed": 0, "deps": 0} skipped_reasons = {"passes": 0, "in_progress": 0, "running": 0, "failed": 0, "deps": 0, "needs_human_input": 0}
for fd in feature_dicts: for fd in feature_dicts:
if fd.get("passes"): if fd.get("passes"):
skipped_reasons["passes"] += 1 skipped_reasons["passes"] += 1
continue continue
if fd.get("needs_human_input"):
skipped_reasons["needs_human_input"] += 1
continue
if fd.get("in_progress"): if fd.get("in_progress"):
skipped_reasons["in_progress"] += 1 skipped_reasons["in_progress"] += 1
continue continue
@@ -846,7 +856,7 @@ class ParallelOrchestrator:
"encoding": "utf-8", "encoding": "utf-8",
"errors": "replace", "errors": "replace",
"cwd": str(self.project_dir), # Run from project dir so CLI creates .claude/ in project "cwd": str(self.project_dir), # Run from project dir so CLI creates .claude/ in project
"env": {**os.environ, "PYTHONUNBUFFERED": "1"}, "env": {**os.environ, "PYTHONUNBUFFERED": "1", "NODE_COMPILE_CACHE": "", "PLAYWRIGHT_CLI_SESSION": f"coding-{feature_id}"},
} }
if sys.platform == "win32": if sys.platform == "win32":
popen_kwargs["creationflags"] = subprocess.CREATE_NO_WINDOW popen_kwargs["creationflags"] = subprocess.CREATE_NO_WINDOW
@@ -909,7 +919,7 @@ class ParallelOrchestrator:
"encoding": "utf-8", "encoding": "utf-8",
"errors": "replace", "errors": "replace",
"cwd": str(self.project_dir), # Run from project dir so CLI creates .claude/ in project "cwd": str(self.project_dir), # Run from project dir so CLI creates .claude/ in project
"env": {**os.environ, "PYTHONUNBUFFERED": "1"}, "env": {**os.environ, "PYTHONUNBUFFERED": "1", "NODE_COMPILE_CACHE": "", "PLAYWRIGHT_CLI_SESSION": f"coding-{primary_id}"},
} }
if sys.platform == "win32": if sys.platform == "win32":
popen_kwargs["creationflags"] = subprocess.CREATE_NO_WINDOW popen_kwargs["creationflags"] = subprocess.CREATE_NO_WINDOW
@@ -1013,8 +1023,9 @@ class ParallelOrchestrator:
"encoding": "utf-8", "encoding": "utf-8",
"errors": "replace", "errors": "replace",
"cwd": str(self.project_dir), # Run from project dir so CLI creates .claude/ in project "cwd": str(self.project_dir), # Run from project dir so CLI creates .claude/ in project
"env": {**os.environ, "PYTHONUNBUFFERED": "1"}, "env": {**os.environ, "PYTHONUNBUFFERED": "1", "NODE_COMPILE_CACHE": "", "PLAYWRIGHT_CLI_SESSION": f"testing-{self._testing_session_counter}"},
} }
self._testing_session_counter += 1
if sys.platform == "win32": if sys.platform == "win32":
popen_kwargs["creationflags"] = subprocess.CREATE_NO_WINDOW popen_kwargs["creationflags"] = subprocess.CREATE_NO_WINDOW
@@ -1074,7 +1085,7 @@ class ParallelOrchestrator:
"encoding": "utf-8", "encoding": "utf-8",
"errors": "replace", "errors": "replace",
"cwd": str(AUTOFORGE_ROOT), "cwd": str(AUTOFORGE_ROOT),
"env": {**os.environ, "PYTHONUNBUFFERED": "1"}, "env": {**os.environ, "PYTHONUNBUFFERED": "1", "NODE_COMPILE_CACHE": ""},
} }
if sys.platform == "win32": if sys.platform == "win32":
popen_kwargs["creationflags"] = subprocess.CREATE_NO_WINDOW popen_kwargs["creationflags"] = subprocess.CREATE_NO_WINDOW
@@ -1160,6 +1171,19 @@ class ParallelOrchestrator:
debug_log.log("CLEANUP", f"Error killing process tree for {agent_type} agent", error=str(e)) debug_log.log("CLEANUP", f"Error killing process tree for {agent_type} agent", error=str(e))
self._on_agent_complete(feature_id, proc.returncode, agent_type, proc) self._on_agent_complete(feature_id, proc.returncode, agent_type, proc)
def _run_inter_session_cleanup(self):
"""Run lightweight cleanup between agent sessions.
Removes stale temp files and project screenshots to prevent
disk space accumulation during long overnight runs.
"""
try:
from temp_cleanup import cleanup_project_screenshots, cleanup_stale_temp
cleanup_stale_temp()
cleanup_project_screenshots(self.project_dir)
except Exception as e:
debug_log.log("CLEANUP", f"Inter-session cleanup failed (non-fatal): {e}")
def _signal_agent_completed(self): def _signal_agent_completed(self):
"""Signal that an agent has completed, waking the main loop. """Signal that an agent has completed, waking the main loop.
@@ -1235,6 +1259,8 @@ class ParallelOrchestrator:
pid=proc.pid, pid=proc.pid,
feature_id=feature_id, feature_id=feature_id,
status=status) status=status)
# Run lightweight cleanup between sessions
self._run_inter_session_cleanup()
# Signal main loop that an agent slot is available # Signal main loop that an agent slot is available
self._signal_agent_completed() self._signal_agent_completed()
return return
@@ -1301,6 +1327,8 @@ class ParallelOrchestrator:
else: else:
print(f"Feature #{feature_id} {status}", flush=True) print(f"Feature #{feature_id} {status}", flush=True)
# Run lightweight cleanup between sessions
self._run_inter_session_cleanup()
# Signal main loop that an agent slot is available # Signal main loop that an agent slot is available
self._signal_agent_completed() self._signal_agent_completed()
@@ -1368,6 +1396,9 @@ class ParallelOrchestrator:
# Must happen before any debug_log.log() calls # Must happen before any debug_log.log() calls
debug_log.start_session() debug_log.start_session()
# Clear any stale drain signal from a previous session
self._clear_drain_signal()
# Log startup to debug file # Log startup to debug file
debug_log.section("ORCHESTRATOR STARTUP") debug_log.section("ORCHESTRATOR STARTUP")
debug_log.log("STARTUP", "Orchestrator run_loop starting", debug_log.log("STARTUP", "Orchestrator run_loop starting",
@@ -1489,6 +1520,34 @@ class ParallelOrchestrator:
print("\nAll features complete!", flush=True) print("\nAll features complete!", flush=True)
break break
# --- Graceful pause (drain mode) ---
if not self._drain_requested and self._check_drain_signal():
self._drain_requested = True
print("Graceful pause requested - draining running agents...", flush=True)
debug_log.log("DRAIN", "Graceful pause requested, draining running agents")
if self._drain_requested:
with self._lock:
coding_count = len(self.running_coding_agents)
testing_count = len(self.running_testing_agents)
if coding_count == 0 and testing_count == 0:
print("All agents drained - paused.", flush=True)
debug_log.log("DRAIN", "All agents drained, entering paused state")
# Wait until signal file is removed (resume) or shutdown
while self._check_drain_signal() and self.is_running and not self._shutdown_requested:
await asyncio.sleep(1)
if not self.is_running or self._shutdown_requested:
break
self._drain_requested = False
print("Resuming from graceful pause...", flush=True)
debug_log.log("DRAIN", "Resuming from graceful pause")
continue
else:
debug_log.log("DRAIN", f"Waiting for agents to finish: coding={coding_count}, testing={testing_count}")
await self._wait_for_agent_completion()
continue
# Maintain testing agents independently (runs every iteration) # Maintain testing agents independently (runs every iteration)
self._maintain_testing_agents(feature_dicts) self._maintain_testing_agents(feature_dicts)
@@ -1613,6 +1672,17 @@ class ParallelOrchestrator:
"yolo_mode": self.yolo_mode, "yolo_mode": self.yolo_mode,
} }
def _check_drain_signal(self) -> bool:
"""Check if the graceful pause (drain) signal file exists."""
from autoforge_paths import get_pause_drain_path
return get_pause_drain_path(self.project_dir).exists()
def _clear_drain_signal(self) -> None:
"""Delete the drain signal file and reset the flag."""
from autoforge_paths import get_pause_drain_path
get_pause_drain_path(self.project_dir).unlink(missing_ok=True)
self._drain_requested = False
def cleanup(self) -> None: def cleanup(self) -> None:
"""Clean up database resources. Safe to call multiple times. """Clean up database resources. Safe to call multiple times.

View File

@@ -62,54 +62,71 @@ def has_features(project_dir: Path) -> bool:
return False return False
def count_passing_tests(project_dir: Path) -> tuple[int, int, int]: def count_passing_tests(project_dir: Path) -> tuple[int, int, int, int]:
""" """
Count passing, in_progress, and total tests via direct database access. Count passing, in_progress, total, and needs_human_input tests via direct database access.
Args: Args:
project_dir: Directory containing the project project_dir: Directory containing the project
Returns: Returns:
(passing_count, in_progress_count, total_count) (passing_count, in_progress_count, total_count, needs_human_input_count)
""" """
from autoforge_paths import get_features_db_path from autoforge_paths import get_features_db_path
db_file = get_features_db_path(project_dir) db_file = get_features_db_path(project_dir)
if not db_file.exists(): if not db_file.exists():
return 0, 0, 0 return 0, 0, 0, 0
try: try:
with closing(_get_connection(db_file)) as conn: with closing(_get_connection(db_file)) as conn:
cursor = conn.cursor() cursor = conn.cursor()
# Single aggregate query instead of 3 separate COUNT queries # Single aggregate query instead of separate COUNT queries
# Handle case where in_progress column doesn't exist yet (legacy DBs) # Handle case where columns don't exist yet (legacy DBs)
try: try:
cursor.execute(""" cursor.execute("""
SELECT SELECT
COUNT(*) as total, COUNT(*) as total,
SUM(CASE WHEN passes = 1 THEN 1 ELSE 0 END) as passing, SUM(CASE WHEN passes = 1 THEN 1 ELSE 0 END) as passing,
SUM(CASE WHEN in_progress = 1 THEN 1 ELSE 0 END) as in_progress SUM(CASE WHEN in_progress = 1 THEN 1 ELSE 0 END) as in_progress,
SUM(CASE WHEN needs_human_input = 1 THEN 1 ELSE 0 END) as needs_human_input
FROM features FROM features
""") """)
row = cursor.fetchone() row = cursor.fetchone()
total = row[0] or 0 total = row[0] or 0
passing = row[1] or 0 passing = row[1] or 0
in_progress = row[2] or 0 in_progress = row[2] or 0
needs_human_input = row[3] or 0
except sqlite3.OperationalError: except sqlite3.OperationalError:
# Fallback for databases without in_progress column # Fallback for databases without newer columns
cursor.execute(""" try:
SELECT cursor.execute("""
COUNT(*) as total, SELECT
SUM(CASE WHEN passes = 1 THEN 1 ELSE 0 END) as passing COUNT(*) as total,
FROM features SUM(CASE WHEN passes = 1 THEN 1 ELSE 0 END) as passing,
""") SUM(CASE WHEN in_progress = 1 THEN 1 ELSE 0 END) as in_progress
row = cursor.fetchone() FROM features
total = row[0] or 0 """)
passing = row[1] or 0 row = cursor.fetchone()
in_progress = 0 total = row[0] or 0
return passing, in_progress, total passing = row[1] or 0
in_progress = row[2] or 0
needs_human_input = 0
except sqlite3.OperationalError:
cursor.execute("""
SELECT
COUNT(*) as total,
SUM(CASE WHEN passes = 1 THEN 1 ELSE 0 END) as passing
FROM features
""")
row = cursor.fetchone()
total = row[0] or 0
passing = row[1] or 0
in_progress = 0
needs_human_input = 0
return passing, in_progress, total, needs_human_input
except Exception as e: except Exception as e:
print(f"[Database error in count_passing_tests: {e}]") print(f"[Database error in count_passing_tests: {e}]")
return 0, 0, 0 return 0, 0, 0, 0
def get_all_passing_features(project_dir: Path) -> list[dict]: def get_all_passing_features(project_dir: Path) -> list[dict]:
@@ -234,7 +251,7 @@ def print_session_header(session_num: int, is_initializer: bool) -> None:
def print_progress_summary(project_dir: Path) -> None: def print_progress_summary(project_dir: Path) -> None:
"""Print a summary of current progress.""" """Print a summary of current progress."""
passing, in_progress, total = count_passing_tests(project_dir) passing, in_progress, total, _needs_human_input = count_passing_tests(project_dir)
if total > 0: if total > 0:
percentage = (passing / total) * 100 percentage = (passing / total) * 100

View File

@@ -16,6 +16,9 @@ from pathlib import Path
# Base templates location (generic templates) # Base templates location (generic templates)
TEMPLATES_DIR = Path(__file__).parent / ".claude" / "templates" TEMPLATES_DIR = Path(__file__).parent / ".claude" / "templates"
# Migration version — bump when adding new migration steps
CURRENT_MIGRATION_VERSION = 1
def get_project_prompts_dir(project_dir: Path) -> Path: def get_project_prompts_dir(project_dir: Path) -> Path:
"""Get the prompts directory for a specific project.""" """Get the prompts directory for a specific project."""
@@ -99,9 +102,9 @@ def _strip_browser_testing_sections(prompt: str) -> str:
flags=re.DOTALL, flags=re.DOTALL,
) )
# Replace the screenshots-only marking rule with YOLO-appropriate wording # Replace the marking rule with YOLO-appropriate wording
prompt = prompt.replace( prompt = prompt.replace(
"**ONLY MARK A FEATURE AS PASSING AFTER VERIFICATION WITH SCREENSHOTS.**", "**ONLY MARK A FEATURE AS PASSING AFTER VERIFICATION WITH BROWSER AUTOMATION.**",
"**YOLO mode: Mark a feature as passing after lint/type-check succeeds and server starts cleanly.**", "**YOLO mode: Mark a feature as passing after lint/type-check succeeds and server starts cleanly.**",
) )
@@ -351,9 +354,70 @@ def scaffold_project_prompts(project_dir: Path) -> Path:
except (OSError, PermissionError) as e: except (OSError, PermissionError) as e:
print(f" Warning: Could not copy allowed_commands.yaml: {e}") print(f" Warning: Could not copy allowed_commands.yaml: {e}")
# Copy Playwright CLI skill for browser automation
skills_src = Path(__file__).parent / ".claude" / "skills" / "playwright-cli"
skills_dest = project_dir / ".claude" / "skills" / "playwright-cli"
if skills_src.exists() and not skills_dest.exists():
try:
shutil.copytree(skills_src, skills_dest)
copied_files.append(".claude/skills/playwright-cli/")
except (OSError, PermissionError) as e:
print(f" Warning: Could not copy playwright-cli skill: {e}")
# Ensure .playwright-cli/ and .playwright/ are in project .gitignore
project_gitignore = project_dir / ".gitignore"
entries_to_add = [".playwright-cli/", ".playwright/"]
existing_lines: list[str] = []
if project_gitignore.exists():
try:
existing_lines = project_gitignore.read_text(encoding="utf-8").splitlines()
except (OSError, PermissionError):
pass
missing_entries = [e for e in entries_to_add if e not in existing_lines]
if missing_entries:
try:
with open(project_gitignore, "a", encoding="utf-8") as f:
# Add newline before entries if file doesn't end with one
if existing_lines and existing_lines[-1].strip():
f.write("\n")
for entry in missing_entries:
f.write(f"{entry}\n")
except (OSError, PermissionError) as e:
print(f" Warning: Could not update .gitignore: {e}")
# Scaffold .playwright/cli.config.json for browser settings
playwright_config_dir = project_dir / ".playwright"
playwright_config_file = playwright_config_dir / "cli.config.json"
if not playwright_config_file.exists():
try:
playwright_config_dir.mkdir(parents=True, exist_ok=True)
import json
config = {
"browser": {
"browserName": "chromium",
"launchOptions": {
"channel": "chrome",
"headless": True,
},
"contextOptions": {
"viewport": {"width": 1280, "height": 720},
},
"isolated": True,
},
}
with open(playwright_config_file, "w", encoding="utf-8") as f:
json.dump(config, f, indent=2)
f.write("\n")
copied_files.append(".playwright/cli.config.json")
except (OSError, PermissionError) as e:
print(f" Warning: Could not create playwright config: {e}")
if copied_files: if copied_files:
print(f" Created project files: {', '.join(copied_files)}") print(f" Created project files: {', '.join(copied_files)}")
# Stamp new projects at the current migration version so they never trigger migration
_set_migration_version(project_dir, CURRENT_MIGRATION_VERSION)
return project_prompts return project_prompts
@@ -425,3 +489,330 @@ def copy_spec_to_project(project_dir: Path) -> None:
return return
print("Warning: No app_spec.txt found to copy to project directory") print("Warning: No app_spec.txt found to copy to project directory")
# ---------------------------------------------------------------------------
# Project version migration
# ---------------------------------------------------------------------------
# Replacement content: coding_prompt.md STEP 5 section (Playwright CLI)
_CLI_STEP5_CONTENT = """\
### STEP 5: VERIFY WITH BROWSER AUTOMATION
**CRITICAL:** You MUST verify features through the actual UI.
Use `playwright-cli` for browser automation:
- Open the browser: `playwright-cli open http://localhost:PORT`
- Take a snapshot to see page elements: `playwright-cli snapshot`
- Read the snapshot YAML file to see element refs
- Click elements by ref: `playwright-cli click e5`
- Type text: `playwright-cli type "search query"`
- Fill form fields: `playwright-cli fill e3 "value"`
- Take screenshots: `playwright-cli screenshot`
- Read the screenshot file to verify visual appearance
- Check console errors: `playwright-cli console`
- Close browser when done: `playwright-cli close`
**Token-efficient workflow:** `playwright-cli screenshot` and `snapshot` save files
to `.playwright-cli/`. You will see a file link in the output. Read the file only
when you need to verify visual appearance or find element refs.
**DO:**
- Test through the UI with clicks and keyboard input
- Take screenshots and read them to verify visual appearance
- Check for console errors with `playwright-cli console`
- Verify complete user workflows end-to-end
- Always run `playwright-cli close` when finished testing
**DON'T:**
- Only test with curl commands
- Use JavaScript evaluation to bypass UI (`eval` and `run-code` are blocked)
- Skip visual verification
- Mark tests passing without thorough verification
"""
# Replacement content: coding_prompt.md BROWSER AUTOMATION reference section
_CLI_BROWSER_SECTION = """\
## BROWSER AUTOMATION
Use `playwright-cli` commands for UI verification. Key commands: `open`, `goto`,
`snapshot`, `click`, `type`, `fill`, `screenshot`, `console`, `close`.
**How it works:** `playwright-cli` uses a persistent browser daemon. `open` starts it,
subsequent commands interact via socket, `close` shuts it down. Screenshots and snapshots
save to `.playwright-cli/` -- read the files when you need to verify content.
Test like a human user with mouse and keyboard. Use `playwright-cli console` to detect
JS errors. Don't bypass UI with JavaScript evaluation.
"""
# Replacement content: testing_prompt.md STEP 2 section (Playwright CLI)
_CLI_TESTING_STEP2 = """\
### STEP 2: VERIFY THE FEATURE
**CRITICAL:** You MUST verify the feature through the actual UI using browser automation.
For the feature returned:
1. Read and understand the feature's verification steps
2. Navigate to the relevant part of the application
3. Execute each verification step using browser automation
4. Take screenshots and read them to verify visual appearance
5. Check for console errors
### Browser Automation (Playwright CLI)
**Navigation & Screenshots:**
- `playwright-cli open <url>` - Open browser and navigate
- `playwright-cli goto <url>` - Navigate to URL
- `playwright-cli screenshot` - Save screenshot to `.playwright-cli/`
- `playwright-cli snapshot` - Save page snapshot with element refs to `.playwright-cli/`
**Element Interaction:**
- `playwright-cli click <ref>` - Click elements (ref from snapshot)
- `playwright-cli type <text>` - Type text
- `playwright-cli fill <ref> <text>` - Fill form fields
- `playwright-cli select <ref> <val>` - Select dropdown
- `playwright-cli press <key>` - Keyboard input
**Debugging:**
- `playwright-cli console` - Check for JS errors
- `playwright-cli network` - Monitor API calls
**Cleanup:**
- `playwright-cli close` - Close browser when done (ALWAYS do this)
**Note:** Screenshots and snapshots save to files. Read the file to see the content.
"""
# Replacement content: testing_prompt.md AVAILABLE TOOLS browser subsection
_CLI_TESTING_TOOLS = """\
### Browser Automation (Playwright CLI)
Use `playwright-cli` commands for browser interaction. Key commands:
- `playwright-cli open <url>` - Open browser
- `playwright-cli goto <url>` - Navigate to URL
- `playwright-cli screenshot` - Take screenshot (saved to `.playwright-cli/`)
- `playwright-cli snapshot` - Get page snapshot with element refs
- `playwright-cli click <ref>` - Click element
- `playwright-cli type <text>` - Type text
- `playwright-cli fill <ref> <text>` - Fill form field
- `playwright-cli console` - Check for JS errors
- `playwright-cli close` - Close browser (always do this when done)
"""
def _get_migration_version(project_dir: Path) -> int:
"""Read the migration version from .autoforge/.migration_version."""
from autoforge_paths import get_autoforge_dir
version_file = get_autoforge_dir(project_dir) / ".migration_version"
if not version_file.exists():
return 0
try:
return int(version_file.read_text().strip())
except (ValueError, OSError):
return 0
def _set_migration_version(project_dir: Path, version: int) -> None:
"""Write the migration version to .autoforge/.migration_version."""
from autoforge_paths import get_autoforge_dir
version_file = get_autoforge_dir(project_dir) / ".migration_version"
version_file.parent.mkdir(parents=True, exist_ok=True)
version_file.write_text(str(version))
def _migrate_coding_prompt_to_cli(content: str) -> str:
"""Replace MCP-based Playwright sections with CLI-based content in coding prompt."""
# Replace STEP 5 section (from header to just before STEP 5.5)
content = re.sub(
r"### STEP 5: VERIFY WITH BROWSER AUTOMATION.*?(?=### STEP 5\.5:)",
_CLI_STEP5_CONTENT,
content,
count=1,
flags=re.DOTALL,
)
# Replace BROWSER AUTOMATION reference section (from header to next ---)
content = re.sub(
r"## BROWSER AUTOMATION\n\n.*?(?=---)",
_CLI_BROWSER_SECTION,
content,
count=1,
flags=re.DOTALL,
)
# Replace inline screenshot rule
content = content.replace(
"**ONLY MARK A FEATURE AS PASSING AFTER VERIFICATION WITH SCREENSHOTS.**",
"**ONLY MARK A FEATURE AS PASSING AFTER VERIFICATION WITH BROWSER AUTOMATION.**",
)
# Replace inline screenshot references (various phrasings from old templates)
for old_phrase in (
"(inline only -- do NOT save to disk)",
"(inline only, never save to disk)",
"(inline mode only -- never save to disk)",
):
content = content.replace(old_phrase, "(saved to `.playwright-cli/`)")
return content
def _migrate_testing_prompt_to_cli(content: str) -> str:
"""Replace MCP-based Playwright sections with CLI-based content in testing prompt."""
# Replace AVAILABLE TOOLS browser subsection FIRST (before STEP 2, to avoid
# matching the new CLI subsection header that the STEP 2 replacement inserts).
# In old prompts, ### Browser Automation (Playwright) only exists in AVAILABLE TOOLS.
content = re.sub(
r"### Browser Automation \(Playwright[^)]*\)\n.*?(?=---)",
_CLI_TESTING_TOOLS,
content,
count=1,
flags=re.DOTALL,
)
# Replace STEP 2 verification section (from header to just before STEP 3)
content = re.sub(
r"### STEP 2: VERIFY THE FEATURE.*?(?=### STEP 3:)",
_CLI_TESTING_STEP2,
content,
count=1,
flags=re.DOTALL,
)
# Replace inline screenshot references (various phrasings from old templates)
for old_phrase in (
"(inline only -- do NOT save to disk)",
"(inline only, never save to disk)",
"(inline mode only -- never save to disk)",
):
content = content.replace(old_phrase, "(saved to `.playwright-cli/`)")
return content
def _migrate_v0_to_v1(project_dir: Path) -> list[str]:
"""Migrate from v0 (MCP-based Playwright) to v1 (Playwright CLI).
Four idempotent sub-steps:
A. Copy playwright-cli skill to project
B. Scaffold .playwright/cli.config.json
C. Update .gitignore with .playwright-cli/ and .playwright/
D. Update coding_prompt.md and testing_prompt.md
"""
import json
migrated: list[str] = []
# A. Copy Playwright CLI skill
skills_src = Path(__file__).parent / ".claude" / "skills" / "playwright-cli"
skills_dest = project_dir / ".claude" / "skills" / "playwright-cli"
if skills_src.exists() and not skills_dest.exists():
try:
shutil.copytree(skills_src, skills_dest)
migrated.append("Copied playwright-cli skill")
except (OSError, PermissionError) as e:
print(f" Warning: Could not copy playwright-cli skill: {e}")
# B. Scaffold .playwright/cli.config.json
playwright_config_dir = project_dir / ".playwright"
playwright_config_file = playwright_config_dir / "cli.config.json"
if not playwright_config_file.exists():
try:
playwright_config_dir.mkdir(parents=True, exist_ok=True)
config = {
"browser": {
"browserName": "chromium",
"launchOptions": {
"channel": "chrome",
"headless": True,
},
"contextOptions": {
"viewport": {"width": 1280, "height": 720},
},
"isolated": True,
},
}
with open(playwright_config_file, "w", encoding="utf-8") as f:
json.dump(config, f, indent=2)
f.write("\n")
migrated.append("Created .playwright/cli.config.json")
except (OSError, PermissionError) as e:
print(f" Warning: Could not create playwright config: {e}")
# C. Update .gitignore
project_gitignore = project_dir / ".gitignore"
entries_to_add = [".playwright-cli/", ".playwright/"]
existing_lines: list[str] = []
if project_gitignore.exists():
try:
existing_lines = project_gitignore.read_text(encoding="utf-8").splitlines()
except (OSError, PermissionError):
pass
missing_entries = [e for e in entries_to_add if e not in existing_lines]
if missing_entries:
try:
with open(project_gitignore, "a", encoding="utf-8") as f:
if existing_lines and existing_lines[-1].strip():
f.write("\n")
for entry in missing_entries:
f.write(f"{entry}\n")
migrated.append(f"Added {', '.join(missing_entries)} to .gitignore")
except (OSError, PermissionError) as e:
print(f" Warning: Could not update .gitignore: {e}")
# D. Update prompts
prompts_dir = get_project_prompts_dir(project_dir)
# D1. Update coding_prompt.md
coding_prompt_path = prompts_dir / "coding_prompt.md"
if coding_prompt_path.exists():
try:
content = coding_prompt_path.read_text(encoding="utf-8")
if "Playwright MCP" in content or "browser_navigate" in content or "browser_take_screenshot" in content:
updated = _migrate_coding_prompt_to_cli(content)
if updated != content:
coding_prompt_path.write_text(updated, encoding="utf-8")
migrated.append("Updated coding_prompt.md to Playwright CLI")
except (OSError, PermissionError) as e:
print(f" Warning: Could not update coding_prompt.md: {e}")
# D2. Update testing_prompt.md
testing_prompt_path = prompts_dir / "testing_prompt.md"
if testing_prompt_path.exists():
try:
content = testing_prompt_path.read_text(encoding="utf-8")
if "browser_navigate" in content or "browser_take_screenshot" in content:
updated = _migrate_testing_prompt_to_cli(content)
if updated != content:
testing_prompt_path.write_text(updated, encoding="utf-8")
migrated.append("Updated testing_prompt.md to Playwright CLI")
except (OSError, PermissionError) as e:
print(f" Warning: Could not update testing_prompt.md: {e}")
return migrated
def migrate_project_to_current(project_dir: Path) -> list[str]:
"""Migrate an existing project to the current AutoForge version.
Idempotent — safe to call on every agent start. Returns list of
human-readable descriptions of what was migrated.
"""
current = _get_migration_version(project_dir)
if current >= CURRENT_MIGRATION_VERSION:
return []
migrated: list[str] = []
if current < 1:
migrated.extend(_migrate_v0_to_v1(project_dir))
# Future: if current < 2: migrated.extend(_migrate_v1_to_v2(project_dir))
_set_migration_version(project_dir, CURRENT_MIGRATION_VERSION)
return migrated

View File

@@ -676,6 +676,18 @@ API_PROVIDERS: dict[str, dict[str, Any]] = {
], ],
"default_model": "glm-4.7", "default_model": "glm-4.7",
}, },
"azure": {
"name": "Azure Anthropic (Claude)",
"base_url": "",
"requires_auth": True,
"auth_env_var": "ANTHROPIC_API_KEY",
"models": [
{"id": "claude-opus-4-6", "name": "Claude Opus"},
{"id": "claude-sonnet-4-5", "name": "Claude Sonnet"},
{"id": "claude-haiku-4-5", "name": "Claude Haiku"},
],
"default_model": "claude-opus-4-6",
},
"ollama": { "ollama": {
"name": "Ollama (Local)", "name": "Ollama (Local)",
"base_url": "http://localhost:11434", "base_url": "http://localhost:11434",
@@ -733,6 +745,21 @@ def get_effective_sdk_env() -> dict[str, str]:
sdk_env = {} sdk_env = {}
# Explicitly clear credentials that could leak from the server process env.
# For providers using ANTHROPIC_AUTH_TOKEN (GLM, Custom), clear ANTHROPIC_API_KEY.
# For providers using ANTHROPIC_API_KEY (Kimi), clear ANTHROPIC_AUTH_TOKEN.
# This prevents the Claude CLI from using the wrong credentials.
auth_env_var = provider.get("auth_env_var", "ANTHROPIC_AUTH_TOKEN")
if auth_env_var == "ANTHROPIC_AUTH_TOKEN":
sdk_env["ANTHROPIC_API_KEY"] = ""
elif auth_env_var == "ANTHROPIC_API_KEY":
sdk_env["ANTHROPIC_AUTH_TOKEN"] = ""
# Clear Vertex AI vars when using non-Vertex alternative providers
sdk_env["CLAUDE_CODE_USE_VERTEX"] = ""
sdk_env["CLOUD_ML_REGION"] = ""
sdk_env["ANTHROPIC_VERTEX_PROJECT_ID"] = ""
# Base URL # Base URL
base_url = all_settings.get("api_base_url") or provider.get("base_url") base_url = all_settings.get("api_base_url") or provider.get("base_url")
if base_url: if base_url:
@@ -741,7 +768,6 @@ def get_effective_sdk_env() -> dict[str, str]:
# Auth token # Auth token
auth_token = all_settings.get("api_auth_token") auth_token = all_settings.get("api_auth_token")
if auth_token: if auth_token:
auth_env_var = provider.get("auth_env_var", "ANTHROPIC_AUTH_TOKEN")
sdk_env[auth_env_var] = auth_token sdk_env[auth_env_var] = auth_token
# Model - set all three tier overrides to the same model # Model - set all three tier overrides to the same model

View File

@@ -66,10 +66,12 @@ ALLOWED_COMMANDS = {
"bash", "bash",
# Script execution # Script execution
"init.sh", # Init scripts; validated separately "init.sh", # Init scripts; validated separately
# Browser automation
"playwright-cli", # Playwright CLI for browser testing; validated separately
} }
# Commands that need additional validation even when in the allowlist # Commands that need additional validation even when in the allowlist
COMMANDS_NEEDING_EXTRA_VALIDATION = {"pkill", "chmod", "init.sh"} COMMANDS_NEEDING_EXTRA_VALIDATION = {"pkill", "chmod", "init.sh", "playwright-cli"}
# Commands that are NEVER allowed, even with user approval # Commands that are NEVER allowed, even with user approval
# These commands can cause permanent system damage or security breaches # These commands can cause permanent system damage or security breaches
@@ -438,6 +440,37 @@ def validate_init_script(command_string: str) -> tuple[bool, str]:
return False, f"Only ./init.sh is allowed, got: {script}" return False, f"Only ./init.sh is allowed, got: {script}"
def validate_playwright_command(command_string: str) -> tuple[bool, str]:
"""
Validate playwright-cli commands - block dangerous subcommands.
Blocks `run-code` (arbitrary Node.js execution) and `eval` (arbitrary JS
evaluation) which bypass the security sandbox.
Returns:
Tuple of (is_allowed, reason_if_blocked)
"""
try:
tokens = shlex.split(command_string)
except ValueError:
return False, "Could not parse playwright-cli command"
if not tokens:
return False, "Empty command"
BLOCKED_SUBCOMMANDS = {"run-code", "eval"}
# Find the subcommand: first non-flag token after 'playwright-cli'
for token in tokens[1:]:
if token.startswith("-"):
continue # skip flags like -s=agent-1
if token in BLOCKED_SUBCOMMANDS:
return False, f"playwright-cli '{token}' is not allowed"
break # first non-flag token is the subcommand
return True, ""
def matches_pattern(command: str, pattern: str) -> bool: def matches_pattern(command: str, pattern: str) -> bool:
""" """
Check if a command matches a pattern. Check if a command matches a pattern.
@@ -955,5 +988,9 @@ async def bash_security_hook(input_data, tool_use_id=None, context=None):
allowed, reason = validate_init_script(cmd_segment) allowed, reason = validate_init_script(cmd_segment)
if not allowed: if not allowed:
return {"decision": "block", "reason": reason} return {"decision": "block", "reason": reason}
elif cmd == "playwright-cli":
allowed, reason = validate_playwright_command(cmd_segment)
if not allowed:
return {"decision": "block", "reason": reason}
return {} return {}

View File

@@ -61,6 +61,17 @@ UI_DIST_DIR = ROOT_DIR / "ui" / "dist"
@asynccontextmanager @asynccontextmanager
async def lifespan(app: FastAPI): async def lifespan(app: FastAPI):
"""Lifespan context manager for startup and shutdown.""" """Lifespan context manager for startup and shutdown."""
# Startup - clean up stale temp files (Playwright profiles, .node cache, etc.)
try:
from temp_cleanup import cleanup_stale_temp
stats = cleanup_stale_temp()
if stats["dirs_deleted"] > 0 or stats["files_deleted"] > 0:
mb_freed = stats["bytes_freed"] / (1024 * 1024)
logger.info("Startup temp cleanup: %d dirs, %d files, %.1f MB freed",
stats["dirs_deleted"], stats["files_deleted"], mb_freed)
except Exception as e:
logger.warning("Startup temp cleanup failed (non-fatal): %s", e)
# Startup - clean up orphaned lock files from previous runs # Startup - clean up orphaned lock files from previous runs
cleanup_orphaned_locks() cleanup_orphaned_locks()
cleanup_orphaned_devserver_locks() cleanup_orphaned_devserver_locks()

View File

@@ -175,3 +175,31 @@ async def resume_agent(project_name: str):
status=manager.status, status=manager.status,
message=message, message=message,
) )
@router.post("/graceful-pause", response_model=AgentActionResponse)
async def graceful_pause_agent(project_name: str):
"""Request a graceful pause (drain mode) - finish current work then pause."""
manager = get_project_manager(project_name)
success, message = await manager.graceful_pause()
return AgentActionResponse(
success=success,
status=manager.status,
message=message,
)
@router.post("/graceful-resume", response_model=AgentActionResponse)
async def graceful_resume_agent(project_name: str):
"""Resume from a graceful pause."""
manager = get_project_manager(project_name)
success, message = await manager.graceful_resume()
return AgentActionResponse(
success=success,
status=manager.status,
message=message,
)

View File

@@ -207,12 +207,14 @@ async def assistant_chat_websocket(websocket: WebSocket, project_name: str):
Client -> Server: Client -> Server:
- {"type": "start", "conversation_id": int | null} - Start/resume session - {"type": "start", "conversation_id": int | null} - Start/resume session
- {"type": "message", "content": "..."} - Send user message - {"type": "message", "content": "..."} - Send user message
- {"type": "answer", "answers": {...}} - Answer to structured questions
- {"type": "ping"} - Keep-alive ping - {"type": "ping"} - Keep-alive ping
Server -> Client: Server -> Client:
- {"type": "conversation_created", "conversation_id": int} - New conversation created - {"type": "conversation_created", "conversation_id": int} - New conversation created
- {"type": "text", "content": "..."} - Text chunk from Claude - {"type": "text", "content": "..."} - Text chunk from Claude
- {"type": "tool_call", "tool": "...", "input": {...}} - Tool being called - {"type": "tool_call", "tool": "...", "input": {...}} - Tool being called
- {"type": "question", "questions": [...]} - Structured questions for user
- {"type": "response_done"} - Response complete - {"type": "response_done"} - Response complete
- {"type": "error", "content": "..."} - Error message - {"type": "error", "content": "..."} - Error message
- {"type": "pong"} - Keep-alive pong - {"type": "pong"} - Keep-alive pong
@@ -303,6 +305,34 @@ async def assistant_chat_websocket(websocket: WebSocket, project_name: str):
async for chunk in session.send_message(user_content): async for chunk in session.send_message(user_content):
await websocket.send_json(chunk) await websocket.send_json(chunk)
elif msg_type == "answer":
# User answered a structured question
if not session:
session = get_session(project_name)
if not session:
await websocket.send_json({
"type": "error",
"content": "No active session. Send 'start' first."
})
continue
# Format the answers as a natural response
answers = message.get("answers", {})
if isinstance(answers, dict):
response_parts = []
for question_idx, answer_value in answers.items():
if isinstance(answer_value, list):
response_parts.append(", ".join(answer_value))
else:
response_parts.append(str(answer_value))
user_response = "; ".join(response_parts) if response_parts else "OK"
else:
user_response = str(answers)
# Stream Claude's response
async for chunk in session.send_message(user_response):
await websocket.send_json(chunk)
else: else:
await websocket.send_json({ await websocket.send_json({
"type": "error", "type": "error",

View File

@@ -23,6 +23,7 @@ from ..schemas import (
FeatureListResponse, FeatureListResponse,
FeatureResponse, FeatureResponse,
FeatureUpdate, FeatureUpdate,
HumanInputResponse,
) )
from ..utils.project_helpers import get_project_path as _get_project_path from ..utils.project_helpers import get_project_path as _get_project_path
from ..utils.validation import validate_project_name from ..utils.validation import validate_project_name
@@ -104,6 +105,9 @@ def feature_to_response(f, passing_ids: set[int] | None = None) -> FeatureRespon
in_progress=f.in_progress if f.in_progress is not None else False, in_progress=f.in_progress if f.in_progress is not None else False,
blocked=blocked, blocked=blocked,
blocking_dependencies=blocking, blocking_dependencies=blocking,
needs_human_input=getattr(f, 'needs_human_input', False) or False,
human_input_request=getattr(f, 'human_input_request', None),
human_input_response=getattr(f, 'human_input_response', None),
) )
@@ -143,11 +147,14 @@ async def list_features(project_name: str):
pending = [] pending = []
in_progress = [] in_progress = []
done = [] done = []
needs_human_input_list = []
for f in all_features: for f in all_features:
feature_response = feature_to_response(f, passing_ids) feature_response = feature_to_response(f, passing_ids)
if f.passes: if f.passes:
done.append(feature_response) done.append(feature_response)
elif getattr(f, 'needs_human_input', False):
needs_human_input_list.append(feature_response)
elif f.in_progress: elif f.in_progress:
in_progress.append(feature_response) in_progress.append(feature_response)
else: else:
@@ -157,6 +164,7 @@ async def list_features(project_name: str):
pending=pending, pending=pending,
in_progress=in_progress, in_progress=in_progress,
done=done, done=done,
needs_human_input=needs_human_input_list,
) )
except HTTPException: except HTTPException:
raise raise
@@ -341,9 +349,11 @@ async def get_dependency_graph(project_name: str):
deps = f.dependencies or [] deps = f.dependencies or []
blocking = [d for d in deps if d not in passing_ids] blocking = [d for d in deps if d not in passing_ids]
status: Literal["pending", "in_progress", "done", "blocked"] status: Literal["pending", "in_progress", "done", "blocked", "needs_human_input"]
if f.passes: if f.passes:
status = "done" status = "done"
elif getattr(f, 'needs_human_input', False):
status = "needs_human_input"
elif blocking: elif blocking:
status = "blocked" status = "blocked"
elif f.in_progress: elif f.in_progress:
@@ -564,6 +574,71 @@ async def skip_feature(project_name: str, feature_id: int):
raise HTTPException(status_code=500, detail="Failed to skip feature") raise HTTPException(status_code=500, detail="Failed to skip feature")
@router.post("/{feature_id}/resolve-human-input", response_model=FeatureResponse)
async def resolve_human_input(project_name: str, feature_id: int, response: HumanInputResponse):
"""Resolve a human input request for a feature.
Validates all required fields have values, stores the response,
and returns the feature to the pending queue for agents to pick up.
"""
project_name = validate_project_name(project_name)
project_dir = _get_project_path(project_name)
if not project_dir:
raise HTTPException(status_code=404, detail=f"Project '{project_name}' not found in registry")
if not project_dir.exists():
raise HTTPException(status_code=404, detail="Project directory not found")
_, Feature = _get_db_classes()
try:
with get_db_session(project_dir) as session:
feature = session.query(Feature).filter(Feature.id == feature_id).first()
if not feature:
raise HTTPException(status_code=404, detail=f"Feature {feature_id} not found")
if not getattr(feature, 'needs_human_input', False):
raise HTTPException(status_code=400, detail="Feature is not waiting for human input")
# Validate required fields
request_data = feature.human_input_request
if request_data and isinstance(request_data, dict):
for field_def in request_data.get("fields", []):
if field_def.get("required", True):
field_id = field_def.get("id")
if field_id not in response.fields or response.fields[field_id] in (None, ""):
raise HTTPException(
status_code=400,
detail=f"Required field '{field_def.get('label', field_id)}' is missing"
)
# Store response and return to pending queue
from datetime import datetime, timezone
response_data = {
"fields": {k: v for k, v in response.fields.items()},
"responded_at": datetime.now(timezone.utc).isoformat(),
}
feature.human_input_response = response_data
feature.needs_human_input = False
# Keep in_progress=False, passes=False so it returns to pending
session.commit()
session.refresh(feature)
# Compute passing IDs for response
all_features = session.query(Feature).all()
passing_ids = {f.id for f in all_features if f.passes}
return feature_to_response(feature, passing_ids)
except HTTPException:
raise
except Exception:
logger.exception("Failed to resolve human input")
raise HTTPException(status_code=500, detail="Failed to resolve human input")
# ============================================================================ # ============================================================================
# Dependency Management Endpoints # Dependency Management Endpoints
# ============================================================================ # ============================================================================

View File

@@ -102,7 +102,7 @@ def get_project_stats(project_dir: Path) -> ProjectStats:
"""Get statistics for a project.""" """Get statistics for a project."""
_init_imports() _init_imports()
assert _count_passing_tests is not None # guaranteed by _init_imports() assert _count_passing_tests is not None # guaranteed by _init_imports()
passing, in_progress, total = _count_passing_tests(project_dir) passing, in_progress, total, _needs_human_input = _count_passing_tests(project_dir)
percentage = (passing / total * 100) if total > 0 else 0.0 percentage = (passing / total * 100) if total > 0 else 0.0
return ProjectStats( return ProjectStats(
passing=passing, passing=passing,

View File

@@ -120,16 +120,41 @@ class FeatureResponse(FeatureBase):
in_progress: bool in_progress: bool
blocked: bool = False # Computed: has unmet dependencies blocked: bool = False # Computed: has unmet dependencies
blocking_dependencies: list[int] = Field(default_factory=list) # Computed blocking_dependencies: list[int] = Field(default_factory=list) # Computed
needs_human_input: bool = False
human_input_request: dict | None = None
human_input_response: dict | None = None
class Config: class Config:
from_attributes = True from_attributes = True
class HumanInputField(BaseModel):
"""Schema for a single human input field."""
id: str
label: str
type: Literal["text", "textarea", "select", "boolean"] = "text"
required: bool = True
placeholder: str | None = None
options: list[dict] | None = None # For select: [{value, label}]
class HumanInputRequest(BaseModel):
"""Schema for an agent's human input request."""
prompt: str
fields: list[HumanInputField]
class HumanInputResponse(BaseModel):
"""Schema for a human's response to an input request."""
fields: dict[str, str | bool | list[str]]
class FeatureListResponse(BaseModel): class FeatureListResponse(BaseModel):
"""Response containing list of features organized by status.""" """Response containing list of features organized by status."""
pending: list[FeatureResponse] pending: list[FeatureResponse]
in_progress: list[FeatureResponse] in_progress: list[FeatureResponse]
done: list[FeatureResponse] done: list[FeatureResponse]
needs_human_input: list[FeatureResponse] = Field(default_factory=list)
class FeatureBulkCreate(BaseModel): class FeatureBulkCreate(BaseModel):
@@ -153,7 +178,7 @@ class DependencyGraphNode(BaseModel):
id: int id: int
name: str name: str
category: str category: str
status: Literal["pending", "in_progress", "done", "blocked"] status: Literal["pending", "in_progress", "done", "blocked", "needs_human_input"]
priority: int priority: int
dependencies: list[int] dependencies: list[int]
@@ -190,9 +215,12 @@ class AgentStartRequest(BaseModel):
@field_validator('model') @field_validator('model')
@classmethod @classmethod
def validate_model(cls, v: str | None) -> str | None: def validate_model(cls, v: str | None) -> str | None:
"""Validate model is in the allowed list.""" """Validate model is in the allowed list (Claude) or allow any model for alternative providers."""
if v is not None and v not in VALID_MODELS: if v is not None and v not in VALID_MODELS:
raise ValueError(f"Invalid model. Must be one of: {VALID_MODELS}") from registry import get_all_settings
settings = get_all_settings()
if settings.get("api_provider", "claude") == "claude":
raise ValueError(f"Invalid model. Must be one of: {VALID_MODELS}")
return v return v
@field_validator('max_concurrency') @field_validator('max_concurrency')
@@ -214,7 +242,7 @@ class AgentStartRequest(BaseModel):
class AgentStatus(BaseModel): class AgentStatus(BaseModel):
"""Current agent status.""" """Current agent status."""
status: Literal["stopped", "running", "paused", "crashed"] status: Literal["stopped", "running", "paused", "crashed", "pausing", "paused_graceful"]
pid: int | None = None pid: int | None = None
started_at: datetime | None = None started_at: datetime | None = None
yolo_mode: bool = False yolo_mode: bool = False
@@ -254,6 +282,7 @@ class WSProgressMessage(BaseModel):
in_progress: int in_progress: int
total: int total: int
percentage: float percentage: float
needs_human_input: int = 0
class WSFeatureUpdateMessage(BaseModel): class WSFeatureUpdateMessage(BaseModel):
@@ -571,9 +600,12 @@ class ScheduleCreate(BaseModel):
@field_validator('model') @field_validator('model')
@classmethod @classmethod
def validate_model(cls, v: str | None) -> str | None: def validate_model(cls, v: str | None) -> str | None:
"""Validate model is in the allowed list.""" """Validate model is in the allowed list (Claude) or allow any model for alternative providers."""
if v is not None and v not in VALID_MODELS: if v is not None and v not in VALID_MODELS:
raise ValueError(f"Invalid model. Must be one of: {VALID_MODELS}") from registry import get_all_settings
settings = get_all_settings()
if settings.get("api_provider", "claude") == "claude":
raise ValueError(f"Invalid model. Must be one of: {VALID_MODELS}")
return v return v
@@ -593,9 +625,12 @@ class ScheduleUpdate(BaseModel):
@field_validator('model') @field_validator('model')
@classmethod @classmethod
def validate_model(cls, v: str | None) -> str | None: def validate_model(cls, v: str | None) -> str | None:
"""Validate model is in the allowed list.""" """Validate model is in the allowed list (Claude) or allow any model for alternative providers."""
if v is not None and v not in VALID_MODELS: if v is not None and v not in VALID_MODELS:
raise ValueError(f"Invalid model. Must be one of: {VALID_MODELS}") from registry import get_all_settings
settings = get_all_settings()
if settings.get("api_provider", "claude") == "claude":
raise ValueError(f"Invalid model. Must be one of: {VALID_MODELS}")
return v return v

View File

@@ -47,8 +47,13 @@ FEATURE_MANAGEMENT_TOOLS = [
"mcp__features__feature_skip", "mcp__features__feature_skip",
] ]
# Interactive tools
INTERACTIVE_TOOLS = [
"mcp__features__ask_user",
]
# Combined list for assistant # Combined list for assistant
ASSISTANT_FEATURE_TOOLS = READONLY_FEATURE_MCP_TOOLS + FEATURE_MANAGEMENT_TOOLS ASSISTANT_FEATURE_TOOLS = READONLY_FEATURE_MCP_TOOLS + FEATURE_MANAGEMENT_TOOLS + INTERACTIVE_TOOLS
# Read-only built-in tools (no Write, Edit, Bash) # Read-only built-in tools (no Write, Edit, Bash)
READONLY_BUILTIN_TOOLS = [ READONLY_BUILTIN_TOOLS = [
@@ -123,6 +128,9 @@ If the user asks you to modify code, explain that you're a project assistant and
- **feature_create_bulk**: Create multiple features at once - **feature_create_bulk**: Create multiple features at once
- **feature_skip**: Move a feature to the end of the queue - **feature_skip**: Move a feature to the end of the queue
**Interactive:**
- **ask_user**: Present structured multiple-choice questions to the user. Use this when you need to clarify requirements, offer design choices, or guide a decision. The user sees clickable option buttons and their selection is returned as your next message.
## Creating Features ## Creating Features
When a user asks to add a feature, use the `feature_create` or `feature_create_bulk` MCP tools directly: When a user asks to add a feature, use the `feature_create` or `feature_create_bulk` MCP tools directly:
@@ -402,6 +410,17 @@ class AssistantChatSession:
elif block_type == "ToolUseBlock" and hasattr(block, "name"): elif block_type == "ToolUseBlock" and hasattr(block, "name"):
tool_name = block.name tool_name = block.name
tool_input = getattr(block, "input", {}) tool_input = getattr(block, "input", {})
# Intercept ask_user tool calls -> yield as question message
if tool_name == "mcp__features__ask_user":
questions = tool_input.get("questions", [])
if questions:
yield {
"type": "question",
"questions": questions,
}
continue
yield { yield {
"type": "tool_call", "type": "tool_call",
"tool": tool_name, "tool": tool_name,

View File

@@ -77,7 +77,7 @@ class AgentProcessManager:
self.project_dir = project_dir self.project_dir = project_dir
self.root_dir = root_dir self.root_dir = root_dir
self.process: subprocess.Popen | None = None self.process: subprocess.Popen | None = None
self._status: Literal["stopped", "running", "paused", "crashed"] = "stopped" self._status: Literal["stopped", "running", "paused", "crashed", "pausing", "paused_graceful"] = "stopped"
self.started_at: datetime | None = None self.started_at: datetime | None = None
self._output_task: asyncio.Task | None = None self._output_task: asyncio.Task | None = None
self.yolo_mode: bool = False # YOLO mode for rapid prototyping self.yolo_mode: bool = False # YOLO mode for rapid prototyping
@@ -96,11 +96,11 @@ class AgentProcessManager:
self.lock_file = get_agent_lock_path(self.project_dir) self.lock_file = get_agent_lock_path(self.project_dir)
@property @property
def status(self) -> Literal["stopped", "running", "paused", "crashed"]: def status(self) -> Literal["stopped", "running", "paused", "crashed", "pausing", "paused_graceful"]:
return self._status return self._status
@status.setter @status.setter
def status(self, value: Literal["stopped", "running", "paused", "crashed"]): def status(self, value: Literal["stopped", "running", "paused", "crashed", "pausing", "paused_graceful"]):
old_status = self._status old_status = self._status
self._status = value self._status = value
if old_status != value: if old_status != value:
@@ -227,6 +227,28 @@ class AgentProcessManager:
"""Remove lock file.""" """Remove lock file."""
self.lock_file.unlink(missing_ok=True) self.lock_file.unlink(missing_ok=True)
def _apply_playwright_headless(self, headless: bool) -> None:
"""Update .playwright/cli.config.json with the current headless setting.
playwright-cli reads this config file on each ``open`` command, so
updating it before the agent starts is sufficient.
"""
config_file = self.project_dir / ".playwright" / "cli.config.json"
if not config_file.exists():
return
try:
import json
config = json.loads(config_file.read_text(encoding="utf-8"))
launch_opts = config.get("browser", {}).get("launchOptions", {})
if launch_opts.get("headless") == headless:
return # already correct
launch_opts["headless"] = headless
config.setdefault("browser", {})["launchOptions"] = launch_opts
config_file.write_text(json.dumps(config, indent=2) + "\n", encoding="utf-8")
logger.info("Set playwright headless=%s for %s", headless, self.project_name)
except Exception:
logger.warning("Failed to update playwright config", exc_info=True)
def _cleanup_stale_features(self) -> None: def _cleanup_stale_features(self) -> None:
"""Clear in_progress flag for all features when agent stops/crashes. """Clear in_progress flag for all features when agent stops/crashes.
@@ -255,7 +277,7 @@ class AgentProcessManager:
).all() ).all()
if stuck: if stuck:
for f in stuck: for f in stuck:
f.in_progress = False f.in_progress = False # type: ignore[assignment]
session.commit() session.commit()
logger.info( logger.info(
"Cleaned up %d stuck feature(s) for %s", "Cleaned up %d stuck feature(s) for %s",
@@ -308,6 +330,12 @@ class AgentProcessManager:
for help_line in AUTH_ERROR_HELP.strip().split('\n'): for help_line in AUTH_ERROR_HELP.strip().split('\n'):
await self._broadcast_output(help_line) await self._broadcast_output(help_line)
# Detect graceful pause status transitions from orchestrator output
if "All agents drained - paused." in decoded:
self.status = "paused_graceful"
elif "Resuming from graceful pause..." in decoded:
self.status = "running"
await self._broadcast_output(sanitized) await self._broadcast_output(sanitized)
except asyncio.CancelledError: except asyncio.CancelledError:
@@ -318,7 +346,7 @@ class AgentProcessManager:
# Check if process ended # Check if process ended
if self.process and self.process.poll() is not None: if self.process and self.process.poll() is not None:
exit_code = self.process.returncode exit_code = self.process.returncode
if exit_code != 0 and self.status == "running": if exit_code != 0 and self.status in ("running", "pausing", "paused_graceful"):
# Check buffered output for auth errors if we haven't detected one yet # Check buffered output for auth errors if we haven't detected one yet
if not auth_error_detected: if not auth_error_detected:
combined_output = '\n'.join(output_buffer) combined_output = '\n'.join(output_buffer)
@@ -326,10 +354,16 @@ class AgentProcessManager:
for help_line in AUTH_ERROR_HELP.strip().split('\n'): for help_line in AUTH_ERROR_HELP.strip().split('\n'):
await self._broadcast_output(help_line) await self._broadcast_output(help_line)
self.status = "crashed" self.status = "crashed"
elif self.status == "running": elif self.status in ("running", "pausing", "paused_graceful"):
self.status = "stopped" self.status = "stopped"
self._cleanup_stale_features() self._cleanup_stale_features()
self._remove_lock() self._remove_lock()
# Clean up drain signal file if present
try:
from autoforge_paths import get_pause_drain_path
get_pause_drain_path(self.project_dir).unlink(missing_ok=True)
except Exception:
pass
async def start( async def start(
self, self,
@@ -355,12 +389,21 @@ class AgentProcessManager:
Returns: Returns:
Tuple of (success, message) Tuple of (success, message)
""" """
if self.status in ("running", "paused"): if self.status in ("running", "paused", "pausing", "paused_graceful"):
return False, f"Agent is already {self.status}" return False, f"Agent is already {self.status}"
if not self._check_lock(): if not self._check_lock():
return False, "Another agent instance is already running for this project" return False, "Another agent instance is already running for this project"
# Clean up stale browser daemons from previous runs
try:
subprocess.run(
["playwright-cli", "kill-all"],
timeout=5, capture_output=True,
)
except (subprocess.TimeoutExpired, FileNotFoundError, OSError):
pass
# Clean up features stuck from a previous crash/stop # Clean up features stuck from a previous crash/stop
self._cleanup_stale_features() self._cleanup_stale_features()
@@ -397,6 +440,10 @@ class AgentProcessManager:
# Add --batch-size flag for multi-feature batching # Add --batch-size flag for multi-feature batching
cmd.extend(["--batch-size", str(batch_size)]) cmd.extend(["--batch-size", str(batch_size)])
# Apply headless setting to .playwright/cli.config.json so playwright-cli
# picks it up (the only mechanism it supports for headless control)
self._apply_playwright_headless(playwright_headless)
try: try:
# Start subprocess with piped stdout/stderr # Start subprocess with piped stdout/stderr
# Use project_dir as cwd so Claude SDK sandbox allows access to project files # Use project_dir as cwd so Claude SDK sandbox allows access to project files
@@ -409,7 +456,8 @@ class AgentProcessManager:
subprocess_env = { subprocess_env = {
**os.environ, **os.environ,
"PYTHONUNBUFFERED": "1", "PYTHONUNBUFFERED": "1",
"PLAYWRIGHT_HEADLESS": "true" if playwright_headless else "false", "PLAYWRIGHT_CLI_SESSION": f"agent-{self.project_name}-{os.getpid()}",
"NODE_COMPILE_CACHE": "", # Disable V8 compile caching to prevent .node file accumulation in %TEMP%
**api_env, **api_env,
} }
@@ -468,6 +516,15 @@ class AgentProcessManager:
except asyncio.CancelledError: except asyncio.CancelledError:
pass pass
# Kill browser daemons before stopping agent
try:
subprocess.run(
["playwright-cli", "kill-all"],
timeout=5, capture_output=True,
)
except (subprocess.TimeoutExpired, FileNotFoundError, OSError):
pass
# CRITICAL: Kill entire process tree, not just orchestrator # CRITICAL: Kill entire process tree, not just orchestrator
# This ensures all spawned coding/testing agents are also terminated # This ensures all spawned coding/testing agents are also terminated
proc = self.process # Capture reference before async call proc = self.process # Capture reference before async call
@@ -481,6 +538,12 @@ class AgentProcessManager:
self._cleanup_stale_features() self._cleanup_stale_features()
self._remove_lock() self._remove_lock()
# Clean up drain signal file if present
try:
from autoforge_paths import get_pause_drain_path
get_pause_drain_path(self.project_dir).unlink(missing_ok=True)
except Exception:
pass
self.status = "stopped" self.status = "stopped"
self.process = None self.process = None
self.started_at = None self.started_at = None
@@ -541,6 +604,47 @@ class AgentProcessManager:
logger.exception("Failed to resume agent") logger.exception("Failed to resume agent")
return False, f"Failed to resume agent: {e}" return False, f"Failed to resume agent: {e}"
async def graceful_pause(self) -> tuple[bool, str]:
"""Request a graceful pause (drain mode).
Creates a signal file that the orchestrator polls. Running agents
finish their current work before the orchestrator enters a paused state.
Returns:
Tuple of (success, message)
"""
if not self.process or self.status not in ("running",):
return False, "Agent is not running"
try:
from autoforge_paths import get_pause_drain_path
drain_path = get_pause_drain_path(self.project_dir)
drain_path.parent.mkdir(parents=True, exist_ok=True)
drain_path.write_text(str(self.process.pid))
self.status = "pausing"
return True, "Graceful pause requested"
except Exception as e:
logger.exception("Failed to request graceful pause")
return False, f"Failed to request graceful pause: {e}"
async def graceful_resume(self) -> tuple[bool, str]:
"""Resume from a graceful pause by removing the drain signal file.
Returns:
Tuple of (success, message)
"""
if not self.process or self.status not in ("pausing", "paused_graceful"):
return False, "Agent is not in a graceful pause state"
try:
from autoforge_paths import get_pause_drain_path
get_pause_drain_path(self.project_dir).unlink(missing_ok=True)
self.status = "running"
return True, "Agent resumed from graceful pause"
except Exception as e:
logger.exception("Failed to resume from graceful pause")
return False, f"Failed to resume: {e}"
async def healthcheck(self) -> bool: async def healthcheck(self) -> bool:
""" """
Check if the agent process is still alive. Check if the agent process is still alive.
@@ -556,8 +660,14 @@ class AgentProcessManager:
poll = self.process.poll() poll = self.process.poll()
if poll is not None: if poll is not None:
# Process has terminated # Process has terminated
if self.status in ("running", "paused"): if self.status in ("running", "paused", "pausing", "paused_graceful"):
self._cleanup_stale_features() self._cleanup_stale_features()
# Clean up drain signal file if present
try:
from autoforge_paths import get_pause_drain_path
get_pause_drain_path(self.project_dir).unlink(missing_ok=True)
except Exception:
pass
self.status = "crashed" self.status = "crashed"
self._remove_lock() self._remove_lock()
return False return False
@@ -642,8 +752,14 @@ def cleanup_orphaned_locks() -> int:
if not project_path.exists(): if not project_path.exists():
continue continue
# Clean up stale drain signal files
from autoforge_paths import get_autoforge_dir, get_pause_drain_path
drain_file = get_pause_drain_path(project_path)
if drain_file.exists():
drain_file.unlink(missing_ok=True)
logger.info("Removed stale drain signal file for project '%s'", name)
# Check both legacy and new locations for lock files # Check both legacy and new locations for lock files
from autoforge_paths import get_autoforge_dir
lock_locations = [ lock_locations = [
project_path / ".agent.lock", project_path / ".agent.lock",
get_autoforge_dir(project_path) / ".agent.lock", get_autoforge_dir(project_path) / ".agent.lock",

View File

@@ -61,7 +61,7 @@ THOUGHT_PATTERNS = [
(re.compile(r'(?:Testing|Verifying|Running tests|Validating)\s+(.+)', re.I), 'testing'), (re.compile(r'(?:Testing|Verifying|Running tests|Validating)\s+(.+)', re.I), 'testing'),
(re.compile(r'(?:Error|Failed|Cannot|Unable to|Exception)\s+(.+)', re.I), 'struggling'), (re.compile(r'(?:Error|Failed|Cannot|Unable to|Exception)\s+(.+)', re.I), 'struggling'),
# Test results # Test results
(re.compile(r'(?:PASS|passed|success)', re.I), 'success'), (re.compile(r'(?:PASS|passed|success)', re.I), 'testing'),
(re.compile(r'(?:FAIL|failed|error)', re.I), 'struggling'), (re.compile(r'(?:FAIL|failed|error)', re.I), 'struggling'),
] ]
@@ -78,6 +78,9 @@ ORCHESTRATOR_PATTERNS = {
'testing_complete': re.compile(r'Feature #(\d+) testing (completed|failed)'), 'testing_complete': re.compile(r'Feature #(\d+) testing (completed|failed)'),
'all_complete': re.compile(r'All features complete'), 'all_complete': re.compile(r'All features complete'),
'blocked_features': re.compile(r'(\d+) blocked by dependencies'), 'blocked_features': re.compile(r'(\d+) blocked by dependencies'),
'drain_start': re.compile(r'Graceful pause requested'),
'drain_complete': re.compile(r'All agents drained'),
'drain_resume': re.compile(r'Resuming from graceful pause'),
} }
@@ -562,6 +565,30 @@ class OrchestratorTracker:
'All features complete!' 'All features complete!'
) )
# Graceful pause (drain mode) events
elif ORCHESTRATOR_PATTERNS['drain_start'].search(line):
self.state = 'draining'
update = self._create_update(
'drain_start',
'Draining active agents...'
)
elif ORCHESTRATOR_PATTERNS['drain_complete'].search(line):
self.state = 'paused'
self.coding_agents = 0
self.testing_agents = 0
update = self._create_update(
'drain_complete',
'All agents drained. Paused.'
)
elif ORCHESTRATOR_PATTERNS['drain_resume'].search(line):
self.state = 'scheduling'
update = self._create_update(
'drain_resume',
'Resuming feature scheduling'
)
return update return update
def _create_update( def _create_update(
@@ -689,15 +716,19 @@ async def poll_progress(websocket: WebSocket, project_name: str, project_dir: Pa
last_in_progress = -1 last_in_progress = -1
last_total = -1 last_total = -1
last_needs_human_input = -1
while True: while True:
try: try:
passing, in_progress, total = count_passing_tests(project_dir) passing, in_progress, total, needs_human_input = count_passing_tests(project_dir)
# Only send if changed # Only send if changed
if passing != last_passing or in_progress != last_in_progress or total != last_total: if (passing != last_passing or in_progress != last_in_progress
or total != last_total or needs_human_input != last_needs_human_input):
last_passing = passing last_passing = passing
last_in_progress = in_progress last_in_progress = in_progress
last_total = total last_total = total
last_needs_human_input = needs_human_input
percentage = (passing / total * 100) if total > 0 else 0 percentage = (passing / total * 100) if total > 0 else 0
await websocket.send_json({ await websocket.send_json({
@@ -706,6 +737,7 @@ async def poll_progress(websocket: WebSocket, project_name: str, project_dir: Pa
"in_progress": in_progress, "in_progress": in_progress,
"total": total, "total": total,
"percentage": round(percentage, 1), "percentage": round(percentage, 1),
"needs_human_input": needs_human_input,
}) })
await asyncio.sleep(2) # Poll every 2 seconds await asyncio.sleep(2) # Poll every 2 seconds
@@ -858,7 +890,7 @@ async def project_websocket(websocket: WebSocket, project_name: str):
# Send initial progress # Send initial progress
count_passing_tests = _get_count_passing_tests() count_passing_tests = _get_count_passing_tests()
passing, in_progress, total = count_passing_tests(project_dir) passing, in_progress, total, needs_human_input = count_passing_tests(project_dir)
percentage = (passing / total * 100) if total > 0 else 0 percentage = (passing / total * 100) if total > 0 else 0
await websocket.send_json({ await websocket.send_json({
"type": "progress", "type": "progress",
@@ -866,6 +898,7 @@ async def project_websocket(websocket: WebSocket, project_name: str):
"in_progress": in_progress, "in_progress": in_progress,
"total": total, "total": total,
"percentage": round(percentage, 1), "percentage": round(percentage, 1),
"needs_human_input": needs_human_input,
}) })
# Keep connection alive and handle incoming messages # Keep connection alive and handle incoming messages

View File

@@ -54,5 +54,15 @@ REM Install dependencies
echo Installing dependencies... echo Installing dependencies...
pip install -r requirements.txt --quiet pip install -r requirements.txt --quiet
REM Ensure playwright-cli is available for browser automation
where playwright-cli >nul 2>&1
if %ERRORLEVEL% neq 0 (
echo Installing playwright-cli for browser automation...
call npm install -g @playwright/cli >nul 2>&1
if %ERRORLEVEL% neq 0 (
echo Note: Could not install playwright-cli. Install manually: npm install -g @playwright/cli
)
)
REM Run the app REM Run the app
python start.py python start.py

View File

@@ -390,8 +390,11 @@ def run_agent(project_name: str, project_dir: Path) -> None:
print(f"Location: {project_dir}") print(f"Location: {project_dir}")
print("-" * 50) print("-" * 50)
# Build the command - pass absolute path # Build the command - pass absolute path and model from settings
cmd = [sys.executable, "autonomous_agent_demo.py", "--project-dir", str(project_dir.resolve())] from registry import DEFAULT_MODEL, get_all_settings
settings = get_all_settings()
model = settings.get("api_model") or settings.get("model", DEFAULT_MODEL)
cmd = [sys.executable, "autonomous_agent_demo.py", "--project-dir", str(project_dir.resolve()), "--model", model]
# Run the agent with stderr capture to detect auth errors # Run the agent with stderr capture to detect auth errors
# stdout goes directly to terminal for real-time output # stdout goes directly to terminal for real-time output

View File

@@ -74,5 +74,14 @@ fi
echo "Installing dependencies..." echo "Installing dependencies..."
pip install -r requirements.txt --quiet pip install -r requirements.txt --quiet
# Ensure playwright-cli is available for browser automation
if ! command -v playwright-cli &> /dev/null; then
echo "Installing playwright-cli for browser automation..."
npm install -g @playwright/cli --quiet 2>/dev/null
if [ $? -ne 0 ]; then
echo "Note: Could not install playwright-cli. Install manually: npm install -g @playwright/cli"
fi
fi
# Run the app # Run the app
python start.py python start.py

View File

@@ -37,5 +37,15 @@ REM Install dependencies
echo Installing dependencies... echo Installing dependencies...
pip install -r requirements.txt --quiet pip install -r requirements.txt --quiet
REM Ensure playwright-cli is available for browser automation
where playwright-cli >nul 2>&1
if %ERRORLEVEL% neq 0 (
echo Installing playwright-cli for browser automation...
call npm install -g @playwright/cli >nul 2>&1
if %ERRORLEVEL% neq 0 (
echo Note: Could not install playwright-cli. Install manually: npm install -g @playwright/cli
)
)
REM Run the Python launcher REM Run the Python launcher
python "%~dp0start_ui.py" %* python "%~dp0start_ui.py" %*

View File

@@ -80,5 +80,14 @@ fi
echo "Installing dependencies..." echo "Installing dependencies..."
pip install -r requirements.txt --quiet pip install -r requirements.txt --quiet
# Ensure playwright-cli is available for browser automation
if ! command -v playwright-cli &> /dev/null; then
echo "Installing playwright-cli for browser automation..."
npm install -g @playwright/cli --quiet 2>/dev/null
if [ $? -ne 0 ]; then
echo "Note: Could not install playwright-cli. Install manually: npm install -g @playwright/cli"
fi
fi
# Run the Python launcher # Run the Python launcher
python start_ui.py "$@" python start_ui.py "$@"

View File

@@ -37,11 +37,12 @@ DIR_PATTERNS = [
"mongodb-memory-server*", # MongoDB Memory Server binaries "mongodb-memory-server*", # MongoDB Memory Server binaries
"ng-*", # Angular CLI temp directories "ng-*", # Angular CLI temp directories
"scoped_dir*", # Chrome/Chromium temp directories "scoped_dir*", # Chrome/Chromium temp directories
"node-compile-cache", # Node.js V8 compile cache directory
] ]
# File patterns to clean up (glob patterns) # File patterns to clean up (glob patterns)
FILE_PATTERNS = [ FILE_PATTERNS = [
".78912*.node", # Node.js native module cache (major space consumer, ~7MB each) ".[0-9a-f]*.node", # Node.js/V8 compile cache files (~7MB each, varying hex prefixes)
"claude-*-cwd", # Claude CLI working directory temp files "claude-*-cwd", # Claude CLI working directory temp files
"mat-debug-*.log", # Material/Angular debug logs "mat-debug-*.log", # Material/Angular debug logs
] ]
@@ -122,6 +123,78 @@ def cleanup_stale_temp(max_age_seconds: int = MAX_AGE_SECONDS) -> dict:
return stats return stats
def cleanup_project_screenshots(project_dir: Path, max_age_seconds: int = 300) -> dict:
"""
Clean up stale Playwright CLI artifacts from the project.
The Playwright CLI daemon saves screenshots, snapshots, and other artifacts
to `{project_dir}/.playwright-cli/`. This removes them after they've aged
out (default 5 minutes).
Also cleans up legacy screenshot patterns from the project root (from the
old Playwright MCP server approach).
Args:
project_dir: Path to the project directory.
max_age_seconds: Maximum age in seconds before an artifact is deleted.
Defaults to 5 minutes (300 seconds).
Returns:
Dictionary with cleanup statistics (files_deleted, bytes_freed, errors).
"""
cutoff_time = time.time() - max_age_seconds
stats: dict = {"files_deleted": 0, "bytes_freed": 0, "errors": []}
# Clean up .playwright-cli/ directory (new CLI approach)
playwright_cli_dir = project_dir / ".playwright-cli"
if playwright_cli_dir.exists():
for item in playwright_cli_dir.iterdir():
if not item.is_file():
continue
try:
mtime = item.stat().st_mtime
if mtime < cutoff_time:
size = item.stat().st_size
item.unlink(missing_ok=True)
if not item.exists():
stats["files_deleted"] += 1
stats["bytes_freed"] += size
logger.debug(f"Deleted playwright-cli artifact: {item}")
except Exception as e:
stats["errors"].append(f"Failed to delete {item}: {e}")
logger.debug(f"Failed to delete artifact {item}: {e}")
# Legacy cleanup: root-level screenshot patterns (from old MCP server approach)
legacy_patterns = [
"feature*-*.png",
"screenshot-*.png",
"step-*.png",
]
for pattern in legacy_patterns:
for item in project_dir.glob(pattern):
if not item.is_file():
continue
try:
mtime = item.stat().st_mtime
if mtime < cutoff_time:
size = item.stat().st_size
item.unlink(missing_ok=True)
if not item.exists():
stats["files_deleted"] += 1
stats["bytes_freed"] += size
logger.debug(f"Deleted legacy screenshot: {item}")
except Exception as e:
stats["errors"].append(f"Failed to delete {item}: {e}")
logger.debug(f"Failed to delete screenshot {item}: {e}")
if stats["files_deleted"] > 0:
mb_freed = stats["bytes_freed"] / (1024 * 1024)
logger.info(f"Artifact cleanup: {stats['files_deleted']} files, {mb_freed:.1f} MB freed")
return stats
def _get_dir_size(path: Path) -> int: def _get_dir_size(path: Path) -> int:
"""Get total size of a directory in bytes.""" """Get total size of a directory in bytes."""
total = 0 total = 0

View File

@@ -25,6 +25,7 @@ from security import (
validate_chmod_command, validate_chmod_command,
validate_init_script, validate_init_script,
validate_pkill_command, validate_pkill_command,
validate_playwright_command,
validate_project_command, validate_project_command,
) )
@@ -923,6 +924,70 @@ pkill_processes:
return passed, failed return passed, failed
def test_playwright_cli_validation():
"""Test playwright-cli subcommand validation."""
print("\nTesting playwright-cli validation:\n")
passed = 0
failed = 0
# Test cases: (command, should_be_allowed, description)
test_cases = [
# Allowed cases
("playwright-cli screenshot", True, "screenshot allowed"),
("playwright-cli snapshot", True, "snapshot allowed"),
("playwright-cli click e5", True, "click with ref"),
("playwright-cli open http://localhost:3000", True, "open URL"),
("playwright-cli -s=agent-1 click e5", True, "session flag with click"),
("playwright-cli close", True, "close browser"),
("playwright-cli goto http://localhost:3000/page", True, "goto URL"),
("playwright-cli fill e3 'test value'", True, "fill form field"),
("playwright-cli console", True, "console messages"),
# Blocked cases
("playwright-cli run-code 'await page.evaluate(() => {})'", False, "run-code blocked"),
("playwright-cli eval 'document.title'", False, "eval blocked"),
("playwright-cli -s=test eval 'document.title'", False, "eval with session flag blocked"),
]
for cmd, should_allow, description in test_cases:
allowed, reason = validate_playwright_command(cmd)
if allowed == should_allow:
print(f" PASS: {cmd!r} ({description})")
passed += 1
else:
expected = "allowed" if should_allow else "blocked"
actual = "allowed" if allowed else "blocked"
print(f" FAIL: {cmd!r} ({description})")
print(f" Expected: {expected}, Got: {actual}")
if reason:
print(f" Reason: {reason}")
failed += 1
# Integration test: verify through the security hook
print("\n Integration tests (via security hook):\n")
# playwright-cli screenshot should be allowed
input_data = {"tool_name": "Bash", "tool_input": {"command": "playwright-cli screenshot"}}
result = asyncio.run(bash_security_hook(input_data))
if result.get("decision") != "block":
print(" PASS: playwright-cli screenshot allowed via hook")
passed += 1
else:
print(f" FAIL: playwright-cli screenshot should be allowed: {result.get('reason')}")
failed += 1
# playwright-cli run-code should be blocked
input_data = {"tool_name": "Bash", "tool_input": {"command": "playwright-cli run-code 'code'"}}
result = asyncio.run(bash_security_hook(input_data))
if result.get("decision") == "block":
print(" PASS: playwright-cli run-code blocked via hook")
passed += 1
else:
print(" FAIL: playwright-cli run-code should be blocked via hook")
failed += 1
return passed, failed
def main(): def main():
print("=" * 70) print("=" * 70)
print(" SECURITY HOOK TESTS") print(" SECURITY HOOK TESTS")
@@ -991,6 +1056,11 @@ def main():
passed += pkill_passed passed += pkill_passed
failed += pkill_failed failed += pkill_failed
# Test playwright-cli validation
pw_passed, pw_failed = test_playwright_cli_validation()
passed += pw_passed
failed += pw_failed
# Commands that SHOULD be blocked # Commands that SHOULD be blocked
# Note: blocklisted commands (sudo, shutdown, dd, aws) are tested in # Note: blocklisted commands (sudo, shutdown, dd, aws) are tested in
# test_blocklist_enforcement(). chmod validation is tested in # test_blocklist_enforcement(). chmod validation is tested in
@@ -1012,6 +1082,9 @@ def main():
# Shell injection attempts # Shell injection attempts
"$(echo pkill) node", "$(echo pkill) node",
'eval "pkill node"', 'eval "pkill node"',
# playwright-cli dangerous subcommands
"playwright-cli run-code 'await page.goto(\"http://evil.com\")'",
"playwright-cli eval 'document.cookie'",
] ]
for cmd in dangerous: for cmd in dangerous:
@@ -1077,6 +1150,12 @@ def main():
"/usr/local/bin/node app.js", "/usr/local/bin/node app.js",
# Combined chmod and init.sh (integration test for both validators) # Combined chmod and init.sh (integration test for both validators)
"chmod +x init.sh && ./init.sh", "chmod +x init.sh && ./init.sh",
# Playwright CLI allowed commands
"playwright-cli open http://localhost:3000",
"playwright-cli screenshot",
"playwright-cli snapshot",
"playwright-cli click e5",
"playwright-cli -s=agent-1 close",
] ]
for cmd in safe: for cmd in safe:

47
ui/e2e/tooltip.spec.ts Normal file
View File

@@ -0,0 +1,47 @@
import { test, expect } from '@playwright/test'
/**
* E2E tooltip tests for header icon buttons.
*
* Run tests:
* cd ui && npm run test:e2e
* cd ui && npm run test:e2e -- tooltip.spec.ts
*/
test.describe('Header tooltips', () => {
test.setTimeout(30000)
test.beforeEach(async ({ page }) => {
await page.goto('/')
await page.waitForSelector('button:has-text("Select Project")', { timeout: 10000 })
})
async function selectProject(page: import('@playwright/test').Page) {
const projectSelector = page.locator('button:has-text("Select Project")')
if (await projectSelector.isVisible()) {
await projectSelector.click()
const items = page.locator('.neo-dropdown-item')
const itemCount = await items.count()
if (itemCount === 0) return false
await items.first().click()
await expect(projectSelector).not.toBeVisible({ timeout: 5000 }).catch(() => {})
return true
}
return false
}
test('Settings tooltip shows on hover', async ({ page }) => {
const hasProject = await selectProject(page)
if (!hasProject) {
test.skip(true, 'No projects available')
return
}
const settingsButton = page.locator('button[aria-label="Open Settings"]')
await expect(settingsButton).toBeVisible()
await settingsButton.hover()
const tooltip = page.locator('[data-slot="tooltip-content"]', { hasText: 'Settings' })
await expect(tooltip).toBeVisible({ timeout: 2000 })
})
})

1566
ui/package-lock.json generated

File diff suppressed because it is too large Load Diff

View File

@@ -19,6 +19,7 @@
"@radix-ui/react-separator": "^1.1.8", "@radix-ui/react-separator": "^1.1.8",
"@radix-ui/react-slot": "^1.2.4", "@radix-ui/react-slot": "^1.2.4",
"@radix-ui/react-switch": "^1.2.6", "@radix-ui/react-switch": "^1.2.6",
"@radix-ui/react-tooltip": "^1.2.8",
"@tanstack/react-query": "^5.72.0", "@tanstack/react-query": "^5.72.0",
"@xterm/addon-fit": "^0.11.0", "@xterm/addon-fit": "^0.11.0",
"@xterm/addon-web-links": "^0.12.0", "@xterm/addon-web-links": "^0.12.0",
@@ -32,6 +33,8 @@
"lucide-react": "^0.475.0", "lucide-react": "^0.475.0",
"react": "^19.0.0", "react": "^19.0.0",
"react-dom": "^19.0.0", "react-dom": "^19.0.0",
"react-markdown": "^10.1.0",
"remark-gfm": "^4.0.1",
"tailwind-merge": "^3.4.0" "tailwind-merge": "^3.4.0"
}, },
"devDependencies": { "devDependencies": {

View File

@@ -33,6 +33,7 @@ import type { Feature } from './lib/types'
import { Button } from '@/components/ui/button' import { Button } from '@/components/ui/button'
import { Card, CardContent } from '@/components/ui/card' import { Card, CardContent } from '@/components/ui/card'
import { Badge } from '@/components/ui/badge' import { Badge } from '@/components/ui/badge'
import { TooltipProvider, Tooltip, TooltipTrigger, TooltipContent } from '@/components/ui/tooltip'
const STORAGE_KEY = 'autoforge-selected-project' const STORAGE_KEY = 'autoforge-selected-project'
const VIEW_MODE_KEY = 'autoforge-view-mode' const VIEW_MODE_KEY = 'autoforge-view-mode'
@@ -129,7 +130,8 @@ function App() {
const allFeatures = [ const allFeatures = [
...(features?.pending ?? []), ...(features?.pending ?? []),
...(features?.in_progress ?? []), ...(features?.in_progress ?? []),
...(features?.done ?? []) ...(features?.done ?? []),
...(features?.needs_human_input ?? [])
] ]
const feature = allFeatures.find(f => f.id === nodeId) const feature = allFeatures.find(f => f.id === nodeId)
if (feature) setSelectedFeature(feature) if (feature) setSelectedFeature(feature)
@@ -180,7 +182,7 @@ function App() {
// E : Expand project with AI (when project selected, has spec and has features) // E : Expand project with AI (when project selected, has spec and has features)
if ((e.key === 'e' || e.key === 'E') && selectedProject && hasSpec && features && if ((e.key === 'e' || e.key === 'E') && selectedProject && hasSpec && features &&
(features.pending.length + features.in_progress.length + features.done.length) > 0) { (features.pending.length + features.in_progress.length + features.done.length + (features.needs_human_input?.length || 0)) > 0) {
e.preventDefault() e.preventDefault()
setShowExpandProject(true) setShowExpandProject(true)
} }
@@ -209,8 +211,8 @@ function App() {
setShowKeyboardHelp(true) setShowKeyboardHelp(true)
} }
// R : Open reset modal (when project selected and agent not running) // R : Open reset modal (when project selected and agent not running/draining)
if ((e.key === 'r' || e.key === 'R') && selectedProject && wsState.agentStatus !== 'running') { if ((e.key === 'r' || e.key === 'R') && selectedProject && !['running', 'pausing', 'paused_graceful'].includes(wsState.agentStatus)) {
e.preventDefault() e.preventDefault()
setShowResetModal(true) setShowResetModal(true)
} }
@@ -244,7 +246,7 @@ function App() {
// Combine WebSocket progress with feature data // Combine WebSocket progress with feature data
const progress = wsState.progress.total > 0 ? wsState.progress : { const progress = wsState.progress.total > 0 ? wsState.progress : {
passing: features?.done.length ?? 0, passing: features?.done.length ?? 0,
total: (features?.pending.length ?? 0) + (features?.in_progress.length ?? 0) + (features?.done.length ?? 0), total: (features?.pending.length ?? 0) + (features?.in_progress.length ?? 0) + (features?.done.length ?? 0) + (features?.needs_human_input?.length ?? 0),
percentage: 0, percentage: 0,
} }
@@ -260,18 +262,19 @@ function App() {
<div className="min-h-screen bg-background"> <div className="min-h-screen bg-background">
{/* Header */} {/* Header */}
<header className="sticky top-0 z-50 bg-card/80 backdrop-blur-md text-foreground border-b-2 border-border"> <header className="sticky top-0 z-50 bg-card/80 backdrop-blur-md text-foreground border-b-2 border-border">
<div className="max-w-7xl mx-auto px-4 py-4"> <div className="max-w-7xl mx-auto px-4 py-3">
<div className="flex items-center justify-between"> <TooltipProvider>
{/* Logo and Title */} {/* Row 1: Branding + Project + Utility icons */}
<div className="flex items-center gap-3"> <div className="flex items-center gap-3">
<img src="/logo.png" alt="AutoForge" className="h-9 w-9 rounded-full" /> {/* Logo and Title */}
<h1 className="font-display text-2xl font-bold tracking-tight uppercase"> <div className="flex items-center gap-2 shrink-0">
AutoForge <img src="/logo.png" alt="AutoForge" className="h-9 w-9 rounded-full" />
</h1> <h1 className="font-display text-2xl font-bold tracking-tight uppercase hidden md:block">
</div> AutoForge
</h1>
</div>
{/* Controls */} {/* Project selector */}
<div className="flex items-center gap-4">
<ProjectSelector <ProjectSelector
projects={projects ?? []} projects={projects ?? []}
selectedProject={selectedProject} selectedProject={selectedProject}
@@ -280,94 +283,114 @@ function App() {
onSpecCreatingChange={setIsSpecCreating} onSpecCreatingChange={setIsSpecCreating}
/> />
{selectedProject && ( {/* Spacer */}
<> <div className="flex-1" />
<AgentControl
projectName={selectedProject}
status={wsState.agentStatus}
defaultConcurrency={selectedProjectData?.default_concurrency}
/>
<DevServerControl {/* Ollama Mode Indicator */}
projectName={selectedProject} {selectedProject && settings?.ollama_mode && (
status={wsState.devServerStatus} <div
url={wsState.devServerUrl} className="hidden sm:flex items-center gap-1.5 px-2 py-1 bg-card rounded border-2 border-border shadow-sm"
/> title="Using Ollama local models"
>
<Button <img src="/ollama.png" alt="Ollama" className="w-5 h-5" />
onClick={() => setShowSettings(true)} <span className="text-xs font-bold text-foreground">Ollama</span>
variant="outline" </div>
size="sm"
title="Settings (,)"
aria-label="Open Settings"
>
<Settings size={18} />
</Button>
<Button
onClick={() => setShowResetModal(true)}
variant="outline"
size="sm"
title="Reset Project (R)"
aria-label="Reset Project"
disabled={wsState.agentStatus === 'running'}
>
<RotateCcw size={18} />
</Button>
{/* Ollama Mode Indicator */}
{settings?.ollama_mode && (
<div
className="flex items-center gap-1.5 px-2 py-1 bg-card rounded border-2 border-border shadow-sm"
title="Using Ollama local models"
>
<img src="/ollama.png" alt="Ollama" className="w-5 h-5" />
<span className="text-xs font-bold text-foreground">Ollama</span>
</div>
)}
{/* GLM Mode Badge */}
{settings?.glm_mode && (
<Badge
className="bg-purple-500 text-white hover:bg-purple-600"
title="Using GLM API"
>
GLM
</Badge>
)}
</>
)} )}
{/* Docs link */} {/* GLM Mode Badge */}
<Button {selectedProject && settings?.glm_mode && (
onClick={() => window.open('https://autoforge.cc', '_blank')} <Badge
variant="outline" className="hidden sm:inline-flex bg-purple-500 text-white hover:bg-purple-600"
size="sm" title="Using GLM API"
title="Documentation" >
aria-label="Open Documentation" GLM
> </Badge>
<BookOpen size={18} /> )}
</Button>
{/* Utility icons - always visible */}
<Tooltip>
<TooltipTrigger asChild>
<Button
onClick={() => window.open('https://autoforge.cc', '_blank')}
variant="outline"
size="sm"
aria-label="Open Documentation"
>
<BookOpen size={18} />
</Button>
</TooltipTrigger>
<TooltipContent>Docs</TooltipContent>
</Tooltip>
{/* Theme selector */}
<ThemeSelector <ThemeSelector
themes={themes} themes={themes}
currentTheme={theme} currentTheme={theme}
onThemeChange={setTheme} onThemeChange={setTheme}
/> />
{/* Dark mode toggle - always visible */} <Tooltip>
<Button <TooltipTrigger asChild>
onClick={toggleDarkMode} <Button
variant="outline" onClick={toggleDarkMode}
size="sm" variant="outline"
title="Toggle dark mode" size="sm"
aria-label="Toggle dark mode" aria-label="Toggle dark mode"
> >
{darkMode ? <Sun size={18} /> : <Moon size={18} />} {darkMode ? <Sun size={18} /> : <Moon size={18} />}
</Button> </Button>
</TooltipTrigger>
<TooltipContent>Toggle theme</TooltipContent>
</Tooltip>
</div> </div>
</div>
{/* Row 2: Project controls - only when a project is selected */}
{selectedProject && (
<div className="flex items-center gap-3 mt-2 pt-2 border-t border-border/50">
<AgentControl
projectName={selectedProject}
status={wsState.agentStatus}
defaultConcurrency={selectedProjectData?.default_concurrency}
/>
<DevServerControl
projectName={selectedProject}
status={wsState.devServerStatus}
url={wsState.devServerUrl}
/>
<div className="flex-1" />
<Tooltip>
<TooltipTrigger asChild>
<Button
onClick={() => setShowSettings(true)}
variant="outline"
size="sm"
aria-label="Open Settings"
>
<Settings size={18} />
</Button>
</TooltipTrigger>
<TooltipContent>Settings (,)</TooltipContent>
</Tooltip>
<Tooltip>
<TooltipTrigger asChild>
<Button
onClick={() => setShowResetModal(true)}
variant="outline"
size="sm"
aria-label="Reset Project"
disabled={['running', 'pausing', 'paused_graceful'].includes(wsState.agentStatus)}
>
<RotateCcw size={18} />
</Button>
</TooltipTrigger>
<TooltipContent>Reset (R)</TooltipContent>
</Tooltip>
</div>
)}
</TooltipProvider>
</div> </div>
</header> </header>
@@ -421,6 +444,7 @@ function App() {
features.pending.length === 0 && features.pending.length === 0 &&
features.in_progress.length === 0 && features.in_progress.length === 0 &&
features.done.length === 0 && features.done.length === 0 &&
(features.needs_human_input?.length || 0) === 0 &&
wsState.agentStatus === 'running' && ( wsState.agentStatus === 'running' && (
<Card className="p-8 text-center"> <Card className="p-8 text-center">
<CardContent className="p-0"> <CardContent className="p-0">
@@ -436,7 +460,7 @@ function App() {
)} )}
{/* View Toggle - only show when there are features */} {/* View Toggle - only show when there are features */}
{features && (features.pending.length + features.in_progress.length + features.done.length) > 0 && ( {features && (features.pending.length + features.in_progress.length + features.done.length + (features.needs_human_input?.length || 0)) > 0 && (
<div className="flex justify-center"> <div className="flex justify-center">
<ViewToggle viewMode={viewMode} onViewModeChange={setViewMode} /> <ViewToggle viewMode={viewMode} onViewModeChange={setViewMode} />
</div> </div>

View File

@@ -1,8 +1,10 @@
import { useState, useEffect, useRef, useCallback } from 'react' import { useState, useEffect, useRef, useCallback } from 'react'
import { Play, Square, Loader2, GitBranch, Clock } from 'lucide-react' import { Play, Square, Loader2, GitBranch, Clock, Pause, PlayCircle } from 'lucide-react'
import { import {
useStartAgent, useStartAgent,
useStopAgent, useStopAgent,
useGracefulPauseAgent,
useGracefulResumeAgent,
useSettings, useSettings,
useUpdateProjectSettings, useUpdateProjectSettings,
} from '../hooks/useProjects' } from '../hooks/useProjects'
@@ -60,12 +62,14 @@ export function AgentControl({ projectName, status, defaultConcurrency = 3 }: Ag
const startAgent = useStartAgent(projectName) const startAgent = useStartAgent(projectName)
const stopAgent = useStopAgent(projectName) const stopAgent = useStopAgent(projectName)
const gracefulPause = useGracefulPauseAgent(projectName)
const gracefulResume = useGracefulResumeAgent(projectName)
const { data: nextRun } = useNextScheduledRun(projectName) const { data: nextRun } = useNextScheduledRun(projectName)
const [showScheduleModal, setShowScheduleModal] = useState(false) const [showScheduleModal, setShowScheduleModal] = useState(false)
const isLoading = startAgent.isPending || stopAgent.isPending const isLoading = startAgent.isPending || stopAgent.isPending || gracefulPause.isPending || gracefulResume.isPending
const isRunning = status === 'running' || status === 'paused' const isRunning = status === 'running' || status === 'paused' || status === 'pausing' || status === 'paused_graceful'
const isLoadingStatus = status === 'loading' const isLoadingStatus = status === 'loading'
const isParallel = concurrency > 1 const isParallel = concurrency > 1
@@ -81,7 +85,7 @@ export function AgentControl({ projectName, status, defaultConcurrency = 3 }: Ag
return ( return (
<> <>
<div className="flex items-center gap-4"> <div className="flex items-center gap-2 sm:gap-4">
{/* Concurrency slider - visible when stopped */} {/* Concurrency slider - visible when stopped */}
{isStopped && ( {isStopped && (
<div className="flex items-center gap-2"> <div className="flex items-center gap-2">
@@ -126,7 +130,7 @@ export function AgentControl({ projectName, status, defaultConcurrency = 3 }: Ag
</Badge> </Badge>
)} )}
{/* Start/Stop button */} {/* Start/Stop/Pause/Resume buttons */}
{isLoadingStatus ? ( {isLoadingStatus ? (
<Button disabled variant="outline" size="sm"> <Button disabled variant="outline" size="sm">
<Loader2 size={18} className="animate-spin" /> <Loader2 size={18} className="animate-spin" />
@@ -146,19 +150,69 @@ export function AgentControl({ projectName, status, defaultConcurrency = 3 }: Ag
)} )}
</Button> </Button>
) : ( ) : (
<Button <div className="flex items-center gap-1.5">
onClick={handleStop} {/* Pausing indicator */}
disabled={isLoading} {status === 'pausing' && (
variant="destructive" <Badge variant="secondary" className="gap-1 animate-pulse">
size="sm" <Loader2 size={12} className="animate-spin" />
title={yoloMode ? 'Stop Agent (YOLO Mode)' : 'Stop Agent'} Pausing...
> </Badge>
{isLoading ? (
<Loader2 size={18} className="animate-spin" />
) : (
<Square size={18} />
)} )}
</Button>
{/* Paused indicator + Resume button */}
{status === 'paused_graceful' && (
<>
<Badge variant="outline" className="gap-1">
Paused
</Badge>
<Button
onClick={() => gracefulResume.mutate()}
disabled={isLoading}
variant="default"
size="sm"
title="Resume agent"
>
{gracefulResume.isPending ? (
<Loader2 size={18} className="animate-spin" />
) : (
<PlayCircle size={18} />
)}
</Button>
</>
)}
{/* Graceful pause button (only when running normally) */}
{status === 'running' && (
<Button
onClick={() => gracefulPause.mutate()}
disabled={isLoading}
variant="outline"
size="sm"
title="Pause agent (finish current work first)"
>
{gracefulPause.isPending ? (
<Loader2 size={18} className="animate-spin" />
) : (
<Pause size={18} />
)}
</Button>
)}
{/* Stop button (always available) */}
<Button
onClick={handleStop}
disabled={isLoading}
variant="destructive"
size="sm"
title="Stop Agent (immediate)"
>
{stopAgent.isPending ? (
<Loader2 size={18} className="animate-spin" />
) : (
<Square size={18} />
)}
</Button>
</div>
)} )}
{/* Clock button to open schedule modal */} {/* Clock button to open schedule modal */}

View File

@@ -72,9 +72,13 @@ export function AgentMissionControl({
? `${agents.length} ${agents.length === 1 ? 'agent' : 'agents'} active` ? `${agents.length} ${agents.length === 1 ? 'agent' : 'agents'} active`
: orchestratorStatus?.state === 'initializing' : orchestratorStatus?.state === 'initializing'
? 'Initializing' ? 'Initializing'
: orchestratorStatus?.state === 'complete' : orchestratorStatus?.state === 'draining'
? 'Complete' ? 'Draining'
: 'Orchestrating' : orchestratorStatus?.state === 'paused'
? 'Paused'
: orchestratorStatus?.state === 'complete'
? 'Complete'
: 'Orchestrating'
} }
</Badge> </Badge>
</div> </div>

View File

@@ -63,7 +63,7 @@ export function AgentThought({ logs, agentStatus }: AgentThoughtProps) {
// Determine if component should be visible // Determine if component should be visible
const shouldShow = useMemo(() => { const shouldShow = useMemo(() => {
if (!thought) return false if (!thought) return false
if (agentStatus === 'running') return true if (agentStatus === 'running' || agentStatus === 'pausing') return true
if (agentStatus === 'paused') { if (agentStatus === 'paused') {
return Date.now() - lastLogTimestamp < IDLE_TIMEOUT return Date.now() - lastLogTimestamp < IDLE_TIMEOUT
} }

View File

@@ -11,6 +11,7 @@ import { Send, Loader2, Wifi, WifiOff, Plus, History } from 'lucide-react'
import { useAssistantChat } from '../hooks/useAssistantChat' import { useAssistantChat } from '../hooks/useAssistantChat'
import { ChatMessage as ChatMessageComponent } from './ChatMessage' import { ChatMessage as ChatMessageComponent } from './ChatMessage'
import { ConversationHistory } from './ConversationHistory' import { ConversationHistory } from './ConversationHistory'
import { QuestionOptions } from './QuestionOptions'
import type { ChatMessage } from '../lib/types' import type { ChatMessage } from '../lib/types'
import { isSubmitEnter } from '../lib/keyboard' import { isSubmitEnter } from '../lib/keyboard'
import { Button } from '@/components/ui/button' import { Button } from '@/components/ui/button'
@@ -52,8 +53,10 @@ export function AssistantChat({
isLoading, isLoading,
connectionStatus, connectionStatus,
conversationId: activeConversationId, conversationId: activeConversationId,
currentQuestions,
start, start,
sendMessage, sendMessage,
sendAnswer,
clearMessages, clearMessages,
} = useAssistantChat({ } = useAssistantChat({
projectName, projectName,
@@ -268,6 +271,16 @@ export function AssistantChat({
</div> </div>
)} )}
{/* Structured questions from assistant */}
{currentQuestions && (
<div className="border-t border-border bg-background">
<QuestionOptions
questions={currentQuestions}
onSubmit={sendAnswer}
/>
</div>
)}
{/* Input area */} {/* Input area */}
<div className="border-t border-border p-4 bg-card"> <div className="border-t border-border p-4 bg-card">
<div className="flex gap-2"> <div className="flex gap-2">
@@ -277,13 +290,13 @@ export function AssistantChat({
onChange={(e) => setInputValue(e.target.value)} onChange={(e) => setInputValue(e.target.value)}
onKeyDown={handleKeyDown} onKeyDown={handleKeyDown}
placeholder="Ask about the codebase..." placeholder="Ask about the codebase..."
disabled={isLoading || isLoadingConversation || connectionStatus !== 'connected'} disabled={isLoading || isLoadingConversation || connectionStatus !== 'connected' || !!currentQuestions}
className="flex-1 resize-none min-h-[44px] max-h-[120px]" className="flex-1 resize-none min-h-[44px] max-h-[120px]"
rows={1} rows={1}
/> />
<Button <Button
onClick={handleSend} onClick={handleSend}
disabled={!inputValue.trim() || isLoading || isLoadingConversation || connectionStatus !== 'connected'} disabled={!inputValue.trim() || isLoading || isLoadingConversation || connectionStatus !== 'connected' || !!currentQuestions}
title="Send message" title="Send message"
> >
{isLoading ? ( {isLoading ? (
@@ -294,7 +307,7 @@ export function AssistantChat({
</Button> </Button>
</div> </div>
<p className="text-xs text-muted-foreground mt-2"> <p className="text-xs text-muted-foreground mt-2">
Press Enter to send, Shift+Enter for new line {currentQuestions ? 'Select an option above to continue' : 'Press Enter to send, Shift+Enter for new line'}
</p> </p>
</div> </div>
</div> </div>

View File

@@ -6,7 +6,7 @@
* Manages conversation state with localStorage persistence. * Manages conversation state with localStorage persistence.
*/ */
import { useState, useEffect, useCallback } from 'react' import { useState, useEffect, useCallback, useRef } from 'react'
import { X, Bot } from 'lucide-react' import { X, Bot } from 'lucide-react'
import { AssistantChat } from './AssistantChat' import { AssistantChat } from './AssistantChat'
import { useConversation } from '../hooks/useConversations' import { useConversation } from '../hooks/useConversations'
@@ -20,6 +20,10 @@ interface AssistantPanelProps {
} }
const STORAGE_KEY_PREFIX = 'assistant-conversation-' const STORAGE_KEY_PREFIX = 'assistant-conversation-'
const WIDTH_STORAGE_KEY = 'assistant-panel-width'
const DEFAULT_WIDTH = 400
const MIN_WIDTH = 300
const MAX_WIDTH_VW = 90
function getStoredConversationId(projectName: string): number | null { function getStoredConversationId(projectName: string): number | null {
try { try {
@@ -100,6 +104,49 @@ export function AssistantPanel({ projectName, isOpen, onClose }: AssistantPanelP
setConversationId(id) setConversationId(id)
}, []) }, [])
// Resizable panel width
const [panelWidth, setPanelWidth] = useState<number>(() => {
try {
const stored = localStorage.getItem(WIDTH_STORAGE_KEY)
if (stored) return Math.max(MIN_WIDTH, parseInt(stored, 10))
} catch { /* ignore */ }
return DEFAULT_WIDTH
})
const isResizing = useRef(false)
const handleMouseDown = useCallback((e: React.MouseEvent) => {
e.preventDefault()
isResizing.current = true
const startX = e.clientX
const startWidth = panelWidth
const maxWidth = window.innerWidth * (MAX_WIDTH_VW / 100)
const handleMouseMove = (e: MouseEvent) => {
if (!isResizing.current) return
const delta = startX - e.clientX
const newWidth = Math.min(maxWidth, Math.max(MIN_WIDTH, startWidth + delta))
setPanelWidth(newWidth)
}
const handleMouseUp = () => {
isResizing.current = false
document.removeEventListener('mousemove', handleMouseMove)
document.removeEventListener('mouseup', handleMouseUp)
document.body.style.cursor = ''
document.body.style.userSelect = ''
// Persist width
setPanelWidth((w) => {
localStorage.setItem(WIDTH_STORAGE_KEY, String(w))
return w
})
}
document.body.style.cursor = 'col-resize'
document.body.style.userSelect = 'none'
document.addEventListener('mousemove', handleMouseMove)
document.addEventListener('mouseup', handleMouseUp)
}, [panelWidth])
return ( return (
<> <>
{/* Backdrop - click to close */} {/* Backdrop - click to close */}
@@ -115,17 +162,25 @@ export function AssistantPanel({ projectName, isOpen, onClose }: AssistantPanelP
<div <div
className={` className={`
fixed right-0 top-0 bottom-0 z-50 fixed right-0 top-0 bottom-0 z-50
w-[400px] max-w-[90vw]
bg-card bg-card
border-l border-border border-l border-border
transform transition-transform duration-300 ease-out transform transition-transform duration-300 ease-out
flex flex-col shadow-xl flex flex-col shadow-xl
${isOpen ? 'translate-x-0' : 'translate-x-full'} ${isOpen ? 'translate-x-0' : 'translate-x-full'}
`} `}
style={{ width: `${panelWidth}px`, maxWidth: `${MAX_WIDTH_VW}vw` }}
role="dialog" role="dialog"
aria-label="Project Assistant" aria-label="Project Assistant"
aria-hidden={!isOpen} aria-hidden={!isOpen}
> >
{/* Resize handle */}
<div
className="absolute left-0 top-0 bottom-0 w-1.5 cursor-col-resize z-10 group"
onMouseDown={handleMouseDown}
>
<div className="absolute inset-y-0 left-0 w-0.5 bg-border group-hover:bg-primary transition-colors" />
</div>
{/* Header */} {/* Header */}
<div className="flex items-center justify-between px-4 py-3 border-b border-border bg-primary text-primary-foreground"> <div className="flex items-center justify-between px-4 py-3 border-b border-border bg-primary text-primary-foreground">
<div className="flex items-center gap-2"> <div className="flex items-center gap-2">

View File

@@ -7,6 +7,8 @@
import { memo } from 'react' import { memo } from 'react'
import { Bot, User, Info } from 'lucide-react' import { Bot, User, Info } from 'lucide-react'
import ReactMarkdown, { type Components } from 'react-markdown'
import remarkGfm from 'remark-gfm'
import type { ChatMessage as ChatMessageType } from '../lib/types' import type { ChatMessage as ChatMessageType } from '../lib/types'
import { Card } from '@/components/ui/card' import { Card } from '@/components/ui/card'
@@ -14,8 +16,16 @@ interface ChatMessageProps {
message: ChatMessageType message: ChatMessageType
} }
// Module-level regex to avoid recreating on each render // Stable references for memo — avoids re-renders
const BOLD_REGEX = /\*\*(.*?)\*\*/g const remarkPlugins = [remarkGfm]
const markdownComponents: Components = {
a: ({ children, href, ...props }) => (
<a href={href} target="_blank" rel="noopener noreferrer" {...props}>
{children}
</a>
),
}
export const ChatMessage = memo(function ChatMessage({ message }: ChatMessageProps) { export const ChatMessage = memo(function ChatMessage({ message }: ChatMessageProps) {
const { role, content, attachments, timestamp, isStreaming } = message const { role, content, attachments, timestamp, isStreaming } = message
@@ -86,39 +96,11 @@ export const ChatMessage = memo(function ChatMessage({ message }: ChatMessagePro
)} )}
<Card className={`${config.bgColor} px-4 py-3 border ${isStreaming ? 'animate-pulse' : ''}`}> <Card className={`${config.bgColor} px-4 py-3 border ${isStreaming ? 'animate-pulse' : ''}`}>
{/* Parse content for basic markdown-like formatting */}
{content && ( {content && (
<div className={`whitespace-pre-wrap text-sm leading-relaxed ${config.textColor}`}> <div className={`text-sm leading-relaxed ${config.textColor} chat-prose${role === 'user' ? ' chat-prose-user' : ''}`}>
{content.split('\n').map((line, i) => { <ReactMarkdown remarkPlugins={remarkPlugins} components={markdownComponents}>
// Bold text - use module-level regex, reset lastIndex for each line {content}
BOLD_REGEX.lastIndex = 0 </ReactMarkdown>
const parts = []
let lastIndex = 0
let match
while ((match = BOLD_REGEX.exec(line)) !== null) {
if (match.index > lastIndex) {
parts.push(line.slice(lastIndex, match.index))
}
parts.push(
<strong key={`bold-${i}-${match.index}`} className="font-bold">
{match[1]}
</strong>
)
lastIndex = match.index + match[0].length
}
if (lastIndex < line.length) {
parts.push(line.slice(lastIndex))
}
return (
<span key={i}>
{parts.length > 0 ? parts : line}
{i < content.split('\n').length - 1 && '\n'}
</span>
)
})}
</div> </div>
)} )}

View File

@@ -15,7 +15,7 @@ import {
Handle, Handle,
} from '@xyflow/react' } from '@xyflow/react'
import dagre from 'dagre' import dagre from 'dagre'
import { CheckCircle2, Circle, Loader2, AlertTriangle, RefreshCw } from 'lucide-react' import { CheckCircle2, Circle, Loader2, AlertTriangle, RefreshCw, UserCircle } from 'lucide-react'
import type { DependencyGraph as DependencyGraphData, GraphNode, ActiveAgent, AgentMascot, AgentState } from '../lib/types' import type { DependencyGraph as DependencyGraphData, GraphNode, ActiveAgent, AgentMascot, AgentState } from '../lib/types'
import { AgentAvatar } from './AgentAvatar' import { AgentAvatar } from './AgentAvatar'
import { Button } from '@/components/ui/button' import { Button } from '@/components/ui/button'
@@ -93,18 +93,20 @@ class GraphErrorBoundary extends Component<ErrorBoundaryProps, ErrorBoundaryStat
// Custom node component // Custom node component
function FeatureNode({ data }: { data: GraphNode & { onClick?: () => void; agent?: NodeAgentInfo } }) { function FeatureNode({ data }: { data: GraphNode & { onClick?: () => void; agent?: NodeAgentInfo } }) {
const statusColors = { const statusColors: Record<string, string> = {
pending: 'bg-yellow-100 border-yellow-300 dark:bg-yellow-900/30 dark:border-yellow-700', pending: 'bg-yellow-100 border-yellow-300 dark:bg-yellow-900/30 dark:border-yellow-700',
in_progress: 'bg-cyan-100 border-cyan-300 dark:bg-cyan-900/30 dark:border-cyan-700', in_progress: 'bg-cyan-100 border-cyan-300 dark:bg-cyan-900/30 dark:border-cyan-700',
done: 'bg-green-100 border-green-300 dark:bg-green-900/30 dark:border-green-700', done: 'bg-green-100 border-green-300 dark:bg-green-900/30 dark:border-green-700',
blocked: 'bg-red-50 border-red-300 dark:bg-red-900/20 dark:border-red-700', blocked: 'bg-red-50 border-red-300 dark:bg-red-900/20 dark:border-red-700',
needs_human_input: 'bg-amber-100 border-amber-300 dark:bg-amber-900/30 dark:border-amber-700',
} }
const textColors = { const textColors: Record<string, string> = {
pending: 'text-yellow-900 dark:text-yellow-100', pending: 'text-yellow-900 dark:text-yellow-100',
in_progress: 'text-cyan-900 dark:text-cyan-100', in_progress: 'text-cyan-900 dark:text-cyan-100',
done: 'text-green-900 dark:text-green-100', done: 'text-green-900 dark:text-green-100',
blocked: 'text-red-900 dark:text-red-100', blocked: 'text-red-900 dark:text-red-100',
needs_human_input: 'text-amber-900 dark:text-amber-100',
} }
const StatusIcon = () => { const StatusIcon = () => {
@@ -115,6 +117,8 @@ function FeatureNode({ data }: { data: GraphNode & { onClick?: () => void; agent
return <Loader2 size={16} className={`${textColors[data.status]} animate-spin`} /> return <Loader2 size={16} className={`${textColors[data.status]} animate-spin`} />
case 'blocked': case 'blocked':
return <AlertTriangle size={16} className="text-destructive" /> return <AlertTriangle size={16} className="text-destructive" />
case 'needs_human_input':
return <UserCircle size={16} className={textColors[data.status]} />
default: default:
return <Circle size={16} className={textColors[data.status]} /> return <Circle size={16} className={textColors[data.status]} />
} }
@@ -323,6 +327,8 @@ function DependencyGraphInner({ graphData, onNodeClick, activeAgents = [] }: Dep
return '#06b6d4' // cyan-500 return '#06b6d4' // cyan-500
case 'blocked': case 'blocked':
return '#ef4444' // red-500 return '#ef4444' // red-500
case 'needs_human_input':
return '#f59e0b' // amber-500
default: default:
return '#eab308' // yellow-500 return '#eab308' // yellow-500
} }

View File

@@ -0,0 +1,182 @@
import { useState, useEffect } from 'react'
import { Loader2, RotateCcw, Terminal } from 'lucide-react'
import { useQueryClient } from '@tanstack/react-query'
import {
Dialog,
DialogContent,
DialogDescription,
DialogFooter,
DialogHeader,
DialogTitle,
} from '@/components/ui/dialog'
import { Button } from '@/components/ui/button'
import { Input } from '@/components/ui/input'
import { Label } from '@/components/ui/label'
import { useDevServerConfig, useUpdateDevServerConfig } from '@/hooks/useProjects'
import { startDevServer } from '@/lib/api'
interface DevServerConfigDialogProps {
projectName: string
isOpen: boolean
onClose: () => void
autoStartOnSave?: boolean
}
export function DevServerConfigDialog({
projectName,
isOpen,
onClose,
autoStartOnSave = false,
}: DevServerConfigDialogProps) {
const { data: config } = useDevServerConfig(isOpen ? projectName : null)
const updateConfig = useUpdateDevServerConfig(projectName)
const queryClient = useQueryClient()
const [command, setCommand] = useState('')
const [error, setError] = useState<string | null>(null)
const [isSaving, setIsSaving] = useState(false)
// Sync input with config when dialog opens or config loads
useEffect(() => {
if (isOpen && config) {
setCommand(config.custom_command ?? config.effective_command ?? '')
setError(null)
}
}, [isOpen, config])
const hasCustomCommand = !!config?.custom_command
const handleSaveAndStart = async () => {
const trimmed = command.trim()
if (!trimmed) {
setError('Please enter a dev server command.')
return
}
setIsSaving(true)
setError(null)
try {
await updateConfig.mutateAsync(trimmed)
if (autoStartOnSave) {
await startDevServer(projectName)
queryClient.invalidateQueries({ queryKey: ['dev-server-status', projectName] })
}
onClose()
} catch (err) {
setError(err instanceof Error ? err.message : 'Failed to save configuration')
} finally {
setIsSaving(false)
}
}
const handleClear = async () => {
setIsSaving(true)
setError(null)
try {
await updateConfig.mutateAsync(null)
setCommand(config?.detected_command ?? '')
} catch (err) {
setError(err instanceof Error ? err.message : 'Failed to clear configuration')
} finally {
setIsSaving(false)
}
}
return (
<Dialog open={isOpen} onOpenChange={(open) => !open && onClose()}>
<DialogContent className="sm:max-w-lg">
<DialogHeader>
<div className="flex items-center gap-3">
<div className="p-2 rounded-lg bg-primary/10 text-primary">
<Terminal size={20} />
</div>
<DialogTitle>Dev Server Configuration</DialogTitle>
</div>
</DialogHeader>
<DialogDescription asChild>
<div className="space-y-4">
{/* Detection info */}
<div className="rounded-lg border-2 border-border bg-muted/50 p-3 text-sm">
{config?.detected_type ? (
<p>
Detected project type: <strong className="text-foreground">{config.detected_type}</strong>
{config.detected_command && (
<span className="text-muted-foreground"> {config.detected_command}</span>
)}
</p>
) : (
<p className="text-muted-foreground">
No project type detected. Enter a custom command below.
</p>
)}
</div>
{/* Command input */}
<div className="space-y-2">
<Label htmlFor="dev-command" className="text-foreground">Dev server command</Label>
<Input
id="dev-command"
value={command}
onChange={(e) => {
setCommand(e.target.value)
setError(null)
}}
placeholder="npm run dev"
onKeyDown={(e) => {
if (e.key === 'Enter' && !isSaving) {
handleSaveAndStart()
}
}}
/>
<p className="text-xs text-muted-foreground">
Allowed runners: npm, npx, pnpm, yarn, python, uvicorn, flask, poetry, cargo, go
</p>
</div>
{/* Clear custom command button */}
{hasCustomCommand && (
<Button
variant="outline"
size="sm"
onClick={handleClear}
disabled={isSaving}
className="gap-1.5"
>
<RotateCcw size={14} />
Clear custom command (use auto-detection)
</Button>
)}
{/* Error display */}
{error && (
<p className="text-sm font-mono text-destructive">{error}</p>
)}
</div>
</DialogDescription>
<DialogFooter className="gap-2 sm:gap-0">
<Button variant="outline" onClick={onClose} disabled={isSaving}>
Cancel
</Button>
<Button onClick={handleSaveAndStart} disabled={isSaving}>
{isSaving ? (
<>
<Loader2 size={16} className="animate-spin mr-1.5" />
Saving...
</>
) : autoStartOnSave ? (
'Save & Start'
) : (
'Save'
)}
</Button>
</DialogFooter>
</DialogContent>
</Dialog>
)
}

View File

@@ -1,8 +1,10 @@
import { Globe, Square, Loader2, ExternalLink, AlertTriangle } from 'lucide-react' import { useState } from 'react'
import { Globe, Square, Loader2, ExternalLink, AlertTriangle, Settings2 } from 'lucide-react'
import { useMutation, useQueryClient } from '@tanstack/react-query' import { useMutation, useQueryClient } from '@tanstack/react-query'
import type { DevServerStatus } from '../lib/types' import type { DevServerStatus } from '../lib/types'
import { startDevServer, stopDevServer } from '../lib/api' import { startDevServer, stopDevServer } from '../lib/api'
import { Button } from '@/components/ui/button' import { Button } from '@/components/ui/button'
import { DevServerConfigDialog } from './DevServerConfigDialog'
// Re-export DevServerStatus from lib/types for consumers that import from here // Re-export DevServerStatus from lib/types for consumers that import from here
export type { DevServerStatus } export type { DevServerStatus }
@@ -59,17 +61,27 @@ interface DevServerControlProps {
* - Shows loading state during operations * - Shows loading state during operations
* - Displays clickable URL when server is running * - Displays clickable URL when server is running
* - Uses neobrutalism design with cyan accent when running * - Uses neobrutalism design with cyan accent when running
* - Config dialog for setting custom dev commands
*/ */
export function DevServerControl({ projectName, status, url }: DevServerControlProps) { export function DevServerControl({ projectName, status, url }: DevServerControlProps) {
const startDevServerMutation = useStartDevServer(projectName) const startDevServerMutation = useStartDevServer(projectName)
const stopDevServerMutation = useStopDevServer(projectName) const stopDevServerMutation = useStopDevServer(projectName)
const [showConfigDialog, setShowConfigDialog] = useState(false)
const [autoStartOnSave, setAutoStartOnSave] = useState(false)
const isLoading = startDevServerMutation.isPending || stopDevServerMutation.isPending const isLoading = startDevServerMutation.isPending || stopDevServerMutation.isPending
const handleStart = () => { const handleStart = () => {
// Clear any previous errors before starting // Clear any previous errors before starting
stopDevServerMutation.reset() stopDevServerMutation.reset()
startDevServerMutation.mutate() startDevServerMutation.mutate(undefined, {
onError: (err) => {
if (err.message?.includes('No dev command available')) {
setAutoStartOnSave(true)
setShowConfigDialog(true)
}
},
})
} }
const handleStop = () => { const handleStop = () => {
// Clear any previous errors before stopping // Clear any previous errors before stopping
@@ -77,6 +89,19 @@ export function DevServerControl({ projectName, status, url }: DevServerControlP
stopDevServerMutation.mutate() stopDevServerMutation.mutate()
} }
const handleOpenConfig = () => {
setAutoStartOnSave(false)
setShowConfigDialog(true)
}
const handleCloseConfig = () => {
setShowConfigDialog(false)
// Clear the start error if config dialog was opened reactively
if (startDevServerMutation.error?.message?.includes('No dev command available')) {
startDevServerMutation.reset()
}
}
// Server is stopped when status is 'stopped' or 'crashed' (can restart) // Server is stopped when status is 'stopped' or 'crashed' (can restart)
const isStopped = status === 'stopped' || status === 'crashed' const isStopped = status === 'stopped' || status === 'crashed'
// Server is in a running state // Server is in a running state
@@ -84,25 +109,40 @@ export function DevServerControl({ projectName, status, url }: DevServerControlP
// Server has crashed // Server has crashed
const isCrashed = status === 'crashed' const isCrashed = status === 'crashed'
// Hide inline error when config dialog is handling it
const startError = startDevServerMutation.error
const showInlineError = startError && !startError.message?.includes('No dev command available')
return ( return (
<div className="flex items-center gap-2"> <div className="flex items-center gap-2">
{isStopped ? ( {isStopped ? (
<Button <>
onClick={handleStart} <Button
disabled={isLoading} onClick={handleStart}
variant={isCrashed ? "destructive" : "outline"} disabled={isLoading}
size="sm" variant={isCrashed ? "destructive" : "outline"}
title={isCrashed ? "Dev Server Crashed - Click to Restart" : "Start Dev Server"} size="sm"
aria-label={isCrashed ? "Restart Dev Server (crashed)" : "Start Dev Server"} title={isCrashed ? "Dev Server Crashed - Click to Restart" : "Start Dev Server"}
> aria-label={isCrashed ? "Restart Dev Server (crashed)" : "Start Dev Server"}
{isLoading ? ( >
<Loader2 size={18} className="animate-spin" /> {isLoading ? (
) : isCrashed ? ( <Loader2 size={18} className="animate-spin" />
<AlertTriangle size={18} /> ) : isCrashed ? (
) : ( <AlertTriangle size={18} />
<Globe size={18} /> ) : (
)} <Globe size={18} />
</Button> )}
</Button>
<Button
onClick={handleOpenConfig}
variant="ghost"
size="sm"
title="Configure Dev Server"
aria-label="Configure Dev Server"
>
<Settings2 size={16} />
</Button>
</>
) : ( ) : (
<Button <Button
onClick={handleStop} onClick={handleStop}
@@ -139,12 +179,20 @@ export function DevServerControl({ projectName, status, url }: DevServerControlP
</Button> </Button>
)} )}
{/* Error display */} {/* Error display (hide "no dev command" error when config dialog handles it) */}
{(startDevServerMutation.error || stopDevServerMutation.error) && ( {(showInlineError || stopDevServerMutation.error) && (
<span className="text-xs font-mono text-destructive ml-2"> <span className="text-xs font-mono text-destructive ml-2">
{String((startDevServerMutation.error || stopDevServerMutation.error)?.message || 'Operation failed')} {String((showInlineError ? startError : stopDevServerMutation.error)?.message || 'Operation failed')}
</span> </span>
)} )}
{/* Dev Server Config Dialog */}
<DevServerConfigDialog
projectName={projectName}
isOpen={showConfigDialog}
onClose={handleCloseConfig}
autoStartOnSave={autoStartOnSave}
/>
</div> </div>
) )
} }

View File

@@ -1,4 +1,4 @@
import { CheckCircle2, Circle, Loader2, MessageCircle } from 'lucide-react' import { CheckCircle2, Circle, Loader2, MessageCircle, UserCircle } from 'lucide-react'
import type { Feature, ActiveAgent } from '../lib/types' import type { Feature, ActiveAgent } from '../lib/types'
import { DependencyBadge } from './DependencyBadge' import { DependencyBadge } from './DependencyBadge'
import { AgentAvatar } from './AgentAvatar' import { AgentAvatar } from './AgentAvatar'
@@ -45,7 +45,8 @@ export function FeatureCard({ feature, onClick, isInProgress, allFeatures = [],
cursor-pointer transition-all hover:border-primary py-3 cursor-pointer transition-all hover:border-primary py-3
${isInProgress ? 'animate-pulse' : ''} ${isInProgress ? 'animate-pulse' : ''}
${feature.passes ? 'border-primary/50' : ''} ${feature.passes ? 'border-primary/50' : ''}
${isBlocked && !feature.passes ? 'border-destructive/50 opacity-80' : ''} ${feature.needs_human_input ? 'border-amber-500/50' : ''}
${isBlocked && !feature.passes && !feature.needs_human_input ? 'border-destructive/50 opacity-80' : ''}
${hasActiveAgent ? 'ring-2 ring-primary ring-offset-2' : ''} ${hasActiveAgent ? 'ring-2 ring-primary ring-offset-2' : ''}
`} `}
> >
@@ -105,6 +106,11 @@ export function FeatureCard({ feature, onClick, isInProgress, allFeatures = [],
<CheckCircle2 size={16} className="text-primary" /> <CheckCircle2 size={16} className="text-primary" />
<span className="text-primary font-medium">Complete</span> <span className="text-primary font-medium">Complete</span>
</> </>
) : feature.needs_human_input ? (
<>
<UserCircle size={16} className="text-amber-500" />
<span className="text-amber-500 font-medium">Needs Your Input</span>
</>
) : isBlocked ? ( ) : isBlocked ? (
<> <>
<Circle size={16} className="text-destructive" /> <Circle size={16} className="text-destructive" />

View File

@@ -1,7 +1,8 @@
import { useState } from 'react' import { useState } from 'react'
import { X, CheckCircle2, Circle, SkipForward, Trash2, Loader2, AlertCircle, Pencil, Link2, AlertTriangle } from 'lucide-react' import { X, CheckCircle2, Circle, SkipForward, Trash2, Loader2, AlertCircle, Pencil, Link2, AlertTriangle, UserCircle } from 'lucide-react'
import { useSkipFeature, useDeleteFeature, useFeatures } from '../hooks/useProjects' import { useSkipFeature, useDeleteFeature, useFeatures, useResolveHumanInput } from '../hooks/useProjects'
import { EditFeatureForm } from './EditFeatureForm' import { EditFeatureForm } from './EditFeatureForm'
import { HumanInputForm } from './HumanInputForm'
import type { Feature } from '../lib/types' import type { Feature } from '../lib/types'
import { import {
Dialog, Dialog,
@@ -50,10 +51,12 @@ export function FeatureModal({ feature, projectName, onClose }: FeatureModalProp
const deleteFeature = useDeleteFeature(projectName) const deleteFeature = useDeleteFeature(projectName)
const { data: allFeatures } = useFeatures(projectName) const { data: allFeatures } = useFeatures(projectName)
const resolveHumanInput = useResolveHumanInput(projectName)
// Build a map of feature ID to feature for looking up dependency names // Build a map of feature ID to feature for looking up dependency names
const featureMap = new Map<number, Feature>() const featureMap = new Map<number, Feature>()
if (allFeatures) { if (allFeatures) {
;[...allFeatures.pending, ...allFeatures.in_progress, ...allFeatures.done].forEach(f => { ;[...allFeatures.pending, ...allFeatures.in_progress, ...allFeatures.done, ...(allFeatures.needs_human_input || [])].forEach(f => {
featureMap.set(f.id, f) featureMap.set(f.id, f)
}) })
} }
@@ -141,6 +144,11 @@ export function FeatureModal({ feature, projectName, onClose }: FeatureModalProp
<CheckCircle2 size={24} className="text-primary" /> <CheckCircle2 size={24} className="text-primary" />
<span className="font-semibold text-primary">COMPLETE</span> <span className="font-semibold text-primary">COMPLETE</span>
</> </>
) : feature.needs_human_input ? (
<>
<UserCircle size={24} className="text-amber-500" />
<span className="font-semibold text-amber-500">NEEDS YOUR INPUT</span>
</>
) : ( ) : (
<> <>
<Circle size={24} className="text-muted-foreground" /> <Circle size={24} className="text-muted-foreground" />
@@ -152,6 +160,38 @@ export function FeatureModal({ feature, projectName, onClose }: FeatureModalProp
</span> </span>
</div> </div>
{/* Human Input Request */}
{feature.needs_human_input && feature.human_input_request && (
<HumanInputForm
request={feature.human_input_request}
onSubmit={async (fields) => {
setError(null)
try {
await resolveHumanInput.mutateAsync({ featureId: feature.id, fields })
onClose()
} catch (err) {
setError(err instanceof Error ? err.message : 'Failed to submit response')
}
}}
isLoading={resolveHumanInput.isPending}
/>
)}
{/* Previous Human Input Response */}
{feature.human_input_response && !feature.needs_human_input && (
<Alert className="border-green-500 bg-green-50 dark:bg-green-950/20">
<CheckCircle2 className="h-4 w-4 text-green-600" />
<AlertDescription>
<h4 className="font-semibold mb-1 text-green-700 dark:text-green-400">Human Input Provided</h4>
<p className="text-sm text-green-600 dark:text-green-300">
Response submitted{feature.human_input_response.responded_at
? ` at ${new Date(feature.human_input_response.responded_at).toLocaleString()}`
: ''}.
</p>
</AlertDescription>
</Alert>
)}
{/* Description */} {/* Description */}
<div> <div>
<h3 className="font-semibold mb-2 text-sm uppercase tracking-wide text-muted-foreground"> <h3 className="font-semibold mb-2 text-sm uppercase tracking-wide text-muted-foreground">

View File

@@ -0,0 +1,150 @@
import { useState } from 'react'
import { Loader2, UserCircle, Send } from 'lucide-react'
import type { HumanInputRequest } from '../lib/types'
import { Button } from '@/components/ui/button'
import { Input } from '@/components/ui/input'
import { Textarea } from '@/components/ui/textarea'
import { Label } from '@/components/ui/label'
import { Alert, AlertDescription } from '@/components/ui/alert'
import { Switch } from '@/components/ui/switch'
interface HumanInputFormProps {
request: HumanInputRequest
onSubmit: (fields: Record<string, string | boolean | string[]>) => Promise<void>
isLoading: boolean
}
export function HumanInputForm({ request, onSubmit, isLoading }: HumanInputFormProps) {
const [values, setValues] = useState<Record<string, string | boolean | string[]>>(() => {
const initial: Record<string, string | boolean | string[]> = {}
for (const field of request.fields) {
if (field.type === 'boolean') {
initial[field.id] = false
} else {
initial[field.id] = ''
}
}
return initial
})
const [validationError, setValidationError] = useState<string | null>(null)
const handleSubmit = async () => {
// Validate required fields
for (const field of request.fields) {
if (field.required) {
const val = values[field.id]
if (val === undefined || val === null || val === '') {
setValidationError(`"${field.label}" is required`)
return
}
}
}
setValidationError(null)
await onSubmit(values)
}
return (
<Alert className="border-amber-500 bg-amber-50 dark:bg-amber-950/20">
<UserCircle className="h-5 w-5 text-amber-600" />
<AlertDescription className="space-y-4">
<div>
<h4 className="font-semibold text-amber-700 dark:text-amber-400">Agent needs your help</h4>
<p className="text-sm text-amber-600 dark:text-amber-300 mt-1">
{request.prompt}
</p>
</div>
<div className="space-y-3">
{request.fields.map((field) => (
<div key={field.id} className="space-y-1.5">
<Label htmlFor={`human-input-${field.id}`} className="text-sm font-medium text-foreground">
{field.label}
{field.required && <span className="text-destructive ml-1">*</span>}
</Label>
{field.type === 'text' && (
<Input
id={`human-input-${field.id}`}
value={values[field.id] as string}
onChange={(e) => setValues(prev => ({ ...prev, [field.id]: e.target.value }))}
placeholder={field.placeholder || ''}
disabled={isLoading}
/>
)}
{field.type === 'textarea' && (
<Textarea
id={`human-input-${field.id}`}
value={values[field.id] as string}
onChange={(e) => setValues(prev => ({ ...prev, [field.id]: e.target.value }))}
placeholder={field.placeholder || ''}
disabled={isLoading}
rows={3}
/>
)}
{field.type === 'select' && field.options && (
<div className="space-y-1.5">
{field.options.map((option) => (
<label
key={option.value}
className={`flex items-center gap-2 p-2 rounded-md border cursor-pointer transition-colors
${values[field.id] === option.value
? 'border-primary bg-primary/10'
: 'border-border hover:border-primary/50'}`}
>
<input
type="radio"
name={`human-input-${field.id}`}
value={option.value}
checked={values[field.id] === option.value}
onChange={(e) => setValues(prev => ({ ...prev, [field.id]: e.target.value }))}
disabled={isLoading}
className="accent-primary"
/>
<span className="text-sm">{option.label}</span>
</label>
))}
</div>
)}
{field.type === 'boolean' && (
<div className="flex items-center gap-2">
<Switch
id={`human-input-${field.id}`}
checked={values[field.id] as boolean}
onCheckedChange={(checked) => setValues(prev => ({ ...prev, [field.id]: checked }))}
disabled={isLoading}
/>
<Label htmlFor={`human-input-${field.id}`} className="text-sm">
{values[field.id] ? 'Yes' : 'No'}
</Label>
</div>
)}
</div>
))}
</div>
{validationError && (
<p className="text-sm text-destructive">{validationError}</p>
)}
<Button
onClick={handleSubmit}
disabled={isLoading}
className="w-full"
>
{isLoading ? (
<Loader2 size={16} className="animate-spin" />
) : (
<>
<Send size={16} />
Submit Response
</>
)}
</Button>
</AlertDescription>
</Alert>
)
}

View File

@@ -13,13 +13,16 @@ interface KanbanBoardProps {
} }
export function KanbanBoard({ features, onFeatureClick, onAddFeature, onExpandProject, activeAgents = [], onCreateSpec, hasSpec = true }: KanbanBoardProps) { export function KanbanBoard({ features, onFeatureClick, onAddFeature, onExpandProject, activeAgents = [], onCreateSpec, hasSpec = true }: KanbanBoardProps) {
const hasFeatures = features && (features.pending.length + features.in_progress.length + features.done.length) > 0 const hasFeatures = features && (features.pending.length + features.in_progress.length + features.done.length + (features.needs_human_input?.length || 0)) > 0
// Combine all features for dependency status calculation // Combine all features for dependency status calculation
const allFeatures = features const allFeatures = features
? [...features.pending, ...features.in_progress, ...features.done] ? [...features.pending, ...features.in_progress, ...features.done, ...(features.needs_human_input || [])]
: [] : []
const needsInputCount = features?.needs_human_input?.length || 0
const showNeedsInput = needsInputCount > 0
if (!features) { if (!features) {
return ( return (
<div className="grid grid-cols-1 md:grid-cols-3 gap-6"> <div className="grid grid-cols-1 md:grid-cols-3 gap-6">
@@ -40,7 +43,7 @@ export function KanbanBoard({ features, onFeatureClick, onAddFeature, onExpandPr
} }
return ( return (
<div className="grid grid-cols-1 md:grid-cols-3 gap-6"> <div className={`grid grid-cols-1 ${showNeedsInput ? 'md:grid-cols-4' : 'md:grid-cols-3'} gap-6`}>
<KanbanColumn <KanbanColumn
title="Pending" title="Pending"
count={features.pending.length} count={features.pending.length}
@@ -64,6 +67,17 @@ export function KanbanBoard({ features, onFeatureClick, onAddFeature, onExpandPr
color="progress" color="progress"
onFeatureClick={onFeatureClick} onFeatureClick={onFeatureClick}
/> />
{showNeedsInput && (
<KanbanColumn
title="Needs Input"
count={needsInputCount}
features={features.needs_human_input}
allFeatures={allFeatures}
activeAgents={activeAgents}
color="human_input"
onFeatureClick={onFeatureClick}
/>
)}
<KanbanColumn <KanbanColumn
title="Done" title="Done"
count={features.done.length} count={features.done.length}

View File

@@ -11,7 +11,7 @@ interface KanbanColumnProps {
features: Feature[] features: Feature[]
allFeatures?: Feature[] allFeatures?: Feature[]
activeAgents?: ActiveAgent[] activeAgents?: ActiveAgent[]
color: 'pending' | 'progress' | 'done' color: 'pending' | 'progress' | 'done' | 'human_input'
onFeatureClick: (feature: Feature) => void onFeatureClick: (feature: Feature) => void
onAddFeature?: () => void onAddFeature?: () => void
onExpandProject?: () => void onExpandProject?: () => void
@@ -24,6 +24,7 @@ const colorMap = {
pending: 'border-t-4 border-t-muted', pending: 'border-t-4 border-t-muted',
progress: 'border-t-4 border-t-primary', progress: 'border-t-4 border-t-primary',
done: 'border-t-4 border-t-primary', done: 'border-t-4 border-t-primary',
human_input: 'border-t-4 border-t-amber-500',
} }
export function KanbanColumn({ export function KanbanColumn({

View File

@@ -103,6 +103,10 @@ function getStateAnimation(state: OrchestratorState): string {
return 'animate-working' return 'animate-working'
case 'monitoring': case 'monitoring':
return 'animate-bounce-gentle' return 'animate-bounce-gentle'
case 'draining':
return 'animate-thinking'
case 'paused':
return ''
case 'complete': case 'complete':
return 'animate-celebrate' return 'animate-celebrate'
default: default:
@@ -121,6 +125,10 @@ function getStateGlow(state: OrchestratorState): string {
return 'shadow-[0_0_16px_rgba(124,58,237,0.6)]' return 'shadow-[0_0_16px_rgba(124,58,237,0.6)]'
case 'monitoring': case 'monitoring':
return 'shadow-[0_0_8px_rgba(167,139,250,0.4)]' return 'shadow-[0_0_8px_rgba(167,139,250,0.4)]'
case 'draining':
return 'shadow-[0_0_10px_rgba(251,191,36,0.5)]'
case 'paused':
return ''
case 'complete': case 'complete':
return 'shadow-[0_0_20px_rgba(112,224,0,0.6)]' return 'shadow-[0_0_20px_rgba(112,224,0,0.6)]'
default: default:
@@ -141,6 +149,10 @@ function getStateDescription(state: OrchestratorState): string {
return 'spawning agents' return 'spawning agents'
case 'monitoring': case 'monitoring':
return 'monitoring progress' return 'monitoring progress'
case 'draining':
return 'draining active agents'
case 'paused':
return 'paused'
case 'complete': case 'complete':
return 'all features complete' return 'all features complete'
default: default:

View File

@@ -25,6 +25,10 @@ function getStateText(state: OrchestratorState): string {
return 'Watching progress...' return 'Watching progress...'
case 'complete': case 'complete':
return 'Mission accomplished!' return 'Mission accomplished!'
case 'draining':
return 'Draining agents...'
case 'paused':
return 'Paused'
default: default:
return 'Orchestrating...' return 'Orchestrating...'
} }
@@ -42,6 +46,10 @@ function getStateColor(state: OrchestratorState): string {
return 'text-primary' return 'text-primary'
case 'initializing': case 'initializing':
return 'text-yellow-600 dark:text-yellow-400' return 'text-yellow-600 dark:text-yellow-400'
case 'draining':
return 'text-amber-600 dark:text-amber-400'
case 'paused':
return 'text-muted-foreground'
default: default:
return 'text-muted-foreground' return 'text-muted-foreground'
} }

View File

@@ -55,7 +55,7 @@ export function ProgressDashboard({
const showThought = useMemo(() => { const showThought = useMemo(() => {
if (!thought) return false if (!thought) return false
if (agentStatus === 'running') return true if (agentStatus === 'running' || agentStatus === 'pausing') return true
if (agentStatus === 'paused') { if (agentStatus === 'paused') {
return Date.now() - lastLogTimestamp < IDLE_TIMEOUT return Date.now() - lastLogTimestamp < IDLE_TIMEOUT
} }

View File

@@ -73,16 +73,17 @@ export function ProjectSelector({
<DropdownMenuTrigger asChild> <DropdownMenuTrigger asChild>
<Button <Button
variant="outline" variant="outline"
className="min-w-[200px] justify-between" className="min-w-[140px] sm:min-w-[200px] justify-between"
disabled={isLoading} disabled={isLoading}
title={selectedProjectData?.path}
> >
{isLoading ? ( {isLoading ? (
<Loader2 size={18} className="animate-spin" /> <Loader2 size={18} className="animate-spin" />
) : selectedProject ? ( ) : selectedProject ? (
<> <>
<span className="flex items-center gap-2"> <span className="flex items-center gap-2 truncate">
<FolderOpen size={18} /> <FolderOpen size={18} className="shrink-0" />
{selectedProject} <span className="truncate">{selectedProject}</span>
</span> </span>
{selectedProjectData && selectedProjectData.stats.total > 0 && ( {selectedProjectData && selectedProjectData.stats.total > 0 && (
<Badge className="ml-2">{selectedProjectData.stats.percentage}%</Badge> <Badge className="ml-2">{selectedProjectData.stats.percentage}%</Badge>
@@ -101,6 +102,7 @@ export function ProjectSelector({
{projects.map(project => ( {projects.map(project => (
<DropdownMenuItem <DropdownMenuItem
key={project.name} key={project.name}
title={project.path}
className={`flex items-center justify-between cursor-pointer ${ className={`flex items-center justify-between cursor-pointer ${
project.name === selectedProject ? 'bg-primary/10' : '' project.name === selectedProject ? 'bg-primary/10' : ''
}`} }`}

View File

@@ -85,6 +85,7 @@ export function SettingsModal({ isOpen, onClose }: SettingsModalProps) {
const handleSaveCustomBaseUrl = () => { const handleSaveCustomBaseUrl = () => {
if (customBaseUrlInput.trim() && !updateSettings.isPending) { if (customBaseUrlInput.trim() && !updateSettings.isPending) {
updateSettings.mutate({ api_base_url: customBaseUrlInput.trim() }) updateSettings.mutate({ api_base_url: customBaseUrlInput.trim() })
setCustomBaseUrlInput('')
} }
} }
@@ -102,12 +103,12 @@ export function SettingsModal({ isOpen, onClose }: SettingsModalProps) {
const currentProviderInfo: ProviderInfo | undefined = providers.find(p => p.id === currentProvider) const currentProviderInfo: ProviderInfo | undefined = providers.find(p => p.id === currentProvider)
const isAlternativeProvider = currentProvider !== 'claude' const isAlternativeProvider = currentProvider !== 'claude'
const showAuthField = isAlternativeProvider && currentProviderInfo?.requires_auth const showAuthField = isAlternativeProvider && currentProviderInfo?.requires_auth
const showBaseUrlField = currentProvider === 'custom' const showBaseUrlField = currentProvider === 'custom' || currentProvider === 'azure'
const showCustomModelInput = currentProvider === 'custom' || currentProvider === 'ollama' const showCustomModelInput = currentProvider === 'custom' || currentProvider === 'ollama'
return ( return (
<Dialog open={isOpen} onOpenChange={(open) => !open && onClose()}> <Dialog open={isOpen} onOpenChange={(open) => !open && onClose()}>
<DialogContent aria-describedby={undefined} className="sm:max-w-sm max-h-[85vh] overflow-y-auto"> <DialogContent aria-describedby={undefined} className="sm:max-w-lg max-h-[90vh] overflow-y-auto">
<DialogHeader> <DialogHeader>
<DialogTitle className="flex items-center gap-2"> <DialogTitle className="flex items-center gap-2">
Settings Settings
@@ -289,22 +290,38 @@ export function SettingsModal({ isOpen, onClose }: SettingsModalProps) {
{showBaseUrlField && ( {showBaseUrlField && (
<div className="space-y-2 pt-1"> <div className="space-y-2 pt-1">
<Label className="text-sm">Base URL</Label> <Label className="text-sm">Base URL</Label>
<div className="flex gap-2"> {settings.api_base_url && !customBaseUrlInput && (
<input <div className="flex items-center gap-2 text-sm text-muted-foreground">
type="text" <ShieldCheck size={14} className="text-green-500" />
value={customBaseUrlInput || settings.api_base_url || ''} <span className="truncate">{settings.api_base_url}</span>
onChange={(e) => setCustomBaseUrlInput(e.target.value)} <Button
placeholder="https://api.example.com/v1" variant="ghost"
className="flex-1 py-1.5 px-3 text-sm border rounded-md bg-background" size="sm"
/> className="h-auto py-0.5 px-2 text-xs shrink-0"
<Button onClick={() => setCustomBaseUrlInput(settings.api_base_url || '')}
size="sm" >
onClick={handleSaveCustomBaseUrl} Change
disabled={!customBaseUrlInput.trim() || isSaving} </Button>
> </div>
Save )}
</Button> {(!settings.api_base_url || customBaseUrlInput) && (
</div> <div className="flex gap-2">
<input
type="text"
value={customBaseUrlInput}
onChange={(e) => setCustomBaseUrlInput(e.target.value)}
placeholder={currentProvider === 'azure' ? 'https://your-resource.services.ai.azure.com/anthropic' : 'https://api.example.com/v1'}
className="flex-1 py-1.5 px-3 text-sm border rounded-md bg-background"
/>
<Button
size="sm"
onClick={handleSaveCustomBaseUrl}
disabled={!customBaseUrlInput.trim() || isSaving}
>
Save
</Button>
</div>
)}
</div> </div>
)} )}
</div> </div>

View File

@@ -1,6 +1,7 @@
import { useState, useRef, useEffect } from 'react' import { useState, useRef, useEffect } from 'react'
import { Palette, Check } from 'lucide-react' import { Palette, Check } from 'lucide-react'
import { Button } from '@/components/ui/button' import { Button } from '@/components/ui/button'
import { Tooltip, TooltipTrigger, TooltipContent } from '@/components/ui/tooltip'
import type { ThemeId, ThemeOption } from '../hooks/useTheme' import type { ThemeId, ThemeOption } from '../hooks/useTheme'
interface ThemeSelectorProps { interface ThemeSelectorProps {
@@ -97,16 +98,20 @@ export function ThemeSelector({ themes, currentTheme, onThemeChange }: ThemeSele
onMouseEnter={handleMouseEnter} onMouseEnter={handleMouseEnter}
onMouseLeave={handleMouseLeave} onMouseLeave={handleMouseLeave}
> >
<Button <Tooltip>
variant="outline" <TooltipTrigger asChild>
size="sm" <Button
title="Theme" variant="outline"
aria-label="Select theme" size="sm"
aria-expanded={isOpen} aria-label="Select theme"
aria-haspopup="true" aria-expanded={isOpen}
> aria-haspopup="true"
<Palette size={18} /> >
</Button> <Palette size={18} />
</Button>
</TooltipTrigger>
<TooltipContent>Theme</TooltipContent>
</Tooltip>
{/* Dropdown */} {/* Dropdown */}
{isOpen && ( {isOpen && (

View File

@@ -0,0 +1,65 @@
import * as React from "react"
import * as TooltipPrimitive from "@radix-ui/react-tooltip"
import { cn } from "@/lib/utils"
function TooltipProvider({
delayDuration = 250,
...props
}: React.ComponentProps<typeof TooltipPrimitive.Provider> & {
delayDuration?: number
}) {
return (
<TooltipPrimitive.Provider
data-slot="tooltip-provider"
delayDuration={delayDuration}
{...props}
/>
)
}
function Tooltip({
...props
}: React.ComponentProps<typeof TooltipPrimitive.Root>) {
return <TooltipPrimitive.Root data-slot="tooltip" {...props} />
}
function TooltipTrigger({
...props
}: React.ComponentProps<typeof TooltipPrimitive.Trigger>) {
return <TooltipPrimitive.Trigger data-slot="tooltip-trigger" {...props} />
}
function TooltipContent({
className,
side = "bottom",
align = "center",
sideOffset = 8,
children,
...props
}: React.ComponentProps<typeof TooltipPrimitive.Content>) {
return (
<TooltipPrimitive.Portal>
<TooltipPrimitive.Content
data-slot="tooltip-content"
side={side}
align={align}
sideOffset={sideOffset}
className={cn(
"z-50 overflow-hidden rounded-md border bg-neutral-900 px-3 py-2 text-sm text-white shadow-md leading-tight min-h-7",
"data-[state=delayed-open]:animate-in data-[state=closed]:animate-out data-[state=closed]:fade-out-0 data-[state=delayed-open]:fade-in-0 data-[side=bottom]:slide-in-from-top-2 data-[side=left]:slide-in-from-right-2 data-[side=right]:slide-in-from-left-2 data-[side=top]:slide-in-from-bottom-2",
className
)}
{...props}
>
{children}
<TooltipPrimitive.Arrow
data-slot="tooltip-arrow"
className="fill-neutral-900"
/>
</TooltipPrimitive.Content>
</TooltipPrimitive.Portal>
)
}
export { Tooltip, TooltipContent, TooltipProvider, TooltipTrigger }

View File

@@ -3,7 +3,7 @@
*/ */
import { useState, useCallback, useRef, useEffect } from "react"; import { useState, useCallback, useRef, useEffect } from "react";
import type { ChatMessage, AssistantChatServerMessage } from "../lib/types"; import type { ChatMessage, AssistantChatServerMessage, SpecQuestion } from "../lib/types";
type ConnectionStatus = "disconnected" | "connecting" | "connected" | "error"; type ConnectionStatus = "disconnected" | "connecting" | "connected" | "error";
@@ -17,8 +17,10 @@ interface UseAssistantChatReturn {
isLoading: boolean; isLoading: boolean;
connectionStatus: ConnectionStatus; connectionStatus: ConnectionStatus;
conversationId: number | null; conversationId: number | null;
currentQuestions: SpecQuestion[] | null;
start: (conversationId?: number | null) => void; start: (conversationId?: number | null) => void;
sendMessage: (content: string) => void; sendMessage: (content: string) => void;
sendAnswer: (answers: Record<string, string | string[]>) => void;
disconnect: () => void; disconnect: () => void;
clearMessages: () => void; clearMessages: () => void;
} }
@@ -36,6 +38,7 @@ export function useAssistantChat({
const [connectionStatus, setConnectionStatus] = const [connectionStatus, setConnectionStatus] =
useState<ConnectionStatus>("disconnected"); useState<ConnectionStatus>("disconnected");
const [conversationId, setConversationId] = useState<number | null>(null); const [conversationId, setConversationId] = useState<number | null>(null);
const [currentQuestions, setCurrentQuestions] = useState<SpecQuestion[] | null>(null);
const wsRef = useRef<WebSocket | null>(null); const wsRef = useRef<WebSocket | null>(null);
const currentAssistantMessageRef = useRef<string | null>(null); const currentAssistantMessageRef = useRef<string | null>(null);
@@ -204,6 +207,25 @@ export function useAssistantChat({
break; break;
} }
case "question": {
// Claude is asking structured questions via ask_user tool
setCurrentQuestions(data.questions);
setIsLoading(false);
// Attach questions to the last assistant message for display context
setMessages((prev) => {
const lastMessage = prev[prev.length - 1];
if (lastMessage?.role === "assistant" && lastMessage.isStreaming) {
return [
...prev.slice(0, -1),
{ ...lastMessage, isStreaming: false, questions: data.questions },
];
}
return prev;
});
break;
}
case "conversation_created": { case "conversation_created": {
setConversationId(data.conversation_id); setConversationId(data.conversation_id);
break; break;
@@ -327,6 +349,49 @@ export function useAssistantChat({
[onError], [onError],
); );
const sendAnswer = useCallback(
(answers: Record<string, string | string[]>) => {
if (!wsRef.current || wsRef.current.readyState !== WebSocket.OPEN) {
onError?.("Not connected");
return;
}
// Format answers as display text for user message
const answerParts: string[] = [];
for (const [, value] of Object.entries(answers)) {
if (Array.isArray(value)) {
answerParts.push(value.join(", "));
} else {
answerParts.push(value);
}
}
const displayText = answerParts.join("; ");
// Add user message to chat
setMessages((prev) => [
...prev,
{
id: generateId(),
role: "user",
content: displayText,
timestamp: new Date(),
},
]);
setCurrentQuestions(null);
setIsLoading(true);
// Send structured answer to server
wsRef.current.send(
JSON.stringify({
type: "answer",
answers,
}),
);
},
[onError],
);
const disconnect = useCallback(() => { const disconnect = useCallback(() => {
reconnectAttempts.current = maxReconnectAttempts; // Prevent reconnection reconnectAttempts.current = maxReconnectAttempts; // Prevent reconnection
if (pingIntervalRef.current) { if (pingIntervalRef.current) {
@@ -350,8 +415,10 @@ export function useAssistantChat({
isLoading, isLoading,
connectionStatus, connectionStatus,
conversationId, conversationId,
currentQuestions,
start, start,
sendMessage, sendMessage,
sendAnswer,
disconnect, disconnect,
clearMessages, clearMessages,
}; };

View File

@@ -137,6 +137,7 @@ function isAllComplete(features: FeatureListResponse | undefined): boolean {
return ( return (
features.pending.length === 0 && features.pending.length === 0 &&
features.in_progress.length === 0 && features.in_progress.length === 0 &&
(features.needs_human_input?.length || 0) === 0 &&
features.done.length > 0 features.done.length > 0
) )
} }

View File

@@ -4,7 +4,7 @@
import { useQuery, useMutation, useQueryClient } from '@tanstack/react-query' import { useQuery, useMutation, useQueryClient } from '@tanstack/react-query'
import * as api from '../lib/api' import * as api from '../lib/api'
import type { FeatureCreate, FeatureUpdate, ModelsResponse, ProjectSettingsUpdate, ProvidersResponse, Settings, SettingsUpdate } from '../lib/types' import type { DevServerConfig, FeatureCreate, FeatureUpdate, ModelsResponse, ProjectSettingsUpdate, ProvidersResponse, Settings, SettingsUpdate } from '../lib/types'
// ============================================================================ // ============================================================================
// Projects // Projects
@@ -133,6 +133,18 @@ export function useUpdateFeature(projectName: string) {
}) })
} }
export function useResolveHumanInput(projectName: string) {
const queryClient = useQueryClient()
return useMutation({
mutationFn: ({ featureId, fields }: { featureId: number; fields: Record<string, string | boolean | string[]> }) =>
api.resolveHumanInput(projectName, featureId, { fields }),
onSuccess: () => {
queryClient.invalidateQueries({ queryKey: ['features', projectName] })
},
})
}
// ============================================================================ // ============================================================================
// Agent // Agent
// ============================================================================ // ============================================================================
@@ -197,6 +209,28 @@ export function useResumeAgent(projectName: string) {
}) })
} }
export function useGracefulPauseAgent(projectName: string) {
const queryClient = useQueryClient()
return useMutation({
mutationFn: () => api.gracefulPauseAgent(projectName),
onSuccess: () => {
queryClient.invalidateQueries({ queryKey: ['agent-status', projectName] })
},
})
}
export function useGracefulResumeAgent(projectName: string) {
const queryClient = useQueryClient()
return useMutation({
mutationFn: () => api.gracefulResumeAgent(projectName),
onSuccess: () => {
queryClient.invalidateQueries({ queryKey: ['agent-status', projectName] })
},
})
}
// ============================================================================ // ============================================================================
// Setup // Setup
// ============================================================================ // ============================================================================
@@ -345,3 +379,36 @@ export function useUpdateSettings() {
}, },
}) })
} }
// ============================================================================
// Dev Server Config
// ============================================================================
// Default config for placeholder (until API responds)
const DEFAULT_DEV_SERVER_CONFIG: DevServerConfig = {
detected_type: null,
detected_command: null,
custom_command: null,
effective_command: null,
}
export function useDevServerConfig(projectName: string | null) {
return useQuery({
queryKey: ['dev-server-config', projectName],
queryFn: () => api.getDevServerConfig(projectName!),
enabled: !!projectName,
staleTime: 30_000,
placeholderData: DEFAULT_DEV_SERVER_CONFIG,
})
}
export function useUpdateDevServerConfig(projectName: string) {
const queryClient = useQueryClient()
return useMutation({
mutationFn: (customCommand: string | null) =>
api.updateDevServerConfig(projectName, customCommand),
onSuccess: () => {
queryClient.invalidateQueries({ queryKey: ['dev-server-config', projectName] })
},
})
}

View File

@@ -33,6 +33,7 @@ interface WebSocketState {
progress: { progress: {
passing: number passing: number
in_progress: number in_progress: number
needs_human_input: number
total: number total: number
percentage: number percentage: number
} }
@@ -60,7 +61,7 @@ const MAX_AGENT_LOGS = 500 // Keep last 500 log lines per agent
export function useProjectWebSocket(projectName: string | null) { export function useProjectWebSocket(projectName: string | null) {
const [state, setState] = useState<WebSocketState>({ const [state, setState] = useState<WebSocketState>({
progress: { passing: 0, in_progress: 0, total: 0, percentage: 0 }, progress: { passing: 0, in_progress: 0, needs_human_input: 0, total: 0, percentage: 0 },
agentStatus: 'loading', agentStatus: 'loading',
logs: [], logs: [],
isConnected: false, isConnected: false,
@@ -107,6 +108,7 @@ export function useProjectWebSocket(projectName: string | null) {
progress: { progress: {
passing: message.passing, passing: message.passing,
in_progress: message.in_progress, in_progress: message.in_progress,
needs_human_input: message.needs_human_input ?? 0,
total: message.total, total: message.total,
percentage: message.percentage, percentage: message.percentage,
}, },
@@ -385,7 +387,7 @@ export function useProjectWebSocket(projectName: string | null) {
// Reset state when project changes to clear stale data // Reset state when project changes to clear stale data
// Use 'loading' for agentStatus to show loading indicator until WebSocket provides actual status // Use 'loading' for agentStatus to show loading indicator until WebSocket provides actual status
setState({ setState({
progress: { passing: 0, in_progress: 0, total: 0, percentage: 0 }, progress: { passing: 0, in_progress: 0, needs_human_input: 0, total: 0, percentage: 0 },
agentStatus: 'loading', agentStatus: 'loading',
logs: [], logs: [],
isConnected: false, isConnected: false,

View File

@@ -181,6 +181,17 @@ export async function createFeaturesBulk(
}) })
} }
export async function resolveHumanInput(
projectName: string,
featureId: number,
response: { fields: Record<string, string | boolean | string[]> }
): Promise<Feature> {
return fetchJSON(`/projects/${encodeURIComponent(projectName)}/features/${featureId}/resolve-human-input`, {
method: 'POST',
body: JSON.stringify(response),
})
}
// ============================================================================ // ============================================================================
// Dependency Graph API // Dependency Graph API
// ============================================================================ // ============================================================================
@@ -271,6 +282,18 @@ export async function resumeAgent(projectName: string): Promise<AgentActionRespo
}) })
} }
export async function gracefulPauseAgent(projectName: string): Promise<AgentActionResponse> {
return fetchJSON(`/projects/${encodeURIComponent(projectName)}/agent/graceful-pause`, {
method: 'POST',
})
}
export async function gracefulResumeAgent(projectName: string): Promise<AgentActionResponse> {
return fetchJSON(`/projects/${encodeURIComponent(projectName)}/agent/graceful-resume`, {
method: 'POST',
})
}
// ============================================================================ // ============================================================================
// Spec Creation API // Spec Creation API
// ============================================================================ // ============================================================================
@@ -445,6 +468,16 @@ export async function getDevServerConfig(projectName: string): Promise<DevServer
return fetchJSON(`/projects/${encodeURIComponent(projectName)}/devserver/config`) return fetchJSON(`/projects/${encodeURIComponent(projectName)}/devserver/config`)
} }
export async function updateDevServerConfig(
projectName: string,
customCommand: string | null
): Promise<DevServerConfig> {
return fetchJSON(`/projects/${encodeURIComponent(projectName)}/devserver/config`, {
method: 'PATCH',
body: JSON.stringify({ custom_command: customCommand }),
})
}
// ============================================================================ // ============================================================================
// Terminal API // Terminal API
// ============================================================================ // ============================================================================

View File

@@ -57,6 +57,26 @@ export interface ProjectPrompts {
coding_prompt: string coding_prompt: string
} }
// Human input types
export interface HumanInputField {
id: string
label: string
type: 'text' | 'textarea' | 'select' | 'boolean'
required: boolean
placeholder?: string
options?: { value: string; label: string }[]
}
export interface HumanInputRequest {
prompt: string
fields: HumanInputField[]
}
export interface HumanInputResponseData {
fields: Record<string, string | boolean | string[]>
responded_at?: string
}
// Feature types // Feature types
export interface Feature { export interface Feature {
id: number id: number
@@ -70,10 +90,13 @@ export interface Feature {
dependencies?: number[] // Optional for backwards compat dependencies?: number[] // Optional for backwards compat
blocked?: boolean // Computed by API blocked?: boolean // Computed by API
blocking_dependencies?: number[] // Computed by API blocking_dependencies?: number[] // Computed by API
needs_human_input?: boolean
human_input_request?: HumanInputRequest | null
human_input_response?: HumanInputResponseData | null
} }
// Status type for graph nodes // Status type for graph nodes
export type FeatureStatus = 'pending' | 'in_progress' | 'done' | 'blocked' export type FeatureStatus = 'pending' | 'in_progress' | 'done' | 'blocked' | 'needs_human_input'
// Graph visualization types // Graph visualization types
export interface GraphNode { export interface GraphNode {
@@ -99,6 +122,7 @@ export interface FeatureListResponse {
pending: Feature[] pending: Feature[]
in_progress: Feature[] in_progress: Feature[]
done: Feature[] done: Feature[]
needs_human_input: Feature[]
} }
export interface FeatureCreate { export interface FeatureCreate {
@@ -120,7 +144,7 @@ export interface FeatureUpdate {
} }
// Agent types // Agent types
export type AgentStatus = 'stopped' | 'running' | 'paused' | 'crashed' | 'loading' export type AgentStatus = 'stopped' | 'running' | 'paused' | 'crashed' | 'loading' | 'pausing' | 'paused_graceful'
export interface AgentStatusResponse { export interface AgentStatusResponse {
status: AgentStatus status: AgentStatus
@@ -216,6 +240,8 @@ export type OrchestratorState =
| 'spawning' | 'spawning'
| 'monitoring' | 'monitoring'
| 'complete' | 'complete'
| 'draining'
| 'paused'
// Orchestrator event for recent activity // Orchestrator event for recent activity
export interface OrchestratorEvent { export interface OrchestratorEvent {
@@ -248,6 +274,7 @@ export interface WSProgressMessage {
in_progress: number in_progress: number
total: number total: number
percentage: number percentage: number
needs_human_input?: number
} }
export interface WSFeatureUpdateMessage { export interface WSFeatureUpdateMessage {
@@ -465,6 +492,11 @@ export interface AssistantChatConversationCreatedMessage {
conversation_id: number conversation_id: number
} }
export interface AssistantChatQuestionMessage {
type: 'question'
questions: SpecQuestion[]
}
export interface AssistantChatPongMessage { export interface AssistantChatPongMessage {
type: 'pong' type: 'pong'
} }
@@ -472,6 +504,7 @@ export interface AssistantChatPongMessage {
export type AssistantChatServerMessage = export type AssistantChatServerMessage =
| AssistantChatTextMessage | AssistantChatTextMessage
| AssistantChatToolCallMessage | AssistantChatToolCallMessage
| AssistantChatQuestionMessage
| AssistantChatResponseDoneMessage | AssistantChatResponseDoneMessage
| AssistantChatErrorMessage | AssistantChatErrorMessage
| AssistantChatConversationCreatedMessage | AssistantChatConversationCreatedMessage

View File

@@ -1271,6 +1271,186 @@
margin: 2rem 0; margin: 2rem 0;
} }
/* ============================================================================
Chat Prose Typography (for markdown in chat bubbles)
============================================================================ */
.chat-prose {
line-height: 1.6;
color: inherit;
}
.chat-prose > :first-child {
margin-top: 0;
}
.chat-prose > :last-child {
margin-bottom: 0;
}
.chat-prose h1 {
font-size: 1.25rem;
font-weight: 700;
margin-top: 1.25rem;
margin-bottom: 0.5rem;
}
.chat-prose h2 {
font-size: 1.125rem;
font-weight: 700;
margin-top: 1rem;
margin-bottom: 0.5rem;
}
.chat-prose h3 {
font-size: 1rem;
font-weight: 600;
margin-top: 0.75rem;
margin-bottom: 0.375rem;
}
.chat-prose h4,
.chat-prose h5,
.chat-prose h6 {
font-size: 0.875rem;
font-weight: 600;
margin-top: 0.75rem;
margin-bottom: 0.25rem;
}
.chat-prose p {
margin-bottom: 0.5rem;
}
.chat-prose ul,
.chat-prose ol {
margin-bottom: 0.5rem;
padding-left: 1.25rem;
}
.chat-prose ul {
list-style-type: disc;
}
.chat-prose ol {
list-style-type: decimal;
}
.chat-prose li {
margin-bottom: 0.25rem;
}
.chat-prose li > ul,
.chat-prose li > ol {
margin-top: 0.25rem;
margin-bottom: 0;
}
.chat-prose pre {
background: var(--muted);
border: 1px solid var(--border);
border-radius: var(--radius);
padding: 0.75rem;
overflow-x: auto;
margin-bottom: 0.5rem;
font-family: var(--font-mono);
font-size: 0.75rem;
line-height: 1.5;
}
.chat-prose code:not(pre code) {
background: var(--muted);
padding: 0.1rem 0.3rem;
border-radius: 0.25rem;
font-family: var(--font-mono);
font-size: 0.75rem;
}
.chat-prose table {
width: 100%;
border-collapse: collapse;
margin-bottom: 0.5rem;
font-size: 0.8125rem;
}
.chat-prose th {
background: var(--muted);
font-weight: 600;
text-align: left;
padding: 0.375rem 0.5rem;
border: 1px solid var(--border);
}
.chat-prose td {
padding: 0.375rem 0.5rem;
border: 1px solid var(--border);
}
.chat-prose blockquote {
border-left: 3px solid var(--primary);
padding-left: 0.75rem;
margin-bottom: 0.5rem;
font-style: italic;
opacity: 0.9;
}
.chat-prose a {
color: var(--primary);
text-decoration: underline;
text-underline-offset: 2px;
}
.chat-prose a:hover {
opacity: 0.8;
}
.chat-prose strong {
font-weight: 700;
}
.chat-prose hr {
border: none;
border-top: 1px solid var(--border);
margin: 0.75rem 0;
}
.chat-prose img {
max-width: 100%;
border-radius: var(--radius);
}
/* User message overrides - need contrast against primary-colored bubble */
.chat-prose-user pre {
background: rgb(255 255 255 / 0.15);
border-color: rgb(255 255 255 / 0.2);
}
.chat-prose-user code:not(pre code) {
background: rgb(255 255 255 / 0.15);
}
.chat-prose-user th {
background: rgb(255 255 255 / 0.15);
}
.chat-prose-user th,
.chat-prose-user td {
border-color: rgb(255 255 255 / 0.2);
}
.chat-prose-user blockquote {
border-left-color: rgb(255 255 255 / 0.5);
}
.chat-prose-user a {
color: inherit;
text-decoration: underline;
}
.chat-prose-user hr {
border-top-color: rgb(255 255 255 / 0.2);
}
/* ============================================================================ /* ============================================================================
Scrollbar Styling Scrollbar Styling
============================================================================ */ ============================================================================ */

View File

@@ -36,6 +36,8 @@ export default defineConfig({
'@radix-ui/react-slot', '@radix-ui/react-slot',
'@radix-ui/react-switch', '@radix-ui/react-switch',
], ],
// Markdown rendering
'vendor-markdown': ['react-markdown', 'remark-gfm'],
// Icons and utilities // Icons and utilities
'vendor-utils': [ 'vendor-utils': [
'lucide-react', 'lucide-react',