- Remove assets/ollama.png (duplicate of ui/public/ollama.png)
- Remove .claude/settings.local.json from tracking
- Add .claude/settings.local.json to .gitignore
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
pkill on BSD systems accepts multiple pattern operands. Previous code
only validated args[-1], allowing disallowed processes to slip through
when combined with allowed ones (e.g., "pkill node sshd" would only
check "sshd").
Now validates every non-flag argument to ensure no disallowed process
can be targeted. Added tests for multiple pattern scenarios.
Addresses CodeRabbit feedback on PR #101.
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Address CodeRabbit security feedback - restrict pkill_processes entries
to alphanumeric names with dots, underscores, and hyphens only.
This prevents potential exploitation through regex metacharacters like
'.*' being registered as process names.
Changes:
- Added VALID_PROCESS_NAME_PATTERN regex constant
- Updated both org and project config validation to:
- Normalize (trim whitespace) process names
- Reject names with regex metacharacters
- Reject names with spaces
- Added 3 new tests for regex validation
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Users can now access the WebUI remotely (e.g., via VS Code tunnels,
remote servers) by specifying a host address:
python start_ui.py --host 0.0.0.0
python start_ui.py --host 0.0.0.0 --port 8888
Changes:
- Added --host and --port CLI arguments to start_ui.py
- Security warning displayed when remote access is enabled
- AUTOCODER_ALLOW_REMOTE env var passed to server
- server/main.py conditionally disables localhost middleware
- CORS updated to allow all origins when remote access is enabled
- Browser auto-open disabled for remote hosts
Security considerations documented in warning:
- File system access to project directories
- API can start/stop agents and modify files
- Recommend firewall or VPN for protection
Fixes: leonvanzyl/autocoder#81
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
When running multiple projects simultaneously, UI would show mixed data
because the manager registry used only project_name as key. Projects with
the same name but different paths shared the same manager instance.
Changed manager registries to use composite key (project_name, resolved_path):
- server/services/process_manager.py: AgentProcessManager registry
- server/services/dev_server_manager.py: DevServerProcessManager registry
This ensures that:
- /old/my-app and /new/my-app get separate managers
- Multiple browser tabs viewing different projects stay isolated
- Project renames don't cause callback contamination
Fixes: leonvanzyl/autocoder#71
Also fixes: leonvanzyl/autocoder#62 (progress bar sync)
Also fixes: leonvanzyl/autocoder#61 (features missing in kanban)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Remove the testing_in_progress claim/release mechanism from the testing
agent architecture. Multiple testing agents can now test the same feature
concurrently, simplifying the system and eliminating potential stale lock
issues.
Changes:
- parallel_orchestrator.py:
- Remove claim_feature_for_testing() and release_testing_claim() methods
- Remove _cleanup_stale_testing_locks() periodic cleanup
- Replace with simple _get_random_passing_feature() selection
- Remove startup stale lock cleanup code
- Remove STALE_TESTING_LOCK_MINUTES constant
- Remove unused imports (timedelta, text)
- api/database.py:
- Remove testing_in_progress and last_tested_at columns from Feature model
- Update to_dict() to exclude these fields
- Convert _migrate_add_testing_columns() to no-op for backwards compat
- mcp_server/feature_mcp.py:
- Remove feature_release_testing tool entirely
- Remove unused datetime import
- prompts.py:
- Update testing prompt to remove feature_release_testing instruction
- Testing agents now just verify and exit (no cleanup needed)
- server/websocket.py:
- Update AgentTracker to use composite keys (feature_id, agent_type)
- Prevents ghost agent creation from ambiguous [Feature #X] messages
- Proper separation of coding vs testing agent tracking
Benefits:
- Eliminates artificial bottleneck from claim coordination
- No stale locks to clean up after crashes
- Simpler crash recovery (no testing state to restore)
- Reduced database writes (no claim/release transactions)
- Matches intended design: random, concurrent regression testing
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Previously, agents that completed their work would remain visible in the
Mission Control UI until a manual page refresh. This occurred because
the AgentTracker._handle_agent_complete method silently dropped completion
messages when an agent wasn't tracked (e.g., due to missed start messages
from WebSocket connection issues).
Backend changes:
- Modified _handle_agent_complete in server/websocket.py to always emit
completion messages, even for untracked agents
- Synthetic completions use agentIndex=-1 and agentName='Unknown' as
sentinel values to indicate untracked agents
Frontend changes:
- Updated useWebSocket.ts to handle synthetic completions by removing
agents by featureId when agentIndex is -1
- Added 30-minute stale agent cleanup as defense-in-depth for users who
leave the UI open for extended periods
- Updated TypeScript types to allow 'Unknown' as valid agent name
Component updates:
- AgentAvatar.tsx: Added UNKNOWN_COLORS and UnknownSVG fallback for
rendering unknown agents with a neutral gray question mark icon
- CelebrationOverlay.tsx, DependencyGraph.tsx: Updated interfaces to
accept 'Unknown' agent names
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Add real-time visibility into the parallel orchestrator's decisions
and state in the Mission Control UI. The orchestrator now has its
own avatar ("Maestro") and displays capacity/queue information.
Backend changes (server/websocket.py):
- Add OrchestratorTracker class that parses orchestrator stdout
- Define regex patterns for key orchestrator events (spawn, complete, capacity)
- Track coding/testing agent counts, ready queue, blocked features
- Emit orchestrator_update WebSocket messages
- Reset tracker state when agent stops or crashes
Frontend changes:
- Add OrchestratorState, OrchestratorStatus, OrchestratorEvent types
- Add WSOrchestratorUpdateMessage to WSMessage union
- Handle orchestrator_update in useWebSocket hook
- Create OrchestratorAvatar component (Maestro - robot conductor)
- Create OrchestratorStatusCard with capacity badges and event ticker
- Update AgentMissionControl to show orchestrator above agent cards
- Add conducting/baton-tap CSS animations for Maestro
The orchestrator status card shows:
- Maestro avatar with state-based animations
- Current orchestrator state and message
- Coding agents, testing agents, ready queue badges
- Blocked features count (when > 0)
- Collapsible recent events list
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Add cross-platform temporary_home() context manager to handle
environment variable differences between Unix and Windows systems.
Changes:
- Add temporary_home() context manager that handles both HOME (Unix)
and USERPROFILE/HOMEDRIVE/HOMEPATH (Windows) environment variables
- Update test_org_config_loading() to use temporary_home()
- Update test_hierarchy_resolution() to use temporary_home()
- Update test_org_blocklist_enforcement() to use temporary_home()
- Add missing imports: os, contextmanager
Why: The unit tests for org config loading were failing on Windows
because they only set the HOME environment variable, but Windows
uses USERPROFILE instead. The integration tests already had this
fix via a similar context manager.
Result: All 148 unit tests now pass on both Windows and Unix systems.
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Move mimetypes import to the top of the import block to satisfy
ruff's import sorting rules (I001). The Windows mimetype fix from
PR #82 placed the import after other imports, which violated the
project's linting standards.
Changes:
- Move `import mimetypes` to alphabetically correct position
- Update comment to clarify timing requirement
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Changes:
- Add temporary_home() context manager for safe HOME manipulation
- Handle both Unix (HOME) and Windows (USERPROFILE, HOMEDRIVE, HOMEPATH)
- Update test_org_blocklist_enforcement to use context manager
- Update test_org_allowlist_inheritance to use context manager
Benefits:
- Environment variables always restored, even on exceptions
- Prevents test pollution across test runs
- Cross-platform compatibility (Windows + Unix)
All 9 integration tests passing.
Changes:
- Support path patterns without ./ prefix (e.g., 'scripts/test.sh')
- Reject non-string or empty command names in org config
- Add 8 new test cases (5 for path patterns, 3 for validation)
Details:
- matches_pattern() now treats any pattern with '/' as a path pattern
- load_org_config() validates that cmd['name'] is a non-empty string
- All 148 unit tests + 9 integration tests passing
Security hardening: Prevents invalid command names from reaching
pattern matching logic, reducing attack surface.
Replace agent-initiated feature selection with orchestrator pre-selection
for both coding and testing agents. This ensures Mission Control displays
correct feature numbers for testing agents (previously showed "Feature #0").
Key changes:
MCP Server (mcp_server/feature_mcp.py):
- Add feature_get_by_id tool for agents to fetch assigned feature details
- Remove obsolete tools: feature_get_next, feature_claim_next,
feature_claim_for_testing, feature_get_for_regression
- Remove helper functions and unused imports (text, OperationalError, func)
Orchestrator (parallel_orchestrator.py):
- Change running_testing_agents from list to dict[int, Popen]
- Add claim_feature_for_testing() with random selection
- Add release_testing_claim() method
- Pass --testing-feature-id to spawned testing agents
- Use unified [Feature #X] output format for both agent types
Agent Entry Points:
- autonomous_agent_demo.py: Add --testing-feature-id CLI argument
- agent.py: Pass testing_feature_id to get_testing_prompt()
Prompt Templates:
- coding_prompt.template.md: Update to use feature_get_by_id
- testing_prompt.template.md: Update workflow for pre-assigned features
- prompts.py: Update pre-claimed headers for both agent types
WebSocket (server/websocket.py):
- Simplify tracking with unified [Feature #X] pattern
- Remove testing-specific parsing code
Assistant (server/services/assistant_chat_session.py):
- Update help text with current available tools
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Major refactoring of the parallel orchestrator to run regression testing
agents independently from coding agents. This improves system reliability
and provides better control over testing behavior.
Key changes:
Database & MCP Layer:
- Add testing_in_progress and last_tested_at columns to Feature model
- Add feature_claim_for_testing() for atomic test claim with retry
- Add feature_release_testing() to release claims after testing
- Refactor claim functions to iterative loops (no recursion)
- Add OperationalError retry handling for transient DB errors
- Reduce MAX_CLAIM_RETRIES from 10 to 5
Orchestrator:
- Decouple testing agent lifecycle from coding agents
- Add _maintain_testing_agents() for continuous testing maintenance
- Fix TOCTOU race in _spawn_testing_agent() - hold lock during spawn
- Add _cleanup_stale_testing_locks() with 30-min timeout
- Fix log ordering - start_session() before stale flag cleanup
- Add stale testing_in_progress cleanup on startup
Dead Code Removal:
- Remove count_testing_in_concurrency from entire stack (12+ files)
- Remove ineffective with_for_update() from features router
API & UI:
- Pass testing_agent_ratio via CLI to orchestrator
- Update testing prompt template to use new claim/release tools
- Rename UI label to "Regression Agents" with clearer description
- Add process_utils.py for cross-platform process tree management
Testing agents now:
- Run continuously as long as passing features exist
- Can re-test features multiple times to catch regressions
- Are controlled by fixed count (0-3) via testing_agent_ratio setting
- Have atomic claiming to prevent concurrent testing of same feature
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Changes:
- Increase command limit from 50 to 100 per project
- Add examples/OPTIMIZE_CONFIG.md with optimization strategies
- Update all documentation references (50 → 100)
- Update tests for new limit
Rationale:
- 50 was too restrictive for projects with many tools (Flutter, etc.)
- Users were unknowingly exceeding limit by listing subcommands
- 100 provides headroom while maintaining security
- New guide teaches wildcard optimization (flutter* vs listing each subcommand)
UI feedback idea: Show command count and optimization suggestions
(tracked for Phase 3 or future enhancement)
Add validation to reject bare wildcards for security:
- matches_pattern(): return False if pattern == '*'
- validate_project_command(): reject name == '*' with clear error
- Added 4 new tests for bare wildcard rejection
This prevents a config with from matching every command,
which would be a major security risk.
Tests: 140 unit tests passing (added 4 bare wildcard tests)
Changes:
- Revert incorrect import from claude_code_sdk to claude_agent_sdk in agent.py
(PR #50 introduced an undocumented change to a deprecated package)
- Clear activeAgents and recentActivity in useWebSocket when agent stops
to prevent stale UI state
The claude_code_sdk package is deprecated (last updated Sep 2025) while
claude_agent_sdk is the active, maintained package. The import change in
PR #50 was undocumented and would have caused ImportError since only
claude-agent-sdk is specified in requirements.txt.
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Performance improvements:
- Fix N+1 query in get_conversations() using COUNT subquery instead of
len(c.messages) which triggered lazy loading for each conversation
- Add SQLAlchemy engine caching to avoid creating new database connections
on every request
- Add React.memo to ChatMessage component to prevent unnecessary re-renders
during message streaming
- Move BOLD_REGEX to module scope to avoid recreating on each render
Code quality improvements:
- Remove 10+ console.log debug statements from AssistantChat.tsx and
AssistantPanel.tsx that were left from development
- Add user feedback for delete errors in ConversationHistory - dialog now
stays open and shows error message instead of silently failing
- Update ConfirmDialog to accept ReactNode for message prop to support
rich error content
These changes address issues identified in the code review of PR #74
(conversation history feature).
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
This commit fixes several issues identified in the agent scheduling
feature from PR #75:
Frontend Fixes:
- Add day boundary handling in timeUtils.ts for timezone conversions
- Add utcToLocalWithDayShift/localToUTCWithDayShift functions
- Add shiftDaysForward/shiftDaysBackward helpers for bitfield adjustment
- Update ScheduleModal to correctly adjust days_of_week when crossing
day boundaries during UTC conversion (fixes schedules running on
wrong days for users in extreme timezones like UTC+9)
Backend Fixes:
- Add MAX_SCHEDULES_PER_PROJECT (50) limit to prevent resource exhaustion
- Wire up crash recovery callback in scheduler_service._start_agent()
- Convert schedules.py endpoints to use context manager for DB sessions
- Fix race condition in override creation with atomic delete-then-create
- Replace deprecated datetime.utcnow with datetime.now(timezone.utc)
- Add DB-level CHECK constraints for Schedule model fields
Files Modified:
- api/database.py: Add _utc_now helper, CheckConstraint imports, constraints
- progress.py: Replace deprecated datetime.utcnow
- server/routers/schedules.py: Add context manager, schedule limits
- server/services/assistant_database.py: Replace deprecated datetime.utcnow
- server/services/scheduler_service.py: Wire crash recovery, fix race condition
- ui/src/components/ScheduleModal.tsx: Use day shift functions
- ui/src/lib/timeUtils.ts: Add day boundary handling functions
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
This fix addresses two root causes that caused testing agents to
accumulate (10-12 agents) instead of maintaining a 1:1 ratio with
coding agents:
1. Testing agents now exit after one session (agent.py)
- Added `or agent_type == "testing"` to the exit condition
- Previously, testing agents never hit the exit condition since
they're spawned with feature_id=None
2. Testing agents now spawn when coding agents START, not complete
- Moved spawn logic from _on_agent_complete() to start_feature()
- Removed the old spawn logic from _on_agent_complete()
- This ensures proper 1:1 ratio and prevents accumulation
Expected behavior after fix:
- First coding agent: no testing agent (no passing features yet)
- Subsequent coding agents: one testing agent spawns per start
- Each testing agent tests ONE feature then terminates immediately
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Security fixes to restore defense-in-depth after merging PR #78:
**client.py:**
- Revert permission mode from "bypassPermissions" to "acceptEdits"
- Remove redundant web_tools_auto_approve_hook from PreToolUse hooks
- Remove unused import of web_tools_auto_approve_hook
**security.py:**
- Remove web_tools_auto_approve_hook function (was redundant and
returned {} for ALL tools, not just WebFetch/WebSearch)
**server/services/spec_chat_session.py:**
- Restore allowed_tools restriction: [Read, Write, Edit, Glob,
WebFetch, WebSearch]
- Revert permission mode from "bypassPermissions" to "acceptEdits"
- Keeps setting_sources=["project", "user"] for global skills access
**ui/src/components/AgentAvatar.tsx:**
- Remove unused getMascotName export to fix React Fast Refresh warning
- File now only exports AgentAvatar component as expected
The bypassPermissions mode combined with unrestricted tool access in
spec_chat_session.py created a security gap where Bash commands could
execute without validation (sandbox disabled, no bash_security_hook).
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
UI Changes:
- Add "Create Spec with AI" button in empty kanban when project has no spec
- Button opens SpecCreationChat to guide users through spec creation
- Shows in Pending column when has_spec=false and no features exist
Windows Fixes:
- Fix asyncio subprocess NotImplementedError on Windows
- Set WindowsProactorEventLoopPolicy in server/__init__.py
- Remove --reload from uvicorn (incompatible with Windows subprocess)
- Add process cleanup on startup in start_ui.bat
Spec Chat Improvements:
- Enable full tool access (remove allowed_tools restriction)
- Add "user" to setting_sources for global skills access
- Use bypassPermissions mode for auto-approval
- Add WebFetch/WebSearch auto-approve hook
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Add comprehensive scheduling system that allows agents to automatically
start and stop during configured time windows, helping users manage
Claude API token limits by running agents during off-hours.
Backend Changes:
- Add Schedule and ScheduleOverride database models for persistent storage
- Implement APScheduler-based SchedulerService with UTC timezone support
- Add schedule CRUD API endpoints (/api/projects/{name}/schedules)
- Add manual override tracking to prevent unwanted auto-start/stop
- Integrate scheduler lifecycle with FastAPI startup/shutdown
- Fix timezone bug: explicitly set timezone=timezone.utc on CronTrigger
to ensure correct UTC scheduling (critical fix)
Frontend Changes:
- Add ScheduleModal component for creating and managing schedules
- Add clock button and schedule status display to AgentControl
- Add timezone utilities for converting between UTC and local time
- Add React Query hooks for schedule data fetching
- Fix 204 No Content handling in fetchJSON for delete operations
- Invalidate nextRun cache when manually stopping agent during window
- Add TypeScript type annotations to Terminal component callbacks
Features:
- Multiple overlapping schedules per project supported
- Auto-start at scheduled time via APScheduler cron jobs
- Auto-stop after configured duration
- Manual start/stop creates persistent overrides in database
- Crash recovery with exponential backoff (max 3 retries)
- Server restart preserves schedules and active overrides
- Times displayed in user's local timezone, stored as UTC
- Immediate start if schedule created during active window
Dependencies:
- Add APScheduler for reliable cron-like scheduling
Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Addresses concerns from PR #76 code review:
- Add exception handling for stat() calls to prevent crashes from race
conditions when files are deleted/modified during iteration
- Add 2-second timestamp tolerance for FAT32 filesystem compatibility
(FAT32 has 2-second mtime precision on USB drives/SD cards)
- Add config file checks (package.json, vite.config.ts, tailwind.config.ts,
tsconfig.json, etc.) that also require rebuilds when changed
- Add logging to show which file triggered the rebuild for debugging
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
The PR #77 introduced a bug where `is_first_run` was used in the
completion detection check, but this variable is only defined when
`agent_type is None`. When the orchestrator runs agents with explicit
`--agent-type` or `--feature-id`, the variable is undefined causing
a NameError crash.
Changed to use `is_initializer` which is always defined and has the
correct semantic meaning for this check.
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>