Changes:
- Add temporary_home() context manager for safe HOME manipulation
- Handle both Unix (HOME) and Windows (USERPROFILE, HOMEDRIVE, HOMEPATH)
- Update test_org_blocklist_enforcement to use context manager
- Update test_org_allowlist_inheritance to use context manager
Benefits:
- Environment variables always restored, even on exceptions
- Prevents test pollution across test runs
- Cross-platform compatibility (Windows + Unix)
All 9 integration tests passing.
Changes:
- Support path patterns without ./ prefix (e.g., 'scripts/test.sh')
- Reject non-string or empty command names in org config
- Add 8 new test cases (5 for path patterns, 3 for validation)
Details:
- matches_pattern() now treats any pattern with '/' as a path pattern
- load_org_config() validates that cmd['name'] is a non-empty string
- All 148 unit tests + 9 integration tests passing
Security hardening: Prevents invalid command names from reaching
pattern matching logic, reducing attack surface.
Changes:
- Increase command limit from 50 to 100 per project
- Add examples/OPTIMIZE_CONFIG.md with optimization strategies
- Update all documentation references (50 → 100)
- Update tests for new limit
Rationale:
- 50 was too restrictive for projects with many tools (Flutter, etc.)
- Users were unknowingly exceeding limit by listing subcommands
- 100 provides headroom while maintaining security
- New guide teaches wildcard optimization (flutter* vs listing each subcommand)
UI feedback idea: Show command count and optimization suggestions
(tracked for Phase 3 or future enhancement)
Add validation to reject bare wildcards for security:
- matches_pattern(): return False if pattern == '*'
- validate_project_command(): reject name == '*' with clear error
- Added 4 new tests for bare wildcard rejection
This prevents a config with from matching every command,
which would be a major security risk.
Tests: 140 unit tests passing (added 4 bare wildcard tests)
Changes:
- Revert incorrect import from claude_code_sdk to claude_agent_sdk in agent.py
(PR #50 introduced an undocumented change to a deprecated package)
- Clear activeAgents and recentActivity in useWebSocket when agent stops
to prevent stale UI state
The claude_code_sdk package is deprecated (last updated Sep 2025) while
claude_agent_sdk is the active, maintained package. The import change in
PR #50 was undocumented and would have caused ImportError since only
claude-agent-sdk is specified in requirements.txt.
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Performance improvements:
- Fix N+1 query in get_conversations() using COUNT subquery instead of
len(c.messages) which triggered lazy loading for each conversation
- Add SQLAlchemy engine caching to avoid creating new database connections
on every request
- Add React.memo to ChatMessage component to prevent unnecessary re-renders
during message streaming
- Move BOLD_REGEX to module scope to avoid recreating on each render
Code quality improvements:
- Remove 10+ console.log debug statements from AssistantChat.tsx and
AssistantPanel.tsx that were left from development
- Add user feedback for delete errors in ConversationHistory - dialog now
stays open and shows error message instead of silently failing
- Update ConfirmDialog to accept ReactNode for message prop to support
rich error content
These changes address issues identified in the code review of PR #74
(conversation history feature).
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
This commit fixes several issues identified in the agent scheduling
feature from PR #75:
Frontend Fixes:
- Add day boundary handling in timeUtils.ts for timezone conversions
- Add utcToLocalWithDayShift/localToUTCWithDayShift functions
- Add shiftDaysForward/shiftDaysBackward helpers for bitfield adjustment
- Update ScheduleModal to correctly adjust days_of_week when crossing
day boundaries during UTC conversion (fixes schedules running on
wrong days for users in extreme timezones like UTC+9)
Backend Fixes:
- Add MAX_SCHEDULES_PER_PROJECT (50) limit to prevent resource exhaustion
- Wire up crash recovery callback in scheduler_service._start_agent()
- Convert schedules.py endpoints to use context manager for DB sessions
- Fix race condition in override creation with atomic delete-then-create
- Replace deprecated datetime.utcnow with datetime.now(timezone.utc)
- Add DB-level CHECK constraints for Schedule model fields
Files Modified:
- api/database.py: Add _utc_now helper, CheckConstraint imports, constraints
- progress.py: Replace deprecated datetime.utcnow
- server/routers/schedules.py: Add context manager, schedule limits
- server/services/assistant_database.py: Replace deprecated datetime.utcnow
- server/services/scheduler_service.py: Wire crash recovery, fix race condition
- ui/src/components/ScheduleModal.tsx: Use day shift functions
- ui/src/lib/timeUtils.ts: Add day boundary handling functions
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
This fix addresses two root causes that caused testing agents to
accumulate (10-12 agents) instead of maintaining a 1:1 ratio with
coding agents:
1. Testing agents now exit after one session (agent.py)
- Added `or agent_type == "testing"` to the exit condition
- Previously, testing agents never hit the exit condition since
they're spawned with feature_id=None
2. Testing agents now spawn when coding agents START, not complete
- Moved spawn logic from _on_agent_complete() to start_feature()
- Removed the old spawn logic from _on_agent_complete()
- This ensures proper 1:1 ratio and prevents accumulation
Expected behavior after fix:
- First coding agent: no testing agent (no passing features yet)
- Subsequent coding agents: one testing agent spawns per start
- Each testing agent tests ONE feature then terminates immediately
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Security fixes to restore defense-in-depth after merging PR #78:
**client.py:**
- Revert permission mode from "bypassPermissions" to "acceptEdits"
- Remove redundant web_tools_auto_approve_hook from PreToolUse hooks
- Remove unused import of web_tools_auto_approve_hook
**security.py:**
- Remove web_tools_auto_approve_hook function (was redundant and
returned {} for ALL tools, not just WebFetch/WebSearch)
**server/services/spec_chat_session.py:**
- Restore allowed_tools restriction: [Read, Write, Edit, Glob,
WebFetch, WebSearch]
- Revert permission mode from "bypassPermissions" to "acceptEdits"
- Keeps setting_sources=["project", "user"] for global skills access
**ui/src/components/AgentAvatar.tsx:**
- Remove unused getMascotName export to fix React Fast Refresh warning
- File now only exports AgentAvatar component as expected
The bypassPermissions mode combined with unrestricted tool access in
spec_chat_session.py created a security gap where Bash commands could
execute without validation (sandbox disabled, no bash_security_hook).
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
UI Changes:
- Add "Create Spec with AI" button in empty kanban when project has no spec
- Button opens SpecCreationChat to guide users through spec creation
- Shows in Pending column when has_spec=false and no features exist
Windows Fixes:
- Fix asyncio subprocess NotImplementedError on Windows
- Set WindowsProactorEventLoopPolicy in server/__init__.py
- Remove --reload from uvicorn (incompatible with Windows subprocess)
- Add process cleanup on startup in start_ui.bat
Spec Chat Improvements:
- Enable full tool access (remove allowed_tools restriction)
- Add "user" to setting_sources for global skills access
- Use bypassPermissions mode for auto-approval
- Add WebFetch/WebSearch auto-approve hook
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Add comprehensive scheduling system that allows agents to automatically
start and stop during configured time windows, helping users manage
Claude API token limits by running agents during off-hours.
Backend Changes:
- Add Schedule and ScheduleOverride database models for persistent storage
- Implement APScheduler-based SchedulerService with UTC timezone support
- Add schedule CRUD API endpoints (/api/projects/{name}/schedules)
- Add manual override tracking to prevent unwanted auto-start/stop
- Integrate scheduler lifecycle with FastAPI startup/shutdown
- Fix timezone bug: explicitly set timezone=timezone.utc on CronTrigger
to ensure correct UTC scheduling (critical fix)
Frontend Changes:
- Add ScheduleModal component for creating and managing schedules
- Add clock button and schedule status display to AgentControl
- Add timezone utilities for converting between UTC and local time
- Add React Query hooks for schedule data fetching
- Fix 204 No Content handling in fetchJSON for delete operations
- Invalidate nextRun cache when manually stopping agent during window
- Add TypeScript type annotations to Terminal component callbacks
Features:
- Multiple overlapping schedules per project supported
- Auto-start at scheduled time via APScheduler cron jobs
- Auto-stop after configured duration
- Manual start/stop creates persistent overrides in database
- Crash recovery with exponential backoff (max 3 retries)
- Server restart preserves schedules and active overrides
- Times displayed in user's local timezone, stored as UTC
- Immediate start if schedule created during active window
Dependencies:
- Add APScheduler for reliable cron-like scheduling
Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Addresses concerns from PR #76 code review:
- Add exception handling for stat() calls to prevent crashes from race
conditions when files are deleted/modified during iteration
- Add 2-second timestamp tolerance for FAT32 filesystem compatibility
(FAT32 has 2-second mtime precision on USB drives/SD cards)
- Add config file checks (package.json, vite.config.ts, tailwind.config.ts,
tsconfig.json, etc.) that also require rebuilds when changed
- Add logging to show which file triggered the rebuild for debugging
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
The PR #77 introduced a bug where `is_first_run` was used in the
completion detection check, but this variable is only defined when
`agent_type is None`. When the orchestrator runs agents with explicit
`--agent-type` or `--feature-id`, the variable is undefined causing
a NameError crash.
Changed to use `is_initializer` which is always defined and has the
correct semantic meaning for this check.
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Introduce a new testing agent architecture that runs regression tests
independently from coding agents, improving quality assurance in
parallel mode.
Key changes:
Testing Agent System:
- Add testing_prompt.template.md for dedicated testing agent role
- Add feature_mark_failing MCP tool for regression detection
- Add --agent-type flag to select initializer/coding/testing mode
- Remove regression testing from coding prompt (now handled by testing agents)
Parallel Orchestrator Enhancements:
- Add testing agent spawning with configurable ratio (--testing-agent-ratio)
- Add comprehensive debug logging system (DebugLog class)
- Improve database session management to prevent stale reads
- Add engine.dispose() calls to refresh connections after subprocess commits
- Fix f-string linting issues (remove unnecessary f-prefixes)
UI Improvements:
- Add testing agent mascot (Chip) to AgentAvatar
- Enhance AgentCard to display testing agent status
- Add testing agent ratio slider in SettingsModal
- Update WebSocket handling for testing agent updates
- Improve ActivityFeed to show testing agent activity
API & Server Updates:
- Add testing_agent_ratio to settings schema and endpoints
- Update process manager to support testing agent type
- Enhance WebSocket messages for agent_update events
Template Changes:
- Delete coding_prompt_yolo.template.md (consolidated into main prompt)
- Update initializer_prompt.template.md with improved structure
- Streamline coding_prompt.template.md workflow
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
The agent was running in an infinite loop when all kanban features were
completed. This happened because:
1. The main loop in agent.py had no completion detection
2. The coding prompt instructs Claude to run regression tests BEFORE
checking for new features
3. feature_get_next() returns "All features passing!" but nothing acted on it
This fix adds three completion checks:
1. Pre-loop check: Exits immediately if project is already 100% complete
when the agent starts (avoids running unnecessary sessions)
2. Post-session check: After each session, checks if all features are now
passing and exits gracefully with a success message
3. Single-feature mode: Exits after one session since the parallel
orchestrator manages spawning new agents for other features
Tested with a project that had 240/240 features passing - agent now exits
immediately with "ALL FEATURES ALREADY COMPLETE!" message.
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Enhance the build_frontend() function to detect when source files have
been modified more recently than the newest file in dist/ directory.
This ensures the UI is rebuilt automatically when source code changes,
preventing stale UI from being served after pulling updates or switching
branches.
Changes:
- Find newest modification time among all files in ui/dist/
- Compare each source file in ui/src/ against newest dist file
- Trigger rebuild if any source file is newer than newest dist file
- Handle edge case when dist/ exists but contains no files
- Prevent serving outdated JavaScript bundles after code changes
This fix applies to all UI launch methods (start_ui.sh, start_ui.bat)
since they all call start_ui.py which contains the build logic.
Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
The parallel orchestrator was exiting prematurely with "All features
complete!" while pending features remained. This was caused by SQLAlchemy
session caching not seeing commits made by agent subprocesses.
Changes:
- Add session.expire_all() to get_resumable_features() to force fresh reads
- Add session.expire_all() to get_ready_features() to force fresh reads
- Add session.expire_all() to get_all_complete() to force fresh reads
- Add defensive retry logic in run_loop() when no features are ready
but nothing is running - now forces a fresh check before declaring blocked
- Add debug logging to get_all_complete() and get_ready_features() to
track passing/pending/in_progress counts for easier diagnosis
The root cause was cross-process database visibility: when an agent
subprocess committed feature completion, the orchestrator's session
had cached the old state and didn't see the update.
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Remove unused imports and organize import statements to pass ruff
linting checks:
- mcp_server/feature_mcp.py: Remove unused imports (are_dependencies_satisfied,
get_blocking_dependencies) and alphabetize import block
- parallel_orchestrator.py: Remove unused imports (time, Awaitable) and
add blank lines between import groups per PEP 8
- server/routers/features.py: Alphabetize imports in dependency resolver
These changes were identified by running `ruff check .` and auto-fixed
with `--fix` flag.
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Critical fixes:
- Lock file TOCTOU race condition: Use atomic O_CREAT|O_EXCL for lock creation
- PID reuse vulnerability on Windows: Store PID:CREATE_TIME in lock file to
detect when a different process has reused the same PID
- WAL mode on network drives: Detect network paths (UNC, mapped drives, NFS,
CIFS) and fall back to DELETE journal mode to prevent corruption
High priority fixes:
- JSON migration now preserves dependencies field during legacy migration
- Process tree termination on Windows: Use psutil to kill child processes
recursively to prevent orphaned browser instances
- Retry backoff jitter: Add random 30% jitter to prevent synchronized retries
under high contention with 5 concurrent agents
Files changed:
- server/services/process_manager.py: Atomic lock creation, PID+create_time
- api/database.py: Network filesystem detection for WAL mode fallback
- api/migration.py: Add dependencies field to JSON migration
- parallel_orchestrator.py: _kill_process_tree helper function
- mcp_server/feature_mcp.py: Add jitter to exponential backoff
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Memoize onNodeClick callback in App.tsx to prevent unnecessary re-renders
- Add useRef pattern in DependencyGraph to store callback without triggering
useMemo recalculation when callback identity changes
- Add hash-based change detection to only update ReactFlow state when
actual graph data changes (node status, edges), not on every parent render
- Add GraphErrorBoundary class component to catch ReactFlow rendering errors
and provide a "Reload Graph" recovery button instead of blank screen
- Wrap DependencyGraph with error boundary and resetKey for graceful recovery
The root cause was frequent WebSocket updates during active agent sessions
causing parent re-renders, which created new inline callback functions,
triggering useMemo/useEffect chains that corrupted ReactFlow's internal state
over time (approximately 1 minute of continuous updates).
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Changes:
- Add per-agent log viewer with copy-to-clipboard functionality
- New AgentLogEntry type for structured log entries
- Logs stored per-agent in WebSocket state (up to 500 entries)
- Log modal rendered via React Portal to avoid overflow issues
- Click log icon on agent card to view full activity history
- Fix agents getting stuck in "failed" state
- Wrap client context manager in try/except (agent.py)
- Remove failed agents from UI on error state (useWebSocket.ts)
- Handle permanently failed features in get_all_complete()
- Add friendlier agent state labels
- "Hit an issue" → "Trying plan B..."
- "Retrying..." → "Being persistent..."
- Softer colors (yellow/orange instead of red)
- Add scheduling scores for smarter feature ordering
- compute_scheduling_scores() in dependency_resolver.py
- Features that unblock others get higher priority
- Update CLAUDE.md with parallel mode documentation
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Major feature implementation for parallel agent execution with dependency-aware
scheduling and an engaging multi-agent UI experience.
Backend Changes:
- Add parallel_orchestrator.py for concurrent feature processing
- Add api/dependency_resolver.py with cycle detection (Kahn's algorithm + DFS)
- Add atomic feature_claim_next() with retry limit and exponential backoff
- Fix circular dependency check arguments in 4 locations
- Add AgentTracker class for parsing agent output and emitting updates
- Add browser isolation with --isolated flag for Playwright MCP
- Extend WebSocket protocol with agent_update messages and log attribution
- Add WSAgentUpdateMessage schema with agent states and mascot names
- Fix WSProgressMessage to include in_progress field
New UI Components:
- AgentMissionControl: Dashboard showing active agents with collapsible activity
- AgentCard: Individual agent status with avatar and thought bubble
- AgentAvatar: SVG mascots (Spark, Fizz, Octo, Hoot, Buzz) with animations
- ActivityFeed: Recent activity stream with stable keys (no flickering)
- CelebrationOverlay: Confetti animation with click/Escape dismiss
- DependencyGraph: Interactive node graph visualization with dagre layout
- DependencyBadge: Visual indicator for feature dependencies
- ViewToggle: Switch between Kanban and Graph views
- KeyboardShortcutsHelp: Help overlay accessible via ? key
UI/UX Improvements:
- Celebration queue system to handle rapid success messages
- Accessibility attributes on AgentAvatar (role, aria-label, aria-live)
- Collapsible Recent Activity section with persisted preference
- Agent count display in header
- Keyboard shortcut G to toggle Kanban/Graph view
- Real-time thought bubbles and state animations
Bug Fixes:
- Fix circular dependency validation (swapped source/target arguments)
- Add MAX_CLAIM_RETRIES=10 to prevent stack overflow under contention
- Fix THOUGHT_PATTERNS to match actual [Tool: name] format
- Fix ActivityFeed key prop to prevent re-renders on new items
- Add featureId/agentIndex to log messages for proper attribution
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Fix duplicate onConversationCreated callbacks by tracking activeConversationId
- Fix history loss when switching conversations with Map-based deduplication
- Disable input while conversation is loading to prevent message routing issues
- Gate WebSocket debug logs behind DEV flag (import.meta.env.DEV)
- Downgrade server logging from info to debug level for reduced noise
- Fix .gitignore prefixes for playwright paths (ui/playwright-report/, ui/test-results/)
- Remove debug console.log from ConversationHistory.tsx
- Add staleTime (30s) to single conversation query for better caching
- Increase history message cap from 20 to 35 for better context
- Replace fixed timeouts with condition-based waits in e2e tests
- Add ConversationHistory dropdown component with list of past conversations
- Add useConversations hook for fetching and managing conversations via React Query
- Implement conversation switching with proper state management
- Fix bug where reopening panel showed new greeting instead of resuming conversation
- Fix bug where selecting from history caused conversation ID to revert
- Add server-side history context loading for resumed conversations
- Add Playwright E2E tests for conversation history feature
- Add logging for debugging conversation flow
Key changes:
- AssistantPanel: manages conversation state with localStorage persistence
- AssistantChat: header with [+] New Chat and [History] buttons
- Server: skips greeting for resumed conversations, loads history context on first message
- Fixed race condition in onConversationCreated callback
Complete the defense-in-depth approach from PR #53 by adding explicit
in_progress=False to all remaining feature creation locations. This
ensures consistency with the MCP server pattern and prevents potential
NULL values in the in_progress field.
Changes:
- server/routers/features.py: Add in_progress=False to create_feature()
and create_features_bulk() endpoints
- server/services/expand_chat_session.py: Add in_progress=False to
_create_features_bulk() in the expand chat session
- api/migration.py: Add in_progress field handling in JSON migration,
reading from source data with False as default
This follows up on PR #53 which added nullable=False constraints and
fixed existing NULL values, but only updated the MCP server creation
paths. Now all 6 feature creation locations explicitly set both
passes=False and in_progress=False.
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Replace JSON output approach with direct MCP tool call in the
/expand-project CLI command to fix feature persistence issue.
Changes:
- Update expand-project.md to call feature_create_bulk MCP tool
- Remove <features_to_create> JSON tag output (not parsed in CLI)
- Simplify confirmation message before feature creation
Why:
- CLI users running /expand-project had features displayed but never
saved to database because nothing parsed the JSON output
- Web UI is unaffected (uses expand_chat_session.py with its own parser)
- This mirrors how initializer_prompt.template.md already works
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- ChatMessage: use CSS variable syntax for bg-neo-accent and text consistency
- DebugLogViewer: fix info log level to use --color-neo-log-info
- TerminalTabs: use neo-hover-subtle for hover states instead of text color
- globals.css: fix shimmer effect selector to target .neo-progress-fill
- globals.css: fix loading spinner visibility with explicit border color
- globals.css: add will-change for .neo-btn-yolo performance
- App.tsx: group constants after imports
- NewProjectModal: remove redundant styling (neo-card provides these)
- Add tsconfig.tsbuildinfo to .gitignore and remove from tracking
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Add the ability for users to edit features that are not yet completed,
allowing them to provide corrections or additional instructions when the
agent is stuck or implementing a feature incorrectly.
Backend changes:
- Add FeatureUpdate schema in server/schemas.py with optional fields
- Add PATCH /api/projects/{project_name}/features/{feature_id} endpoint
- Validate that completed features (passes=True) cannot be edited
Frontend changes:
- Add FeatureUpdate type in ui/src/lib/types.ts
- Add updateFeature() API function in ui/src/lib/api.ts
- Add useUpdateFeature() React Query mutation hook
- Create EditFeatureForm.tsx component with pre-filled form values
- Update FeatureModal.tsx with Edit button for non-completed features
The edit form allows modifying category, name, description, priority,
and test steps. Save button is disabled until changes are detected.
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>