autocoder

mirror of https://github.com/leonvanzyl/autocoder.git synced 2026-02-01 15:03:36 +00:00

Author	SHA1	Message	Date
Auto	1607fc8175	feat: add multi-feature batching for coding agents Enable the orchestrator to assign 1-3 features per coding agent subprocess, selected via dependency chain extension + same-category fill. This reduces cold-start overhead and leverages shared context across related features. Orchestrator (parallel_orchestrator.py): - Add batch tracking: _batch_features and _feature_to_primary data structures - Add build_feature_batches() with dependency chain + category fill algorithm - Add start_feature_batch() and _spawn_coding_agent_batch() methods - Update _on_agent_complete() for batch cleanup across all features - Update stop_feature() with _feature_to_primary lookup - Update get_ready_features() to exclude all batch feature IDs - Update main loop to build batches then spawn per available slot CLI and agent layer: - Add --feature-ids (comma-separated) and --batch-size CLI args - Add feature_ids parameter to run_autonomous_agent() with batch prompt selection - Add get_batch_feature_prompt() with sequential workflow instructions WebSocket layer (server/websocket.py): - Add BATCH_CODING_AGENT_START_PATTERN and BATCH_FEATURES_COMPLETE_PATTERN - Add _handle_batch_agent_start() and _handle_batch_agent_complete() methods - Add featureIds field to all agent_update messages - Track current_feature_id updates as agent moves through batch Frontend (React UI): - Add featureIds to ActiveAgent and WSAgentUpdateMessage types - Update KanbanColumn and DependencyGraph agent-feature maps for batch - Update AgentCard to show "Batch: #X, #Y, #Z" with active feature highlight - Add "Features per Agent" segmented control (1-3) in SettingsModal Settings integration (full stack): - Add batch_size to schemas, settings router, agent router, process manager - Default batch_size=3, user-configurable 1-3 via settings UI - batch_size=1 is functionally identical to pre-batching behavior Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2026-02-01 16:35:07 +02:00
Auto	94e0b05cb1	refactor: optimize token usage, deduplicate code, fix bugs across agents Token reduction (~40% per session, ~2.3M fewer tokens per 200-feature project): - Agent-type-specific tool lists: coding 9, testing 5, init 5 (was 19 for all) - Right-sized max_turns: coding 300, testing 100 (was 1000 for all) - Trimmed coding prompt template (~150 lines removed) - Streamlined testing prompt with batch support - YOLO mode now strips browser testing instructions from prompt - Added Grep, WebFetch, WebSearch to expand project session Performance improvements: - Rate limit retries start at ~15s with jitter (was fixed 60s) - Post-spawn delay reduced to 0.5s (was 2s) - Orchestrator consolidated to 1 DB query per loop (was 5-7) - Testing agents batch 3 features per session (was 1) - Smart context compaction preserves critical state, discards noise Bug fixes: - Removed ghost feature_release_testing MCP tool (wasted tokens every test session) - Forward all 9 Vertex AI env vars to chat sessions (was missing 3) - Fix DetachedInstanceError risk in test batch ORM access - Prevent duplicate testing of same features in parallel mode Code deduplication: - _get_project_path(): 9 copies -> 1 shared utility (project_helpers.py) - validate_project_name(): 9 copies -> 2 variants in 1 file (validation.py) - ROOT_DIR: 10 copies -> 1 definition (chat_constants.py) - API_ENV_VARS: 4 copies -> 1 source of truth (env_constants.py) Security hardening: - Unified sensitive directory blocklist (14 dirs, was two divergent lists) - Cached get_blocked_paths() for O(1) directory listing checks - Terminal security warning when ALLOW_REMOTE=1 exposes WebSocket - 20 new security tests for EXTRA_READ_PATHS blocking - Extracted _validate_command_list() and _validate_pkill_processes() helpers Type safety: - 87 mypy errors -> 0 across 58 source files - Installed types-PyYAML for proper yaml stub types - Fixed SQLAlchemy Column[T] coercions across all routers Dead code removed: - 13 files deleted (~2,679 lines): unused UI components, debug logs, outdated docs - 7 unused npm packages removed (Radix UI components with 0 imports) - AgentAvatar.tsx reduced from 615 -> 119 lines (SVGs extracted to mascotData.tsx) New CLI options: - --testing-batch-size (1-5) for parallel mode test batching - --testing-feature-ids for direct multi-feature testing Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2026-02-01 13:16:24 +02:00
Auto	c4d0c6c9b2	fix: address rate limit detection false positives and reset-time cap - Narrow `\boverloaded\b` regex to require server/api/system context, preventing false positives when Claude discusses method/operator overloading in OOP code (C++, Java, C#, etc.) - Restore 24-hour cap for absolute reset-time delays instead of 1-hour clamp, avoiding unnecessary retry loops when rate limits reset hours in the future - Add test for Retry-After: 0 returning 0 (regression lock for the `is not None` fix) - Add false positive tests for "overloaded" in programming context Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2026-02-01 10:39:07 +02:00
cabana8471	89f6721cfa	fix: use clamp_retry_delay() for reset-time delays Use the shared clamp_retry_delay() function (1-hour cap) for parsed reset-time delays instead of a separate 24-hour cap. This aligns with the PR's consistent 1-hour maximum delay objective. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2026-01-30 21:47:11 +01:00
cabana8471	88c695259f	fix: address 3 new CodeRabbit review comments 1. agent.py: Reset opposite retry counter when entering rate_limit or error status to prevent mixed events from inflating delays 2. rate_limit_utils.py: Fix parse_retry_after() regex to reject minute/hour units - patterns now require explicit "seconds"/"s" unit or end of string 3. test_rate_limit_utils.py: Add tests for "retry after 5 minutes" and other minute/hour variants to ensure they return None Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2026-01-30 21:41:01 +01:00
cabana8471	f018b4c1d8	fix: address PR #109 review feedback from leonvanzyl - BLOCKER: Remove clear_stuck_features import and psutil block (doesn't exist in upstream) - Fix overly broad rate limit patterns to avoid false positives - Remove "please wait", "try again later", "limit reached", "429" (bare) - Convert to regex-based detection with word boundaries - Add patterns for "http 429", "status 429", "error 429" - Add bounds checking (1-3600s) for parsed retry delays - Use is_rate_limit_error() consistently instead of inline pattern matching - Extract backoff functions to rate_limit_utils.py for testability - calculate_rate_limit_backoff() for exponential backoff - calculate_error_backoff() for linear backoff - clamp_retry_delay() for safe range enforcement - Rename test_agent.py to test_rate_limit_utils.py (matches module) - Add comprehensive false-positive tests: - Version numbers (v14.29.0) - Issue/PR numbers (#429) - Line numbers (file.py:429) - Port numbers (4293) - Legitimate wait/retry messages Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2026-01-30 21:20:52 +01:00
cabana8471	cf8dec9abf	fix: address CodeRabbit review - extract rate limit logic to shared module - Create rate_limit_utils.py with shared constants and functions - Update agent.py to import from shared module - Update test_agent.py to import from shared module (removes duplication) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2026-01-27 06:58:56 +01:00
cabana8471	ff1a63d104	fix: address CodeRabbit review feedback - Fix comment: "exponential" -> "linear" for error backoff (30 * retries) - Fix rate limit counter reset: only reset when no rate limit signal detected - Apply exponential backoff to rate limit in response text (not just exceptions) - Use explicit `is not None` check for retry_seconds to handle Retry-After: 0 Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2026-01-27 06:32:07 +01:00
cabana8471	bf194ad72f	fix: improve rate limit handling with exponential backoff When Claude API hits rate limits via HTTP 429 exceptions (rather than response text), the agent now properly detects and handles them: - Add RATE_LIMIT_PATTERNS constant for comprehensive detection - Add parse_retry_after() to extract wait times from error messages - Add is_rate_limit_error() helper for pattern matching - Return new "rate_limit" status from exception handler - Implement exponential backoff: 60s → 120s → 240s... (max 1 hour) - Improve generic error backoff: 30s → 60s → 90s... (max 5 minutes) - Expand text-based detection patterns in response handling - Add unit tests for new functions Fixes #41 Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2026-01-26 22:56:57 +01:00
Auto	b00eef5eca	refactor: orchestrator pre-selects features for all agents Replace agent-initiated feature selection with orchestrator pre-selection for both coding and testing agents. This ensures Mission Control displays correct feature numbers for testing agents (previously showed "Feature #0"). Key changes: MCP Server (mcp_server/feature_mcp.py): - Add feature_get_by_id tool for agents to fetch assigned feature details - Remove obsolete tools: feature_get_next, feature_claim_next, feature_claim_for_testing, feature_get_for_regression - Remove helper functions and unused imports (text, OperationalError, func) Orchestrator (parallel_orchestrator.py): - Change running_testing_agents from list to dict[int, Popen] - Add claim_feature_for_testing() with random selection - Add release_testing_claim() method - Pass --testing-feature-id to spawned testing agents - Use unified [Feature #X] output format for both agent types Agent Entry Points: - autonomous_agent_demo.py: Add --testing-feature-id CLI argument - agent.py: Pass testing_feature_id to get_testing_prompt() Prompt Templates: - coding_prompt.template.md: Update to use feature_get_by_id - testing_prompt.template.md: Update workflow for pre-assigned features - prompts.py: Update pre-claimed headers for both agent types WebSocket (server/websocket.py): - Simplify tracking with unified [Feature #X] pattern - Remove testing-specific parsing code Assistant (server/services/assistant_chat_session.py): - Update help text with current available tools Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2026-01-22 16:24:48 +02:00
Auto	29c6b252a9	fix: correct SDK import and clear stale agent UI on stop Changes: - Revert incorrect import from claude_code_sdk to claude_agent_sdk in agent.py (PR #50 introduced an undocumented change to a deprecated package) - Clear activeAgents and recentActivity in useWebSocket when agent stops to prevent stale UI state The claude_code_sdk package is deprecated (last updated Sep 2025) while claude_agent_sdk is the active, maintained package. The import change in PR #50 was undocumented and would have caused ImportError since only claude-agent-sdk is specified in requirements.txt. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2026-01-22 09:39:24 +02:00
Leon van Zyl	a71406c2b5	Merge pull request #50 from kunalnano/fix/agent-completion-exit fix: exit agent loop when all features pass	2026-01-22 09:35:56 +02:00
Auto	33e9f38633	fix: prevent testing agents from running indefinitely This fix addresses two root causes that caused testing agents to accumulate (10-12 agents) instead of maintaining a 1:1 ratio with coding agents: 1. Testing agents now exit after one session (agent.py) - Added `or agent_type == "testing"` to the exit condition - Previously, testing agents never hit the exit condition since they're spawned with feature_id=None 2. Testing agents now spawn when coding agents START, not complete - Moved spawn logic from _on_agent_complete() to start_feature() - Removed the old spawn logic from _on_agent_complete() - This ensures proper 1:1 ratio and prevents accumulation Expected behavior after fix: - First coding agent: no testing agent (no passing features yet) - Subsequent coding agents: one testing agent spawns per start - Each testing agent tests ONE feature then terminates immediately Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2026-01-22 08:23:47 +02:00
Auto	6c8b463891	fix: use is_initializer instead of undefined is_first_run variable The PR #77 introduced a bug where `is_first_run` was used in the completion detection check, but this variable is only defined when `agent_type is None`. When the orchestrator runs agents with explicit `--agent-type` or `--feature-id`, the variable is undefined causing a NameError crash. Changed to use `is_initializer` which is always defined and has the correct semantic meaning for this check. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2026-01-19 09:03:38 +02:00
Leon van Zyl	0391df8fb2	Merge pull request #77 from rohitpalod/fix/agent-completion-detection fix: add completion detection to prevent infinite loop when all features done	2026-01-19 09:00:30 +02:00
Auto	13128361b0	feat: add dedicated testing agents and enhanced parallel orchestration Introduce a new testing agent architecture that runs regression tests independently from coding agents, improving quality assurance in parallel mode. Key changes: Testing Agent System: - Add testing_prompt.template.md for dedicated testing agent role - Add feature_mark_failing MCP tool for regression detection - Add --agent-type flag to select initializer/coding/testing mode - Remove regression testing from coding prompt (now handled by testing agents) Parallel Orchestrator Enhancements: - Add testing agent spawning with configurable ratio (--testing-agent-ratio) - Add comprehensive debug logging system (DebugLog class) - Improve database session management to prevent stale reads - Add engine.dispose() calls to refresh connections after subprocess commits - Fix f-string linting issues (remove unnecessary f-prefixes) UI Improvements: - Add testing agent mascot (Chip) to AgentAvatar - Enhance AgentCard to display testing agent status - Add testing agent ratio slider in SettingsModal - Update WebSocket handling for testing agent updates - Improve ActivityFeed to show testing agent activity API & Server Updates: - Add testing_agent_ratio to settings schema and endpoints - Update process manager to support testing agent type - Enhance WebSocket messages for agent_update events Template Changes: - Delete coding_prompt_yolo.template.md (consolidated into main prompt) - Update initializer_prompt.template.md with improved structure - Streamline coding_prompt.template.md workflow Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2026-01-18 13:49:50 +02:00
Rohit Palod	ffdd97a3f7	fix: add completion detection to prevent infinite loop when all features done The agent was running in an infinite loop when all kanban features were completed. This happened because: 1. The main loop in agent.py had no completion detection 2. The coding prompt instructs Claude to run regression tests BEFORE checking for new features 3. feature_get_next() returns "All features passing!" but nothing acted on it This fix adds three completion checks: 1. Pre-loop check: Exits immediately if project is already 100% complete when the agent starts (avoids running unnecessary sessions) 2. Post-session check: After each session, checks if all features are now passing and exits gracefully with a success message 3. Single-feature mode: Exits after one session since the parallel orchestrator manages spawning new agents for other features Tested with a project that had 240/240 features passing - agent now exits immediately with "ALL FEATURES ALREADY COMPLETE!" message. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2026-01-18 12:59:37 +05:30
Auto	bf3a6b0b73	feat: add per-agent logging UI and fix stuck agent issues Changes: - Add per-agent log viewer with copy-to-clipboard functionality - New AgentLogEntry type for structured log entries - Logs stored per-agent in WebSocket state (up to 500 entries) - Log modal rendered via React Portal to avoid overflow issues - Click log icon on agent card to view full activity history - Fix agents getting stuck in "failed" state - Wrap client context manager in try/except (agent.py) - Remove failed agents from UI on error state (useWebSocket.ts) - Handle permanently failed features in get_all_complete() - Add friendlier agent state labels - "Hit an issue" → "Trying plan B..." - "Retrying..." → "Being persistent..." - Softer colors (yellow/orange instead of red) - Add scheduling scores for smarter feature ordering - compute_scheduling_scores() in dependency_resolver.py - Features that unblock others get higher priority - Update CLAUDE.md with parallel mode documentation Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2026-01-17 14:11:24 +02:00
Auto	85f6940a54	feat: add concurrent agents with dependency system and delightful UI Major feature implementation for parallel agent execution with dependency-aware scheduling and an engaging multi-agent UI experience. Backend Changes: - Add parallel_orchestrator.py for concurrent feature processing - Add api/dependency_resolver.py with cycle detection (Kahn's algorithm + DFS) - Add atomic feature_claim_next() with retry limit and exponential backoff - Fix circular dependency check arguments in 4 locations - Add AgentTracker class for parsing agent output and emitting updates - Add browser isolation with --isolated flag for Playwright MCP - Extend WebSocket protocol with agent_update messages and log attribution - Add WSAgentUpdateMessage schema with agent states and mascot names - Fix WSProgressMessage to include in_progress field New UI Components: - AgentMissionControl: Dashboard showing active agents with collapsible activity - AgentCard: Individual agent status with avatar and thought bubble - AgentAvatar: SVG mascots (Spark, Fizz, Octo, Hoot, Buzz) with animations - ActivityFeed: Recent activity stream with stable keys (no flickering) - CelebrationOverlay: Confetti animation with click/Escape dismiss - DependencyGraph: Interactive node graph visualization with dagre layout - DependencyBadge: Visual indicator for feature dependencies - ViewToggle: Switch between Kanban and Graph views - KeyboardShortcutsHelp: Help overlay accessible via ? key UI/UX Improvements: - Celebration queue system to handle rapid success messages - Accessibility attributes on AgentAvatar (role, aria-label, aria-live) - Collapsible Recent Activity section with persisted preference - Agent count display in header - Keyboard shortcut G to toggle Kanban/Graph view - Real-time thought bubbles and state animations Bug Fixes: - Fix circular dependency validation (swapped source/target arguments) - Add MAX_CLAIM_RETRIES=10 to prevent stack overflow under contention - Fix THOUGHT_PATTERNS to match actual [Tool: name] format - Fix ActivityFeed key prop to prevent re-renders on new items - Add featureId/agentIndex to log messages for proper attribution Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2026-01-17 12:59:42 +02:00
Al Sharma	3d97cbf24b	fix: exit agent loop when all features pass Previously, the autonomous agent would continue running indefinitely even after all features passed verification. The agent would enter a verification loop, repeatedly checking 'All features are passing!' without ever exiting. This fix detects the completion message from feature_get_next() and gracefully exits the main loop with a victory banner, preventing unnecessary API calls and resource consumption. Fixes infinite loop when project reaches 100% completion.	2026-01-12 14:34:42 -06:00
Corey Cauble	9c07dd72db	Fixed issues requested by coderabbitai Applied Fixes More flexible string matching: Changed from response.lower().strip().startswith("limit reached") to "limit reached" in response.lower() to handle cases where the message has prefix text or variations in whitespace. Improved regex pattern: Updated to r"(?i)\bresets(?:\s+at)?\s+(\d+)(?::(\d+))?\s(am\|pm)\s\(([^)]+)\)" which now handles: Optional "at" after "resets" (e.g., "resets at 3pm" or "resets 3pm") Flexible whitespace around components Word boundaries to prevent partial matches Timezone sanitization: Added .strip() to tz_name = match.group(4).strip() to remove any leading/trailing whitespace that could cause ZoneInfo() to fail. Safety clamp: Added delay_seconds = min(delta.total_seconds(), 24 * 60 * 60) to ensure the delay never exceeds 24 hours, preventing the agent from being stuck waiting for extremely long periods.	2026-01-09 14:35:20 -08:00
Corey Cauble	2b2e28a2c5	Enhance limit reached message for better context and display in the UI	2026-01-09 14:30:34 -08:00
Corey Cauble	7f436a467b	Implement reset time parsing for auto-continue Added functionality to parse and handle reset time for auto-continue based on agent response when Limit is reached for Claude Code SDK	2026-01-09 12:34:57 -08:00
Auto	122f03dc21	feat: Add GitHub Actions CI for PR protection - Add CI workflow with Python (ruff lint, security tests) and UI (ESLint, TypeScript, build) jobs - Add ruff, mypy, pytest to requirements.txt - Add pyproject.toml with ruff configuration - Fix import sorting across Python files (ruff --fix) - Fix test_security.py expectations to match actual security policy - Remove invalid 'eof' command from ALLOWED_COMMANDS 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2026-01-07 10:35:19 +02:00
Auto	05607b310a	feat: Add YOLO mode for rapid prototyping without browser testing Add a new YOLO (You Only Live Once) mode that skips all browser testing and regression tests for faster feature iteration during prototyping. Changes made: Core YOLO Mode Implementation: - Add --yolo CLI flag to autonomous_agent_demo.py - Update agent.py to accept yolo_mode parameter and select appropriate prompt - Modify client.py to conditionally include Playwright MCP server (excluded in YOLO mode) - Add coding_prompt_yolo.template.md with static analysis only verification - Add get_coding_prompt_yolo() to prompts.py Server/API Updates: - Add AgentStartRequest schema with yolo_mode field - Update AgentStatus to include yolo_mode - Modify process_manager.py to pass --yolo flag to subprocess - Update agent router to accept yolo_mode in start request UI Updates: - Add YOLO toggle button (lightning bolt icon) in AgentControl - Show YOLO mode indicator when agent is running in YOLO mode - Add useAgentStatus hook to track current mode - Update startAgent API to accept yoloMode parameter - Add YOLO toggle in SpecCreationChat completion flow Spec Creation Improvements: - Fix create-spec.md to properly replace [FEATURE_COUNT] placeholder - Add REQUIRED FEATURE COUNT section to initializer_prompt.template.md - Fix spec_chat_session.py to create security settings file for Claude SDK - Delete app_spec.txt before spec creation to allow fresh creation Documentation: - Add YOLO mode section to CLAUDE.md with usage examples - Add checkpoint.md slash command for creating detailed commits 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2026-01-02 08:36:58 +02:00
Auto	3a085085e4	fix next feature filtering on in progress	2025-12-30 20:29:27 +02:00
Auto	dd7c1ddd82	init	2025-12-30 11:13:18 +02:00

27 Commits