13 KiB
CLAUDE.md
This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
Project Overview
This is an autonomous coding agent system with a React-based UI. It uses the Claude Agent SDK to build complete applications over multiple sessions using a two-agent pattern:
- Initializer Agent - First session reads an app spec and creates features in a SQLite database
- Coding Agent - Subsequent sessions implement features one by one, marking them as passing
Commands
Quick Start (Recommended)
# Windows - launches CLI menu
start.bat
# macOS/Linux
./start.sh
# Launch Web UI (serves pre-built React app)
start_ui.bat # Windows
./start_ui.sh # macOS/Linux
Python Backend (Manual)
# Create and activate virtual environment
python -m venv venv
venv\Scripts\activate # Windows
source venv/bin/activate # macOS/Linux
# Install dependencies
pip install -r requirements.txt
# Run the main CLI launcher
python start.py
# Run agent directly for a project (use absolute path or registered name)
python autonomous_agent_demo.py --project-dir C:/Projects/my-app
python autonomous_agent_demo.py --project-dir my-app # if registered
# YOLO mode: rapid prototyping without browser testing
python autonomous_agent_demo.py --project-dir my-app --yolo
# Parallel mode: run multiple agents concurrently (1-5 agents)
python autonomous_agent_demo.py --project-dir my-app --parallel --max-concurrency 3
YOLO Mode (Rapid Prototyping)
YOLO mode skips all testing for faster feature iteration:
# CLI
python autonomous_agent_demo.py --project-dir my-app --yolo
# UI: Toggle the lightning bolt button before starting the agent
What's different in YOLO mode:
- No regression testing (skips
feature_get_for_regression) - No Playwright MCP server (browser automation disabled)
- Features marked passing after lint/type-check succeeds
- Faster iteration for prototyping
What's the same:
- Lint and type-check still run to verify code compiles
- Feature MCP server for tracking progress
- All other development tools available
When to use: Early prototyping when you want to quickly scaffold features without verification overhead. Switch back to standard mode for production-quality development.
React UI (in ui/ directory)
cd ui
npm install
npm run dev # Development server (hot reload)
npm run build # Production build (required for start_ui.bat)
npm run lint # Run ESLint
Note: The start_ui.bat script serves the pre-built UI from ui/dist/. After making UI changes, run npm run build in the ui/ directory.
Architecture
Core Python Modules
start.py- CLI launcher with project creation/selection menuautonomous_agent_demo.py- Entry point for running the agentagent.py- Agent session loop using Claude Agent SDKclient.py- ClaudeSDKClient configuration with security hooks and MCP serverssecurity.py- Bash command allowlist validation (ALLOWED_COMMANDS whitelist)prompts.py- Prompt template loading with project-specific fallbackprogress.py- Progress tracking, database queries, webhook notificationsregistry.py- Project registry for mapping names to paths (cross-platform)parallel_orchestrator.py- Concurrent agent execution with dependency-aware schedulingapi/dependency_resolver.py- Cycle detection (Kahn's algorithm + DFS) and dependency validation
Project Registry
Projects can be stored in any directory. The registry maps project names to paths using SQLite:
- All platforms:
~/.autocoder/registry.db
The registry uses:
- SQLite database with SQLAlchemy ORM
- POSIX path format (forward slashes) for cross-platform compatibility
- SQLite's built-in transaction handling for concurrency safety
Server API (server/)
The FastAPI server provides REST endpoints for the UI:
server/routers/projects.py- Project CRUD with registry integrationserver/routers/features.py- Feature managementserver/routers/agent.py- Agent control (start/stop/pause/resume)server/routers/filesystem.py- Filesystem browser API with security controlsserver/routers/spec_creation.py- WebSocket for interactive spec creation
Feature Management
Features are stored in SQLite (features.db) via SQLAlchemy. The agent interacts with features through an MCP server:
mcp_server/feature_mcp.py- MCP server exposing feature management toolsapi/database.py- SQLAlchemy models (Feature table with priority, category, name, description, steps, passes, dependencies)
MCP tools available to the agent:
feature_get_stats- Progress statisticsfeature_get_next- Get highest-priority pending feature (respects dependencies)feature_claim_next- Atomically claim next available feature (for parallel mode)feature_get_for_regression- Random passing features for regression testingfeature_mark_passing- Mark feature completefeature_skip- Move feature to end of queuefeature_create_bulk- Initialize all features (used by initializer)feature_add_dependency- Add dependency between features (with cycle detection)feature_remove_dependency- Remove a dependency
React UI (ui/)
- Tech stack: React 18, TypeScript, TanStack Query, Tailwind CSS v4, Radix UI, dagre (graph layout)
src/App.tsx- Main app with project selection, kanban board, agent controlssrc/hooks/useWebSocket.ts- Real-time updates via WebSocket (progress, agent status, logs, agent updates)src/hooks/useProjects.ts- React Query hooks for API callssrc/lib/api.ts- REST API clientsrc/lib/types.ts- TypeScript type definitions
Key components:
AgentMissionControl.tsx- Dashboard showing active agents with mascots (Spark, Fizz, Octo, Hoot, Buzz)DependencyGraph.tsx- Interactive node graph visualization with dagre layoutCelebrationOverlay.tsx- Confetti animation on feature completionFolderBrowser.tsx- Server-side filesystem browser for project folder selection
Keyboard shortcuts (press ? for help):
D- Toggle debug panelG- Toggle Kanban/Graph viewN- Add new featureA- Toggle AI assistant,- Open settings
Project Structure for Generated Apps
Projects can be stored in any directory (registered in ~/.autocoder/registry.db). Each project contains:
prompts/app_spec.txt- Application specification (XML format)prompts/initializer_prompt.md- First session promptprompts/coding_prompt.md- Continuation session promptfeatures.db- SQLite database with feature test cases.agent.lock- Lock file to prevent multiple agent instances.autocoder/allowed_commands.yaml- Project-specific bash command allowlist (optional)
Security Model
Defense-in-depth approach configured in client.py:
- OS-level sandbox for bash commands
- Filesystem restricted to project directory only
- Bash commands validated using hierarchical allowlist system
Per-Project Allowed Commands
The agent's bash command access is controlled through a hierarchical configuration system:
Command Hierarchy (highest to lowest priority):
- Hardcoded Blocklist (
security.py) - NEVER allowed (dd, sudo, shutdown, etc.) - Org Blocklist (
~/.autocoder/config.yaml) - Cannot be overridden by projects - Org Allowlist (
~/.autocoder/config.yaml) - Available to all projects - Global Allowlist (
security.py) - Default commands (npm, git, curl, etc.) - Project Allowlist (
.autocoder/allowed_commands.yaml) - Project-specific commands
Project Configuration:
Each project can define custom allowed commands in .autocoder/allowed_commands.yaml:
version: 1
commands:
# Exact command names
- name: swift
description: Swift compiler
# Prefix wildcards (matches swiftc, swiftlint, swiftformat)
- name: swift*
description: All Swift development tools
# Local project scripts
- name: ./scripts/build.sh
description: Project build script
Organization Configuration:
System administrators can set org-wide policies in ~/.autocoder/config.yaml:
version: 1
# Commands available to ALL projects
allowed_commands:
- name: jq
description: JSON processor
# Commands blocked across ALL projects (cannot be overridden)
blocked_commands:
- aws # Prevent accidental cloud operations
- kubectl # Block production deployments
Pattern Matching:
- Exact:
swiftmatches onlyswift - Wildcard:
swift*matchesswift,swiftc,swiftlint, etc. - Scripts:
./scripts/build.shmatches the script by name from any directory
Limits:
- Maximum 100 commands per project config
- Blocklisted commands (sudo, dd, shutdown, etc.) can NEVER be allowed
- Org-level blocked commands cannot be overridden by project configs
Testing:
# Unit tests (136 tests - fast)
python test_security.py
# Integration tests (9 tests - uses real hooks)
python test_security_integration.py
Files:
security.py- Command validation logic and hardcoded blocklisttest_security.py- Unit tests for security system (136 tests)test_security_integration.py- Integration tests with real hooks (9 tests)TEST_SECURITY.md- Quick testing reference guideexamples/project_allowed_commands.yaml- Project config example (all commented by default)examples/org_config.yaml- Org config example (all commented by default)examples/README.md- Comprehensive guide with use cases, testing, and troubleshootingPHASE3_SPEC.md- Specification for mid-session approval feature (future enhancement)
Claude Code Integration
.claude/commands/create-spec.md-/create-specslash command for interactive spec creation.claude/skills/frontend-design/SKILL.md- Skill for distinctive UI design.claude/templates/- Prompt templates copied to new projectsexamples/- Configuration examples and documentation for security settings
Key Patterns
Prompt Loading Fallback Chain
- Project-specific:
{project_dir}/prompts/{name}.md - Base template:
.claude/templates/{name}.template.md
Agent Session Flow
- Check if
features.dbhas features (determines initializer vs coding agent) - Create ClaudeSDKClient with security settings
- Send prompt and stream response
- Auto-continue with 3-second delay between sessions
Real-time UI Updates
The UI receives updates via WebSocket (/ws/projects/{project_name}):
progress- Test pass counts (passing, in_progress, total)agent_status- Running/paused/stopped/crashedlog- Agent output lines with optional featureId/agentIndex for attributionfeature_update- Feature status changesagent_update- Multi-agent state updates (thinking/working/testing/success/error) with mascot names
Parallel Mode
When running with --parallel, the orchestrator:
- Spawns multiple Claude agents as subprocesses (up to
--max-concurrency) - Each agent claims features atomically via
feature_claim_next - Features blocked by unmet dependencies are skipped
- Browser contexts are isolated per agent using
--isolatedflag - AgentTracker parses output and emits
agent_updatemessages for UI
Process Limits (Parallel Mode)
The orchestrator enforces strict bounds on concurrent processes:
MAX_PARALLEL_AGENTS = 5- Maximum concurrent coding agentsMAX_TOTAL_AGENTS = 10- Hard limit on total agents (coding + testing)- Testing agents are capped at
max_concurrency(same as coding agents)
Expected process count during normal operation:
- 1 orchestrator process
- Up to 5 coding agents
- Up to 5 testing agents
- Total: never exceeds 11 Python processes
Stress Test Verification:
# Windows - verify process bounds
# 1. Note baseline count
tasklist | findstr python | find /c /v ""
# 2. Start parallel agent (max concurrency)
python autonomous_agent_demo.py --project-dir test --parallel --max-concurrency 5
# 3. During run - should NEVER exceed baseline + 11
tasklist | findstr python | find /c /v ""
# 4. After stop via UI - should return to baseline
tasklist | findstr python | find /c /v ""
# macOS/Linux - verify process bounds
# 1. Note baseline count
pgrep -c python
# 2. Start parallel agent
python autonomous_agent_demo.py --project-dir test --parallel --max-concurrency 5
# 3. During run - should NEVER exceed baseline + 11
pgrep -c python
# 4. After stop - should return to baseline
pgrep -c python
Log Verification:
# Check spawn vs completion balance
grep "Started testing agent" orchestrator_debug.log | wc -l
grep "Testing agent.*completed\|failed" orchestrator_debug.log | wc -l
# Watch for cap enforcement messages
grep "at max testing agents\|At max total agents" orchestrator_debug.log
Design System
The UI uses a neobrutalism design with Tailwind CSS v4:
- CSS variables defined in
ui/src/styles/globals.cssvia@themedirective - Custom animations:
animate-slide-in,animate-pulse-neo,animate-shimmer - Color tokens:
--color-neo-pending(yellow),--color-neo-progress(cyan),--color-neo-done(green)