mirror of https://github.com/leonvanzyl/autocoder.git synced 2026-02-05 08:23:08 +00:00

Go to file

Auto c55a1a0182 fix: harden dev server RCE mitigations from PR #153

Address security gaps and improve validation in the dev server command
execution path introduced by PR #153:

Security fixes (critical):
- Add missing shell metacharacters to dangerous_ops blocklist: single &
  (Windows cmd.exe command separator), >, <, ^, %, \n, \r
- The single & gap was a confirmed RCE bypass on Windows where .cmd
  files are always executed via cmd.exe even with shell=False (CPython
  limitation documented in issue #77696)
- Apply validate_custom_command_strict at /start endpoint for
  defense-in-depth against config file tampering

Validation improvements:
- Fix uvicorn --flag=value syntax (split on = before comparing)
- Expand Python support: Django (manage.py), Flask, custom .py scripts
- Add runners: flask, poetry, cargo, go, npx
- Expand npm script allowlist: serve, develop, server, preview
- Reorder PATCH /config validation to run strict check first (fail fast)
- Extract constants: ALLOWED_NPM_SCRIPTS, ALLOWED_PYTHON_MODULES,
  BLOCKED_SHELLS for reuse and testability

Cleanup:
- Remove unused security.py imports from dev_server_manager.py
- Fix deprecated datetime.utcnow() -> datetime.now(timezone.utc)
- Remove unnecessary _remove_lock() in exception handlers where lock
  was never created (Popen failure path)

Tests:
- Add test_devserver_security.py with 78 tests covering valid commands,
  blocked shells, blocked commands, injection attempts, dangerous_ops
  blocking, and constant verification

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

2026-02-05 08:52:47 +02:00

.claude

rebrand: rename AutoCoder to AutoForge across entire codebase

2026-02-04 12:02:06 +02:00

.github/workflows

feat: Add GitHub Actions CI for PR protection

2026-01-07 10:35:19 +02:00

api

rebrand: rename AutoCoder to AutoForge across entire codebase

2026-02-04 12:02:06 +02:00

bin

feat: add npm global package for one-command install

2026-02-04 14:48:00 +02:00

examples

rebrand: rename AutoCoder to AutoForge across entire codebase

2026-02-04 12:02:06 +02:00

lib

feat: add npm global package for one-command install

2026-02-04 14:48:00 +02:00

mcp_server

refactor: optimize token usage, deduplicate code, fix bugs across agents

2026-02-01 13:16:24 +02:00

server

fix: harden dev server RCE mitigations from PR #153

2026-02-05 08:52:47 +02:00

version patch

2026-02-04 15:41:15 +02:00

.env.example

rebrand: rename AutoCoder to AutoForge across entire codebase

2026-02-04 12:02:06 +02:00

.gitignore

refactor: extract docs to standalone site at autoforge.cc

2026-02-04 15:36:55 +02:00

.npmignore

feat: add npm global package for one-command install

2026-02-04 14:48:00 +02:00

agent.py

feat: add multi-feature batching for coding agents

2026-02-01 16:35:07 +02:00

auth.py

fix: consolidate auth error handling and fix start.bat credential check

2026-01-10 12:19:32 +02:00

autoforge_paths.py

rebrand: rename AutoCoder to AutoForge across entire codebase

2026-02-04 12:02:06 +02:00

autonomous_agent_demo.py

fix: add automatic temp folder cleanup at Maestro startup

2026-02-05 00:08:26 +01:00

CLAUDE.md

refactor: extract docs to standalone site at autoforge.cc

2026-02-04 15:36:55 +02:00

client.py

rebrand: rename AutoCoder to AutoForge across entire codebase

2026-02-04 12:02:06 +02:00

env_constants.py

rebrand: rename AutoCoder to AutoForge across entire codebase

2026-02-04 12:02:06 +02:00

LICENSE.md

fix dotenv dependency, also add license agreement

2026-01-06 17:03:35 +02:00

package.json

0.1.1

2026-02-04 15:39:46 +02:00

parallel_orchestrator.py

rebrand: rename AutoCoder to AutoForge across entire codebase

2026-02-04 12:02:06 +02:00

progress.py

rebrand: rename AutoCoder to AutoForge across entire codebase

2026-02-04 12:02:06 +02:00

prompts.py

rebrand: rename AutoCoder to AutoForge across entire codebase

2026-02-04 12:02:06 +02:00

pyproject.toml

feat: Add GitHub Actions CI for PR protection

2026-01-07 10:35:19 +02:00

rate_limit_utils.py

refactor: optimize token usage, deduplicate code, fix bugs across agents

2026-02-01 13:16:24 +02:00

README.md

feat: add npm global package for one-command install

2026-02-04 14:48:00 +02:00

registry.py

rebrand: rename AutoCoder to AutoForge across entire codebase

2026-02-04 12:02:06 +02:00

requirements-prod.txt

feat: add npm global package for one-command install

2026-02-04 14:48:00 +02:00

requirements.txt

refactor: optimize token usage, deduplicate code, fix bugs across agents

2026-02-01 13:16:24 +02:00

security.py

rebrand: rename AutoCoder to AutoForge across entire codebase

2026-02-04 12:02:06 +02:00

start_ui.bat

rebrand: rename AutoCoder to AutoForge across entire codebase

2026-02-04 12:02:06 +02:00

start_ui.py

rebrand: rename AutoCoder to AutoForge across entire codebase

2026-02-04 12:02:06 +02:00

start_ui.sh

rebrand: rename AutoCoder to AutoForge across entire codebase

2026-02-04 12:02:06 +02:00

start.bat

rebrand: rename AutoCoder to AutoForge across entire codebase

2026-02-04 12:02:06 +02:00

start.py

rebrand: rename AutoCoder to AutoForge across entire codebase

2026-02-04 12:02:06 +02:00

start.sh

rebrand: rename AutoCoder to AutoForge across entire codebase

2026-02-04 12:02:06 +02:00

temp_cleanup.py

fix: add automatic temp folder cleanup at Maestro startup

2026-02-05 00:08:26 +01:00

test_client.py

refactor: optimize token usage, deduplicate code, fix bugs across agents

2026-02-01 13:16:24 +02:00

test_dependency_resolver.py

fix: align security_settings with permission_mode + add dependency tests

2026-01-29 08:04:01 +02:00

test_devserver_security.py

fix: harden dev server RCE mitigations from PR #153

2026-02-05 08:52:47 +02:00

test_rate_limit_utils.py

refactor: optimize token usage, deduplicate code, fix bugs across agents

2026-02-01 13:16:24 +02:00

test_security_integration.py

rebrand: rename AutoCoder to AutoForge across entire codebase

2026-02-04 12:02:06 +02:00

test_security.py

rebrand: rename AutoCoder to AutoForge across entire codebase

2026-02-04 12:02:06 +02:00

README.md

AutoForge

A long-running autonomous coding agent powered by the Claude Agent SDK. This tool can build complete applications over multiple sessions using a two-agent pattern (initializer + coding agent). Includes a React-based UI for monitoring progress in real-time.

Video Tutorial

Watch the setup and usage guide →

Prerequisites

Node.js 20+ - Required for the CLI
Python 3.11+ - Auto-detected on first run (download)
Claude Code CLI - Install and authenticate (see below)

Claude Code CLI (Required)

macOS / Linux:

curl -fsSL https://claude.ai/install.sh | bash

Windows (PowerShell):

irm https://claude.ai/install.ps1 | iex

Authentication

You need one of the following:

Claude Pro/Max Subscription - Use claude login to authenticate (recommended)
Anthropic API Key - Pay-per-use from https://console.anthropic.com/

Quick Start

Option 1: npm Install (Recommended)

npm install -g autoforge-ai
autoforge

On first run, AutoForge automatically:

Checks for Python 3.11+
Creates a virtual environment at ~/.autoforge/venv/
Installs Python dependencies
Copies a default config file to ~/.autoforge/.env
Starts the server and opens your browser

CLI Commands

autoforge                       Start the server (default)
autoforge config                Open ~/.autoforge/.env in $EDITOR
autoforge config --path         Print config file path
autoforge config --show         Show active configuration values
autoforge --port PORT           Custom port (default: auto from 8888)
autoforge --host HOST           Custom host (default: 127.0.0.1)
autoforge --no-browser          Don't auto-open browser
autoforge --repair              Delete and recreate virtual environment
autoforge --version             Print version
autoforge --help                Show help

Option 2: From Source (Development)

Clone the repository and use the start scripts directly. This is the recommended path if you want to contribute or modify AutoForge itself.

git clone https://github.com/leonvanzyl/autoforge.git
cd autoforge

Web UI:

Platform	Command
Windows	`start_ui.bat`
macOS / Linux	`./start_ui.sh`

This launches the React-based web UI at http://localhost:5173 with:

Project selection and creation
Kanban board view of features
Real-time agent output streaming
Start/pause/stop controls

CLI Mode:

Platform	Command
Windows	`start.bat`
macOS / Linux	`./start.sh`

The start script will:

Check if Claude CLI is installed
Check if you're authenticated (prompt to run claude login if not)
Create a Python virtual environment
Install dependencies
Launch the main menu

Creating or Continuing a Project

You'll see options to:

Create new project - Start a fresh project with AI-assisted spec generation
Continue existing project - Resume work on a previous project

For new projects, you can use the built-in /create-spec command to interactively create your app specification with Claude's help.

How It Works

Two-Agent Pattern

Initializer Agent (First Session): Reads your app specification, creates features in a SQLite database (features.db), sets up the project structure, and initializes git.
Coding Agent (Subsequent Sessions): Picks up where the previous session left off, implements features one by one, and marks them as passing in the database.

Feature Management

Features are stored in SQLite via SQLAlchemy and managed through an MCP server that exposes tools to the agent:

feature_get_stats - Progress statistics
feature_get_next - Get highest-priority pending feature
feature_get_for_regression - Random passing features for regression testing
feature_mark_passing - Mark feature complete
feature_skip - Move feature to end of queue
feature_create_bulk - Initialize all features (used by initializer)

Session Management

Each session runs with a fresh context window
Progress is persisted via SQLite database and git commits
The agent auto-continues between sessions (3 second delay)
Press Ctrl+C to pause; run the start script again to resume

Important Timing Expectations

Note: Building complete applications takes time!

First session (initialization): The agent generates feature test cases. This takes several minutes and may appear to hang - this is normal.
Subsequent sessions: Each coding iteration can take 5-15 minutes depending on complexity.
Full app: Building all features typically requires many hours of total runtime across multiple sessions.

Tip: The feature count in the prompts determines scope. For faster demos, you can modify your app spec to target fewer features (e.g., 20-50 features for a quick demo).

Project Structure

autoforge/
├── bin/                         # npm CLI entry point
├── lib/                         # CLI bootstrap and setup logic
├── start.py                     # CLI menu and project management
├── start_ui.py                  # Web UI backend (FastAPI server launcher)
├── autonomous_agent_demo.py     # Agent entry point
├── agent.py                     # Agent session logic
├── client.py                    # Claude SDK client configuration
├── security.py                  # Bash command allowlist and validation
├── progress.py                  # Progress tracking utilities
├── prompts.py                   # Prompt loading utilities
├── api/
│   └── database.py              # SQLAlchemy models (Feature table)
├── mcp_server/
│   └── feature_mcp.py           # MCP server for feature management tools
├── server/
│   ├── main.py                  # FastAPI REST API server
│   ├── websocket.py             # WebSocket handler for real-time updates
│   ├── schemas.py               # Pydantic schemas
│   ├── routers/                 # API route handlers
│   └── services/                # Business logic services
├── ui/                          # React frontend
│   ├── src/
│   │   ├── App.tsx              # Main app component
│   │   ├── hooks/               # React Query and WebSocket hooks
│   │   └── lib/                 # API client and types
│   ├── package.json
│   └── vite.config.ts
├── .claude/
│   ├── commands/
│   │   └── create-spec.md       # /create-spec slash command
│   ├── skills/                  # Claude Code skills
│   └── templates/               # Prompt templates
├── requirements.txt             # Python dependencies (development)
├── requirements-prod.txt        # Python dependencies (npm install)
├── package.json                 # npm package definition
└── .env                         # Optional configuration

Generated Project Structure

After the agent runs, your project directory will contain:

generations/my_project/
├── features.db               # SQLite database (feature test cases)
├── prompts/
│   ├── app_spec.txt          # Your app specification
│   ├── initializer_prompt.md # First session prompt
│   └── coding_prompt.md      # Continuation session prompt
├── init.sh                   # Environment setup script
├── claude-progress.txt       # Session progress notes
└── [application files]       # Generated application code

Running the Generated Application

After the agent completes (or pauses), you can run the generated application:

cd generations/my_project

# Run the setup script created by the agent
./init.sh

# Or manually (typical for Node.js apps):
npm install
npm run dev

The application will typically be available at http://localhost:3000 or similar.

Security Model

This project uses a defense-in-depth security approach (see security.py and client.py):

OS-level Sandbox: Bash commands run in an isolated environment
Filesystem Restrictions: File operations restricted to the project directory only
Bash Allowlist: Only specific commands are permitted:
- File inspection: ls, cat, head, tail, wc, grep
- Node.js: npm, node
- Version control: git
- Process management: ps, lsof, sleep, pkill (dev processes only)

Commands not in the allowlist are blocked by the security hook.

Web UI Development

The React UI is located in the ui/ directory.

Development Mode

cd ui
npm install
npm run dev      # Development server with hot reload

Building for Production

cd ui
npm run build    # Builds to ui/dist/

Note: The start_ui.bat/start_ui.sh scripts serve the pre-built UI from ui/dist/. After making UI changes, run npm run build to see them when using the start scripts.

Tech Stack

React 18 with TypeScript
TanStack Query for data fetching
Tailwind CSS v4 with neobrutalism design
Radix UI components
WebSocket for real-time updates

Real-time Updates

The UI receives live updates via WebSocket (/ws/projects/{project_name}):

progress - Test pass counts
agent_status - Running/paused/stopped/crashed
log - Agent output lines (streamed from subprocess stdout)
feature_update - Feature status changes

Configuration

AutoForge reads configuration from a .env file. The file location depends on how you installed AutoForge:

Install method	Config file location	Edit command
npm (global)	`~/.autoforge/.env`	`autoforge config`
From source	`.env` in the project root	Edit directly

A default config file is created automatically on first run. Use autoforge config to open it in your editor, or autoforge config --show to print the active values.

N8N Webhook Integration

Add to your .env to send progress notifications to an N8N webhook:

# Optional: N8N webhook for progress notifications
PROGRESS_N8N_WEBHOOK_URL=https://your-n8n-instance.com/webhook/your-webhook-id

When test progress increases, the agent sends:

{
  "event": "test_progress",
  "passing": 45,
  "total": 200,
  "percentage": 22.5,
  "project": "my_project",
  "timestamp": "2025-01-15T14:30:00.000Z"
}

Using GLM Models (Alternative to Claude)

Add these variables to your .env file to use Zhipu AI's GLM models:

ANTHROPIC_BASE_URL=https://api.z.ai/api/anthropic
ANTHROPIC_AUTH_TOKEN=your-zhipu-api-key
API_TIMEOUT_MS=3000000
ANTHROPIC_DEFAULT_SONNET_MODEL=glm-4.7
ANTHROPIC_DEFAULT_OPUS_MODEL=glm-4.7
ANTHROPIC_DEFAULT_HAIKU_MODEL=glm-4.5-air

This routes AutoForge's API requests through Zhipu's Claude-compatible API, allowing you to use GLM-4.7 and other models. This only affects AutoForge - your global Claude Code settings remain unchanged.

Get an API key at: https://z.ai/subscribe

Using Ollama Local Models

Add these variables to your .env file to run agents with local models via Ollama v0.14.0+:

ANTHROPIC_BASE_URL=http://localhost:11434
ANTHROPIC_AUTH_TOKEN=ollama
API_TIMEOUT_MS=3000000
ANTHROPIC_DEFAULT_SONNET_MODEL=qwen3-coder
ANTHROPIC_DEFAULT_OPUS_MODEL=qwen3-coder
ANTHROPIC_DEFAULT_HAIKU_MODEL=qwen3-coder

See the CLAUDE.md for recommended models and known limitations.

Using Vertex AI

Add these variables to your .env file to run agents via Google Cloud Vertex AI:

CLAUDE_CODE_USE_VERTEX=1
CLOUD_ML_REGION=us-east5
ANTHROPIC_VERTEX_PROJECT_ID=your-gcp-project-id
ANTHROPIC_DEFAULT_OPUS_MODEL=claude-opus-4-5@20251101
ANTHROPIC_DEFAULT_SONNET_MODEL=claude-sonnet-4-5@20250929
ANTHROPIC_DEFAULT_HAIKU_MODEL=claude-3-5-haiku@20241022

Requires gcloud auth application-default login first. Note the @ separator (not -) in Vertex AI model names.

Customization

Changing the Application

Use the /create-spec command when creating a new project, or manually edit the files in your project's prompts/ directory:

app_spec.txt - Your application specification
initializer_prompt.md - Controls feature generation

Modifying Allowed Commands

Edit security.py to add or remove commands from ALLOWED_COMMANDS.

Troubleshooting

"Claude CLI not found" Install the Claude Code CLI using the instructions in the Prerequisites section.

"Not authenticated with Claude" Run claude login to authenticate. The start script will prompt you to do this automatically.

"Appears to hang on first run" This is normal. The initializer agent is generating detailed test cases, which takes significant time. Watch for [Tool: ...] output to confirm the agent is working.

"Command blocked by security hook" The agent tried to run a command not in the allowlist. This is the security system working as intended. If needed, add the command to ALLOWED_COMMANDS in security.py.

"Python 3.11+ required but not found" Install Python 3.11 or later from python.org. Make sure python3 (or python on Windows) is on your PATH.

"Python venv module not available" On Debian/Ubuntu, the venv module is packaged separately. Install it with sudo apt install python3.XX-venv (replace XX with your Python minor version, e.g., python3.12-venv).

"AutoForge is already running" A server instance is already active. Use the browser URL shown in the terminal, or stop the existing instance with Ctrl+C first.

Virtual environment issues after a Python upgrade Run autoforge --repair to delete and recreate the virtual environment from scratch.

License

Languages

Python 58.3%

TypeScript 36.6%

CSS 2.8%

JavaScript 1.7%

Shell 0.3%

Other 0.2%