PR #158 added temp_cleanup.py and its import in autonomous_agent_demo.py
but did not include the file in the package.json "files" array. This
caused ModuleNotFoundError for npm installations since the module was
missing from the published tarball.
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Address security gaps and improve validation in the dev server command
execution path introduced by PR #153:
Security fixes (critical):
- Add missing shell metacharacters to dangerous_ops blocklist: single &
(Windows cmd.exe command separator), >, <, ^, %, \n, \r
- The single & gap was a confirmed RCE bypass on Windows where .cmd
files are always executed via cmd.exe even with shell=False (CPython
limitation documented in issue #77696)
- Apply validate_custom_command_strict at /start endpoint for
defense-in-depth against config file tampering
Validation improvements:
- Fix uvicorn --flag=value syntax (split on = before comparing)
- Expand Python support: Django (manage.py), Flask, custom .py scripts
- Add runners: flask, poetry, cargo, go, npx
- Expand npm script allowlist: serve, develop, server, preview
- Reorder PATCH /config validation to run strict check first (fail fast)
- Extract constants: ALLOWED_NPM_SCRIPTS, ALLOWED_PYTHON_MODULES,
BLOCKED_SHELLS for reuse and testability
Cleanup:
- Remove unused security.py imports from dev_server_manager.py
- Fix deprecated datetime.utcnow() -> datetime.now(timezone.utc)
- Remove unnecessary _remove_lock() in exception handlers where lock
was never created (Popen failure path)
Tests:
- Add test_devserver_security.py with 78 tests covering valid commands,
blocked shells, blocked commands, injection attempts, dangerous_ops
blocking, and constant verification
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Problem:
When AutoForge runs agents that use Playwright for browser testing or
mongodb-memory-server for database tests, temporary files accumulate in
the system temp folder (%TEMP% on Windows, /tmp on Linux/macOS). These
files are never cleaned up automatically and can consume hundreds of GB
over time.
Affected temp items:
- playwright_firefoxdev_profile-* (browser profiles)
- playwright-artifacts-* (test artifacts)
- playwright-transform-cache
- mongodb-memory-server* (MongoDB binaries)
- ng-* (Angular CLI temp)
- scoped_dir* (Chrome/Chromium temp)
- .78912*.node (Node.js native module cache, ~7MB each)
- claude-*-cwd (Claude CLI working directory files)
- mat-debug-*.log (Material/Angular debug logs)
Solution:
- New temp_cleanup.py module with cleanup_stale_temp() function
- Called at Maestro (orchestrator) startup in autonomous_agent_demo.py
- Only deletes files/folders older than 1 hour (safe for running processes)
- Runs every time the Play button is clicked or agent auto-restarts
- Reports cleanup stats: dirs deleted, files deleted, MB freed
Why cleanup at Maestro startup:
- Reliable hook point (runs on every agent start, including auto-restart
after rate limits which happens every ~5 hours)
- No need for background timers or scheduled tasks
- Cleanup happens before new temp files are created
Testing:
- Tested on Windows with 958 items in temp folder
- Successfully cleaned 45 dirs, 758 files, freed 415 MB
- Files younger than 1 hour correctly preserved
Closes#155
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Remove embedded documentation system (18 files) from main UI:
- Delete ui/src/components/docs/ (DocsPage, DocsContent, DocsSidebar,
DocsSearch, docsData, and all 13 section components)
- Delete ui/src/hooks/useHashRoute.ts (only used for docs routing)
- Simplify ui/src/main.tsx: remove Router component, render App directly
inside QueryClientProvider (no more hash-based routing)
- Update docs button in App.tsx header to open https://autoforge.cc in
a new tab instead of navigating to #/docs hash route
- Add logo to header
- Add temp-docs/ to .gitignore
- Update CLAUDE.md with current architecture documentation
The documentation has been extracted into a separate repository and
deployed as a standalone Vite + React site at https://autoforge.cc.
This reduces the main UI bundle and decouples docs from app releases.
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Add a Node.js CLI wrapper that allows installing AutoForge globally via
`npm install -g autoforge-ai` and running it with a single `autoforge`
command. The CLI handles Python detection, venv management, config
loading, and uvicorn server lifecycle automatically.
New files:
- package.json: npm package config with bin entry, files whitelist,
and prepublishOnly script that builds the UI
- bin/autoforge.js: thin entry point that imports lib/cli.js
- lib/cli.js: main CLI module (~790 lines) with cross-platform Python
3.11+ detection, composite venv marker for smart invalidation
(requirements hash + Python version + path), .env config management
at ~/.autoforge/.env, server startup with PID file and port detection,
and signal handling with process tree cleanup
- requirements-prod.txt: runtime-only deps (excludes ruff, mypy, pytest)
- .npmignore: excludes dev files, tests, __pycache__, UI source
Modified files:
- ui/package.json: rename to autoforge-ui to avoid confusion with root
- .gitignore: add *.tgz for npm pack output
- README.md: add npm install as primary quick start method, document
CLI commands, add Ollama/Vertex AI config sections, new troubleshooting
entries for Python/venv issues
- GettingStarted.tsx: add Installation, Quick Start, and CLI Commands
sections to in-app documentation with command reference table
- docsData.ts: add installation and cli-commands sidebar entries
Published as autoforge-ai@0.1.0 on npm.
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Add max-h-[calc(100vh-2rem)] and overflow-y-auto to the shared
DialogContent component so modals scroll vertically when their
content exceeds the viewport height (e.g., Settings modal when
browser is zoomed in).
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Complete project rebrand from AutoCoder to AutoForge, touching 62 files
across Python backend, FastAPI server, React UI, documentation, config,
and CI/CD.
Key changes:
- Rename autocoder_paths.py -> autoforge_paths.py with backward-compat
migration from .autocoder/ -> .autoforge/ directories
- Update registry.py to migrate ~/.autocoder/ -> ~/.autoforge/ global
config directory with fallback support
- Update security.py with fallback reads from legacy .autocoder/ paths
- Rename .claude/commands and skills from gsd-to-autocoder-spec to
gsd-to-autoforge-spec
- Update all Python modules: client, prompts, progress, agent,
orchestrator, server routers and services
- Update React UI: package.json name, index.html title, localStorage
keys, all documentation sections, component references
- Update start scripts (bat/sh/py), examples, and .env.example
- Update CLAUDE.md and README.md with new branding and paths
- Update test files for new .autoforge/ directory structure
- Transfer git remote from leonvanzyl/autocoder to
AutoForgeAI/autoforge
Backward compatibility preserved: legacy .autocoder/ directories are
auto-detected and migrated on next agent start. Config fallback chain
checks both new and old paths.
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
The parallel orchestrator was using AUTOCODER_ROOT as the working
directory when spawning coding, batch, and testing agent subprocesses.
This caused the Claude Code CLI to create .claude/ and .claude_worktrees/
directories in the autocoder installation folder instead of the project
directory, scattering output files across multiple locations.
Changed all 3 subprocess spawn sites (coding agent, batch agent, testing
agent) to use self.project_dir as cwd, matching the behavior of the
server's process_manager.py. The subprocess commands already use absolute
paths to autonomous_agent_demo.py, so Python imports are unaffected.
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
The output reader was stamping every line with the primary feature ID
(e.g., [Feature #24]) even after the agent claimed a new feature in the
batch. Now parses feature_claim_and_get calls in the output stream and
switches the prefix to the newly claimed feature ID, so logs correctly
show [Feature #30] once the agent moves on.
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Enable the orchestrator to assign 1-3 features per coding agent subprocess,
selected via dependency chain extension + same-category fill. This reduces
cold-start overhead and leverages shared context across related features.
Orchestrator (parallel_orchestrator.py):
- Add batch tracking: _batch_features and _feature_to_primary data structures
- Add build_feature_batches() with dependency chain + category fill algorithm
- Add start_feature_batch() and _spawn_coding_agent_batch() methods
- Update _on_agent_complete() for batch cleanup across all features
- Update stop_feature() with _feature_to_primary lookup
- Update get_ready_features() to exclude all batch feature IDs
- Update main loop to build batches then spawn per available slot
CLI and agent layer:
- Add --feature-ids (comma-separated) and --batch-size CLI args
- Add feature_ids parameter to run_autonomous_agent() with batch prompt selection
- Add get_batch_feature_prompt() with sequential workflow instructions
WebSocket layer (server/websocket.py):
- Add BATCH_CODING_AGENT_START_PATTERN and BATCH_FEATURES_COMPLETE_PATTERN
- Add _handle_batch_agent_start() and _handle_batch_agent_complete() methods
- Add featureIds field to all agent_update messages
- Track current_feature_id updates as agent moves through batch
Frontend (React UI):
- Add featureIds to ActiveAgent and WSAgentUpdateMessage types
- Update KanbanColumn and DependencyGraph agent-feature maps for batch
- Update AgentCard to show "Batch: #X, #Y, #Z" with active feature highlight
- Add "Features per Agent" segmented control (1-3) in SettingsModal
Settings integration (full stack):
- Add batch_size to schemas, settings router, agent router, process manager
- Default batch_size=3, user-configurable 1-3 via settings UI
- batch_size=1 is functionally identical to pre-batching behavior
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Redesign ProgressDashboard from tall stacked layout to compact inline:
title/badge left, passing/total right, progress bar with percentage below
- Absorb AgentThought functionality directly into ProgressDashboard,
showing the agent's current thought below the progress bar
- Remove standalone AgentThought usage from App.tsx (component now unused)
- Pass logs/agentStatus to ProgressDashboard in single-agent mode only
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Replace the PLAYWRIGHT_HEADLESS environment variable with a global
setting toggle in the Settings modal. The setting is persisted in the
registry DB and injected as an env var into agent subprocesses, so
client.py reads it unchanged.
Backend:
- Add playwright_headless field to SettingsResponse/SettingsUpdate schemas
- Read/write the setting in settings router via existing _parse_bool helper
- Pass playwright_headless from agent router through to process manager
- Inject PLAYWRIGHT_HEADLESS env var into subprocess environment
Frontend:
- Add playwright_headless to Settings/SettingsUpdate TypeScript types
- Add "Headless Browser" Switch toggle below YOLO mode in SettingsModal
- Add default value to DEFAULT_SETTINGS in useProjects
Also fix CSS build warning: change @import url("tw-animate-css") to bare
@import "tw-animate-css" so Tailwind v4 inlines it during compilation
instead of leaving it for Vite/Lightning CSS post-processing.
Remove stale summary.md from previous refactoring session.
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Add centralized path resolution module (autocoder_paths.py) that
consolidates all autocoder-generated file paths behind a dual-path
strategy: check .autocoder/X first, fall back to root-level X for
backward compatibility, default to .autocoder/X for new projects.
Key changes:
- New autocoder_paths.py with dual-path resolution for features.db,
assistant.db, lock files, settings, prompts dir, and progress cache
- migrate_project_layout() safely moves old-layout projects to new
layout with SQLite WAL flush and integrity verification
- Updated 22 files to delegate path construction to autocoder_paths
- Reset/delete logic cleans both old and new file locations
- Orphan lock cleanup checks both locations per project
- Migration called automatically at agent start in autonomous_agent_demo.py
- Updated markdown commands/skills to reference .autocoder/prompts/
- CLAUDE.md documentation updated with new project structure
Files at project root that remain unchanged:
- CLAUDE.md (Claude SDK reads from cwd via setting_sources=["project"])
- app_spec.txt root copy (agent templates reference it via cat)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Narrow `\boverloaded\b` regex to require server/api/system context,
preventing false positives when Claude discusses method/operator
overloading in OOP code (C++, Java, C#, etc.)
- Restore 24-hour cap for absolute reset-time delays instead of 1-hour
clamp, avoiding unnecessary retry loops when rate limits reset hours
in the future
- Add test for Retry-After: 0 returning 0 (regression lock for the
`is not None` fix)
- Add false positive tests for "overloaded" in programming context
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
The onComplete handler in the empty Kanban spec creation flow only
logged errors to console.error, leaving users with no feedback when
the agent failed to start. This wires up the SpecCreationChat
component's built-in error UI (spinner, error banner, retry button)
via initializerStatus, initializerError, and onRetryInitializer props.
Changes:
- Add InitializerStatus type and specInitializerStatus/Error state
- Set status to 'starting' before startAgent call (shows spinner)
- On error, keep spec chat open so the error banner is visible
- On success, close chat and refresh queries (same as before)
- Wire up onRetryInitializer to reset state to idle
- Reset initializer status on cancel/exit for clean re-entry
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Replace ineffective threading.Lock() with atomic SQL operations for
cross-process safety. Key changes:
- Add SQLAlchemy event hooks (do_connect/do_begin) for BEGIN IMMEDIATE
transactions in api/database.py
- Add atomic_transaction() context manager for multi-statement ops
- Convert all feature MCP write operations to atomic UPDATE...WHERE
with compare-and-swap patterns (feature_claim, mark_passing, etc.)
- Add WHERE passes=0 state guard to feature_mark_passing
- Add WAL checkpoint on shutdown and idempotent cleanup() in
parallel_orchestrator.py with async-safe signal handling
- Wrap SQLite connections with contextlib.closing() in progress.py
- Add thread-safe engine cache with double-checked locking in
assistant_database.py
- Migrate to SQLAlchemy 2.0 DeclarativeBase across all modules
Inspired by PR #108 (cabana8471-arch), with fixes for nested
BEGIN EXCLUSIVE bug and missing state guards.
Closes#106
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Add explicit session.rollback() in exception handlers for database
context managers in features.py, schedules.py, and database.py get_db()
to prevent SQLAlchemy PendingRollbackError on failed transactions
- Add EXPAND_FEATURE_TOOLS to expand session security settings allow list
so the expand skill can use the MCP tools it references
- Update assistant session prompt to direct the LLM to call MCP tools
directly for feature creation instead of suggesting CLI commands
Cherry-picked fixes from PR #92 (closed) with cleaner implementation.
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Trim the env var value and fall back to the default model when the
trimmed result is empty. This prevents invalid empty strings from
being appended to VALID_MODELS.
Addresses CodeRabbit review feedback on PR #147.
When ANTHROPIC_DEFAULT_OPUS_MODEL env var is set to a custom model ID,
that model was not present in VALID_MODELS (derived from AVAILABLE_MODELS),
causing potential validation failures in server/schemas.py validators.
This fix dynamically appends the env-provided DEFAULT_MODEL to VALID_MODELS
when set, ensuring validators accept the runtime default. The merge is
idempotent (only adds if missing) and doesn't alter AVAILABLE_MODELS semantics.
Addresses CodeRabbit review feedback on PR #147.
When creating a spec from an empty Kanban board (via "Create Spec" button),
the agent was not automatically starting after clicking "Continue to Project".
Root cause: The SpecCreationChat component in App.tsx had an onComplete handler
that only closed the chat and refreshed queries, but did not call startAgent().
This was different from the NewProjectModal flow which correctly started the agent.
Changes:
- Add startAgent import to App.tsx
- Update onComplete handler to call startAgent() with yoloMode and maxConcurrency
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Root cause: NewProjectModal was rendered inside ProjectSelector (in header),
causing the fixed inset-0 container to be constrained by the dropdown DOM tree.
Changes:
- NewProjectModal.tsx: Use createPortal to render chat screen at document.body level
- SpecCreationChat.tsx: Change h-full to h-screen for explicit viewport height
- SpecCreationChat.tsx: Add min-h-0 to messages area for proper flexbox scrolling
This fixes the chat screen not displaying full-screen when creating a new project
with Claude.
Use the shared clamp_retry_delay() function (1-hour cap) for parsed
reset-time delays instead of a separate 24-hour cap. This aligns with
the PR's consistent 1-hour maximum delay objective.
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
1. agent.py: Reset opposite retry counter when entering rate_limit or error
status to prevent mixed events from inflating delays
2. rate_limit_utils.py: Fix parse_retry_after() regex to reject minute/hour
units - patterns now require explicit "seconds"/"s" unit or end of string
3. test_rate_limit_utils.py: Add tests for "retry after 5 minutes" and other
minute/hour variants to ensure they return None
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Rename compute_mode -> convert_model_for_vertex for clarity
- Move `import re` to module top-level (stdlib convention)
- Use greedy regex quantifier for more readable pattern matching
- Restore PEP 8 double blank line between top-level definitions
- Add test_client.py with 10 unit tests covering:
- Vertex disabled (env unset, "0", empty)
- Standard conversions (Opus, Sonnet, Haiku)
- Edge cases (already-converted, non-Claude, no date suffix, empty)
Follow-up improvements from PR #129 review.
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>