Kacper a65b16cbae feat: implement modular provider architecture with Codex CLI support
Implements a flexible provider pattern that supports both Claude Agent SDK
and OpenAI Codex CLI, enabling future expansion to other AI providers
(Cursor, OpenCode, etc.) with minimal changes.

## Architecture Changes

### New Provider System
- Created provider abstraction layer with BaseProvider interface
- Model-based routing: model prefix determines provider
  - `gpt-*`, `o*` → CodexProvider (subprocess CLI)
  - `claude-*`, `opus/sonnet/haiku` → ClaudeProvider (SDK)
- Providers implement common ExecuteOptions interface

### New Files Created
- `providers/types.ts` - Shared interfaces (ExecuteOptions, ProviderMessage, etc.)
- `providers/base-provider.ts` - Abstract base class
- `providers/claude-provider.ts` - Claude Agent SDK wrapper
- `providers/codex-provider.ts` - Codex CLI subprocess executor
- `providers/codex-cli-detector.ts` - Installation & auth detection
- `providers/codex-config-manager.ts` - TOML config management
- `providers/provider-factory.ts` - Model-based provider routing
- `lib/subprocess-manager.ts` - Reusable subprocess utilities

## Features Implemented

### Codex CLI Integration
- Spawns Codex CLI as subprocess with JSONL output
- Converts Codex events to Claude SDK-compatible format
- Supports both `codex login` and OPENAI_API_KEY auth methods
- Handles: reasoning, messages, commands, todos, file changes
- Extracts text from content blocks for non-vision CLI

### Conversation History
- Added conversationHistory support to ExecuteOptions
- ClaudeProvider: yields previous messages to SDK
- CodexProvider: prepends history as text context
- Follow-up prompts maintain full conversation context

### Image Upload Support
- Images embedded as base64 for vision models
- Image paths appended to prompt text for Read tool access
- Auto-mode: copies images to feature folder
- Follow-up: combines original + new images
- Updates feature.json with image metadata

### Session Model Persistence
- Added `model` field to Session and SessionMetadata
- Sessions remember model preference across interactions
- API endpoints accept model parameter
- Auto-mode respects feature's model setting

## Modified Files

### Services
- `agent-service.ts`:
  - Added conversation history building
  - Uses ProviderFactory instead of direct SDK calls
  - Appends image paths to prompts
  - Added model parameter and persistence

- `auto-mode-service.ts`:
  - Removed OpenAI model block restriction
  - Uses ProviderFactory for all models
  - Added image support in buildFeaturePrompt
  - Follow-up: loads context, copies images, updates feature.json
  - Returns to waiting_approval after follow-up

### Routes
- `agent.ts`: Added model parameter to /send endpoint
- `sessions.ts`: Added model field to create/update
- `models.ts`: Added Codex models (gpt-5.2, gpt-5.1-codex*)

### Configuration
- `.env.example`: Added OPENAI_API_KEY and CODEX_CLI_PATH
- `.gitignore`: Added provider-specific ignores

## Bug Fixes
- Fixed image path resolution (relative → absolute)
- Fixed Codex empty prompt when images attached
- Fixed follow-up status management (in_progress → waiting_approval)
- Fixed follow-up images not appearing in prompt text
- Removed OpenAI model restrictions in auto-mode

## Testing Notes
- Codex CLI authentication verified with both methods
- Image uploads work for both Claude (vision) and Codex (Read tool)
- Follow-up prompts maintain full context
- Conversation history persists across turns
- Model switching works per-session

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2025-12-13 03:45:41 +01:00
2025-12-07 16:43:26 -05:00
2025-12-07 16:43:26 -05:00
2025-12-11 22:43:57 -05:00
2025-12-12 11:39:00 -05:00

Automaker

Automaker is an autonomous AI development studio that helps you build software faster using AI-powered agents. It provides a visual Kanban board interface to manage features, automatically assigns AI agents to implement them, and tracks progress through an intuitive workflow from backlog to verified completion.


[!CAUTION]

Security Disclaimer

This software uses AI-powered tooling that has access to your operating system and can read, modify, and delete files. Use at your own risk.

We have reviewed this codebase for security vulnerabilities, but you assume all risk when running this software. You should review the code yourself before running it.

We do not recommend running Automaker directly on your local computer due to the risk of AI agents having access to your entire file system. Please sandbox this application using Docker or a virtual machine.

Read the full disclaimer


Getting Started

Prerequisites

Quick Start

# 1. Clone the repo
git clone git@github.com:AutoMaker-Org/automaker.git
cd automaker

# 2. Install dependencies
npm install

# 3. Get your Claude Code OAuth token
claude setup-token
# ⚠️ This prints your token - don't share your screen!

# 4. Set the token and run
export CLAUDE_CODE_OAUTH_TOKEN="sk-ant-oat01-..."
npm run dev:electron

How to Run

Development Modes

Automaker can be run in several modes:

# Standard development mode
npm run dev:electron

# With DevTools open automatically
npm run dev:electron:debug

# For WSL (Windows Subsystem for Linux)
npm run dev:electron:wsl

# For WSL with GPU acceleration
npm run dev:electron:wsl:gpu

Web Browser Mode

# Run in web browser (http://localhost:3007)
npm run dev:web
# or
npm run dev

Building for Production

# Build Next.js app
npm run build

# Build Electron app for distribution
npm run build:electron

Running Production Build

# Start production Next.js server
npm run start

Testing

# Run tests headless
npm run test

# Run tests with browser visible
npm run test:headed

Linting

# Run ESLint
npm run lint

Authentication Options

Automaker supports multiple authentication methods (in order of priority):

Method Environment Variable Description
OAuth Token (env) CLAUDE_CODE_OAUTH_TOKEN From claude setup-token - uses your Claude subscription
OAuth Token (stored) Stored in app credentials file
API Key (stored) Anthropic API key stored in app
API Key (env) ANTHROPIC_API_KEY Pay-per-use API key

Recommended: Use CLAUDE_CODE_OAUTH_TOKEN if you have a Claude subscription.

Persistent Setup (Optional)

Add to your ~/.bashrc or ~/.zshrc:

export CLAUDE_CODE_OAUTH_TOKEN="YOUR_TOKEN_HERE"

Then restart your terminal or run source ~/.bashrc.

Features

  • 📋 Kanban Board - Visual drag-and-drop board to manage features through backlog, in progress, waiting approval, and verified stages
  • 🤖 AI Agent Integration - Automatic AI agent assignment to implement features when moved to "In Progress"
  • 🧠 Multi-Model Support - Choose from multiple AI models including Claude Opus, Sonnet, and more
  • 💭 Extended Thinking - Enable extended thinking modes for complex problem-solving
  • 📡 Real-time Agent Output - View live agent output, logs, and file diffs as features are being implemented
  • 🔍 Project Analysis - AI-powered project structure analysis to understand your codebase
  • 📁 Context Management - Add context files to help AI agents understand your project better
  • 💡 Feature Suggestions - AI-generated feature suggestions based on your project
  • 🖼️ Image Support - Attach images and screenshots to feature descriptions
  • Concurrent Processing - Configure concurrency to process multiple features simultaneously
  • 🧪 Test Integration - Automatic test running and verification for implemented features
  • 🔀 Git Integration - View git diffs and track changes made by AI agents
  • 👤 AI Profiles - Create and manage different AI agent profiles for various tasks
  • 💬 Chat History - Keep track of conversations and interactions with AI agents
  • ⌨️ Keyboard Shortcuts - Efficient navigation and actions via keyboard shortcuts
  • 🎨 Dark/Light Theme - Beautiful UI with theme support
  • 🖥️ Cross-Platform - Desktop application built with Electron for Windows, macOS, and Linux

Tech Stack

Learn More

To learn more about Next.js, take a look at the following resources:

License

See LICENSE for details.

Description
No description provided
Readme 30 MiB
Languages
TypeScript 96.3%
CSS 2.3%
Shell 0.9%
JavaScript 0.4%
Dockerfile 0.1%