- STACK.md - Technologies and dependencies - ARCHITECTURE.md - System design and patterns - STRUCTURE.md - Directory layout - CONVENTIONS.md - Code style and patterns - TESTING.md - Test structure - INTEGRATIONS.md - External services - CONCERNS.md - Technical debt and issues
14 KiB
Codebase Concerns
Analysis Date: 2026-01-27
Tech Debt
Loose Type Safety in Error Handling:
- Issue: Multiple uses of
as anytype assertions bypass TypeScript safety, particularly in error context handling and provider responses - Files:
apps/server/src/providers/claude-provider.ts(lines 318-322),apps/server/src/lib/error-handler.ts,apps/server/src/routes/settings/routes/update-global.ts - Impact: Errors could have unchecked properties; refactoring becomes risky without compiler assistance
- Fix approach: Replace
as anywith proper type guards and discriminated unions; create helper functions for safe property access
Missing Test Coverage for Critical Services:
- Issue: Several core services explicitly excluded from test coverage thresholds due to integration complexity
- Files:
apps/server/vitest.config.ts(line 22), explicitly excluded:claude-usage-service.ts,mcp-test-service.ts,cli-provider.ts,cursor-provider.ts - Impact: Usage tracking, MCP integration, and CLI detection could break undetected; regression detection is limited
- Fix approach: Create integration test fixtures for CLI providers; mock MCP SDK for mcp-test-service tests; add usage tracking unit tests with mocked API calls
Unused/Stub TODO Item Processing:
- Issue: TodoWrite tool implementation exists but is partially integrated; tool name constants scattered across codex provider
- Files:
apps/server/src/providers/codex-tool-mapping.ts,apps/server/src/providers/codex-provider.ts - Impact: Todo list updates may not synchronize properly with all providers; unclear which providers support TodoWrite
- Fix approach: Consolidate tool name constants; add provider capability flags for todo support
Electron Electron.ts Size and Complexity:
- Issue: Single 3741-line file handles all Electron IPC, native bindings, and communication
- Files:
apps/ui/src/lib/electron.ts - Impact: Difficult to test; hard to isolate bugs; changes require full testing of all features; potential memory overhead from monolithic file
- Fix approach: Split by responsibility (IPC, window management, file operations, debug tools); create separate bridge layers
Known Bugs
API Key Management Incomplete for Gemini:
- Symptoms: Gemini API key verification endpoint not implemented despite other providers having verification
- Files:
apps/ui/src/components/views/settings-view/api-keys/hooks/use-api-key-management.ts(line 122) - Trigger: User tries to verify Gemini API key in settings
- Workaround: Key verification skipped for Gemini; settings page still accepts and stores key
Orphaned Features Detection Vulnerable to False Negatives:
- Symptoms: Features marked as orphaned when branch matching logic doesn't account for all scenarios
- Files:
apps/server/src/services/auto-mode-service.ts(lines 5714-5773) - Trigger: Features that were manually switched branches or rebased
- Workaround: Manual cleanup via feature deletion; branch comparison is basic name matching only
Terminal Themes Incomplete:
- Symptoms: Light theme themes (solarizedlight, github) map to same generic lightTheme; no dedicated implementations
- Files:
apps/ui/src/config/terminal-themes.ts(lines 593-594) - Trigger: User selects solarizedlight or github terminal theme
- Workaround: Uses generic light theme instead of specific scheme; visual appearance doesn't match expectation
Security Considerations
Process Environment Variable Exposure:
- Risk: Child processes inherit all parent
process.envincluding sensitive credentials (API keys, tokens) - Files:
apps/server/src/providers/cursor-provider.ts(line 993),apps/server/src/providers/codex-provider.ts(line 1099) - Current mitigation: Dotenv provides isolation at app startup; selective env passing to some providers
- Recommendations: Use explicit allowlists for env vars passed to child processes (only pass REQUIRED_KEYS); audit all spawn calls for env handling; document which providers need which credentials
Unvalidated Provider Tool Input:
- Risk: Tool input from CLI providers (Cursor, Copilot, Codex) is partially validated through Record<string, unknown> patterns; execution context could be escaped
- Files:
apps/server/src/providers/codex-provider.ts(lines 506-543),apps/server/src/providers/tool-normalization.ts - Current mitigation: Status enums validated; tool names checked against allow-lists in some providers
- Recommendations: Implement comprehensive schema validation for all tool inputs before execution; use zod or similar for runtime validation; add security tests for injection patterns
API Key Storage in Settings Files:
- Risk: API keys stored in plaintext in
~/.automaker/settings.jsonanddata/settings.json; file permissions may not be restricted - Files:
apps/server/src/services/settings-service.ts, usesatomicWriteJsonwithout file permission enforcement - Current mitigation: Limited by file system permissions; Electron mode has single-user access
- Recommendations: Encrypt sensitive settings fields (apiKeys, tokens); use OS credential stores (Keychain/Credential Manager) for production; add file permission checks on startup
Performance Bottlenecks
Synchronous Feature Loading at Startup:
- Problem: All features loaded synchronously at project load; blocks UI with 1000+ features
- Files:
apps/server/src/services/feature-loader.ts(line 230 Promise.all, but synchronous enumeration) - Cause: Feature directory walk and JSON parsing is not paginated or lazy-loaded
- Improvement path: Implement lazy loading with pagination (load first 50, fetch more on scroll); add caching layer with TTL; move to background indexing; add feature count limits with warnings
Auto-Mode Concurrency at Max Can Exceed Rate Limits:
- Problem: maxConcurrency = 10 can quickly exhaust Claude API rate limits if all features execute simultaneously
- Files:
apps/server/src/services/auto-mode-service.ts(line 2931 Promise.all for concurrent agents) - Cause: No adaptive backoff; no API usage tracking before queuing; hint mentions reducing concurrency but doesn't enforce it
- Improvement path: Integrate with claude-usage-service to check remaining quota before starting features; implement exponential backoff on 429 errors; add per-model rate limit tracking
Terminal Session Memory Leak Risk:
- Problem: Terminal sessions accumulate in memory; expired sessions not cleaned up reliably
- Files:
apps/server/src/routes/terminal/common.ts(line 66 cleanup runs every 5 minutes, but only for tokens) - Cause: Cleanup interval is arbitrary; session map not bounded; no session lifespan limit
- Improvement path: Implement LRU eviction with max session count; reduce cleanup interval to 1 minute; add memory usage monitoring; auto-close idle sessions after 30 minutes
Large File Content Loading Without Limits:
- Problem: File content loaded entirely into memory;
describe-file.tstruncates at 50KB but loads all content first - Files:
apps/server/src/routes/context/routes/describe-file.ts(line 128) - Cause: Synchronous file read; no streaming; no check before reading large files
- Improvement path: Check file size before reading; stream large files; add file size warnings; implement chunked processing for analysis
Fragile Areas
Provider Factory Model Resolution:
- Files:
apps/server/src/providers/provider-factory.ts,apps/server/src/providers/simple-query-service.ts - Why fragile: Each provider interprets model strings differently; no central registry; model aliases resolved at multiple layers (model-resolver, provider-specific maps, CLI validation)
- Safe modification: Add integration tests for each model alias per provider; create model capability matrix; centralize model validation before dispatch
- Test coverage: No dedicated tests; relies on E2E; no isolated unit tests for model resolution
WebSocket Session Authentication:
- Files:
apps/server/src/lib/auth.ts(line 40 setInterval),apps/server/src/index.ts(token validation per message) - Why fragile: Session tokens generated and validated at multiple points; no single source of truth; expiration is not atomic
- Safe modification: Add tests for token expiration edge cases; ensure cleanup removes all references; log all auth failures
- Test coverage: Auth middleware tested, but not session lifecycle
Auto-Mode Feature State Machine:
- Files:
apps/server/src/services/auto-mode-service.ts(lines 465-600) - Why fragile: Multiple states (running, queued, completed, error) managed across different methods; no explicit state transition validation; error recovery is defensive (catches all, logs, continues)
- Safe modification: Create explicit state enum with valid transitions; add invariant checks; unit test state transitions with all error cases
- Test coverage: Gaps in error recovery paths; no tests for concurrent state changes
Scaling Limits
Feature Count Scalability:
- Current capacity: ~1000 features tested; UI performance degrades with pagination required
- Limit: 10K+ features cause >5s load times; memory usage ~100MB for metadata alone
- Scaling path: Implement feature database instead of file-per-feature; add ElasticSearch indexing for search; paginate API responses (50 per page); add feature archiving
Concurrent Auto-Mode Executions:
- Current capacity: maxConcurrency = 10 features; limited by Claude API rate limits
- Limit: Rate limit hits at ~4-5 simultaneous features with extended context (100K+ tokens)
- Scaling path: Implement token usage budgeting before feature start; queue features with estimated token cost; add provider-specific rate limit handling
Terminal Session Count:
- Current capacity: ~100 active terminal sessions per server
- Limit: Memory grows unbounded; no session count limit enforced
- Scaling path: Add max session count with least-recently-used eviction; implement session federation for distributed setup
Worktree Disk Usage:
- Current capacity: 10K worktrees (~20GB with typical repos)
- Limit:
.worktreesdirectory grows without cleanup; old worktrees accumulate - Scaling path: Add worktree TTL (delete if not used for 30 days); implement cleanup job; add quota warnings at 50/80% disk
Dependencies at Risk
node-pty Beta Version:
- Risk:
node-pty@1.1.0-beta41used for terminal emulation; beta status indicates possible instability - Impact: Terminal features could break on minor platform changes; no guarantees on bug fixes
- Migration plan: Monitor releases for stable version; pin to specific commit if needed; test extensively on target platforms (macOS, Linux, Windows)
@anthropic-ai/claude-agent-sdk 0.1.x:
- Risk: Pre-1.0 version; SDK API may change in future releases; limited version stability guarantees
- Impact: Breaking changes could require significant refactoring; feature additions in SDK may not align with Automaker roadmap
- Migration plan: Pin to specific 0.1.x version; review SDK changelogs before upgrades; maintain SDK compatibility tests; consider fallback implementation for critical paths
@openai/codex-sdk 0.77.x:
- Risk: Codex model deprecated by OpenAI; SDK may be archived or unsupported
- Impact: Codex provider could become non-functional; error messages may not be actionable
- Migration plan: Monitor OpenAI roadmap for migration path; implement fallback to Claude for Codex requests; add deprecation warning in UI
Express 5.2.x RC Stage:
- Risk: Express 5 is still in release candidate phase (as of Node 22); full stability not guaranteed
- Impact: Minor version updates could include breaking changes; middleware compatibility issues possible
- Migration plan: Maintain compatibility layer for Express 5 API; test with latest major before release; document any version-specific workarounds
Missing Critical Features
Persistent Session Storage:
- Problem: Agent conversation sessions stored only in-memory; restart loses all chat history
- Blocks: Long-running analysis across server restarts; session recovery not possible
- Impact: Users must re-run entire analysis if server restarts; lost productivity
Rate Limit Awareness:
- Problem: No tracking of API usage relative to rate limits before executing features
- Blocks: Predictable concurrent feature execution; users frequently hit rate limits unexpectedly
- Impact: Feature execution fails with cryptic rate limit errors; poor user experience
Feature Dependency Visualization:
- Problem: Dependency-resolver package exists but no UI to visualize or manage dependencies
- Blocks: Users cannot plan feature order; complex dependencies not visible
- Impact: Features implemented in wrong order; blocking dependencies missed
Test Coverage Gaps
CLI Provider Integration:
- What's not tested: Actual CLI execution paths; environment setup; error recovery from CLI crashes
- Files:
apps/server/src/providers/cli-provider.ts,apps/server/src/lib/cli-detection.ts - Risk: Changes to CLI handling could break silently; detection logic not validated on target platforms
- Priority: High - affects all CLI-based providers (Cursor, Copilot, Codex)
Cursor Provider Platform-Specific Paths:
- What's not tested: Windows/Linux Cursor installation detection; version directory parsing; APPDATA environment variable handling
- Files:
apps/server/src/providers/cursor-provider.ts(lines 267-498) - Risk: Platform-specific bugs not caught; Cursor detection fails on non-standard installations
- Priority: High - Cursor is primary provider; platform differences critical
Event Hook System State Changes:
- What's not tested: Concurrent hook execution; cleanup on server shutdown; webhook delivery retries
- Files:
apps/server/src/services/event-hook-service.ts(line 248 Promise.allSettled) - Risk: Hooks may not execute in expected order; memory not cleaned up; webhooks lost on failure
- Priority: Medium - affects automation workflows
Error Classification for New Providers:
- What's not tested: Each provider's unique error patterns mapped to ErrorType enum; new provider errors not classified
- Files:
apps/server/src/lib/error-handler.ts(lines 58-80), each provider error mapping - Risk: User sees generic "unknown error" instead of actionable message; categorization regresses with new providers
- Priority: Medium - impacts user experience
Feature State Corruption Scenarios:
- What's not tested: Concurrent feature updates; partial writes with power loss; JSON parsing recovery
- Files:
apps/server/src/services/feature-loader.ts,@automaker/utils(atomicWriteJson) - Risk: Feature data corrupted on concurrent access; recovery incomplete; no validation before use
- Priority: High - data loss risk
Concerns audit: 2026-01-27