Files
automaker/.planning/codebase/CONCERNS.md
Shirone c30cde242a docs: map existing codebase
- STACK.md - Technologies and dependencies
- ARCHITECTURE.md - System design and patterns
- STRUCTURE.md - Directory layout
- CONVENTIONS.md - Code style and patterns
- TESTING.md - Test structure
- INTEGRATIONS.md - External services
- CONCERNS.md - Technical debt and issues
2026-01-27 13:48:24 +01:00

14 KiB

Codebase Concerns

Analysis Date: 2026-01-27

Tech Debt

Loose Type Safety in Error Handling:

  • Issue: Multiple uses of as any type assertions bypass TypeScript safety, particularly in error context handling and provider responses
  • Files: apps/server/src/providers/claude-provider.ts (lines 318-322), apps/server/src/lib/error-handler.ts, apps/server/src/routes/settings/routes/update-global.ts
  • Impact: Errors could have unchecked properties; refactoring becomes risky without compiler assistance
  • Fix approach: Replace as any with proper type guards and discriminated unions; create helper functions for safe property access

Missing Test Coverage for Critical Services:

  • Issue: Several core services explicitly excluded from test coverage thresholds due to integration complexity
  • Files: apps/server/vitest.config.ts (line 22), explicitly excluded: claude-usage-service.ts, mcp-test-service.ts, cli-provider.ts, cursor-provider.ts
  • Impact: Usage tracking, MCP integration, and CLI detection could break undetected; regression detection is limited
  • Fix approach: Create integration test fixtures for CLI providers; mock MCP SDK for mcp-test-service tests; add usage tracking unit tests with mocked API calls

Unused/Stub TODO Item Processing:

  • Issue: TodoWrite tool implementation exists but is partially integrated; tool name constants scattered across codex provider
  • Files: apps/server/src/providers/codex-tool-mapping.ts, apps/server/src/providers/codex-provider.ts
  • Impact: Todo list updates may not synchronize properly with all providers; unclear which providers support TodoWrite
  • Fix approach: Consolidate tool name constants; add provider capability flags for todo support

Electron Electron.ts Size and Complexity:

  • Issue: Single 3741-line file handles all Electron IPC, native bindings, and communication
  • Files: apps/ui/src/lib/electron.ts
  • Impact: Difficult to test; hard to isolate bugs; changes require full testing of all features; potential memory overhead from monolithic file
  • Fix approach: Split by responsibility (IPC, window management, file operations, debug tools); create separate bridge layers

Known Bugs

API Key Management Incomplete for Gemini:

  • Symptoms: Gemini API key verification endpoint not implemented despite other providers having verification
  • Files: apps/ui/src/components/views/settings-view/api-keys/hooks/use-api-key-management.ts (line 122)
  • Trigger: User tries to verify Gemini API key in settings
  • Workaround: Key verification skipped for Gemini; settings page still accepts and stores key

Orphaned Features Detection Vulnerable to False Negatives:

  • Symptoms: Features marked as orphaned when branch matching logic doesn't account for all scenarios
  • Files: apps/server/src/services/auto-mode-service.ts (lines 5714-5773)
  • Trigger: Features that were manually switched branches or rebased
  • Workaround: Manual cleanup via feature deletion; branch comparison is basic name matching only

Terminal Themes Incomplete:

  • Symptoms: Light theme themes (solarizedlight, github) map to same generic lightTheme; no dedicated implementations
  • Files: apps/ui/src/config/terminal-themes.ts (lines 593-594)
  • Trigger: User selects solarizedlight or github terminal theme
  • Workaround: Uses generic light theme instead of specific scheme; visual appearance doesn't match expectation

Security Considerations

Process Environment Variable Exposure:

  • Risk: Child processes inherit all parent process.env including sensitive credentials (API keys, tokens)
  • Files: apps/server/src/providers/cursor-provider.ts (line 993), apps/server/src/providers/codex-provider.ts (line 1099)
  • Current mitigation: Dotenv provides isolation at app startup; selective env passing to some providers
  • Recommendations: Use explicit allowlists for env vars passed to child processes (only pass REQUIRED_KEYS); audit all spawn calls for env handling; document which providers need which credentials

Unvalidated Provider Tool Input:

  • Risk: Tool input from CLI providers (Cursor, Copilot, Codex) is partially validated through Record<string, unknown> patterns; execution context could be escaped
  • Files: apps/server/src/providers/codex-provider.ts (lines 506-543), apps/server/src/providers/tool-normalization.ts
  • Current mitigation: Status enums validated; tool names checked against allow-lists in some providers
  • Recommendations: Implement comprehensive schema validation for all tool inputs before execution; use zod or similar for runtime validation; add security tests for injection patterns

API Key Storage in Settings Files:

  • Risk: API keys stored in plaintext in ~/.automaker/settings.json and data/settings.json; file permissions may not be restricted
  • Files: apps/server/src/services/settings-service.ts, uses atomicWriteJson without file permission enforcement
  • Current mitigation: Limited by file system permissions; Electron mode has single-user access
  • Recommendations: Encrypt sensitive settings fields (apiKeys, tokens); use OS credential stores (Keychain/Credential Manager) for production; add file permission checks on startup

Performance Bottlenecks

Synchronous Feature Loading at Startup:

  • Problem: All features loaded synchronously at project load; blocks UI with 1000+ features
  • Files: apps/server/src/services/feature-loader.ts (line 230 Promise.all, but synchronous enumeration)
  • Cause: Feature directory walk and JSON parsing is not paginated or lazy-loaded
  • Improvement path: Implement lazy loading with pagination (load first 50, fetch more on scroll); add caching layer with TTL; move to background indexing; add feature count limits with warnings

Auto-Mode Concurrency at Max Can Exceed Rate Limits:

  • Problem: maxConcurrency = 10 can quickly exhaust Claude API rate limits if all features execute simultaneously
  • Files: apps/server/src/services/auto-mode-service.ts (line 2931 Promise.all for concurrent agents)
  • Cause: No adaptive backoff; no API usage tracking before queuing; hint mentions reducing concurrency but doesn't enforce it
  • Improvement path: Integrate with claude-usage-service to check remaining quota before starting features; implement exponential backoff on 429 errors; add per-model rate limit tracking

Terminal Session Memory Leak Risk:

  • Problem: Terminal sessions accumulate in memory; expired sessions not cleaned up reliably
  • Files: apps/server/src/routes/terminal/common.ts (line 66 cleanup runs every 5 minutes, but only for tokens)
  • Cause: Cleanup interval is arbitrary; session map not bounded; no session lifespan limit
  • Improvement path: Implement LRU eviction with max session count; reduce cleanup interval to 1 minute; add memory usage monitoring; auto-close idle sessions after 30 minutes

Large File Content Loading Without Limits:

  • Problem: File content loaded entirely into memory; describe-file.ts truncates at 50KB but loads all content first
  • Files: apps/server/src/routes/context/routes/describe-file.ts (line 128)
  • Cause: Synchronous file read; no streaming; no check before reading large files
  • Improvement path: Check file size before reading; stream large files; add file size warnings; implement chunked processing for analysis

Fragile Areas

Provider Factory Model Resolution:

  • Files: apps/server/src/providers/provider-factory.ts, apps/server/src/providers/simple-query-service.ts
  • Why fragile: Each provider interprets model strings differently; no central registry; model aliases resolved at multiple layers (model-resolver, provider-specific maps, CLI validation)
  • Safe modification: Add integration tests for each model alias per provider; create model capability matrix; centralize model validation before dispatch
  • Test coverage: No dedicated tests; relies on E2E; no isolated unit tests for model resolution

WebSocket Session Authentication:

  • Files: apps/server/src/lib/auth.ts (line 40 setInterval), apps/server/src/index.ts (token validation per message)
  • Why fragile: Session tokens generated and validated at multiple points; no single source of truth; expiration is not atomic
  • Safe modification: Add tests for token expiration edge cases; ensure cleanup removes all references; log all auth failures
  • Test coverage: Auth middleware tested, but not session lifecycle

Auto-Mode Feature State Machine:

  • Files: apps/server/src/services/auto-mode-service.ts (lines 465-600)
  • Why fragile: Multiple states (running, queued, completed, error) managed across different methods; no explicit state transition validation; error recovery is defensive (catches all, logs, continues)
  • Safe modification: Create explicit state enum with valid transitions; add invariant checks; unit test state transitions with all error cases
  • Test coverage: Gaps in error recovery paths; no tests for concurrent state changes

Scaling Limits

Feature Count Scalability:

  • Current capacity: ~1000 features tested; UI performance degrades with pagination required
  • Limit: 10K+ features cause >5s load times; memory usage ~100MB for metadata alone
  • Scaling path: Implement feature database instead of file-per-feature; add ElasticSearch indexing for search; paginate API responses (50 per page); add feature archiving

Concurrent Auto-Mode Executions:

  • Current capacity: maxConcurrency = 10 features; limited by Claude API rate limits
  • Limit: Rate limit hits at ~4-5 simultaneous features with extended context (100K+ tokens)
  • Scaling path: Implement token usage budgeting before feature start; queue features with estimated token cost; add provider-specific rate limit handling

Terminal Session Count:

  • Current capacity: ~100 active terminal sessions per server
  • Limit: Memory grows unbounded; no session count limit enforced
  • Scaling path: Add max session count with least-recently-used eviction; implement session federation for distributed setup

Worktree Disk Usage:

  • Current capacity: 10K worktrees (~20GB with typical repos)
  • Limit: .worktrees directory grows without cleanup; old worktrees accumulate
  • Scaling path: Add worktree TTL (delete if not used for 30 days); implement cleanup job; add quota warnings at 50/80% disk

Dependencies at Risk

node-pty Beta Version:

  • Risk: node-pty@1.1.0-beta41 used for terminal emulation; beta status indicates possible instability
  • Impact: Terminal features could break on minor platform changes; no guarantees on bug fixes
  • Migration plan: Monitor releases for stable version; pin to specific commit if needed; test extensively on target platforms (macOS, Linux, Windows)

@anthropic-ai/claude-agent-sdk 0.1.x:

  • Risk: Pre-1.0 version; SDK API may change in future releases; limited version stability guarantees
  • Impact: Breaking changes could require significant refactoring; feature additions in SDK may not align with Automaker roadmap
  • Migration plan: Pin to specific 0.1.x version; review SDK changelogs before upgrades; maintain SDK compatibility tests; consider fallback implementation for critical paths

@openai/codex-sdk 0.77.x:

  • Risk: Codex model deprecated by OpenAI; SDK may be archived or unsupported
  • Impact: Codex provider could become non-functional; error messages may not be actionable
  • Migration plan: Monitor OpenAI roadmap for migration path; implement fallback to Claude for Codex requests; add deprecation warning in UI

Express 5.2.x RC Stage:

  • Risk: Express 5 is still in release candidate phase (as of Node 22); full stability not guaranteed
  • Impact: Minor version updates could include breaking changes; middleware compatibility issues possible
  • Migration plan: Maintain compatibility layer for Express 5 API; test with latest major before release; document any version-specific workarounds

Missing Critical Features

Persistent Session Storage:

  • Problem: Agent conversation sessions stored only in-memory; restart loses all chat history
  • Blocks: Long-running analysis across server restarts; session recovery not possible
  • Impact: Users must re-run entire analysis if server restarts; lost productivity

Rate Limit Awareness:

  • Problem: No tracking of API usage relative to rate limits before executing features
  • Blocks: Predictable concurrent feature execution; users frequently hit rate limits unexpectedly
  • Impact: Feature execution fails with cryptic rate limit errors; poor user experience

Feature Dependency Visualization:

  • Problem: Dependency-resolver package exists but no UI to visualize or manage dependencies
  • Blocks: Users cannot plan feature order; complex dependencies not visible
  • Impact: Features implemented in wrong order; blocking dependencies missed

Test Coverage Gaps

CLI Provider Integration:

  • What's not tested: Actual CLI execution paths; environment setup; error recovery from CLI crashes
  • Files: apps/server/src/providers/cli-provider.ts, apps/server/src/lib/cli-detection.ts
  • Risk: Changes to CLI handling could break silently; detection logic not validated on target platforms
  • Priority: High - affects all CLI-based providers (Cursor, Copilot, Codex)

Cursor Provider Platform-Specific Paths:

  • What's not tested: Windows/Linux Cursor installation detection; version directory parsing; APPDATA environment variable handling
  • Files: apps/server/src/providers/cursor-provider.ts (lines 267-498)
  • Risk: Platform-specific bugs not caught; Cursor detection fails on non-standard installations
  • Priority: High - Cursor is primary provider; platform differences critical

Event Hook System State Changes:

  • What's not tested: Concurrent hook execution; cleanup on server shutdown; webhook delivery retries
  • Files: apps/server/src/services/event-hook-service.ts (line 248 Promise.allSettled)
  • Risk: Hooks may not execute in expected order; memory not cleaned up; webhooks lost on failure
  • Priority: Medium - affects automation workflows

Error Classification for New Providers:

  • What's not tested: Each provider's unique error patterns mapped to ErrorType enum; new provider errors not classified
  • Files: apps/server/src/lib/error-handler.ts (lines 58-80), each provider error mapping
  • Risk: User sees generic "unknown error" instead of actionable message; categorization regresses with new providers
  • Priority: Medium - impacts user experience

Feature State Corruption Scenarios:

  • What's not tested: Concurrent feature updates; partial writes with power loss; JSON parsing recovery
  • Files: apps/server/src/services/feature-loader.ts, @automaker/utils (atomicWriteJson)
  • Risk: Feature data corrupted on concurrent access; recovery incomplete; no validation before use
  • Priority: High - data loss risk

Concerns audit: 2026-01-27