mirror of https://github.com/AutoMaker-Org/automaker.git synced 2026-02-01 08:13:37 +00:00

Files

Shirone c30cde242a docs: map existing codebase

- STACK.md - Technologies and dependencies
- ARCHITECTURE.md - System design and patterns
- STRUCTURE.md - Directory layout
- CONVENTIONS.md - Code style and patterns
- TESTING.md - Test structure
- INTEGRATIONS.md - External services
- CONCERNS.md - Technical debt and issues

2026-01-27 13:48:24 +01:00

14 KiB

Raw Blame History

Codebase Concerns

Analysis Date: 2026-01-27

Tech Debt

Loose Type Safety in Error Handling:

Issue: Multiple uses of as any type assertions bypass TypeScript safety, particularly in error context handling and provider responses
Files: apps/server/src/providers/claude-provider.ts (lines 318-322), apps/server/src/lib/error-handler.ts, apps/server/src/routes/settings/routes/update-global.ts
Impact: Errors could have unchecked properties; refactoring becomes risky without compiler assistance
Fix approach: Replace as any with proper type guards and discriminated unions; create helper functions for safe property access

Missing Test Coverage for Critical Services:

Issue: Several core services explicitly excluded from test coverage thresholds due to integration complexity
Files: apps/server/vitest.config.ts (line 22), explicitly excluded: claude-usage-service.ts, mcp-test-service.ts, cli-provider.ts, cursor-provider.ts
Impact: Usage tracking, MCP integration, and CLI detection could break undetected; regression detection is limited
Fix approach: Create integration test fixtures for CLI providers; mock MCP SDK for mcp-test-service tests; add usage tracking unit tests with mocked API calls

Unused/Stub TODO Item Processing:

Issue: TodoWrite tool implementation exists but is partially integrated; tool name constants scattered across codex provider
Files: apps/server/src/providers/codex-tool-mapping.ts, apps/server/src/providers/codex-provider.ts
Impact: Todo list updates may not synchronize properly with all providers; unclear which providers support TodoWrite
Fix approach: Consolidate tool name constants; add provider capability flags for todo support

Electron Electron.ts Size and Complexity:

Issue: Single 3741-line file handles all Electron IPC, native bindings, and communication
Files: apps/ui/src/lib/electron.ts
Impact: Difficult to test; hard to isolate bugs; changes require full testing of all features; potential memory overhead from monolithic file
Fix approach: Split by responsibility (IPC, window management, file operations, debug tools); create separate bridge layers

Known Bugs

API Key Management Incomplete for Gemini:

Symptoms: Gemini API key verification endpoint not implemented despite other providers having verification
Files: apps/ui/src/components/views/settings-view/api-keys/hooks/use-api-key-management.ts (line 122)
Trigger: User tries to verify Gemini API key in settings
Workaround: Key verification skipped for Gemini; settings page still accepts and stores key

Orphaned Features Detection Vulnerable to False Negatives:

Symptoms: Features marked as orphaned when branch matching logic doesn't account for all scenarios
Files: apps/server/src/services/auto-mode-service.ts (lines 5714-5773)
Trigger: Features that were manually switched branches or rebased
Workaround: Manual cleanup via feature deletion; branch comparison is basic name matching only

Terminal Themes Incomplete:

Symptoms: Light theme themes (solarizedlight, github) map to same generic lightTheme; no dedicated implementations
Files: apps/ui/src/config/terminal-themes.ts (lines 593-594)
Trigger: User selects solarizedlight or github terminal theme
Workaround: Uses generic light theme instead of specific scheme; visual appearance doesn't match expectation

Security Considerations

Process Environment Variable Exposure:

Risk: Child processes inherit all parent process.env including sensitive credentials (API keys, tokens)
Files: apps/server/src/providers/cursor-provider.ts (line 993), apps/server/src/providers/codex-provider.ts (line 1099)
Current mitigation: Dotenv provides isolation at app startup; selective env passing to some providers
Recommendations: Use explicit allowlists for env vars passed to child processes (only pass REQUIRED_KEYS); audit all spawn calls for env handling; document which providers need which credentials

Unvalidated Provider Tool Input:

Risk: Tool input from CLI providers (Cursor, Copilot, Codex) is partially validated through Record<string, unknown> patterns; execution context could be escaped
Files: apps/server/src/providers/codex-provider.ts (lines 506-543), apps/server/src/providers/tool-normalization.ts
Current mitigation: Status enums validated; tool names checked against allow-lists in some providers
Recommendations: Implement comprehensive schema validation for all tool inputs before execution; use zod or similar for runtime validation; add security tests for injection patterns

API Key Storage in Settings Files:

Risk: API keys stored in plaintext in ~/.automaker/settings.json and data/settings.json; file permissions may not be restricted
Files: apps/server/src/services/settings-service.ts, uses atomicWriteJson without file permission enforcement
Current mitigation: Limited by file system permissions; Electron mode has single-user access
Recommendations: Encrypt sensitive settings fields (apiKeys, tokens); use OS credential stores (Keychain/Credential Manager) for production; add file permission checks on startup

Performance Bottlenecks

Synchronous Feature Loading at Startup:

Problem: All features loaded synchronously at project load; blocks UI with 1000+ features
Files: apps/server/src/services/feature-loader.ts (line 230 Promise.all, but synchronous enumeration)
Cause: Feature directory walk and JSON parsing is not paginated or lazy-loaded
Improvement path: Implement lazy loading with pagination (load first 50, fetch more on scroll); add caching layer with TTL; move to background indexing; add feature count limits with warnings

Auto-Mode Concurrency at Max Can Exceed Rate Limits:

Problem: maxConcurrency = 10 can quickly exhaust Claude API rate limits if all features execute simultaneously
Files: apps/server/src/services/auto-mode-service.ts (line 2931 Promise.all for concurrent agents)
Cause: No adaptive backoff; no API usage tracking before queuing; hint mentions reducing concurrency but doesn't enforce it
Improvement path: Integrate with claude-usage-service to check remaining quota before starting features; implement exponential backoff on 429 errors; add per-model rate limit tracking

Terminal Session Memory Leak Risk:

Problem: Terminal sessions accumulate in memory; expired sessions not cleaned up reliably
Files: apps/server/src/routes/terminal/common.ts (line 66 cleanup runs every 5 minutes, but only for tokens)
Cause: Cleanup interval is arbitrary; session map not bounded; no session lifespan limit
Improvement path: Implement LRU eviction with max session count; reduce cleanup interval to 1 minute; add memory usage monitoring; auto-close idle sessions after 30 minutes

Large File Content Loading Without Limits:

Problem: File content loaded entirely into memory; describe-file.ts truncates at 50KB but loads all content first
Files: apps/server/src/routes/context/routes/describe-file.ts (line 128)
Cause: Synchronous file read; no streaming; no check before reading large files
Improvement path: Check file size before reading; stream large files; add file size warnings; implement chunked processing for analysis

Fragile Areas

Provider Factory Model Resolution:

Files: apps/server/src/providers/provider-factory.ts, apps/server/src/providers/simple-query-service.ts
Why fragile: Each provider interprets model strings differently; no central registry; model aliases resolved at multiple layers (model-resolver, provider-specific maps, CLI validation)
Safe modification: Add integration tests for each model alias per provider; create model capability matrix; centralize model validation before dispatch
Test coverage: No dedicated tests; relies on E2E; no isolated unit tests for model resolution

WebSocket Session Authentication:

Files: apps/server/src/lib/auth.ts (line 40 setInterval), apps/server/src/index.ts (token validation per message)
Why fragile: Session tokens generated and validated at multiple points; no single source of truth; expiration is not atomic
Safe modification: Add tests for token expiration edge cases; ensure cleanup removes all references; log all auth failures
Test coverage: Auth middleware tested, but not session lifecycle

Auto-Mode Feature State Machine:

Files: apps/server/src/services/auto-mode-service.ts (lines 465-600)
Why fragile: Multiple states (running, queued, completed, error) managed across different methods; no explicit state transition validation; error recovery is defensive (catches all, logs, continues)
Safe modification: Create explicit state enum with valid transitions; add invariant checks; unit test state transitions with all error cases
Test coverage: Gaps in error recovery paths; no tests for concurrent state changes

Scaling Limits

Feature Count Scalability:

Current capacity: ~1000 features tested; UI performance degrades with pagination required
Limit: 10K+ features cause >5s load times; memory usage ~100MB for metadata alone
Scaling path: Implement feature database instead of file-per-feature; add ElasticSearch indexing for search; paginate API responses (50 per page); add feature archiving

Concurrent Auto-Mode Executions:

Current capacity: maxConcurrency = 10 features; limited by Claude API rate limits
Limit: Rate limit hits at ~4-5 simultaneous features with extended context (100K+ tokens)
Scaling path: Implement token usage budgeting before feature start; queue features with estimated token cost; add provider-specific rate limit handling

Terminal Session Count:

Current capacity: ~100 active terminal sessions per server
Limit: Memory grows unbounded; no session count limit enforced
Scaling path: Add max session count with least-recently-used eviction; implement session federation for distributed setup

Worktree Disk Usage:

Current capacity: 10K worktrees (~20GB with typical repos)
Limit: .worktrees directory grows without cleanup; old worktrees accumulate
Scaling path: Add worktree TTL (delete if not used for 30 days); implement cleanup job; add quota warnings at 50/80% disk

Dependencies at Risk

node-pty Beta Version:

Risk: node-pty@1.1.0-beta41 used for terminal emulation; beta status indicates possible instability
Impact: Terminal features could break on minor platform changes; no guarantees on bug fixes
Migration plan: Monitor releases for stable version; pin to specific commit if needed; test extensively on target platforms (macOS, Linux, Windows)

@anthropic-ai/claude-agent-sdk 0.1.x:

Risk: Pre-1.0 version; SDK API may change in future releases; limited version stability guarantees
Impact: Breaking changes could require significant refactoring; feature additions in SDK may not align with Automaker roadmap
Migration plan: Pin to specific 0.1.x version; review SDK changelogs before upgrades; maintain SDK compatibility tests; consider fallback implementation for critical paths

@openai/codex-sdk 0.77.x:

Risk: Codex model deprecated by OpenAI; SDK may be archived or unsupported
Impact: Codex provider could become non-functional; error messages may not be actionable
Migration plan: Monitor OpenAI roadmap for migration path; implement fallback to Claude for Codex requests; add deprecation warning in UI

Express 5.2.x RC Stage:

Risk: Express 5 is still in release candidate phase (as of Node 22); full stability not guaranteed
Impact: Minor version updates could include breaking changes; middleware compatibility issues possible
Migration plan: Maintain compatibility layer for Express 5 API; test with latest major before release; document any version-specific workarounds

Missing Critical Features

Persistent Session Storage:

Problem: Agent conversation sessions stored only in-memory; restart loses all chat history
Blocks: Long-running analysis across server restarts; session recovery not possible
Impact: Users must re-run entire analysis if server restarts; lost productivity

Rate Limit Awareness:

Problem: No tracking of API usage relative to rate limits before executing features
Blocks: Predictable concurrent feature execution; users frequently hit rate limits unexpectedly
Impact: Feature execution fails with cryptic rate limit errors; poor user experience

Feature Dependency Visualization:

Problem: Dependency-resolver package exists but no UI to visualize or manage dependencies
Blocks: Users cannot plan feature order; complex dependencies not visible
Impact: Features implemented in wrong order; blocking dependencies missed

Test Coverage Gaps

CLI Provider Integration:

What's not tested: Actual CLI execution paths; environment setup; error recovery from CLI crashes
Files: apps/server/src/providers/cli-provider.ts, apps/server/src/lib/cli-detection.ts
Risk: Changes to CLI handling could break silently; detection logic not validated on target platforms
Priority: High - affects all CLI-based providers (Cursor, Copilot, Codex)

Cursor Provider Platform-Specific Paths:

What's not tested: Windows/Linux Cursor installation detection; version directory parsing; APPDATA environment variable handling
Files: apps/server/src/providers/cursor-provider.ts (lines 267-498)
Risk: Platform-specific bugs not caught; Cursor detection fails on non-standard installations
Priority: High - Cursor is primary provider; platform differences critical

Event Hook System State Changes:

What's not tested: Concurrent hook execution; cleanup on server shutdown; webhook delivery retries
Files: apps/server/src/services/event-hook-service.ts (line 248 Promise.allSettled)
Risk: Hooks may not execute in expected order; memory not cleaned up; webhooks lost on failure
Priority: Medium - affects automation workflows

Error Classification for New Providers:

What's not tested: Each provider's unique error patterns mapped to ErrorType enum; new provider errors not classified
Files: apps/server/src/lib/error-handler.ts (lines 58-80), each provider error mapping
Risk: User sees generic "unknown error" instead of actionable message; categorization regresses with new providers
Priority: Medium - impacts user experience

Feature State Corruption Scenarios:

What's not tested: Concurrent feature updates; partial writes with power loss; JSON parsing recovery
Files: apps/server/src/services/feature-loader.ts, @automaker/utils (atomicWriteJson)
Risk: Feature data corrupted on concurrent access; recovery incomplete; no validation before use
Priority: High - data loss risk

Concerns audit: 2026-01-27

14 KiB Raw Blame History