refactor: optimize token usage, deduplicate code, fix bugs across agents

Token reduction (~40% per session, ~2.3M fewer tokens per 200-feature project):
- Agent-type-specific tool lists: coding 9, testing 5, init 5 (was 19 for all)
- Right-sized max_turns: coding 300, testing 100 (was 1000 for all)
- Trimmed coding prompt template (~150 lines removed)
- Streamlined testing prompt with batch support
- YOLO mode now strips browser testing instructions from prompt
- Added Grep, WebFetch, WebSearch to expand project session

Performance improvements:
- Rate limit retries start at ~15s with jitter (was fixed 60s)
- Post-spawn delay reduced to 0.5s (was 2s)
- Orchestrator consolidated to 1 DB query per loop (was 5-7)
- Testing agents batch 3 features per session (was 1)
- Smart context compaction preserves critical state, discards noise

Bug fixes:
- Removed ghost feature_release_testing MCP tool (wasted tokens every test session)
- Forward all 9 Vertex AI env vars to chat sessions (was missing 3)
- Fix DetachedInstanceError risk in test batch ORM access
- Prevent duplicate testing of same features in parallel mode

Code deduplication:
- _get_project_path(): 9 copies -> 1 shared utility (project_helpers.py)
- validate_project_name(): 9 copies -> 2 variants in 1 file (validation.py)
- ROOT_DIR: 10 copies -> 1 definition (chat_constants.py)
- API_ENV_VARS: 4 copies -> 1 source of truth (env_constants.py)

Security hardening:
- Unified sensitive directory blocklist (14 dirs, was two divergent lists)
- Cached get_blocked_paths() for O(1) directory listing checks
- Terminal security warning when ALLOW_REMOTE=1 exposes WebSocket
- 20 new security tests for EXTRA_READ_PATHS blocking
- Extracted _validate_command_list() and _validate_pkill_processes() helpers

Type safety:
- 87 mypy errors -> 0 across 58 source files
- Installed types-PyYAML for proper yaml stub types
- Fixed SQLAlchemy Column[T] coercions across all routers

Dead code removed:
- 13 files deleted (~2,679 lines): unused UI components, debug logs, outdated docs
- 7 unused npm packages removed (Radix UI components with 0 imports)
- AgentAvatar.tsx reduced from 615 -> 119 lines (SVGs extracted to mascotData.tsx)

New CLI options:
- --testing-batch-size (1-5) for parallel mode test batching
- --testing-feature-ids for direct multi-feature testing

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
This commit is contained in:
Auto
2026-02-01 13:16:24 +02:00
parent dc5bcc4ae9
commit 94e0b05cb1
57 changed files with 1974 additions and 4300 deletions

View File

@@ -25,25 +25,13 @@ from .assistant_database import (
create_conversation,
get_messages,
)
from .chat_constants import API_ENV_VARS, ROOT_DIR
# Load environment variables from .env file if present
load_dotenv()
logger = logging.getLogger(__name__)
# Root directory of the project
ROOT_DIR = Path(__file__).parent.parent.parent
# Environment variables to pass through to Claude CLI for API configuration
API_ENV_VARS = [
"ANTHROPIC_BASE_URL",
"ANTHROPIC_AUTH_TOKEN",
"API_TIMEOUT_MS",
"ANTHROPIC_DEFAULT_SONNET_MODEL",
"ANTHROPIC_DEFAULT_OPUS_MODEL",
"ANTHROPIC_DEFAULT_HAIKU_MODEL",
]
# Read-only feature MCP tools
READONLY_FEATURE_MCP_TOOLS = [
"mcp__features__feature_get_stats",
@@ -215,7 +203,7 @@ class AssistantChatSession:
# Create a new conversation if we don't have one
if is_new_conversation:
conv = create_conversation(self.project_dir, self.project_name)
self.conversation_id = conv.id
self.conversation_id = int(conv.id) # type coercion: Column[int] -> int
yield {"type": "conversation_created", "conversation_id": self.conversation_id}
# Build permissions list for assistant access (read + feature management)
@@ -270,7 +258,11 @@ class AssistantChatSession:
system_cli = shutil.which("claude")
# Build environment overrides for API configuration
sdk_env = {var: os.getenv(var) for var in API_ENV_VARS if os.getenv(var)}
sdk_env: dict[str, str] = {}
for var in API_ENV_VARS:
value = os.getenv(var)
if value:
sdk_env[var] = value
# Determine model from environment or use default
# This allows using alternative APIs (e.g., GLM via z.ai) that may not support Claude model names
@@ -286,7 +278,7 @@ class AssistantChatSession:
# This avoids Windows command line length limit (~8191 chars)
setting_sources=["project"],
allowed_tools=[*READONLY_BUILTIN_TOOLS, *ASSISTANT_FEATURE_TOOLS],
mcp_servers=mcp_servers,
mcp_servers=mcp_servers, # type: ignore[arg-type] # SDK accepts dict config at runtime
permission_mode="bypassPermissions",
max_turns=100,
cwd=str(self.project_dir.resolve()),
@@ -312,6 +304,8 @@ class AssistantChatSession:
greeting = f"Hello! I'm your project assistant for **{self.project_name}**. I can help you understand the codebase, explain features, and answer questions about the project. What would you like to know?"
# Store the greeting in the database
# conversation_id is guaranteed non-None here (set on line 206 above)
assert self.conversation_id is not None
add_message(self.project_dir, self.conversation_id, "assistant", greeting)
yield {"type": "text", "content": greeting}