refactor: improve Vertex AI model conversion and add tests

- Rename compute_mode -> convert_model_for_vertex for clarity - Move `import re` to module top-level (stdlib convention) - Use greedy regex quantifier for more readable pattern matching - Restore PEP 8 double blank line between top-level definitions - Add test_client.py with 10 unit tests covering: - Vertex disabled (env unset, "0", empty) - Standard conversions (Opus, Sonnet, Haiku) - Edge cases (already-converted, non-Claude, no date suffix, empty) Follow-up improvements from PR #129 review. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Merge pull request #129 from derhally/zd/vertex-support
2026-01-30 22:32:06 +00:00 · 2026-01-30 10:20:03 +02:00 · 2026-01-30 10:16:18 +02:00 · 2026-01-30 09:07:40 +02:00 · 2026-01-30 09:03:52 +02:00 · 2026-01-29 15:02:08 +01:00
4 changed files with 182 additions and 18 deletions
--- a/.env.example
+++ b/.env.example
@@ -22,6 +22,18 @@
 # Example: EXTRA_READ_PATHS=/Volumes/Data/dev,/Users/shared/libs
 # EXTRA_READ_PATHS=

+# Google Cloud Vertex AI Configuration (Optional)
+# To use Claude via Vertex AI on Google Cloud Platform, uncomment and set these variables.
+# Requires: gcloud CLI installed and authenticated (run: gcloud auth application-default login)
+# Note: Use @ instead of - in model names (e.g., claude-opus-4-5@20251101)
+#
+# CLAUDE_CODE_USE_VERTEX=1
+# CLOUD_ML_REGION=us-east5
+# ANTHROPIC_VERTEX_PROJECT_ID=your-gcp-project-id
+# ANTHROPIC_DEFAULT_OPUS_MODEL=claude-opus-4-5@20251101
+# ANTHROPIC_DEFAULT_SONNET_MODEL=claude-sonnet-4-5@20250929
+# ANTHROPIC_DEFAULT_HAIKU_MODEL=claude-3-5-haiku@20241022
+
 # GLM/Alternative API Configuration (Optional)
 # To use Zhipu AI's GLM models instead of Claude, uncomment and set these variables.
 # This only affects AutoCoder - your global Claude Code settings remain unchanged.
--- a/client.py
+++ b/client.py
@@ -7,6 +7,7 @@ Functions for creating and configuring the Claude Agent SDK client.

 import json
 import os
+import re
 import shutil
 import sys
 from pathlib import Path
@@ -31,7 +32,7 @@ DEFAULT_PLAYWRIGHT_HEADLESS = True
 DEFAULT_PLAYWRIGHT_BROWSER = "firefox"

 # Environment variables to pass through to Claude CLI for API configuration
-# These allow using alternative API endpoints (e.g., GLM via z.ai) without
+# These allow using alternative API endpoints (e.g., GLM via z.ai, Vertex AI) without
 # affecting the user's global Claude Code settings
 API_ENV_VARS = [
    "ANTHROPIC_BASE_URL",              # Custom API endpoint (e.g., https://api.z.ai/api/anthropic)
@@ -40,6 +41,10 @@ API_ENV_VARS = [
    "ANTHROPIC_DEFAULT_SONNET_MODEL",  # Model override for Sonnet
    "ANTHROPIC_DEFAULT_OPUS_MODEL",    # Model override for Opus
    "ANTHROPIC_DEFAULT_HAIKU_MODEL",   # Model override for Haiku
+    # Vertex AI configuration
+    "CLAUDE_CODE_USE_VERTEX",          # Enable Vertex AI mode (set to "1")
+    "CLOUD_ML_REGION",                 # GCP region (e.g., us-east5)
+    "ANTHROPIC_VERTEX_PROJECT_ID",     # GCP project ID
 ]

 # Extra read paths for cross-project file access (read-only)
@@ -64,6 +69,35 @@ EXTRA_READ_PATHS_BLOCKLIST = {
    ".netrc",
 }

+def convert_model_for_vertex(model: str) -> str:
+    """
+    Convert model name format for Vertex AI compatibility.
+
+    Vertex AI uses @ to separate model name from version (e.g., claude-opus-4-5@20251101)
+    while the Anthropic API uses - (e.g., claude-opus-4-5-20251101).
+
+    Args:
+        model: Model name in Anthropic format (with hyphens)
+
+    Returns:
+        Model name in Vertex AI format (with @ before date) if Vertex AI is enabled,
+        otherwise returns the model unchanged.
+    """
+    # Only convert if Vertex AI is enabled
+    if os.getenv("CLAUDE_CODE_USE_VERTEX") != "1":
+        return model
+
+    # Pattern: claude-{name}-{version}-{date} -> claude-{name}-{version}@{date}
+    # Example: claude-opus-4-5-20251101 -> claude-opus-4-5@20251101
+    # The date is always 8 digits at the end
+    match = re.match(r'^(claude-.+)-(\d{8})$', model)
+    if match:
+        base_name, date = match.groups()
+        return f"{base_name}@{date}"
+
+    # If already in @ format or doesn't match expected pattern, return as-is
+    return model
+

 def get_playwright_headless() -> bool:
    """
@@ -400,14 +434,19 @@ def create_client(
        if value:
            sdk_env[var] = value

-    # Detect alternative API mode (Ollama or GLM)
+    # Detect alternative API mode (Ollama, GLM, or Vertex AI)
    base_url = sdk_env.get("ANTHROPIC_BASE_URL", "")
-    is_alternative_api = bool(base_url)
+    is_vertex = sdk_env.get("CLAUDE_CODE_USE_VERTEX") == "1"
+    is_alternative_api = bool(base_url) or is_vertex
    is_ollama = "localhost:11434" in base_url or "127.0.0.1:11434" in base_url
-
+    model = convert_model_for_vertex(model)
    if sdk_env:
        print(f"   - API overrides: {', '.join(sdk_env.keys())}")
-        if is_ollama:
+        if is_vertex:
+            project_id = sdk_env.get("ANTHROPIC_VERTEX_PROJECT_ID", "unknown")
+            region = sdk_env.get("CLOUD_ML_REGION", "unknown")
+            print(f"   - Vertex AI Mode: Using GCP project '{project_id}' with model '{model}' in region '{region}'")
+        elif is_ollama:
            print("   - Ollama Mode: Using local models")
        elif "ANTHROPIC_BASE_URL" in sdk_env:
            print(f"   - GLM Mode: Using {sdk_env['ANTHROPIC_BASE_URL']}")
@@ -486,7 +525,7 @@ def create_client(
            # Enable extended context beta for better handling of long sessions.
            # This provides up to 1M tokens of context with automatic compaction.
            # See: https://docs.anthropic.com/en/api/beta-headers
-            # Disabled for alternative APIs (Ollama, GLM) as they don't support Claude-specific betas.
+            # Disabled for alternative APIs (Ollama, GLM, Vertex AI) as they don't support this beta.
            betas=[] if is_alternative_api else ["context-1m-2025-08-07"],
            # Note on context management:
            # The Claude Agent SDK handles context management automatically through the
--- a/parallel_orchestrator.py
+++ b/parallel_orchestrator.py
@@ -169,9 +169,11 @@ class ParallelOrchestrator:
        # Thread-safe state
        self._lock = threading.Lock()
        # Coding agents: feature_id -> process
+        # Safe to key by feature_id because start_feature() checks for duplicates before spawning
        self.running_coding_agents: dict[int, subprocess.Popen] = {}
-        # Testing agents: feature_id -> process (feature being tested)
-        self.running_testing_agents: dict[int, subprocess.Popen] = {}
+        # Testing agents: pid -> (feature_id, process)
+        # Keyed by PID (not feature_id) because multiple agents can test the same feature
+        self.running_testing_agents: dict[int, tuple[int, subprocess.Popen]] = {}
        # Legacy alias for backward compatibility
        self.running_agents = self.running_coding_agents
        self.abort_events: dict[int, threading.Event] = {}
@@ -429,7 +431,10 @@ class ParallelOrchestrator:

            # Spawn outside lock (I/O bound operation)
            print(f"[DEBUG] Spawning testing agent ({spawn_index}/{desired})", flush=True)
-            self._spawn_testing_agent()
+            success, msg = self._spawn_testing_agent()
+            if not success:
+                debug_log.log("TESTING", f"Spawn failed, stopping: {msg}")
+                return

    def start_feature(self, feature_id: int, resume: bool = False) -> tuple[bool, str]:
        """Start a single coding agent for a feature.
@@ -611,8 +616,9 @@ class ParallelOrchestrator:
                debug_log.log("TESTING", f"FAILED to spawn testing agent: {e}")
                return False, f"Failed to start testing agent: {e}"

-            # Register process with feature ID (same pattern as coding agents)
-            self.running_testing_agents[feature_id] = proc
+            # Register process by PID (not feature_id) to avoid overwrites
+            # when multiple agents test the same feature
+            self.running_testing_agents[proc.pid] = (feature_id, proc)
            testing_count = len(self.running_testing_agents)

        # Start output reader thread with feature ID (same as coding agents)
@@ -795,11 +801,8 @@ class ParallelOrchestrator:
        """
        if agent_type == "testing":
            with self._lock:
-                # Remove from dict by finding the feature_id for this proc
-                for fid, p in list(self.running_testing_agents.items()):
-                    if p is proc:
-                        del self.running_testing_agents[fid]
-                        break
+                # Remove by PID
+                self.running_testing_agents.pop(proc.pid, None)

            status = "completed" if return_code == 0 else "failed"
            print(f"Feature #{feature_id} testing {status}", flush=True)
@@ -898,12 +901,17 @@ class ParallelOrchestrator:
        with self._lock:
            testing_items = list(self.running_testing_agents.items())

-        for feature_id, proc in testing_items:
+        for pid, (feature_id, proc) in testing_items:
            result = kill_process_tree(proc, timeout=5.0)
-            debug_log.log("STOP", f"Killed testing agent for feature #{feature_id} (PID {proc.pid})",
+            debug_log.log("STOP", f"Killed testing agent for feature #{feature_id} (PID {pid})",
                status=result.status, children_found=result.children_found,
                children_terminated=result.children_terminated, children_killed=result.children_killed)

+        # Clear dict so get_status() doesn't report stale agents while
+        # _on_agent_complete callbacks are still in flight.
+        with self._lock:
+            self.running_testing_agents.clear()
+
    async def run_loop(self):
        """Main orchestration loop."""
        self.is_running = True
--- a/test_client.py
+++ b/test_client.py
@@ -0,0 +1,105 @@
+#!/usr/bin/env python3
+"""
+Client Utility Tests
+====================
+
+Tests for the client module utility functions.
+Run with: python test_client.py
+"""
+
+import os
+import unittest
+
+from client import convert_model_for_vertex
+
+
+class TestConvertModelForVertex(unittest.TestCase):
+    """Tests for convert_model_for_vertex function."""
+
+    def setUp(self):
+        """Save original env state."""
+        self._orig_vertex = os.environ.get("CLAUDE_CODE_USE_VERTEX")
+
+    def tearDown(self):
+        """Restore original env state."""
+        if self._orig_vertex is None:
+            os.environ.pop("CLAUDE_CODE_USE_VERTEX", None)
+        else:
+            os.environ["CLAUDE_CODE_USE_VERTEX"] = self._orig_vertex
+
+    # --- Vertex AI disabled (default) ---
+
+    def test_returns_model_unchanged_when_vertex_disabled(self):
+        os.environ.pop("CLAUDE_CODE_USE_VERTEX", None)
+        self.assertEqual(
+            convert_model_for_vertex("claude-opus-4-5-20251101"),
+            "claude-opus-4-5-20251101",
+        )
+
+    def test_returns_model_unchanged_when_vertex_set_to_zero(self):
+        os.environ["CLAUDE_CODE_USE_VERTEX"] = "0"
+        self.assertEqual(
+            convert_model_for_vertex("claude-opus-4-5-20251101"),
+            "claude-opus-4-5-20251101",
+        )
+
+    def test_returns_model_unchanged_when_vertex_set_to_empty(self):
+        os.environ["CLAUDE_CODE_USE_VERTEX"] = ""
+        self.assertEqual(
+            convert_model_for_vertex("claude-sonnet-4-5-20250929"),
+            "claude-sonnet-4-5-20250929",
+        )
+
+    # --- Vertex AI enabled: standard conversions ---
+
+    def test_converts_opus_model(self):
+        os.environ["CLAUDE_CODE_USE_VERTEX"] = "1"
+        self.assertEqual(
+            convert_model_for_vertex("claude-opus-4-5-20251101"),
+            "claude-opus-4-5@20251101",
+        )
+
+    def test_converts_sonnet_model(self):
+        os.environ["CLAUDE_CODE_USE_VERTEX"] = "1"
+        self.assertEqual(
+            convert_model_for_vertex("claude-sonnet-4-5-20250929"),
+            "claude-sonnet-4-5@20250929",
+        )
+
+    def test_converts_haiku_model(self):
+        os.environ["CLAUDE_CODE_USE_VERTEX"] = "1"
+        self.assertEqual(
+            convert_model_for_vertex("claude-3-5-haiku-20241022"),
+            "claude-3-5-haiku@20241022",
+        )
+
+    # --- Vertex AI enabled: already converted or non-matching ---
+
+    def test_already_vertex_format_unchanged(self):
+        os.environ["CLAUDE_CODE_USE_VERTEX"] = "1"
+        self.assertEqual(
+            convert_model_for_vertex("claude-opus-4-5@20251101"),
+            "claude-opus-4-5@20251101",
+        )
+
+    def test_non_claude_model_unchanged(self):
+        os.environ["CLAUDE_CODE_USE_VERTEX"] = "1"
+        self.assertEqual(
+            convert_model_for_vertex("gpt-4o"),
+            "gpt-4o",
+        )
+
+    def test_model_without_date_suffix_unchanged(self):
+        os.environ["CLAUDE_CODE_USE_VERTEX"] = "1"
+        self.assertEqual(
+            convert_model_for_vertex("claude-opus-4-5"),
+            "claude-opus-4-5",
+        )
+
+    def test_empty_string_unchanged(self):
+        os.environ["CLAUDE_CODE_USE_VERTEX"] = "1"
+        self.assertEqual(convert_model_for_vertex(""), "")
+
+
+if __name__ == "__main__":
+    unittest.main()
Author	SHA1	Message	Date
Auto	79d02a1410	refactor: improve Vertex AI model conversion and add tests - Rename compute_mode -> convert_model_for_vertex for clarity - Move `import re` to module top-level (stdlib convention) - Use greedy regex quantifier for more readable pattern matching - Restore PEP 8 double blank line between top-level definitions - Add test_client.py with 10 unit tests covering: - Vertex disabled (env unset, "0", empty) - Standard conversions (Opus, Sonnet, Haiku) - Edge cases (already-converted, non-Claude, no date suffix, empty) Follow-up improvements from PR #129 review. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2026-01-30 10:20:03 +02:00
Leon van Zyl	813fcde18b	Merge pull request #129 from derhally/zd/vertex-support feat: Adds Vertex AI support for Claude models	2026-01-30 10:16:18 +02:00
Auto	b693de2999	fix: improve parallel orchestrator agent tracking clarity and cleanup - Add comment on running_coding_agents explaining why feature_id keying is safe (start_feature checks for duplicates before spawning), since the sister dict running_testing_agents required PID keying to avoid overwrites from concurrent same-feature testing - Clear running_testing_agents dict in stop_all() after killing processes so get_status() doesn't report stale agent counts while _on_agent_complete callbacks are still in flight Follow-up to PR #130 (runaway testing agent spawn fix). Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2026-01-30 09:07:40 +02:00
Leon van Zyl	21fe28f51d	Merge pull request #130 from ipodishima/fix/too_many_agents_spawned fix: prevent runaway testing agent spawning (critical)	2026-01-30 09:03:52 +02:00
Marian Paul	80b6af7b2b	fix: prevent runaway testing agent spawning (critical) running_testing_agents was keyed by feature_id, so when multiple agents tested the same feature, each spawn overwrote the previous dict entry. The count stayed at 1 regardless of how many processes were actually running, causing the maintain loop to spawn agents indefinitely (~130+). Re-key the dict by PID so each agent gets a unique entry and the existing max-agent guards work correctly. Also check the return value of _spawn_testing_agent() to break the loop on failure. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2026-01-29 15:02:08 +01:00
Zeid Derhally	099f52b19c	Add support for using vertex AI with claude models	2026-01-29 06:13:34 -05:00