Files
autocoder/server/services/chat_constants.py
Auto 4f102e7bc2 fix: resolve false-positive rate limit and one-message-behind in chat sessions
The Claude Code CLI v2.1.45+ emits a `rate_limit_event` message type that
the Python SDK v0.1.19 cannot parse, raising MessageParseError. Two bugs
resulted:

1. **False-positive rate limit**: check_rate_limit_error() matched
   "rate_limit" in the exception string "Unknown message type:
   rate_limit_event" via both an explicit type check and a regex fallback,
   triggering 15-19s backoff + query re-send on every session.

2. **One-message-behind**: The MessageParseError killed the
   receive_response() async generator, but the CLI subprocess was still
   alive with buffered response data. Catching and returning meant the
   response was never consumed. The next send_message() would read the
   previous response first, creating a one-behind offset.

Changes:

- chat_constants.py: check_rate_limit_error() now returns (False, None)
  for any MessageParseError, blocking both false-positive paths. Added
  safe_receive_response() helper that retries receive_response() on
  MessageParseError — the SDK's decoupled producer/consumer architecture
  (anyio memory channel) allows the new generator to continue reading
  remaining messages without data loss. Removed calculate_rate_limit_backoff
  re-export and MAX_CHAT_RATE_LIMIT_RETRIES constant.

- spec_chat_session.py, assistant_chat_session.py, expand_chat_session.py:
  Replaced retry-with-backoff loops with safe_receive_response() wrapper.
  Removed asyncio.sleep backoff, query re-send, and rate_limited yield.
  Cleaned up unused imports (asyncio, calculate_rate_limit_backoff,
  MAX_CHAT_RATE_LIMIT_RETRIES).

- agent.py: Added inner retry loop around receive_response() with same
  MessageParseError skip-and-restart pattern. Removed early-return that
  truncated responses.

- types.ts: Removed SpecChatRateLimitedMessage,
  AssistantChatRateLimitedMessage, and their union entries.

- useSpecChat.ts, useAssistantChat.ts, useExpandChat.ts: Removed dead
  'rate_limited' case handlers.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-23 13:00:16 +02:00

112 lines
4.3 KiB
Python

"""
Chat Session Constants
======================
Shared constants for all chat session types (assistant, spec, expand).
The canonical ``API_ENV_VARS`` list lives in ``env_constants.py`` at the
project root and is re-exported here for convenience so that existing
imports (``from .chat_constants import API_ENV_VARS``) continue to work.
"""
import logging
import sys
from pathlib import Path
from typing import Any, AsyncGenerator
# -------------------------------------------------------------------
# Root directory of the autoforge project (repository root).
# Used throughout the server package whenever the repo root is needed.
# -------------------------------------------------------------------
ROOT_DIR = Path(__file__).parent.parent.parent
# Ensure the project root is on sys.path so we can import env_constants
# from the root-level module without requiring a package install.
_root_str = str(ROOT_DIR)
if _root_str not in sys.path:
sys.path.insert(0, _root_str)
# -------------------------------------------------------------------
# Environment variables forwarded to Claude CLI subprocesses.
# Single source of truth lives in env_constants.py at the project root.
# Re-exported here so existing ``from .chat_constants import API_ENV_VARS``
# imports continue to work unchanged.
# -------------------------------------------------------------------
from env_constants import API_ENV_VARS # noqa: E402, F401
from rate_limit_utils import is_rate_limit_error, parse_retry_after # noqa: E402, F401
logger = logging.getLogger(__name__)
def check_rate_limit_error(exc: Exception) -> tuple[bool, int | None]:
"""Inspect an exception and determine if it represents a rate-limit.
Returns ``(is_rate_limit, retry_seconds)``. ``retry_seconds`` is the
parsed Retry-After value when available, otherwise ``None`` (caller
should use exponential backoff).
"""
# MessageParseError = unknown CLI message type (e.g. "rate_limit_event").
# These are informational events, NOT actual rate limit errors.
# The word "rate_limit" in the type name would false-positive the regex.
if type(exc).__name__ == "MessageParseError":
return False, None
# For all other exceptions: match error text against known rate-limit patterns
exc_str = str(exc)
if is_rate_limit_error(exc_str):
retry = parse_retry_after(exc_str)
return True, retry
return False, None
async def safe_receive_response(client: Any, log: logging.Logger) -> AsyncGenerator:
"""Wrap ``client.receive_response()`` to skip ``MessageParseError``.
The Claude Code CLI may emit message types (e.g. ``rate_limit_event``)
that the installed Python SDK does not recognise, causing
``MessageParseError`` which kills the async generator. The CLI
subprocess is still alive and the SDK uses a buffered memory channel,
so we restart ``receive_response()`` to continue reading remaining
messages without losing data.
"""
max_retries = 50
retries = 0
while True:
try:
async for msg in client.receive_response():
yield msg
return # Normal completion
except Exception as exc:
if type(exc).__name__ == "MessageParseError":
retries += 1
if retries > max_retries:
log.error(f"Too many unrecognized CLI messages ({retries}), stopping")
return
log.warning(f"Ignoring unrecognized message from Claude CLI: {exc}")
continue
raise
async def make_multimodal_message(content_blocks: list[dict]) -> AsyncGenerator[dict, None]:
"""Yield a single multimodal user message in Claude Agent SDK format.
The Claude Agent SDK's ``query()`` method accepts either a plain string
or an ``AsyncIterable[dict]`` for custom message formats. This helper
wraps a list of content blocks (text and/or images) in the expected
envelope.
Args:
content_blocks: List of content-block dicts, e.g.
``[{"type": "text", "text": "..."}, {"type": "image", ...}]``.
Yields:
A single dict representing the user message.
"""
yield {
"type": "user",
"message": {"role": "user", "content": content_blocks},
"parent_tool_use_id": None,
"session_id": "default",
}