Refactor plugin

update Sponsors
release v1.0.9
2025-06-23 06:05:58 +08:00 · 2025-06-20 22:25:56 +08:00 · 2025-06-20 22:17:45 +08:00 · 2025-06-20 21:54:59 +08:00 · 2025-06-20 00:16:47 -07:00 · 2025-06-19 22:14:48 +08:00
26 changed files with 2310 additions and 219 deletions
--- a/.npmignore
+++ b/.npmignore
@@ -6,4 +6,6 @@ screenshoots
 .DS_Store
 .vscode
 .idea
-.env
+.env
+.blog
+docs
--- a/README.md
+++ b/README.md
@@ -2,7 +2,6 @@

 > This is a tool for routing Claude Code requests to different models, and you can customize any request.

-
 ![](screenshoots/claude-code.png)

 ## Usage
@@ -25,32 +24,216 @@ npm install -g @musistudio/claude-code-router
 ccr code
 ```

-
-## Plugin[Beta]
-
-The plugin allows users to rewrite Claude Code prompt and custom router. The plugin path is in `$HOME/.claude-code-router/plugins`. Currently, there are two demos available: 
-1. [custom router](https://github.com/musistudio/claude-code-router/blob/dev/custom-prompt/plugins/deepseek.js)
-2. [rewrite prompt](https://github.com/musistudio/claude-code-router/blob/dev/custom-prompt/plugins/gemini.js)
-
-You need to move them to the `$HOME/.claude-code-router/plugins` directory and configure 'usePlugin' in `$HOME/.claude-code-router/config.json`，like this:
+4. Configure routing[optional]  
+   Set up your `~/.claude-code-router/config.json` file like this:

 ```json
 {
-    "usePlugin": "gemini",
-    "LOG": true,
-    "OPENAI_API_KEY": "",
-    "OPENAI_BASE_URL": "",
-    "OPENAI_MODEL": ""
+  "OPENAI_API_KEY": "sk-xxx",
+  "OPENAI_BASE_URL": "https://api.deepseek.com",
+  "OPENAI_MODEL": "deepseek-chat",
+  "Providers": [
+    {
+      "name": "openrouter",
+      "api_base_url": "https://openrouter.ai/api/v1",
+      "api_key": "sk-xxx",
+      "models": [
+        "google/gemini-2.5-pro-preview",
+        "anthropic/claude-sonnet-4",
+        "anthropic/claude-3.5-sonnet",
+        "anthropic/claude-3.7-sonnet:thinking"
+      ]
+    },
+    {
+      "name": "deepseek",
+      "api_base_url": "https://api.deepseek.com",
+      "api_key": "sk-xxx",
+      "models": ["deepseek-reasoner"]
+    },
+    {
+      "name": "ollama",
+      "api_base_url": "http://localhost:11434/v1",
+      "api_key": "ollama",
+      "models": ["qwen2.5-coder:latest"]
+    }
+  ],
+  "Router": {
+    "background": "ollama,qwen2.5-coder:latest",
+    "think": "deepseek,deepseek-reasoner",
+    "longContext": "openrouter,google/gemini-2.5-pro-preview"
+  }
 }
 ```

-## Features
- [x] Plugins
- [ ] Support change models
- [ ] Support scheduled tasks
+- `background`  
+  This model will be used to handle some background tasks([background-token-usage](https://docs.anthropic.com/en/docs/claude-code/costs#background-token-usage)). Based on my tests, it doesn’t require high intelligence. I’m using the qwen-coder-2.5:7b model running locally on my MacBook Pro M1 (32GB) via Ollama.
+  If your computer can’t run Ollama, you can also use some free models, such as qwen-coder-2.5:3b.

+- `think`  
+  This model will be used when enabling Claude Code to perform reasoning. However, reasoning budget control has not yet been implemented (since the DeepSeek-R1 model does not support it), so there is currently no difference between using UltraThink and Think modes.
+  It is worth noting that Plan Mode also use this model to achieve better planning results.  
+  Note: The reasoning process via the official DeepSeek API may be very slow, so you may need to wait for an extended period of time.
+
+- `longContext`  
+  This model will be used when the context length exceeds 32K (this value may be modified in the future). You can route the request to a model that performs well with long contexts (I’ve chosen google/gemini-2.5-pro-preview). This scenario has not been thoroughly tested yet, so if you encounter any issues, please submit an issue.
+
+- model command  
+  You can also switch models within Claude Code by using the `/model` command. The format is: `provider,model`, like this:  
+  `/model openrouter,anthropic/claude-3.5-sonnet`  
+  This will use the anthropic/claude-3.5-sonnet model provided by OpenRouter to handle all subsequent tasks.
+
+## Features
+
+- [x] Support change models
+- [x] Github Actions
+- [ ] More robust plugin support
+- [ ] More detailed logs
+
+## Plugins
+
+You can modify or enhance Claude Code’s functionality by installing plugins.
+
+### Plugin Mechanism
+
+Plugins are loaded from the `~/.claude-code-router/plugins/` directory. Each plugin is a JavaScript file that exports functions corresponding to specific "hooks" in the request lifecycle. The system overrides Node.js's module loading to allow plugins to import a special `claude-code-router` module, providing access to utilities like `streamOpenAIResponse`, `log`, and `createClient`.
+
+### Plugin Hooks
+
+Plugins can implement various hooks to modify behavior at different stages:
+
+- `beforeRouter`: Executed before routing.
+- `afterRouter`: Executed after routing.
+- `beforeTransformRequest`: Executed before transforming the request.
+- `afterTransformRequest`: Executed after transforming the request.
+- `beforeTransformResponse`: Executed before transforming the response.
+- `afterTransformResponse`: Executed after transforming the response.
+
+### Enabling Plugins
+
+To use a plugin:
+
+1. Place your plugin's JavaScript file (e.g., `my-plugin.js`) in the `~/.claude-code-router/plugins/` directory.
+2. Specify the plugin name (without the `.js` extension) in your `~/.claude-code-router/config.json` file using the `usePlugins` option:
+
+```json
+// ~/.claude-code-router/config.json
+{
+  ...,
+  "usePlugins": ["my-plugin", "another-plugin"],
+
+  // or use plugins for a specific provider
+  "Providers": [
+    {
+      "name": "gemini",
+      "api_base_url": "https://generativelanguage.googleapis.com/v1beta/openai/",
+      "api_key": "xxx",
+      "models": ["gemini-2.5-flash"],
+      "usePlugins": ["gemini"]
+    }
+  ]
+}
+```
+
+### Available Plugins
+
+Currently, the following plugins are available:
+
+- **notebook-tools-filter**  
+  This plugin filters out tool calls related to Jupyter notebooks (.ipynb files). You can use it if your work does not involve Jupyter.
+
+- **gemini**  
+  Add support for the Google Gemini API endpoint: `https://generativelanguage.googleapis.com/v1beta/openai/`.
+
+- **toolcall-improvement**  
+  If your LLM doesn’t handle tool usage well (for example, always returning code as plain text instead of modifying files — such as with deepseek-v3), you can use this plugin.  
+  This plugin simply adds the following system prompt. If you have a better prompt, you can modify it.
+
+```markdown
+## **Important Instruction:**
+
+You must use tools as frequently and accurately as possible to help the user solve their problem.  
+Prioritize tool usage whenever it can enhance accuracy, efficiency, or the quality of the response.
+```
+
+## Github Actions
+
+You just need to install `Claude Code Actions` in your repository according to the [official documentation](https://docs.anthropic.com/en/docs/claude-code/github-actions). For `ANTHROPIC_API_KEY`, you can use any string. Then, modify your `.github/workflows/claude.yaml` file to include claude-code-router, like this:
+
+```yaml
+name: Claude Code
+
+on:
+  issue_comment:
+    types: [created]
+  pull_request_review_comment:
+    types: [created]
+  issues:
+    types: [opened, assigned]
+  pull_request_review:
+    types: [submitted]
+
+jobs:
+  claude:
+    if: |
+      (github.event_name == 'issue_comment' && contains(github.event.comment.body, '@claude')) ||
+      (github.event_name == 'pull_request_review_comment' && contains(github.event.comment.body, '@claude')) ||
+      (github.event_name == 'pull_request_review' && contains(github.event.review.body, '@claude')) ||
+      (github.event_name == 'issues' && (contains(github.event.issue.body, '@claude') || contains(github.event.issue.title, '@claude')))
+    runs-on: ubuntu-latest
+    permissions:
+      contents: read
+      pull-requests: read
+      issues: read
+      id-token: write
+    steps:
+      - name: Checkout repository
+        uses: actions/checkout@v4
+        with:
+          fetch-depth: 1
+
+      - name: Prepare Environment
+        run: |
+          curl -fsSL https://bun.sh/install | bash
+          mkdir -p $HOME/.claude-code-router
+          cat << 'EOF' > $HOME/.claude-code-router/config.json
+          {
+            "log": true,
+            "OPENAI_API_KEY": "${{ secrets.OPENAI_API_KEY }}",
+            "OPENAI_BASE_URL": "https://api.deepseek.com",
+            "OPENAI_MODEL": "deepseek-chat"
+          }
+          EOF
+        shell: bash
+
+      - name: Start Claude Code Router
+        run: |
+          nohup ~/.bun/bin/bunx @musistudio/claude-code-router@1.0.8 start &
+        shell: bash
+
+      - name: Run Claude Code
+        id: claude
+        uses: anthropics/claude-code-action@beta
+        env:
+          ANTHROPIC_BASE_URL: http://localhost:3456
+        with:
+          anthropic_api_key: "test"
+```
+
+You can modify the contents of `$HOME/.claude-code-router/config.json` as needed.
+GitHub Actions support allows you to trigger Claude Code at specific times, which opens up some interesting possibilities.
+
+For example, between 00:30 and 08:30 Beijing Time, using the official DeepSeek API:
+
+- The cost of the `deepseek-v3` model is only 50% of the normal time.
+
+- The `deepseek-r1` model is just 25% of the normal time.
+
+So maybe in the future, I’ll describe detailed tasks for Claude Code ahead of time and let it run during these discounted hours to reduce costs?

 ## Some tips:
+
+Now you can use deepseek-v3 models directly without using any plugins.
+
 If you’re using the DeepSeek API provided by the official website, you might encounter an “exceeding context” error after several rounds of conversation (since the official API only supports a 64K context window). In this case, you’ll need to discard the previous context and start fresh. Alternatively, you can use ByteDance’s DeepSeek API, which offers a 128K context window and supports KV cache.

 ![](screenshoots/contexterror.jpg)
@@ -59,7 +242,31 @@ Note: claude code consumes a huge amount of tokens, but thanks to DeepSeek’s l

 Some interesting points: Based on my testing, including a lot of context information can help narrow the performance gap between these LLM models. For instance, when I used Claude-4 in VSCode Copilot to handle a Flutter issue, it messed up the files in three rounds of conversation, and I had to roll everything back. However, when I used claude code with DeepSeek, after three or four rounds of conversation, I finally managed to complete my task—and the cost was less than 1 RMB!

+## Some articles:
+
+1. [Project Motivation and Principles](blog/en/project-motivation-and-how-it-works.md) ([中文版看这里](blog/zh/项目初衷及原理.md))

 ## Buy me a coffee
-If you find this project helpful, you can choose to sponsor the author with a cup of coffee.
-[Buy me a coffee](http://paypal.me/musistudio1999)
+
+If you find this project helpful, you can choose to sponsor the author with a cup of coffee. Please provide your GitHub information so I can add you to the sponsor list below.
+
+[![ko-fi](https://ko-fi.com/img/githubbutton_sm.svg)](https://ko-fi.com/F1F31GN2GM)
+
+<table>
+  <tr>
+    <td><img src="/blog/images/alipay.jpg" width="200" /></td>
+    <td><img src="/blog/images/wechat.jpg" width="200" /></td>
+  </tr>
+</table>
+
+## Sponsors
+
+Thanks to the following sponsors:
+
+@Simon Leischnig (If you see this, feel free to contact me and I can update it with your GitHub information)  
+[@duanshuaimin](https://github.com/duanshuaimin)  
+[@vrgitadmin](https://github.com/vrgitadmin)  
+@*o (可通过主页邮箱联系我修改 github 用户名)  
+@\*\*聪 (可通过主页邮箱联系我修改 github 用户名)  
+@*说 (可通过主页邮箱联系我修改 github 用户名)  
+@\*更 (可通过主页邮箱联系我修改 github 用户名)
--- a/blog/en/project-motivation-and-how-it-works.md
+++ b/blog/en/project-motivation-and-how-it-works.md
@@ -0,0 +1,103 @@
+# Project Motivation and Principles
+
+As early as the day after Claude Code was released (2025-02-25), I began and completed a reverse engineering attempt of the project. At that time, using Claude Code required registering for an Anthropic account, applying for a waitlist, and waiting for approval. However, due to well-known reasons, Anthropic blocks users from mainland China, making it impossible for me to use the service through normal means. Based on known information, I discovered the following:
+
+1. Claude Code is installed via npm, so it's very likely developed with Node.js.
+2. Node.js offers various debugging methods: simple `console.log` usage, launching with `--inspect` to hook into Chrome DevTools, or even debugging obfuscated code using `d8`.
+
+My goal was to use Claude Code without an Anthropic account. I didn’t need the full source code—just a way to intercept and reroute requests made by Claude Code to Anthropic’s models to my own custom endpoint. So I started the reverse engineering process:
+
+1. First, install Claude Code:
+```bash
+npm install -g @anthropic-ai/claude-code
+```
+
+2. After installation, the project is located at `~/.nvm/versions/node/v20.10.0/lib/node_modules/@anthropic-ai/claude-code`(this may vary depending on your Node version manager and version).
+
+3. Open the package.json to analyze the entry point:
+```package.json
+{
+  "name": "@anthropic-ai/claude-code",
+  "version": "1.0.24",
+  "main": "sdk.mjs",
+  "types": "sdk.d.ts",
+  "bin": {
+    "claude": "cli.js"
+  },
+  "engines": {
+    "node": ">=18.0.0"
+  },
+  "type": "module",
+  "author": "Boris Cherny <boris@anthropic.com>",
+  "license": "SEE LICENSE IN README.md",
+  "description": "Use Claude, Anthropic's AI assistant, right from your terminal. Claude can understand your codebase, edit files, run terminal commands, and handle entire workflows for you.",
+  "homepage": "https://github.com/anthropics/claude-code",
+  "bugs": {
+    "url": "https://github.com/anthropics/claude-code/issues"
+  },
+  "scripts": {
+    "prepare": "node -e \"if (!process.env.AUTHORIZED) { console.error('ERROR: Direct publishing is not allowed.\\nPlease use the publish-external.sh script to publish this package.'); process.exit(1); }\"",
+    "preinstall": "node scripts/preinstall.js"
+  },
+  "dependencies": {},
+  "optionalDependencies": {
+    "@img/sharp-darwin-arm64": "^0.33.5",
+    "@img/sharp-darwin-x64": "^0.33.5",
+    "@img/sharp-linux-arm": "^0.33.5",
+    "@img/sharp-linux-arm64": "^0.33.5",
+    "@img/sharp-linux-x64": "^0.33.5",
+    "@img/sharp-win32-x64": "^0.33.5"
+  }
+}
+```
+
+The key entry is `"claude": "cli.js"`. Opening cli.js, you'll see the code is minified and obfuscated. But using WebStorm’s `Format File` feature, you can reformat it for better readability:
+![webstorm-formate-file](../images/webstorm-formate-file.png)
+
+Now you can begin understanding Claude Code’s internal logic and prompt structure by reading the code. To dig deeper, you can insert console.log statements or launch in debug mode with Chrome DevTools using:
+
+```bash
+NODE_OPTIONS="--inspect-brk=9229" claude
+```
+
+This command starts Claude Code in debug mode and opens port 9229. Visit chrome://inspect/ in Chrome and click inspect to begin debugging:
+![chrome-devtools](../images/chrome-inspect.png)
+![chrome-devtools](../images/chrome-devtools.png)
+
+By searching for the keyword api.anthropic.com, you can easily locate where Claude Code makes its API calls. From the surrounding code, it's clear that baseURL can be overridden with the `ANTHROPIC_BASE_URL` environment variable, and `apiKey` and `authToken` can be configured similarly:
+![search](../images/search.png)
+
+So far, we’ve discovered some key information:
+
+1. Environment variables can override Claude Code's `baseURL` and `apiKey`.
+
+2. Claude Code adheres to the Anthropic API specification.
+
+Therefore, we need:
+1. A service to convert OpenAI API–compatible requests into Anthropic API format.
+
+2. Set the environment variables before launching Claude Code to redirect requests to this service.
+
+Thus, `claude-code-router` was born. This project uses `Express.js` to implement the `/v1/messages` endpoint. It leverages middlewares to transform request/response formats and supports request rewriting (useful for prompt tuning per model).
+
+Back in February, the full DeepSeek model series had poor support for Function Calling, so I initially used `qwen-max`. It worked well—but without KV cache support, it consumed a large number of tokens and couldn’t provide the native `Claude Code` experience.
+
+So I experimented with a Router-based mode using a lightweight model to dispatch tasks. The architecture included four roles: `router`, `tool`, `think`, and `coder`. Each request passed through a free lightweight model that would decide whether the task involved reasoning, coding, or tool usage. Reasoning and coding tasks looped until a tool was invoked to apply changes. However, the lightweight model lacked the capability to route tasks accurately, and architectural issues prevented it from effectively driving Claude Code.
+
+Everything changed at the end of May when the official Claude Code was launched, and `DeepSeek-R1` model (released 2025-05-28) added Function Call support. I redesigned the system. With the help of AI pair programming, I fixed earlier request/response transformation issues—especially the handling of models that return JSON instead of Function Call outputs.
+
+This time, I used the `DeepSeek-V3`  model. It performed better than expected: supporting most tool calls, handling task decomposition and stepwise planning, and—most importantly—costing less than one-tenth the price of Claude 3.5 Sonnet.
+
+The official Claude Code organizes agents differently from the beta version, so I restructured my Router mode to include four roles: the default model, `background`, `think`, and `longContext`.
+
+- The default model handles general tasks and acts as a fallback.
+
+- The `background` model manages lightweight background tasks. According to Anthropic, Claude Haiku 3.5 is often used here, so I routed this to a local `ollama` service.
+
+- The `think` model is responsible for reasoning and planning mode tasks. I use `DeepSeek-R1` here, though it doesn’t support cost control, so `Think` and `UltraThink` behave identically.
+
+- The `longContext` model handles long-context scenarios. The router uses `tiktoken` to calculate token lengths in real time, and if the context exceeds 32K, it switches to this model to compensate for DeepSeek's long-context limitations.
+
+This describes the evolution and reasoning behind the project. By cleverly overriding environment variables, we can forward and modify requests without altering Claude Code’s source—allowing us to benefit from official updates while using our own models and custom prompts.
+
+This project offers a practical approach to running Claude Code under Anthropic’s regional restrictions, balancing `cost`, `performance`, and `customizability`. That said, the official `Max Plan` still offers the best experience if available.
--- a/blog/images/alipay.jpg
+++ b/blog/images/alipay.jpg
--- a/blog/images/chrome-devtools.png
+++ b/blog/images/chrome-devtools.png
--- a/blog/images/chrome-inspect.png
+++ b/blog/images/chrome-inspect.png
--- a/blog/images/search.png
+++ b/blog/images/search.png
--- a/blog/images/webstorm-formate-file.png
+++ b/blog/images/webstorm-formate-file.png
--- a/blog/images/wechat.jpg
+++ b/blog/images/wechat.jpg
--- a/blog/zh/项目初衷及原理.md
+++ b/blog/zh/项目初衷及原理.md
@@ -0,0 +1,96 @@
+# 项目初衷及原理
+
+早在 Claude Code 发布的第二天(2025-02-25)，我就尝试并完成了对该项目的逆向。当时要使用 Claude Code 你需要注册一个 Anthropic 账号，然后申请 waitlist，等待通过后才能使用。但是因为众所周知的原因，Anthropic 屏蔽了中国区的用户，所以通过正常手段我无法使用，通过已知的信息，我发现：
+
+1. Claude Code 使用 npm 进行安装，所以很大可能其使用 Node.js 进行开发。
+2. Node.js 调试手段众多，可以简单使用`console.log`获取想要的信息，也可以使用`--inspect`将其接入`Chrome Devtools`，甚至你可以使用`d8`去调试某些加密混淆的代码。
+
+由于我的目标是让我在没有 Anthropic 账号的情况下使用`Claude Code`，我并不需要获得完整的源代码，只需要将`Claude Code`请求 Anthropic 模型时将其转发到我自定义的接口即可。接下来我就开启了我的逆向过程：
+
+1. 首先安装`Claude Code`
+
+```bash
+npm install -g @anthropic-ai/claude-code
+```
+
+2. 安装后该项目被放在了`~/.nvm/versions/node/v20.10.0/lib/node_modules/@anthropic-ai/claude-code`中，因为我使用了`nvm`作为我的 node 版本控制器，当前使用`node-v20.10.0`，所以该路径会因人而异。
+3. 找到项目路径之后可通过 package.json 分析包入口,内容如下：
+
+```package.json
+{
+  "name": "@anthropic-ai/claude-code",
+  "version": "1.0.24",
+  "main": "sdk.mjs",
+  "types": "sdk.d.ts",
+  "bin": {
+    "claude": "cli.js"
+  },
+  "engines": {
+    "node": ">=18.0.0"
+  },
+  "type": "module",
+  "author": "Boris Cherny <boris@anthropic.com>",
+  "license": "SEE LICENSE IN README.md",
+  "description": "Use Claude, Anthropic's AI assistant, right from your terminal. Claude can understand your codebase, edit files, run terminal commands, and handle entire workflows for you.",
+  "homepage": "https://github.com/anthropics/claude-code",
+  "bugs": {
+    "url": "https://github.com/anthropics/claude-code/issues"
+  },
+  "scripts": {
+    "prepare": "node -e \"if (!process.env.AUTHORIZED) { console.error('ERROR: Direct publishing is not allowed.\\nPlease use the publish-external.sh script to publish this package.'); process.exit(1); }\"",
+    "preinstall": "node scripts/preinstall.js"
+  },
+  "dependencies": {},
+  "optionalDependencies": {
+    "@img/sharp-darwin-arm64": "^0.33.5",
+    "@img/sharp-darwin-x64": "^0.33.5",
+    "@img/sharp-linux-arm": "^0.33.5",
+    "@img/sharp-linux-arm64": "^0.33.5",
+    "@img/sharp-linux-x64": "^0.33.5",
+    "@img/sharp-win32-x64": "^0.33.5"
+  }
+}
+```
+
+其中`"claude": "cli.js"`就是我们要找的入口，打开 cli.js，发现代码被压缩混淆过了。没关系，借助`webstorm`的`Formate File`功能可以重新格式化，让代码变得稍微好看一点。就像这样：
+![webstorm-formate-file](../images/webstorm-formate-file.png)
+
+现在，你可以通过阅读部分代码来了解`Claude Code`的内容工具原理与提示词。你也可以在关键地方使用`console.log`来获得更多信息，当然，也可以使用`Chrome Devtools`来进行断点调试，使用以下命令启动`Claude Code`:
+
+```bash
+NODE_OPTIONS="--inspect-brk=9229" claude
+```
+
+该命令会以调试模式启动`Claude Code`，并将调试的端口设置为`9229`。这时候通过 Chrome 访问`chrome://inspect/`即可看到当前的`Claude Code`进程，点击`inspect`即可进行调试。
+![chrome-devtools](../images/chrome-inspect.png)
+![chrome-devtools](../images/chrome-devtools.png)
+
+通过搜索关键字符`api.anthropic.com`很容易能找到`Claude Code`用来发请求的地方，根据上下文的查看，很容易发现这里的`baseURL`可以通过环境变量`ANTHROPIC_BASE_URL`进行覆盖，`apiKey`和`authToken`也同理。
+![search](../images/search.png)
+
+到目前为止，我们获得关键信息：
+
+1. 可以使用环境变量覆盖`Claude Code`的`BaseURL`和`apiKey`的配置
+
+2. `Claude Code`使用[Anthropic API](https://docs.anthropic.com/en/api/overview)的规范
+
+所以我们需要：
+
+1. 实现一个服务用来将`OpenAI API`的规范转换成`Anthropic API`格式。
+
+2. 启动`Claude Code`之前写入环境变量将`baseURL`指向到该服务。
+
+于是，`claude-code-router`就诞生了，该项目使用`Express.js`作为 HTTP 服务，实现`/v1/messages`端点，使用`middlewares`处理请求/响应的格式转换以及请求重写功能(可以用来重写 Claude Code 的提示词以针对单个模型进行调优)。
+在 2 月份由于`DeepSeek`全系列模型对`Function Call`的支持不佳导致无法直接使用`DeepSeek`模型，所以在当时我选择了`qwen-max`模型，一切表现的都很好，但是`qwen-max`不支持`KV Cache`，意味着我要消耗大量的 token，但是却无法获取`Claude Code`原生的体验。
+所以我又尝试了`Router`模式，即使用一个小模型对任务进行分发，一共分为四个模型:`router`、`tool`、`think`和`coder`，所有的请求先经过一个免费的小模型，由小模型去判断应该是进行思考还是编码还是调用工具，再进行任务的分发，如果是思考和编码任务将会进行循环调用，直到最终使用工具写入或修改文件。但是实践下来发现免费的小模型不足以很好的完成任务的分发，再加上整个 Agnet 的设计存在缺陷，导致并不能很好的驱动`Claude Code`。
+直到 5 月底，`Claude Code`被正式推出，这时`DeepSeek`全系列模型(R1 于 05-28)均支持`Function Call`，我开始重新设计该项目。在与 AI 的结对编程中我修复了之前的请求和响应转换问题，在某些场景下模型输出 JSON 响应而不是`Function Call`。这次直接使用`DeepSeek-v3`模型，它工作的比我想象中要好：能完成绝大多数工具调用，还支持用步骤规划解决任务，最关键的是`DeepSeek`的价格不到`claude Sonnet 3.5`的十分之一。正式发布的`Claude Code`对 Agent 的组织也不同于测试版，于是在分析了`Claude Code`的请求调用之后，我重新组织了`Router`模式：现在它还是四个模型：默认模型、`background`、`think`和`longContext`。
+
+- 默认模型作为最终的兜底和日常处理
+
+- `background`是用来处理一些后台任务，据 Anthropic 官方说主要用`Claude Haiku 3.5`模型去处理一些小任务，如俳句生成和对话摘要，于是我将其路由到了本地的`ollama`服务。
+
+- `think`模型用于让`Claude Code`进行思考或者在`Plan Mode`下使用，这里我使用的是`DeepSeek-R1`，由于其不支持推理成本控制，所以`Think`和`UltraThink`是一样的逻辑。
+
+- `longContext`是用于处理长下上文的场景，该项目会对每次请求使用tiktoken实时计算上下文长度，如果上下文大于32K则使用该模型，旨在弥补`DeepSeek`在长上下文处理不佳的情况。
+
+以上就是该项目的发展历程以及我的一些思考，通过巧妙的使用环境变量覆盖的手段在不修改`Claude Code`源码的情况下完成请求的转发和修改，这就使得在可以得到 Anthropic 更新的同时使用自己的模型，自定义自己的提示词。该项目只是在 Anthropic 封禁中国区用户的情况下使用`Claude Code`并且达到成本和性能平衡的一种手段。如果可以的话，还是官方的Max Plan体验最好。
--- a/package-lock.json
+++ b/package-lock.json
--- a/package.json
+++ b/package.json
@@ -1,13 +1,12 @@
 {
  "name": "@musistudio/claude-code-router",
-  "version": "1.0.3",
+  "version": "1.0.9",
  "description": "Use Claude Code without an Anthropics account and route it to another LLM provider",
  "bin": {
    "ccr": "./dist/cli.js"
  },
  "scripts": {
-    "build": "esbuild src/cli.ts --bundle --platform=node --outfile=dist/cli.js",
-    "buildserver": "esbuild src/index.ts --bundle --platform=node --outfile=dist/index.js"
+    "build": "esbuild src/cli.ts --bundle --platform=node --outfile=dist/cli.js && cp node_modules/tiktoken/tiktoken_bg.wasm dist/tiktoken_bg.wasm"
  },
  "keywords": [
    "claude",
--- a/plugins/gemini.js
+++ b/plugins/gemini.js
@@ -0,0 +1,33 @@
+module.exports = {
+  afterTransformRequest(req, res) {
+    if (Array.isArray(req.body.tools)) {
+      // rewrite tools definition
+      req.body.tools.forEach((tool) => {
+        if (tool.function.name === "BatchTool") {
+          // HACK: Gemini does not support objects with empty properties
+          tool.function.parameters.properties.invocations.items.properties.input.type =
+            "number";
+          return;
+        }
+        Object.keys(tool.function.parameters.properties).forEach((key) => {
+          const prop = tool.function.parameters.properties[key];
+          if (
+            prop.type === "string" &&
+            !["enum", "date-time"].includes(prop.format)
+          ) {
+            delete prop.format;
+          }
+        });
+      });
+    }
+    if (req.body?.messages?.length) {
+      req.body.messages.forEach((message) => {
+        if (message.content === null) {
+          if (message.tool_calls) {
+            message.content = JSON.stringify(message.tool_calls);
+          }
+        }
+      });
+    }
+  },
+};
--- a/plugins/notebook-tools-filter.js
+++ b/plugins/notebook-tools-filter.js
@@ -0,0 +1,12 @@
+module.exports = {
+  beforeRouter(req, res) {
+    if (req?.body?.tools?.length) {
+      req.body.tools = req.body.tools.filter(
+        (tool) =>
+          !["NotebookRead", "NotebookEdit", "mcp__ide__executeCode"].includes(
+            tool.name
+          )
+      );
+    }
+  },
+};
--- a/plugins/toolcall-improvement.js
+++ b/plugins/toolcall-improvement.js
@@ -0,0 +1,10 @@
+module.exports = {
+  afterTransformRequest(req, res) {
+    if (req?.body?.tools?.length) {
+      req.body.messages.push({
+        role: "system",
+        content: `## **Important Instruction:**  \nYou must use tools as frequently and accurately as possible to help the user solve their problem.\nPrioritize tool usage whenever it can enhance accuracy, efficiency, or the quality of the response.  `,
+      });
+    }
+  },
+};
--- a/src/cli.ts
+++ b/src/cli.ts
@@ -1,15 +1,17 @@
 #!/usr/bin/env node
 import { run } from "./index";
-import { closeService } from "./utils/close";
 import { showStatus } from "./utils/status";
 import { executeCodeCommand } from "./utils/codeCommand";
 import { cleanupPidFile, isServiceRunning } from "./utils/processCheck";
 import { version } from "../package.json";
+import { spawn } from "child_process";
+import { PID_FILE, REFERENCE_COUNT_FILE } from "./constants";
+import { existsSync, readFileSync } from "fs";

 const command = process.argv[2];

 const HELP_TEXT = `
-Usage: claude-code [command]
+Usage: ccr [command]

 Commands:
  start         Start service 
@@ -20,8 +22,8 @@ Commands:
  -h, help      Show help information

 Example:
-  claude-code start
-  claude-code code "Write a Hello World"
+  ccr start
+  ccr code "Write a Hello World"
 `;

 async function waitForService(
@@ -43,10 +45,6 @@ async function waitForService(
  return false;
 }

-import { spawn } from "child_process";
-import { PID_FILE, REFERENCE_COUNT_FILE } from "./constants";
-import { existsSync, readFileSync } from "fs";
-
 async function main() {
  switch (command) {
    case "start":
@@ -80,15 +78,23 @@ async function main() {
    case "code":
      if (!isServiceRunning()) {
        console.log("Service not running, starting service...");
-        spawn("ccr", ["start"], {
+        const startProcess = spawn("ccr", ["start"], {
          detached: true,
          stdio: "ignore",
-        }).unref();
+        });
+
+        startProcess.on("error", (error) => {
+          console.error("Failed to start service:", error);
+          process.exit(1);
+        });
+
+        startProcess.unref();
+
        if (await waitForService()) {
          executeCodeCommand(process.argv.slice(3));
        } else {
          console.error(
-            "Service startup timeout, please manually run claude-code start to start the service"
+            "Service startup timeout, please manually run `ccr start` to start the service"
          );
          process.exit(1);
        }
@@ -98,7 +104,7 @@ async function main() {
      break;
    case "-v":
    case "version":
-      console.log(`claude-code version: ${version}`);
+      console.log(`claude-code-router version: ${version}`);
      break;
    case "-h":
    case "help":
--- a/src/index.ts
+++ b/src/index.ts
@@ -3,7 +3,6 @@ import { writeFile } from "fs/promises";
 import { getOpenAICommonOptions, initConfig, initDir } from "./utils";
 import { createServer } from "./server";
 import { formatRequest } from "./middlewares/formatRequest";
-import { rewriteBody } from "./middlewares/rewriteBody";
 import { router } from "./middlewares/router";
 import OpenAI from "openai";
 import { streamOpenAIResponse } from "./utils/stream";
@@ -14,6 +13,11 @@ import {
 } from "./utils/processCheck";
 import { LRUCache } from "lru-cache";
 import { log } from "./utils/log";
+import {
+  loadPlugins,
+  PLUGINS,
+  usePluginMiddleware,
+} from "./middlewares/plugin";

 async function initializeClaudeConfig() {
  const homeDir = process.env.HOME;
@@ -44,6 +48,7 @@ interface ModelProvider {
  api_base_url: string;
  api_key: string;
  models: string[];
+  usePlugins?: string[];
 }

 async function run(options: RunOptions = {}) {
@@ -56,6 +61,7 @@ async function run(options: RunOptions = {}) {
  await initializeClaudeConfig();
  await initDir();
  const config = await initConfig();
+  await loadPlugins(config.usePlugins || []);

  const Providers = new Map<string, ModelProvider>();
  const providerCache = new LRUCache<string, OpenAI>({
@@ -63,7 +69,7 @@ async function run(options: RunOptions = {}) {
    ttl: 2 * 60 * 60 * 1000,
  });

-  function getProviderInstance(providerName: string): OpenAI {
+  async function getProviderInstance(providerName: string): Promise<OpenAI> {
    const provider: ModelProvider | undefined = Providers.get(providerName);
    if (provider === undefined) {
      throw new Error(`Provider ${providerName} not found`);
@@ -77,6 +83,10 @@ async function run(options: RunOptions = {}) {
      });
      providerCache.set(provider.name, openai);
    }
+    const plugins = provider.usePlugins || [];
+    if (plugins.length > 0) {
+      await loadPlugins(plugins.map((name) => `${providerName},${name}`));
+    }
    return openai;
  }

@@ -127,27 +137,39 @@ async function run(options: RunOptions = {}) {

  const server = await createServer(servicePort);
  server.useMiddleware((req, res, next) => {
-    console.log("Middleware triggered for request:", req.body.model);
    req.config = config;
    next();
  });
-  server.useMiddleware(rewriteBody);
+  server.useMiddleware(usePluginMiddleware("beforeRouter"));
  if (
    config.Router?.background &&
    config.Router?.think &&
    config?.Router?.longContext
  ) {
    server.useMiddleware(router);
+  } else {
+    server.useMiddleware((req, res, next) => {
+      req.provider = "default";
+      req.body.model = config.OPENAI_MODEL;
+      next();
+    });
  }
+  server.useMiddleware(usePluginMiddleware("afterRouter"));
+  server.useMiddleware(usePluginMiddleware("beforeTransformRequest"));
  server.useMiddleware(formatRequest);
+  server.useMiddleware(usePluginMiddleware("afterTransformRequest"));

  server.app.post("/v1/messages", async (req, res) => {
    try {
-      const provider = getProviderInstance(req.provider || "default");
+      const provider = await getProviderInstance(req.provider || "default");
+      log("final request body:", req.body);
      const completion: any = await provider.chat.completions.create(req.body);
-      await streamOpenAIResponse(res, completion, req.body.model, req.body);
+      await streamOpenAIResponse(req, res, completion);
    } catch (e) {
-      console.error("Error in OpenAI API call:", e);
+      log("Error in OpenAI API call:", e);
+      res.status(500).json({
+        error: e.message,
+      });
    }
  });
  server.start();
--- a/src/middlewares/formatRequest.ts
+++ b/src/middlewares/formatRequest.ts
@@ -1,7 +1,6 @@
 import { Request, Response, NextFunction } from "express";
 import { MessageCreateParamsBase } from "@anthropic-ai/sdk/resources/messages";
 import OpenAI from "openai";
-import { streamOpenAIResponse } from "../utils/stream";
 import { log } from "../utils/log";

 export const formatRequest = async (
@@ -19,7 +18,7 @@ export const formatRequest = async (
    tools,
    stream,
  }: MessageCreateParamsBase = req.body;
-  log("formatRequest: ", req.body);
+  log("beforeTransformRequest: ", req.body);
  try {
    // @ts-ignore
    const openAIMessages = Array.isArray(messages)
@@ -50,6 +49,7 @@ export const formatRequest = async (

            anthropicMessage.content.forEach((contentPart) => {
              if (contentPart.type === "text") {
+                if (contentPart.text.includes("(no content)")) return;
                textContent +=
                  (typeof contentPart.text === "string"
                    ? contentPart.text
@@ -112,17 +112,18 @@ export const formatRequest = async (
            });

            const trimmedUserText = userTextMessageContent.trim();
+            // @ts-ignore
+            openAiMessagesFromThisAnthropicMessage.push(
+              // @ts-ignore
+              ...subsequentToolMessages
+            );
+
            if (trimmedUserText.length > 0) {
              openAiMessagesFromThisAnthropicMessage.push({
                role: "user",
                content: trimmedUserText,
              });
            }
-            // @ts-ignore
-            openAiMessagesFromThisAnthropicMessage.push(
-              // @ts-ignore
-              ...subsequentToolMessages
-            );
          } else {
            // Fallback for other roles (e.g. system, or custom roles if they were to appear here with array content)
            // This will combine all text parts into a single message for that role.
@@ -180,30 +181,9 @@ export const formatRequest = async (
    res.setHeader("Cache-Control", "no-cache");
    res.setHeader("Connection", "keep-alive");
    req.body = data;
-    console.log(JSON.stringify(data.messages, null, 2));
+    log("afterTransformRequest: ", req.body);
  } catch (error) {
-    console.error("Error in request processing:", error);
-    const errorCompletion: AsyncIterable<OpenAI.Chat.Completions.ChatCompletionChunk> =
-      {
-        async *[Symbol.asyncIterator]() {
-          yield {
-            id: `error_${Date.now()}`,
-            created: Math.floor(Date.now() / 1000),
-            model,
-            object: "chat.completion.chunk",
-            choices: [
-              {
-                index: 0,
-                delta: {
-                  content: `Error: ${(error as Error).message}`,
-                },
-                finish_reason: "stop",
-              },
-            ],
-          };
-        },
-      };
-    await streamOpenAIResponse(res, errorCompletion, model, req.body);
+    log("Error in TransformRequest:", error);
  }
  next();
 };
--- a/src/middlewares/plugin.ts
+++ b/src/middlewares/plugin.ts
@@ -0,0 +1,106 @@
+import Module from "node:module";
+import { streamOpenAIResponse } from "../utils/stream";
+import { log } from "../utils/log";
+import { PLUGINS_DIR } from "../constants";
+import path from "node:path";
+import { access } from "node:fs/promises";
+import { OpenAI } from "openai";
+import { createClient } from "../utils";
+import { Response } from "express";
+
+// @ts-ignore
+const originalLoad = Module._load;
+// @ts-ignore
+Module._load = function (request, parent, isMain) {
+  if (request === "claude-code-router") {
+    return {
+      streamOpenAIResponse,
+      log,
+      OpenAI,
+      createClient,
+    };
+  }
+  return originalLoad.call(this, request, parent, isMain);
+};
+
+export type PluginHook =
+  | "beforeRouter"
+  | "afterRouter"
+  | "beforeTransformRequest"
+  | "afterTransformRequest"
+  | "beforeTransformResponse"
+  | "afterTransformResponse";
+
+export interface Plugin {
+  beforeRouter?: (req: any, res: Response) => Promise<any>;
+  afterRouter?: (req: any, res: Response) => Promise<any>;
+
+  beforeTransformRequest?: (req: any, res: Response) => Promise<any>;
+  afterTransformRequest?: (req: any, res: Response) => Promise<any>;
+
+  beforeTransformResponse?: (
+    req: any,
+    res: Response,
+    data?: { completion: any }
+  ) => Promise<any>;
+  afterTransformResponse?: (
+    req: any,
+    res: Response,
+    data?: { completion: any; transformedCompletion: any }
+  ) => Promise<any>;
+}
+
+export const PLUGINS = new Map<string, Plugin>();
+
+const loadPlugin = async (pluginName: string) => {
+  const filePath = pluginName.split(",").pop();
+  const pluginPath = path.join(PLUGINS_DIR, `${filePath}.js`);
+  try {
+    await access(pluginPath);
+    const plugin = require(pluginPath);
+    if (
+      [
+        "beforeRouter",
+        "afterRouter",
+        "beforeTransformRequest",
+        "afterTransformRequest",
+        "beforeTransformResponse",
+        "afterTransformResponse",
+      ].some((key) => key in plugin)
+    ) {
+      PLUGINS.set(pluginName, plugin);
+      log(`Plugin ${pluginName} loaded successfully.`);
+    } else {
+      throw new Error(`Plugin ${pluginName} does not export a function.`);
+    }
+  } catch (e) {
+    console.error(`Failed to load plugin ${pluginName}:`, e);
+    throw e;
+  }
+};
+
+export const loadPlugins = async (pluginNames: string[]) => {
+  console.log("Loading plugins:", pluginNames);
+  for (const file of pluginNames) {
+    await loadPlugin(file);
+  }
+};
+
+export const usePluginMiddleware = (type: PluginHook) => {
+  return async (req: any, res: Response, next: any) => {
+    for (const [name, plugin] of PLUGINS.entries()) {
+      if (name.includes(",") && !name.startsWith(`${req.provider},`)) {
+        continue;
+      }
+      if (plugin[type]) {
+        try {
+          await plugin[type](req, res);
+          log(`Plugin ${name} executed hook: ${type}`);
+        } catch (error) {
+          log(`Error in plugin ${name} during hook ${type}:`, error);
+        }
+      }
+    }
+    next();
+  };
+};
--- a/src/middlewares/rewriteBody.ts
+++ b/src/middlewares/rewriteBody.ts
@@ -1,45 +0,0 @@
-import { Request, Response, NextFunction } from "express";
-import Module from "node:module";
-import { streamOpenAIResponse } from "../utils/stream";
-import { log } from "../utils/log";
-import { PLUGINS_DIR } from "../constants";
-import path from "node:path";
-import { access } from "node:fs/promises";
-import { OpenAI } from "openai";
-import { createClient } from "../utils";
-
-// @ts-ignore
-const originalLoad = Module._load;
-// @ts-ignore
-Module._load = function (request, parent, isMain) {
-  if (request === "claude-code-router") {
-    return {
-      streamOpenAIResponse,
-      log,
-      OpenAI,
-      createClient,
-    };
-  }
-  return originalLoad.call(this, request, parent, isMain);
-};
-
-export const rewriteBody = async (
-  req: Request,
-  res: Response,
-  next: NextFunction
-) => {
-  if (!req.config.usePlugins) {
-    return next();
-  }
-  for (const plugin of req.config.usePlugins) {
-    const pluginPath = path.join(PLUGINS_DIR, `${plugin.trim()}.js`);
-    try {
-      await access(pluginPath);
-      const rewritePlugin = require(pluginPath);
-      await rewritePlugin(req, res);
-    } catch (e) {
-      console.error(e);
-    }
-  }
-  next();
-};
--- a/src/middlewares/router.ts
+++ b/src/middlewares/router.ts
@@ -6,6 +6,14 @@ import { log } from "../utils/log";
 const enc = get_encoding("cl100k_base");

 const getUseModel = (req: Request, tokenCount: number) => {
+  const [provider, model] = req.body.model.split(",");
+  if (provider && model) {
+    return {
+      provider,
+      model,
+    };
+  }
+
  // if tokenCount is greater than 32K, use the long context model
  if (tokenCount > 1000 * 32) {
    log("Using long context model due to token count:", tokenCount);
@@ -33,16 +41,11 @@ const getUseModel = (req: Request, tokenCount: number) => {
      model,
    };
  }
-  const [provider, model] = req.body.model.split(",");
-  if (provider && model) {
-    return {
-      provider,
-      model,
-    };
-  }
+  const [defaultProvider, defaultModel] =
+    req.config.Router!.default?.split(",");
  return {
-    provider: "default",
-    model: req.config.OPENAI_MODEL,
+    provider: defaultProvider || "default",
+    model: defaultModel || req.config.OPENAI_MODEL,
  };
 };

@@ -67,7 +70,11 @@ export const router = async (
                JSON.stringify(contentPart.input)
              ).length;
            } else if (contentPart.type === "tool_result") {
-              tokenCount += enc.encode(contentPart.content || "").length;
+              tokenCount += enc.encode(
+                typeof contentPart.content === "string"
+                  ? contentPart.content
+                  : JSON.stringify(contentPart.content)
+              ).length;
            }
          });
        }
--- a/src/utils/codeCommand.ts
+++ b/src/utils/codeCommand.ts
@@ -9,6 +9,12 @@ export async function executeCodeCommand(args: string[] = []) {
  // Set environment variables
  const env = {
    ...process.env,
+    HTTPS_PROXY: undefined,
+    HTTP_PROXY: undefined,
+    ALL_PROXY: undefined,
+    https_proxy: undefined,
+    http_proxy: undefined,
+    all_proxy: undefined,
    DISABLE_PROMPT_CACHING: "1",
    ANTHROPIC_AUTH_TOKEN: "test",
    ANTHROPIC_BASE_URL: `http://127.0.0.1:3456`,
@@ -19,10 +25,11 @@ export async function executeCodeCommand(args: string[] = []) {
  incrementReferenceCount();

  // Execute claude command
-  const claudeProcess = spawn("claude", args, {
+  const claudePath = process.env.CLAUDE_PATH || "claude";
+  const claudeProcess = spawn(claudePath, args, {
    env,
    stdio: "inherit",
-    shell: true,
+    shell: true
  });

  claudeProcess.on("error", (error) => {
--- a/src/utils/index.ts
+++ b/src/utils/index.ts
@@ -8,6 +8,7 @@ import {
  HOME_DIR,
  PLUGINS_DIR,
 } from "../constants";
+import crypto from "node:crypto";

 export function getOpenAICommonOptions(): ClientOptions {
  const options: ClientOptions = {};
@@ -90,3 +91,7 @@ export const createClient = (options: ClientOptions) => {
  });
  return client;
 };
+
+export const sha256 = (data: string | Buffer): string => {
+  return crypto.createHash("sha256").update(data).digest("hex");
+};
--- a/src/utils/log.ts
+++ b/src/utils/log.ts
@@ -11,6 +11,7 @@ if (!fs.existsSync(HOME_DIR)) {

 export function log(...args: any[]) {
  // Check if logging is enabled via environment variable
+  // console.log(...args); // Log to console for immediate feedback
  const isLogEnabled = process.env.LOG === "true";

  if (!isLogEnabled) {
--- a/src/utils/status.ts
+++ b/src/utils/status.ts
@@ -14,13 +14,13 @@ export function showStatus() {
        console.log(`📄 PID File: ${info.pidFile}`);
        console.log('');
        console.log('🚀 Ready to use! Run the following commands:');
-        console.log('   claude-code-router code    # Start coding with Claude');
-        console.log('   claude-code-router close   # Stop the service');
+        console.log('   ccr code    # Start coding with Claude');
+        console.log('   ccr close   # Stop the service');
    } else {
        console.log('❌ Status: Not Running');
        console.log('');
        console.log('💡 To start the service:');
-        console.log('   claude-code-router start');
+        console.log('   ccr start');
    }
    
    console.log('');
--- a/src/utils/stream.ts
+++ b/src/utils/stream.ts
@@ -1,6 +1,13 @@
-import { Response } from "express";
-import { OpenAI } from "openai";
+import { Request, Response } from "express";
 import { log } from "./log";
+import { PLUGINS } from "../middlewares/plugin";
+import { sha256 } from ".";
+
+declare module "express" {
+  interface Request {
+    provider?: string;
+  }
+}

 interface ContentBlock {
  type: string;
@@ -42,39 +49,100 @@ interface MessageEvent {
 }

 export async function streamOpenAIResponse(
+  req: Request,
  res: Response,
-  completion: any,
-  model: string,
-  body: any
+  _completion: any
 ) {
-  const write = (data: string) => {
-    log("response: ", data);
-    res.write(data);
+  let completion = _completion;
+  res.locals.completion = completion;
+
+  for (const [name, plugin] of PLUGINS.entries()) {
+    if (name.includes(",") && !name.startsWith(`${req.provider},`)) {
+      continue;
+    }
+    if (plugin.beforeTransformResponse) {
+      const result = await plugin.beforeTransformResponse(req, res, {
+        completion,
+      });
+      if (result) {
+        completion = result;
+      }
+    }
+  }
+  const write = async (data: string) => {
+    let eventData = data;
+    for (const [name, plugin] of PLUGINS.entries()) {
+      if (name.includes(",") && !name.startsWith(`${req.provider},`)) {
+        continue;
+      }
+      if (plugin.afterTransformResponse) {
+        const hookResult = await plugin.afterTransformResponse(req, res, {
+          completion: res.locals.completion,
+          transformedCompletion: eventData,
+        });
+        if (typeof hookResult === "string") {
+          eventData = hookResult;
+        }
+      }
+    }
+    if (eventData) {
+      log("response: ", eventData);
+      res.write(eventData);
+    }
  };
  const messageId = "msg_" + Date.now();
-  if (!body.stream) {
-    res.json({
+  if (!req.body.stream) {
+    let content: any = [];
+    if (completion.choices[0].message.content) {
+      content = [{ text: completion.choices[0].message.content, type: "text" }];
+    } else if (completion.choices[0].message.tool_calls) {
+      content = completion.choices[0].message.tool_calls.map((item: any) => {
+        return {
+          type: "tool_use",
+          id: item.id,
+          name: item.function?.name,
+          input: item.function?.arguments
+            ? JSON.parse(item.function.arguments)
+            : {},
+        };
+      });
+    }
+
+    const result = {
      id: messageId,
      type: "message",
      role: "assistant",
      // @ts-ignore
-      content: completion.choices[0].message.content || completion.choices[0].message.tool_calls?.map((item) => {
-        return {
-          type: 'tool_use',
-          id: item.id,
-          name: item.function?.name,
-          input: item.function?.arguments ? JSON.parse(item.function.arguments) : {},
-        };
-      }) || '',
-      stop_reason: completion.choices[0].finish_reason === 'tool_calls' ? "tool_use" : "end_turn",
+      content: content,
+      stop_reason:
+        completion.choices[0].finish_reason === "tool_calls"
+          ? "tool_use"
+          : "end_turn",
      stop_sequence: null,
-      usage: {
-        input_tokens: 100,
-        output_tokens: 50,
-      },
-    });
-    res.end();
-    return;
+    };
+    try {
+      res.locals.transformedCompletion = result;
+      for (const [name, plugin] of PLUGINS.entries()) {
+        if (name.includes(",") && !name.startsWith(`${req.provider},`)) {
+          continue;
+        }
+        if (plugin.afterTransformResponse) {
+          const hookResult = await plugin.afterTransformResponse(req, res, {
+            completion: res.locals.completion,
+            transformedCompletion: res.locals.transformedCompletion,
+          });
+          if (hookResult) {
+            res.locals.transformedCompletion = hookResult;
+          }
+        }
+      }
+      res.json(res.locals.transformedCompletion);
+      res.end();
+      return;
+    } catch (error) {
+      log("Error sending response:", error);
+      res.status(500).send("Internal Server Error");
+    }
  }

  let contentBlockIndex = 0;
@@ -88,7 +156,7 @@ export async function streamOpenAIResponse(
      type: "message",
      role: "assistant",
      content: [],
-      model,
+      model: req.body.model,
      stop_reason: null,
      stop_sequence: null,
      usage: { input_tokens: 1, output_tokens: 1 },
@@ -99,6 +167,8 @@ export async function streamOpenAIResponse(
  let isToolUse = false;
  let toolUseJson = "";
  let hasStartedTextBlock = false;
+  let currentToolCallId: string | null = null;
+  let toolCallJsonMap = new Map<string, string>();

  try {
    for await (const chunk of completion) {
@@ -106,62 +176,98 @@ export async function streamOpenAIResponse(
      const delta = chunk.choices[0].delta;

      if (delta.tool_calls && delta.tool_calls.length > 0) {
-        const toolCall = delta.tool_calls[0];
+        // Handle each tool call in the current chunk
+        for (const [index, toolCall] of delta.tool_calls.entries()) {
+          // Generate a stable ID for this tool call position
+          const toolCallId = toolCall.id || `tool_${index}`;

-        if (!isToolUse) {
-          // Start new tool call block
-          isToolUse = true;
-          const toolBlock: ContentBlock = {
-            type: "tool_use",
-            id: `toolu_${Date.now()}`,
-            name: toolCall.function?.name,
-            input: {},
-          };
+          // If this position doesn't have an active tool call, start a new one
+          if (!toolCallJsonMap.has(`${index}`)) {
+            // End previous tool call if one was active
+            if (isToolUse && currentToolCallId) {
+              const contentBlockStop: MessageEvent = {
+                type: "content_block_stop",
+                index: contentBlockIndex,
+              };
+              write(
+                `event: content_block_stop\ndata: ${JSON.stringify(
+                  contentBlockStop
+                )}\n\n`
+              );
+            }

-          const toolBlockStart: MessageEvent = {
-            type: "content_block_start",
-            index: contentBlockIndex,
-            content_block: toolBlock,
-          };
+            // Start new tool call block
+            isToolUse = true;
+            currentToolCallId = `${index}`;
+            contentBlockIndex++;
+            toolCallJsonMap.set(`${index}`, ""); // Initialize JSON accumulator for this tool call

-          currentContentBlocks.push(toolBlock);
+            const toolBlock: ContentBlock = {
+              type: "tool_use",
+              id: toolCallId, // Use the original ID if available
+              name: toolCall.function?.name,
+              input: {},
+            };

-          write(
-            `event: content_block_start\ndata: ${JSON.stringify(
-              toolBlockStart
-            )}\n\n`
-          );
-          toolUseJson = "";
-        }
+            const toolBlockStart: MessageEvent = {
+              type: "content_block_start",
+              index: contentBlockIndex,
+              content_block: toolBlock,
+            };

-        // Stream tool call JSON
-        if (toolCall.function?.arguments) {
-          const jsonDelta: MessageEvent = {
-            type: "content_block_delta",
-            index: contentBlockIndex,
-            delta: {
-              type: "input_json_delta",
-              partial_json: toolCall.function?.arguments,
-            },
-          };
+            currentContentBlocks.push(toolBlock);

-          toolUseJson += toolCall.function.arguments;
-
-          try {
-            const parsedJson = JSON.parse(toolUseJson);
-            currentContentBlocks[contentBlockIndex].input = parsedJson;
-          } catch (e) {
-            log(e);
-            // JSON not yet complete, continue accumulating
+            write(
+              `event: content_block_start\ndata: ${JSON.stringify(
+                toolBlockStart
+              )}\n\n`
+            );
          }

-          write(
-            `event: content_block_delta\ndata: ${JSON.stringify(jsonDelta)}\n\n`
-          );
+          // Stream tool call JSON for this position
+          if (toolCall.function?.arguments) {
+            const jsonDelta: MessageEvent = {
+              type: "content_block_delta",
+              index: contentBlockIndex,
+              delta: {
+                type: "input_json_delta",
+                partial_json: toolCall.function.arguments,
+              },
+            };
+
+            // Accumulate JSON for this specific tool call position
+            const currentJson = toolCallJsonMap.get(`${index}`) || "";
+            const newJson = currentJson + toolCall.function.arguments;
+            toolCallJsonMap.set(`${index}`, newJson);
+
+            // Try to parse accumulated JSON
+            if (isValidJson(newJson)) {
+              try {
+                const parsedJson = JSON.parse(newJson);
+                const blockIndex = currentContentBlocks.findIndex(
+                  (block) => block.type === "tool_use" && block.id === toolCallId
+                );
+                if (blockIndex !== -1) {
+                  currentContentBlocks[blockIndex].input = parsedJson;
+                }
+              } catch (e) {
+                log("JSON parsing error (continuing to accumulate):", e);
+              }
+            }
+
+            write(
+              `event: content_block_delta\ndata: ${JSON.stringify(
+                jsonDelta
+              )}\n\n`
+            );
+          }
        }
-      } else if (delta.content) {
-        // Handle regular text content
-        if (isToolUse) {
+      } else if (delta.content || chunk.choices[0].finish_reason) {
+        // Handle regular text content or completion
+        if (
+          isToolUse &&
+          (delta.content || chunk.choices[0].finish_reason === "tool_calls")
+        ) {
          log("Tool call ended here:", delta);
          // End previous tool call block
          const contentBlockStop: MessageEvent = {
@@ -176,10 +282,10 @@ export async function streamOpenAIResponse(
          );
          contentBlockIndex++;
          isToolUse = false;
+          currentToolCallId = null;
+          toolUseJson = ""; // Reset for safety
        }

-        if (!delta.content) continue;
-
        // If text block not yet started, send content_block_start
        if (!hasStartedTextBlock) {
          const textBlock: ContentBlock = {
@@ -269,15 +375,33 @@ export async function streamOpenAIResponse(
    );
  }

-  // Close last content block
-  const contentBlockStop: MessageEvent = {
-    type: "content_block_stop",
-    index: contentBlockIndex,
-  };
+  // Close last content block if any is open
+  if (isToolUse || hasStartedTextBlock) {
+    const contentBlockStop: MessageEvent = {
+      type: "content_block_stop",
+      index: contentBlockIndex,
+    };

-  write(
-    `event: content_block_stop\ndata: ${JSON.stringify(contentBlockStop)}\n\n`
-  );
+    write(
+      `event: content_block_stop\ndata: ${JSON.stringify(contentBlockStop)}\n\n`
+    );
+  }
+
+  res.locals.transformedCompletion = currentContentBlocks;
+  for (const [name, plugin] of PLUGINS.entries()) {
+    if (name.includes(",") && !name.startsWith(`${req.provider},`)) {
+      continue;
+    }
+    if (plugin.afterTransformResponse) {
+      const hookResult = await plugin.afterTransformResponse(req, res, {
+        completion: res.locals.completion,
+        transformedCompletion: res.locals.transformedCompletion,
+      });
+      if (hookResult) {
+        res.locals.transformedCompletion = hookResult;
+      }
+    }
+  }

  // Send message_delta event with appropriate stop_reason
  const messageDelta: MessageEvent = {
@@ -285,12 +409,12 @@ export async function streamOpenAIResponse(
    delta: {
      stop_reason: isToolUse ? "tool_use" : "end_turn",
      stop_sequence: null,
-      content: currentContentBlocks,
+      content: res.locals.transformedCompletion,
    },
    usage: { input_tokens: 100, output_tokens: 150 },
  };
  if (!isToolUse) {
-    log("body: ", body, "messageDelta: ", messageDelta);
+    log("body: ", req.body, "messageDelta: ", messageDelta);
  }

  write(`event: message_delta\ndata: ${JSON.stringify(messageDelta)}\n\n`);
@@ -303,3 +427,42 @@ export async function streamOpenAIResponse(
  write(`event: message_stop\ndata: ${JSON.stringify(messageStop)}\n\n`);
  res.end();
 }
+
+// Add helper function at the top of the file
+function isValidJson(str: string): boolean {
+  // Check if the string contains both opening and closing braces/brackets
+  const hasOpenBrace = str.includes("{");
+  const hasCloseBrace = str.includes("}");
+  const hasOpenBracket = str.includes("[");
+  const hasCloseBracket = str.includes("]");
+
+  // Check if we have matching pairs
+  if ((hasOpenBrace && !hasCloseBrace) || (!hasOpenBrace && hasCloseBrace)) {
+    return false;
+  }
+  if (
+    (hasOpenBracket && !hasCloseBracket) ||
+    (!hasOpenBracket && hasCloseBracket)
+  ) {
+    return false;
+  }
+
+  // Count nested braces/brackets
+  let braceCount = 0;
+  let bracketCount = 0;
+
+  for (const char of str) {
+    if (char === "{") braceCount++;
+    if (char === "}") braceCount--;
+    if (char === "[") bracketCount++;
+    if (char === "]") bracketCount--;
+
+    // If we ever go negative, the JSON is invalid
+    if (braceCount < 0 || bracketCount < 0) {
+      return false;
+    }
+  }
+
+  // All braces/brackets should be matched
+  return braceCount === 0 && bracketCount === 0;
+}
Author	SHA1	Message	Date
jinhui.li	cba0536c45	Refactor plugin	2025-06-23 06:05:58 +08:00
jinhui.li	dba8b1e6c8	update Sponsors	2025-06-20 22:25:56 +08:00
jinhui.li	6bdcf4ccc2	release v1.0.9	2025-06-20 22:17:45 +08:00
musi	add5cfb6c2	Merge pull request #37 from Evyatar108/patch-1 Fix (1) multi/parallel-tool invocation and (2) `API Error: A.map` permanently	2025-06-20 21:54:59 +08:00
Evyatar108	c984b57585	Fix multi/parallel-tool invocation	2025-06-20 00:16:47 -07:00
musi	258ef787c7	update features	2025-06-19 22:14:48 +08:00
musi	2c3f89cf53	Merge branch 'main' of github.com:musistudio/claude-code-router	2025-06-19 21:15:19 +08:00
musi	a67fce3991	update sponsors	2025-06-19 21:14:52 +08:00
jinhui.li	d3856c0cf9	Merge remote-tracking branch 'origin/main'	2025-06-19 12:07:42 +08:00
jinhui.li	2cad9e93b8	update README	2025-06-19 12:06:55 +08:00
musi	d6be620cec	update sponsors	2025-06-18 21:19:38 +08:00
jinhui.li	57a7da14a3	add plugins	2025-06-18 12:20:07 +08:00
jinhui.li	84cb9a2009	release v1.0.8	2025-06-17 12:38:33 +08:00
jinhui.li	ac51db990c	save request errors to a log file	2025-06-17 12:37:52 +08:00
jinhui.li	ae88d63c7c	adjust /model command priority	2025-06-17 12:36:59 +08:00
jinhui.li	dd29cf895f	fix API Error: A.map is not a function	2025-06-17 09:01:53 +08:00
musi	56ab2ee309	add Sponsors	2025-06-17 06:26:46 +08:00
musi	d0d164e8ea	release v1.0.7	2025-06-17 06:12:59 +08:00
jinhui.li	ca1b9a5fba	update README	2025-06-16 13:04:43 +08:00
jinhui.li	4482853222	update README	2025-06-16 13:03:20 +08:00
jinhui.li	4dc73a31eb	add fisrt article	2025-06-16 13:02:15 +08:00
jinhui.li	329b5d9b9b	fix API Error when using proxy	2025-06-16 12:43:44 +08:00
musi	0da9cf156d	release v1.0.6	2025-06-16 06:16:07 +08:00
musi	d810e2f57e	fix missing tiktoken_bg.wasm	2025-06-16 06:14:40 +08:00
musi	81514b0676	fix doc typo	2025-06-15 22:33:06 +08:00
musi	a3ec2c223d	fix not working without router	2025-06-15 20:45:32 +08:00
jinhui.li	ee8b82947d	release v1.0.4	2025-06-15 20:28:11 +08:00
jinhui.li	1aa6dbe51a	add CLAUDE_PATH env variable	2025-06-15 20:26:35 +08:00
musi	80d9298b34	update README	2025-06-15 20:13:11 +08:00