346 lines
13 KiB
Markdown
346 lines
13 KiB
Markdown
# Claude Code Router
|
||
|
||
> This is a tool for routing Claude Code requests to different models, and you can customize any request.
|
||
|
||

|
||
|
||
## Usage
|
||
|
||
1. Install Claude Code
|
||
|
||
```shell
|
||
npm install -g @anthropic-ai/claude-code
|
||
```
|
||
|
||
2. Install Claude Code Router
|
||
|
||
```shell
|
||
npm install -g @musistudio/claude-code-router
|
||
```
|
||
|
||
3. Start Claude Code by claude-code-router
|
||
|
||
```shell
|
||
ccr code
|
||
```
|
||
|
||
4. Configure routing
|
||
Set up your `~/.claude-code-router/config.json` file like this:
|
||
|
||
```json
|
||
{
|
||
"Providers": [
|
||
{
|
||
"name": "openrouter",
|
||
// IMPORTANT: api_base_url must be a complete (full) URL.
|
||
"api_base_url": "https://openrouter.ai/api/v1/chat/completions",
|
||
"api_key": "sk-xxx",
|
||
"models": [
|
||
"google/gemini-2.5-pro-preview",
|
||
"anthropic/claude-sonnet-4",
|
||
"anthropic/claude-3.5-sonnet",
|
||
"anthropic/claude-3.7-sonnet:thinking"
|
||
],
|
||
"transformer": {
|
||
"use": ["openrouter"]
|
||
}
|
||
},
|
||
{
|
||
"name": "deepseek",
|
||
// IMPORTANT: api_base_url must be a complete (full) URL.
|
||
"api_base_url": "https://api.deepseek.com/chat/completions",
|
||
"api_key": "sk-xxx",
|
||
"models": ["deepseek-chat", "deepseek-reasoner"],
|
||
"transformer": {
|
||
"use": ["deepseek"],
|
||
"deepseek-chat": {
|
||
// Enhance tool usage for the deepseek-chat model using the ToolUse transformer.
|
||
"use": ["tooluse"]
|
||
}
|
||
}
|
||
},
|
||
{
|
||
"name": "ollama",
|
||
// IMPORTANT: api_base_url must be a complete (full) URL.
|
||
"api_base_url": "http://localhost:11434/v1/chat/completions",
|
||
"api_key": "ollama",
|
||
"models": ["qwen2.5-coder:latest"]
|
||
},
|
||
{
|
||
"name": "gemini",
|
||
// IMPORTANT: api_base_url must be a complete (full) URL.
|
||
"api_base_url": "https://generativelanguage.googleapis.com/v1beta/models/",
|
||
"api_key": "sk-xxx",
|
||
"models": ["gemini-2.5-flash", "gemini-2.5-pro"],
|
||
"transformer": {
|
||
"use": ["gemini"]
|
||
}
|
||
},
|
||
{
|
||
"name": "volcengine",
|
||
// IMPORTANT: api_base_url must be a complete (full) URL.
|
||
"api_base_url": "https://ark.cn-beijing.volces.com/api/v3/chat/completions",
|
||
"api_key": "sk-xxx",
|
||
"models": ["deepseek-v3-250324", "deepseek-r1-250528"],
|
||
"transformer": {
|
||
"use": ["deepseek"]
|
||
}
|
||
},
|
||
{
|
||
"name": "siliconflow",
|
||
// IMPORTANT: api_base_url must be a complete (full) URL.
|
||
"api_base_url": "https://api.siliconflow.cn/v1/chat/completions",
|
||
"api_key": "sk-xxx",
|
||
"models": ["moonshotai/Kimi-K2-Instruct"],
|
||
"transformer": {
|
||
"use": [
|
||
[
|
||
"maxtoken",
|
||
{
|
||
"max_tokens": 16384 // for siliconflow max_tokens
|
||
}
|
||
]
|
||
]
|
||
}
|
||
}
|
||
],
|
||
"Router": {
|
||
"default": "deepseek,deepseek-chat", // IMPORTANT OPENAI_MODEL has been deprecated
|
||
"background": "ollama,qwen2.5-coder:latest",
|
||
"think": "deepseek,deepseek-reasoner",
|
||
"longContext": "openrouter,google/gemini-2.5-pro-preview"
|
||
}
|
||
}
|
||
```
|
||
|
||
- `background`
|
||
This model will be used to handle some background tasks([background-token-usage](https://docs.anthropic.com/en/docs/claude-code/costs#background-token-usage)). Based on my tests, it doesn’t require high intelligence. I’m using the qwen-coder-2.5:7b model running locally on my MacBook Pro M1 (32GB) via Ollama.
|
||
If your computer can’t run Ollama, you can also use some free models, such as qwen-coder-2.5:3b.
|
||
|
||
- `think`
|
||
This model will be used when enabling Claude Code to perform reasoning. However, reasoning budget control has not yet been implemented (since the DeepSeek-R1 model does not support it), so there is currently no difference between using UltraThink and Think modes.
|
||
It is worth noting that Plan Mode also use this model to achieve better planning results.
|
||
Note: The reasoning process via the official DeepSeek API may be very slow, so you may need to wait for an extended period of time.
|
||
|
||
- `longContext`
|
||
This model will be used when the context length exceeds 32K (this value may be modified in the future). You can route the request to a model that performs well with long contexts (I’ve chosen google/gemini-2.5-pro-preview). This scenario has not been thoroughly tested yet, so if you encounter any issues, please submit an issue.
|
||
|
||
- model command
|
||
You can also switch models within Claude Code by using the `/model` command. The format is: `provider,model`, like this:
|
||
`/model openrouter,anthropic/claude-3.5-sonnet`
|
||
This will use the anthropic/claude-3.5-sonnet model provided by OpenRouter to handle all subsequent tasks.
|
||
|
||
5. About transformer
|
||
`transformer` is used to convert requests and responses for different vendors. For different vendors, we can configure different transformers.
|
||
|
||
For example, in the following case, we use the `openrouter` transformer for the OpenRouter vendor. This transformer removes the `cache_control` parameter (mainly used to adapt Claude's prompt cache) from the request for models other than Claude. In the response, it adapts the reasoning field.
|
||
```json
|
||
{
|
||
"name": "openrouter",
|
||
"api_base_url": "https://openrouter.ai/api/v1/chat/completions",
|
||
"api_key": "",
|
||
"models": [
|
||
"google/gemini-2.5-pro-preview",
|
||
"anthropic/claude-sonnet-4",
|
||
"anthropic/claude-3.5-sonnet",
|
||
"anthropic/claude-3.7-sonnet:thinking",
|
||
"deepseek/deepseek-chat-v3-0324"
|
||
],
|
||
"transformer": {
|
||
"use": [
|
||
"openrouter"
|
||
]
|
||
}
|
||
}
|
||
```
|
||
You can also configure transformers for different models of the same vendor. For instance, in the following example, we use the `deepseek` transformer for the DeepSeek vendor. This transformer sets the maximum value of `max_tokens` to `8192` in the request, and in the response, it adapts the `reasoning_content` field. Additionally, for the `deepseek-chat` model, we use the `tooluse` transformer, which optimizes the tool call for the `deepseek-v3` model using the `tool_choice` parameter (mainly because deepseek-r1 does not support the tool_choice parameter).
|
||
```json
|
||
{
|
||
"name": "deepseek",
|
||
"api_base_url": "https://api.deepseek.com/chat/completions",
|
||
"api_key": "",
|
||
"models": [
|
||
"deepseek-chat",
|
||
"deepseek-reasoner"
|
||
],
|
||
"transformer": {
|
||
"use": [
|
||
"deepseek"
|
||
],
|
||
"deepseek-chat": {
|
||
"use": [
|
||
"tooluse"
|
||
]
|
||
}
|
||
}
|
||
}
|
||
```
|
||
Currently, the following transformers are available:
|
||
|
||
- deepseek
|
||
|
||
- gemini
|
||
|
||
- maxtoken
|
||
|
||
- openrouter
|
||
|
||
- tooluse
|
||
|
||
- gemini-cli (experimental, unofficial support: https://gist.github.com/musistudio/1c13a65f35916a7ab690649d3df8d1cd)
|
||
|
||
You can configure custom transformers in the `config.json` file using the `transformers` field, for example:
|
||
```json
|
||
{
|
||
"transformers": [
|
||
{
|
||
"path": "$HOME/.claude-code-router/plugins/gemini-cli.js",
|
||
"options": {
|
||
"project": "xxx"
|
||
}
|
||
}
|
||
]
|
||
}
|
||
```
|
||
|
||
## Features
|
||
|
||
- [x] Support change models
|
||
- [x] Github Actions
|
||
- [ ] More detailed logs
|
||
- [ ] Support image
|
||
- [ ] Support web search
|
||
|
||
## Github Actions
|
||
|
||
You just need to install `Claude Code Actions` in your repository according to the [official documentation](https://docs.anthropic.com/en/docs/claude-code/github-actions). For `ANTHROPIC_API_KEY`, you can use any string. Then, modify your `.github/workflows/claude.yaml` file to include claude-code-router, like this:
|
||
|
||
```yaml
|
||
name: Claude Code
|
||
|
||
on:
|
||
issue_comment:
|
||
types: [created]
|
||
pull_request_review_comment:
|
||
types: [created]
|
||
issues:
|
||
types: [opened, assigned]
|
||
pull_request_review:
|
||
types: [submitted]
|
||
|
||
jobs:
|
||
claude:
|
||
if: |
|
||
(github.event_name == 'issue_comment' && contains(github.event.comment.body, '@claude')) ||
|
||
(github.event_name == 'pull_request_review_comment' && contains(github.event.comment.body, '@claude')) ||
|
||
(github.event_name == 'pull_request_review' && contains(github.event.review.body, '@claude')) ||
|
||
(github.event_name == 'issues' && (contains(github.event.issue.body, '@claude') || contains(github.event.issue.title, '@claude')))
|
||
runs-on: ubuntu-latest
|
||
permissions:
|
||
contents: read
|
||
pull-requests: read
|
||
issues: read
|
||
id-token: write
|
||
steps:
|
||
- name: Checkout repository
|
||
uses: actions/checkout@v4
|
||
with:
|
||
fetch-depth: 1
|
||
|
||
- name: Prepare Environment
|
||
run: |
|
||
curl -fsSL https://bun.sh/install | bash
|
||
mkdir -p $HOME/.claude-code-router
|
||
cat << 'EOF' > $HOME/.claude-code-router/config.json
|
||
{
|
||
"log": true,
|
||
"OPENAI_API_KEY": "${{ secrets.OPENAI_API_KEY }}",
|
||
"OPENAI_BASE_URL": "https://api.deepseek.com",
|
||
"OPENAI_MODEL": "deepseek-chat"
|
||
}
|
||
EOF
|
||
shell: bash
|
||
|
||
- name: Start Claude Code Router
|
||
run: |
|
||
nohup ~/.bun/bin/bunx @musistudio/claude-code-router@1.0.8 start &
|
||
shell: bash
|
||
|
||
- name: Run Claude Code
|
||
id: claude
|
||
uses: anthropics/claude-code-action@beta
|
||
env:
|
||
ANTHROPIC_BASE_URL: http://localhost:3456
|
||
with:
|
||
anthropic_api_key: "test"
|
||
```
|
||
|
||
You can modify the contents of `$HOME/.claude-code-router/config.json` as needed.
|
||
GitHub Actions support allows you to trigger Claude Code at specific times, which opens up some interesting possibilities.
|
||
|
||
For example, between 00:30 and 08:30 Beijing Time, using the official DeepSeek API:
|
||
|
||
- The cost of the `deepseek-v3` model is only 50% of the normal time.
|
||
|
||
- The `deepseek-r1` model is just 25% of the normal time.
|
||
|
||
So maybe in the future, I’ll describe detailed tasks for Claude Code ahead of time and let it run during these discounted hours to reduce costs?
|
||
|
||
## Some tips:
|
||
|
||
Now you can use deepseek-v3 models directly without using any plugins.
|
||
|
||
If you’re using the DeepSeek API provided by the official website, you might encounter an “exceeding context” error after several rounds of conversation (since the official API only supports a 64K context window). In this case, you’ll need to discard the previous context and start fresh. Alternatively, you can use ByteDance’s DeepSeek API, which offers a 128K context window and supports KV cache.
|
||
|
||

|
||
|
||
Note: claude code consumes a huge amount of tokens, but thanks to DeepSeek’s low cost, you can use claude code at a fraction of Claude’s price, and you don’t need to subscribe to the Claude Max plan.
|
||
|
||
Some interesting points: Based on my testing, including a lot of context information can help narrow the performance gap between these LLM models. For instance, when I used Claude-4 in VSCode Copilot to handle a Flutter issue, it messed up the files in three rounds of conversation, and I had to roll everything back. However, when I used claude code with DeepSeek, after three or four rounds of conversation, I finally managed to complete my task—and the cost was less than 1 RMB!
|
||
|
||
## Some articles:
|
||
|
||
1. [Project Motivation and Principles](blog/en/project-motivation-and-how-it-works.md) ([项目初衷及原理](blog/zh/项目初衷及原理.md))
|
||
2. [Maybe We Can Do More with the Router](blog/en/maybe-we-can-do-more-with-the-route.md) ([或许我们能在 Router 中做更多事情](blog/zh/或许我们能在Router中做更多事情.md))
|
||
|
||
## Buy me a coffee
|
||
|
||
If you find this project helpful, you can choose to sponsor the author with a cup of coffee. Please provide your GitHub information so I can add you to the sponsor list below.
|
||
|
||
[](https://ko-fi.com/F1F31GN2GM)
|
||
|
||
<table>
|
||
<tr>
|
||
<td><img src="/blog/images/alipay.jpg" width="200" /></td>
|
||
<td><img src="/blog/images/wechat.jpg" width="200" /></td>
|
||
</tr>
|
||
</table>
|
||
|
||
## Sponsors
|
||
|
||
Thanks to the following sponsors for supporting the continued development of this project:
|
||
|
||
@Simon Leischnig (If you see this, feel free to contact me and I can update it with your GitHub information)
|
||
[@duanshuaimin](https://github.com/duanshuaimin)
|
||
[@vrgitadmin](https://github.com/vrgitadmin)
|
||
@\*o (可通过主页邮箱联系我修改 github 用户名)
|
||
[@ceilwoo](https://github.com/ceilwoo)
|
||
@\*说 (可通过主页邮箱联系我修改 github 用户名)
|
||
@\*更 (可通过主页邮箱联系我修改 github 用户名)
|
||
@K\*g (可通过主页邮箱联系我修改 github 用户名)
|
||
@R\*R (可通过主页邮箱联系我修改 github 用户名)
|
||
[@bobleer](https://github.com/bobleer)
|
||
@\*苗 (可通过主页邮箱联系我修改 github 用户名)
|
||
@\*划 (可通过主页邮箱联系我修改 github 用户名)
|
||
[@Clarence-pan](https://github.com/Clarence-pan)
|
||
[@carter003](https://github.com/carter003)
|
||
@S\*r (可通过主页邮箱联系我修改 github 用户名)
|
||
@\*晖 (可通过主页邮箱联系我修改 github 用户名)
|
||
@\*敏 (可通过主页邮箱联系我修改 github 用户名)
|
||
@Z\*z (可通过主页邮箱联系我修改 github 用户名)
|
||
@\*然 (可通过主页邮箱联系我修改 github 用户名)
|
||
[@cluic](https://github.com/cluic)
|
||
@\*苗 (可通过主页邮箱联系我修改 github 用户名)
|
||
[@PromptExpert](https://github.com/PromptExpert)
|
||
@\*应 (可通过主页邮箱联系我修改 github 用户名)
|