Add an .env.example file and provide more detailed instructions.
This commit is contained in:
22
README.md
22
README.md
@@ -8,6 +8,8 @@
|
||||
|
||||
## Implemented
|
||||
|
||||
- [x] Mormal Mode and Router Mode
|
||||
|
||||
- [x] Using the qwen2.5-coder-3b-instruct model as the routing dispatcher (since it’s currently free on Alibaba Cloud’s official website)
|
||||
|
||||
- [x] Using the qwen-max-0125 model as the tool invoker
|
||||
@@ -42,6 +44,7 @@ npm i
|
||||
|
||||
```shell
|
||||
# Alternatively, you can create an .env file in the repo directory
|
||||
# You can refer to the .env.example file to create the .env file
|
||||
|
||||
## disable router
|
||||
ENABLE_ROUTER=false
|
||||
@@ -79,3 +82,22 @@ export ANTHROPIC_BASE_URL="http://127.0.0.1:3456"
|
||||
export API_TIMEOUT_MS=600000
|
||||
claude
|
||||
```
|
||||
|
||||
## Normal Mode
|
||||
|
||||
The initial version uses a single model to accomplish all tasks. This model needs to support function calling and must allow for a sufficiently large tool description length, ideally greater than 1754. If the model used in this mode does not support KV Cache, it will consume a significant number of tokens.
|
||||
|
||||

|
||||
|
||||
## Router Mode
|
||||
|
||||
Using multiple models to handle different tasks, this mode requires setting ENABLE_ROUTER to true and configuring four models: ROUTER_AGENT_MODEL, TOOL_AGENT_MODEL, CODER_AGENT_MODEL, and THINK_AGENT_MODEL.
|
||||
|
||||
ROUTER_AGENT_MODEL does not require high intelligence and is only responsible for request routing. A small model is sufficient for this task (testing has shown that the qwen-coder-3b model performs well).
|
||||
TOOL_AGENT_MODEL must support function calling and allow for a sufficiently large tool description length, ideally greater than 1754. If the model used in this mode does not support KV Cache, it will consume a significant number of tokens.
|
||||
|
||||
CODER_AGENT_MODEL and THINK_AGENT_MODEL can use the DeepSeek series of models.
|
||||
|
||||
The purpose of router mode is to separate tool invocation from coding tasks, enabling the use of inference models like r1, which do not support function calling.
|
||||
|
||||

|
||||
|
||||
Reference in New Issue
Block a user