update readme

Former-commit-id: 06bcbb901f69265632892a5fcbc956b8be1153da
2023-08-07 15:02:02 +08:00
parent 370f817549
commit 34a2bddfcd
5 changed files with 39 additions and 30 deletions
--- a/README.md
+++ b/README.md
@@ -41,16 +41,21 @@
 [23/05/31] Now we support training the **BLOOM & BLOOMZ** models in this repo. Try `--model_name_or_path bigscience/bloomz-7b1-mt` and `--lora_target query_key_value` arguments to use the BLOOMZ model.

 ## Supported Models
-| model                                                       | model size                  | model_name_or_path             | lora_target       | template |
-|-------------------------------------------------------------|-----------------------------|--------------------------------|-------------------|----------|
-| [LLaMA](https://github.com/facebookresearch/llama)          | 7B/13B/33B/65B              | -                              | q_proj,v_proj     | default  |
-| [LLaMA-2](https://huggingface.co/meta-llama)                | 7B/13B/70B                  | meta-llama/Llama-2-7b-hf       | q_proj,v_proj     | llama2   |
-| [BLOOM](https://huggingface.co/bigscience/bloom)            | 560M/1.1B/1.7B/3B/7.1B/176B | bigscience/bloom-7b1           | query_key_value   | default  |
-| [BLOOMZ](https://huggingface.co/bigscience/bloomz)          | 560M/1.1B/1.7B/3B/7.1B/176B | bigscience/bloomz-7b1-mt       | query_key_value   | default  |
-| [Falcon](https://huggingface.co/tiiuae/falcon-7b)           | 7B/40B                      | tiiuae/falcon-7b               | query_key_value   | default  |
-| [Baichuan](https://huggingface.co/baichuan-inc/baichuan-7B) | 7B/13B                      | baichuan-inc/Baichuan-13B-Chat | W_pack            | baichuan |
-| [InternLM](https://github.com/InternLM/InternLM)            | 7B                          | internlm/internlm-7b           | q_proj,v_proj     | intern   |
-| [Qwen](https://github.com/QwenLM/Qwen-7B)                   | 7B                          | Qwen/Qwen-7B-Chat              | c_attn            | chatml   |
+
+| Model                                                    | Model size                  | Default module    | Template |
+| -------------------------------------------------------- | --------------------------- | ----------------- |----------|
+| [LLaMA](https://github.com/facebookresearch/llama)       | 7B/13B/33B/65B              | q_proj,v_proj     | -        |
+| [LLaMA-2](https://huggingface.co/meta-llama)             | 7B/13B/70B                  | q_proj,v_proj     | llama2   |
+| [BLOOM](https://huggingface.co/bigscience/bloom)         | 560M/1.1B/1.7B/3B/7.1B/176B | query_key_value   | -        |
+| [BLOOMZ](https://huggingface.co/bigscience/bloomz)       | 560M/1.1B/1.7B/3B/7.1B/176B | query_key_value   | -        |
+| [Falcon](https://huggingface.co/tiiuae/falcon-7b)        | 7B/40B                      | query_key_value   | -        |
+| [Baichuan](https://github.com/baichuan-inc/baichuan-13B) | 7B/13B                      | W_pack            | baichuan |
+| [InternLM](https://github.com/InternLM/InternLM)         | 7B                          | q_proj,v_proj     | intern   |
+| [Qwen](https://github.com/QwenLM/Qwen-7B)                | 7B                          | c_attn            | chatml   |
+| [XVERSE](https://github.com/xverse-ai/XVERSE-13B)        | 13B                         | q_proj,v_proj     | -        |
+
+> * **Default module** is used for the `--lora_target` argument. Please use `python src/train_bash.py -h` to see all available options.
+> * For the "base" models, the `--template` argument can be chosen from `default`, `alpaca`, `vicuna` etc.

 ## Supported Training Approaches