support badam for all stages

Former-commit-id: 7a1380646119bfe6855f73dd90570defcea05281
This commit is contained in:
hiyouga
2024-04-16 17:44:48 +08:00
parent 42084e08ae
commit a4167fd925
9 changed files with 61 additions and 28 deletions

View File

@@ -3,7 +3,7 @@ We provide diverse examples about fine-tuning LLMs.
```
examples/
├── lora_single_gpu/
│ ├── pretrain.sh: Do pre-training using LoRA
│ ├── pretrain.sh: Do continuous pre-training using LoRA
│ ├── sft.sh: Do supervised fine-tuning using LoRA
│ ├── reward.sh: Do reward modeling using LoRA
│ ├── ppo.sh: Do PPO training using LoRA
@@ -34,6 +34,8 @@ examples/
└── extras/
├── galore/
│ └── sft.sh: Fine-tune model with GaLore
├── badam/
│ └── sft.sh: Fine-tune model with BAdam
├── loraplus/
│ └── sft.sh: Fine-tune model using LoRA+
├── llama_pro/

View File

@@ -3,7 +3,7 @@
```
examples/
├── lora_single_gpu/
│ ├── pretrain.sh: 基于 LoRA 进行预训练
│ ├── pretrain.sh: 基于 LoRA 进行增量预训练
│ ├── sft.sh: 基于 LoRA 进行指令监督微调
│ ├── reward.sh: 基于 LoRA 进行奖励模型训练
│ ├── ppo.sh: 基于 LoRA 进行 PPO 训练
@@ -34,6 +34,8 @@ examples/
└── extras/
├── galore/
│ └── sft.sh: 使用 GaLore 训练模型
├── badam/
│ └── sft.sh: 使用 BAdam 训练模型
├── loraplus/
│ └── sft.sh: 使用 LoRA+ 训练模型
├── llama_pro/