Merge branch 'hiyouga:main' into pixtral-patch
Former-commit-id: 28696e2f945a9f55e4ca9e9dc5ebd8af9df45d8b
This commit is contained in:
22
README.md
22
README.md
@@ -75,6 +75,8 @@ Compared to ChatGLM's [P-Tuning](https://github.com/THUDM/ChatGLM2-6B/tree/main/
|
||||
|
||||
## Changelog
|
||||
|
||||
[24/10/09] We supported downloading pre-trained models and datasets from the **[Modelers Hub](https://modelers.cn/models)**. See [this tutorial](#download-from-modelers-hub) for usage.
|
||||
|
||||
[24/09/19] We support fine-tuning the **[Qwen2.5](https://qwenlm.github.io/blog/qwen2.5/)** models.
|
||||
|
||||
[24/08/30] We support fine-tuning the **[Qwen2-VL](https://qwenlm.github.io/blog/qwen2-vl/)** models. Thank [@simonJJJ](https://github.com/simonJJJ)'s PR.
|
||||
@@ -133,7 +135,7 @@ Compared to ChatGLM's [P-Tuning](https://github.com/THUDM/ChatGLM2-6B/tree/main/
|
||||
|
||||
[23/12/12] We supported fine-tuning the latest MoE model **[Mixtral 8x7B](https://huggingface.co/mistralai/Mixtral-8x7B-v0.1)** in our framework. See hardware requirement [here](#hardware-requirement).
|
||||
|
||||
[23/12/01] We supported downloading pre-trained models and datasets from the **[ModelScope Hub](https://modelscope.cn/models)** for Chinese mainland users. See [this tutorial](#download-from-modelscope-hub) for usage.
|
||||
[23/12/01] We supported downloading pre-trained models and datasets from the **[ModelScope Hub](https://modelscope.cn/models)**. See [this tutorial](#download-from-modelscope-hub) for usage.
|
||||
|
||||
[23/10/21] We supported **[NEFTune](https://arxiv.org/abs/2310.05914)** trick for fine-tuning. Try `neftune_noise_alpha: 5` argument to activate NEFTune.
|
||||
|
||||
@@ -364,7 +366,7 @@ cd LLaMA-Factory
|
||||
pip install -e ".[torch,metrics]"
|
||||
```
|
||||
|
||||
Extra dependencies available: torch, torch-npu, metrics, deepspeed, liger-kernel, bitsandbytes, hqq, eetq, gptq, awq, aqlm, vllm, galore, badam, adam-mini, qwen, modelscope, quality
|
||||
Extra dependencies available: torch, torch-npu, metrics, deepspeed, liger-kernel, bitsandbytes, hqq, eetq, gptq, awq, aqlm, vllm, galore, badam, adam-mini, qwen, modelscope, openmind, quality
|
||||
|
||||
> [!TIP]
|
||||
> Use `pip install --no-deps -e .` to resolve package conflicts.
|
||||
@@ -416,7 +418,7 @@ Download the pre-built Docker images: [32GB](http://mirrors.cn-central-221.ovaij
|
||||
|
||||
### Data Preparation
|
||||
|
||||
Please refer to [data/README.md](data/README.md) for checking the details about the format of dataset files. You can either use datasets on HuggingFace / ModelScope hub or load the dataset in local disk.
|
||||
Please refer to [data/README.md](data/README.md) for checking the details about the format of dataset files. You can either use datasets on HuggingFace / ModelScope / Modelers hub or load the dataset in local disk.
|
||||
|
||||
> [!NOTE]
|
||||
> Please update `data/dataset_info.json` to use your custom dataset.
|
||||
@@ -484,6 +486,7 @@ docker build -f ./docker/docker-cuda/Dockerfile \
|
||||
docker run -dit --gpus=all \
|
||||
-v ./hf_cache:/root/.cache/huggingface \
|
||||
-v ./ms_cache:/root/.cache/modelscope \
|
||||
-v ./om_cache:/root/.cache/openmind \
|
||||
-v ./data:/app/data \
|
||||
-v ./output:/app/output \
|
||||
-p 7860:7860 \
|
||||
@@ -508,6 +511,7 @@ docker build -f ./docker/docker-npu/Dockerfile \
|
||||
docker run -dit \
|
||||
-v ./hf_cache:/root/.cache/huggingface \
|
||||
-v ./ms_cache:/root/.cache/modelscope \
|
||||
-v ./om_cache:/root/.cache/openmind \
|
||||
-v ./data:/app/data \
|
||||
-v ./output:/app/output \
|
||||
-v /usr/local/dcmi:/usr/local/dcmi \
|
||||
@@ -541,6 +545,7 @@ docker build -f ./docker/docker-rocm/Dockerfile \
|
||||
docker run -dit \
|
||||
-v ./hf_cache:/root/.cache/huggingface \
|
||||
-v ./ms_cache:/root/.cache/modelscope \
|
||||
-v ./om_cache:/root/.cache/openmind \
|
||||
-v ./data:/app/data \
|
||||
-v ./output:/app/output \
|
||||
-v ./saves:/app/saves \
|
||||
@@ -561,6 +566,7 @@ docker exec -it llamafactory bash
|
||||
|
||||
- `hf_cache`: Utilize Hugging Face cache on the host machine. Reassignable if a cache already exists in a different directory.
|
||||
- `ms_cache`: Similar to Hugging Face cache but for ModelScope users.
|
||||
- `om_cache`: Similar to Hugging Face cache but for Modelers users.
|
||||
- `data`: Place datasets on this dir of the host machine so that they can be selected on LLaMA Board GUI.
|
||||
- `output`: Set export dir to this location so that the merged result can be accessed directly on the host machine.
|
||||
|
||||
@@ -585,6 +591,16 @@ export USE_MODELSCOPE_HUB=1 # `set USE_MODELSCOPE_HUB=1` for Windows
|
||||
|
||||
Train the model by specifying a model ID of the ModelScope Hub as the `model_name_or_path`. You can find a full list of model IDs at [ModelScope Hub](https://modelscope.cn/models), e.g., `LLM-Research/Meta-Llama-3-8B-Instruct`.
|
||||
|
||||
### Download from Modelers Hub
|
||||
|
||||
You can also use Modelers Hub to download models and datasets.
|
||||
|
||||
```bash
|
||||
export USE_OPENMIND_HUB=1 # `set USE_OPENMIND_HUB=1` for Windows
|
||||
```
|
||||
|
||||
Train the model by specifying a model ID of the Modelers Hub as the `model_name_or_path`. You can find a full list of model IDs at [Modelers Hub](https://modelers.cn/models), e.g., `TeleAI/TeleChat-7B-pt`.
|
||||
|
||||
### Use W&B Logger
|
||||
|
||||
To use [Weights & Biases](https://wandb.ai) for logging experimental results, you need to add the following arguments to yaml files.
|
||||
|
||||
Reference in New Issue
Block a user