support ORPO
Former-commit-id: f44a4c27e2461cdaa1b16865f597a31033c0e6d9
This commit is contained in:
@@ -34,6 +34,8 @@ If you are using a custom dataset, please provide your dataset definition in the
|
||||
|
||||
Given above, you can use the custom dataset via specifying `--dataset dataset_name`.
|
||||
|
||||
----
|
||||
|
||||
Currently we support dataset in **alpaca** or **sharegpt** format, the dataset in alpaca format should follow the below format:
|
||||
|
||||
```json
|
||||
@@ -84,6 +86,10 @@ For the preference datasets, the `response` column should be a string list whose
|
||||
}
|
||||
```
|
||||
|
||||
Remember to set `"ranking": true` for the preference datasets.
|
||||
|
||||
----
|
||||
|
||||
The dataset in sharegpt format should follow the below format:
|
||||
|
||||
```json
|
||||
|
||||
@@ -34,6 +34,8 @@
|
||||
|
||||
添加后可通过指定 `--dataset 数据集名称` 参数使用自定义数据集。
|
||||
|
||||
----
|
||||
|
||||
该项目目前支持两种格式的数据集:**alpaca** 和 **sharegpt**,其中 alpaca 格式的数据集按照以下方式组织:
|
||||
|
||||
```json
|
||||
@@ -84,6 +86,10 @@
|
||||
}
|
||||
```
|
||||
|
||||
添加偏好数据集需要额外指定 `"ranking": true`。
|
||||
|
||||
----
|
||||
|
||||
而 sharegpt 格式的数据集按照以下方式组织:
|
||||
|
||||
```json
|
||||
|
||||
Reference in New Issue
Block a user