update template
Former-commit-id: a95f3a4d62de1073a78125401cf4289ec0523156
This commit is contained in:
@@ -17,14 +17,15 @@ If you are using a custom dataset, please provide your dataset definition in the
|
||||
|
||||
where the `prompt` and `response` columns should contain non-empty values. The `query` column will be concatenated with the `prompt` column and used as input for the model. The `history` column should contain a list where each element is a string tuple representing a query-response pair.
|
||||
|
||||
For Reward-Modeling(rm) dataset, the first n examples represent chosen examples and the last n examples represent rejected examples.
|
||||
For datasets used in reward modeling or DPO training, the `response` column should be a string list, with the preferred answers appearing first, for example:
|
||||
|
||||
```json
|
||||
{
|
||||
"instruction": "Question?",
|
||||
"input": "",
|
||||
"output": [
|
||||
"chosen answer",
|
||||
"rejected answer"
|
||||
]
|
||||
"instruction": "Question",
|
||||
"input": "",
|
||||
"output": [
|
||||
"Chosen answer",
|
||||
"Rejected answer"
|
||||
]
|
||||
}
|
||||
```
|
||||
|
||||
Reference in New Issue
Block a user