add rm dataset explanation

Signed-off-by: Peter Pan <Peter.Pan@daocloud.io>

Former-commit-id: 1efb95025be6501f1b30b20e7c711d3590b5d1ee
This commit is contained in:
Peter Pan
2023-08-22 01:30:57 -04:00
parent 9c0622de13
commit 5cac87d317
2 changed files with 25 additions and 0 deletions

View File

@@ -16,3 +16,15 @@ If you are using a custom dataset, please provide your dataset definition in the
```
where the `prompt` and `response` columns should contain non-empty values. The `query` column will be concatenated with the `prompt` column and used as input for the model. The `history` column should contain a list where each element is a string tuple representing a query-response pair.
For Reward-Modeling(rm) dataset, the first n examples represent chosen examples and the last n examples represent rejected examples.
```json
{
"instruction": "Question?",
"input": "",
"output": [
"chosen answer",
"rejected answer"
]
}
```