update template

Former-commit-id: a95f3a4d62de1073a78125401cf4289ec0523156
2023-08-22 19:46:09 +08:00
parent f55907dbea
commit 6310613699
6 changed files with 38 additions and 20 deletions
--- a/data/README_zh.md
+++ b/data/README_zh.md
@@ -17,15 +17,15 @@

 其中 `prompt` 和 `response` 列应当是非空的字符串。`query` 列的内容将会和 `prompt` 列拼接作为模型输入。`history` 列应当是一个列表，其中每个元素是一个字符串二元组，分别代表用户请求和模型答复。

-对于奖励模型(rm)的数据集，头N个输出表示`chosen`的数据，后N个输出表示`rejected`的数据，例如：
+对于奖励模型或 DPO 训练的数据集，`response` 列应当是一个字符串列表，排在前面的代表更优的答案，例如：
+
 ```json
 {
-    "instruction": "Question?",
-    "input": "",
-    "output": [
-       "chosen answer",
-       "rejected answer"
-    ]
+  "instruction": "Question",
+  "input": "",
+  "output": [
+    "Chosen answer",
+    "Rejected answer"
+  ]
 }
-
 ```