Commit Graph

6 Commits

Author SHA1 Message Date
hiyouga
a4167fd925 support badam for all stages
Former-commit-id: 7a1380646119bfe6855f73dd90570defcea05281
2024-04-16 17:44:48 +08:00
hiyouga
1dc963caa6 fix #3083
Former-commit-id: ff9a3f73961a362d0ddc22079f80a85465fffda8
2024-04-01 22:53:52 +08:00
hiyouga
be0a807e8c fix ORPO loss
Former-commit-id: 5544ddde9087f00f9e20b78d0079f20c2f5d1604
2024-04-01 14:42:41 +08:00
hiyouga
52d402e2a9 fix IPO and ORPO loss
Former-commit-id: fc27955732aedbb12003faf19b760e2768b228f2
2024-04-01 14:37:53 +08:00
hiyouga
00e17a377c use log1p in orpo loss
https://github.com/huggingface/trl/pull/1491

Former-commit-id: 3b15d495264b00a4f8716bafea334778874963d7
2024-03-31 19:27:08 +08:00
hiyouga
d764cd8736 support ORPO
Former-commit-id: f44a4c27e2461cdaa1b16865f597a31033c0e6d9
2024-03-31 18:29:50 +08:00