Commit Graph

  • 1348f7d860 fix resize vocab at inference #3022 hiyouga 2024-04-03 18:14:24 +08:00
  • f6530222f7 fix #3116 hiyouga 2024-04-03 14:47:59 +08:00
  • a74a7585e0 update vllm example hiyouga 2024-04-02 22:45:20 +08:00
  • 5bf0cca2b8 update readme hiyouga 2024-04-02 22:17:48 +08:00
  • 755b6511ff update examples hiyouga 2024-04-02 21:09:25 +08:00
  • 35621c6089 add zh readme hiyouga 2024-04-02 20:58:45 +08:00
  • 38b59664e6 update examples hiyouga 2024-04-02 20:51:21 +08:00
  • 933a084999 update examples hiyouga 2024-04-02 20:41:49 +08:00
  • c1510d19c7 update readme hiyouga 2024-04-02 20:37:37 +08:00
  • 2074cf99fb update readme hiyouga 2024-04-02 20:22:11 +08:00
  • b12176d818 simplify readme hiyouga 2024-04-02 20:07:43 +08:00
  • 117b67ea30 add moe aux loss control #3085 hiyouga 2024-04-02 14:26:31 +08:00
  • 03e20bb5c6 fix #3022 hiyouga 2024-04-02 13:58:39 +08:00
  • 0c4a1381a4 Update SECURITY.md hiyouga 2024-04-01 23:30:03 +08:00
  • 9e14501edb set dev version hiyouga 2024-04-01 23:24:08 +08:00
  • 1dc963caa6 fix #3083 hiyouga 2024-04-01 22:53:52 +08:00
  • 85726c91ce add qwen1.5 moe hiyouga 2024-04-01 21:49:40 +08:00
  • 40211db275 fix #3077 hiyouga 2024-04-01 21:35:18 +08:00
  • e7f13098c6 support infer 4bit model on GPUs #3023 hiyouga 2024-04-01 17:34:04 +08:00
  • 61eb3a3d46 update webui hiyouga 2024-04-01 16:23:28 +08:00
  • be0a807e8c fix ORPO loss hiyouga 2024-04-01 14:42:41 +08:00
  • 52d402e2a9 fix IPO and ORPO loss hiyouga 2024-04-01 14:37:53 +08:00
  • c5a46f9113 fix plots hiyouga 2024-03-31 19:43:48 +08:00
  • 00e17a377c use log1p in orpo loss hiyouga 2024-03-31 19:27:08 +08:00
  • 9abd83adb1 update readme hiyouga 2024-03-31 18:46:34 +08:00
  • f0d2afcf90 Merge pull request #3066 from hiyouga/orpo hoshi-hiyouga 2024-03-31 18:42:48 +08:00
  • 1aba442bcd support orpo in webui hiyouga 2024-03-31 18:34:59 +08:00
  • d764cd8736 support ORPO hiyouga 2024-03-31 18:29:50 +08:00
  • 526111a303 tiny fix hiyouga 2024-03-31 00:10:29 +08:00
  • b8364046df Merge pull request #3057 from marko1616/bugfix/lora-model-merge hoshi-hiyouga 2024-03-31 00:07:20 +08:00
  • 1f617c6e08 fix blank line contains whitespace marko1616 2024-03-30 23:46:55 +08:00
  • a6858a36c0 Fix Llama model save for full param train marko1616 2024-03-30 23:45:04 +08:00
  • 6198121923 support save args in webui #2807 #3046 hiyouga 2024-03-30 23:09:12 +08:00
  • b0efebf853 upgrade gradio to 4.21.0 hiyouga 2024-03-30 20:37:08 +08:00
  • fbd0584391 release v0.6.1 v0.6.1 hiyouga 2024-03-29 11:36:08 +08:00
  • 50224b09cc update readme hiyouga 2024-03-28 22:02:32 +08:00
  • 32dcc5a491 add project hiyouga 2024-03-28 20:24:27 +08:00
  • 9408366a36 fix #2982 hiyouga 2024-03-28 20:22:31 +08:00
  • f0e564beaa update readme hiyouga 2024-03-28 18:35:11 +08:00
  • 14b75a0b93 fix #3010 hiyouga 2024-03-28 18:31:17 +08:00
  • 59e6ebf039 update trainers hiyouga 2024-03-28 18:16:27 +08:00
  • 7cdc16abdf Supports custom data set sampling quantity zhangzc 2024-03-27 14:22:50 +08:00
  • dc540dfaa8 fix ds optimizer hoshi-hiyouga 2024-03-26 23:39:56 +08:00
  • 587e65e442 fix #2981 hiyouga 2024-03-26 17:53:04 +08:00
  • a916688723 fix bug hiyouga 2024-03-26 17:30:12 +08:00
  • 3336422760 fix #2961 hiyouga 2024-03-26 17:26:14 +08:00
  • 04423b916f release v0.6.0 (real) v0.6.0 hiyouga 2024-03-25 23:37:48 +08:00
  • bf8d2f8eda tiny fix hiyouga 2024-03-25 23:28:52 +08:00
  • 2a5d02fd0f update readme hiyouga 2024-03-25 23:06:13 +08:00
  • ea550ed9e0 Merge pull request #2967 from Tsumugii24/main hoshi-hiyouga 2024-03-25 23:02:22 +08:00
  • 02665cd42b Update README.md Tsumugii24 2024-03-25 22:54:38 +08:00
  • 0c6a94e66d Update README_zh.md Tsumugii24 2024-03-25 22:54:26 +08:00
  • ebd6bc2604 add arg check hiyouga 2024-03-25 22:42:58 +08:00
  • daab85e3e6 release v0.6.0 hiyouga 2024-03-25 22:38:56 +08:00
  • 769d81a83d Update README_zh.md Tsumugii24 2024-03-25 22:31:03 +08:00
  • ac2a401b1d Merge pull request #2963 from rkinas/patch-1 hoshi-hiyouga 2024-03-25 21:49:34 +08:00
  • bb53c18153 Update requirements.txt Remek Kinas 2024-03-25 14:30:58 +01:00
  • 04e0fe9147 tiny fix hiyouga 2024-03-25 21:18:08 +08:00
  • 39f75c7001 Merge pull request #2945 from marko1616/bugfix/lora-model-merge hoshi-hiyouga 2024-03-25 13:36:08 +08:00
  • 7f99cb1817 pass ruff check marko1616 2024-03-24 16:12:10 +08:00
  • c555b2cce3 fix Llama lora merge crash marko1616 2024-03-24 03:06:11 +08:00
  • 2eba1c6851 fix Llama lora merge crash marko1616 2024-03-24 02:55:23 +08:00
  • edeed55664 fix Llama lora merge crash marko1616 2024-03-24 02:44:35 +08:00
  • 92248f9cb2 fix #2936 hiyouga 2024-03-24 00:43:21 +08:00
  • c548ad5e69 fix #2928 hiyouga 2024-03-24 00:34:54 +08:00
  • a57d839e1d fix #2941 hiyouga 2024-03-24 00:28:44 +08:00
  • d88a34bc79 Merge pull request #2919 from 0xez/main hoshi-hiyouga 2024-03-22 12:12:24 +08:00
  • 60cbc9d0e5 Update README_zh.md, fix the release date of the paper 0xez 2024-03-22 10:41:17 +08:00
  • d5005e766f Update README.md, fix the release date of the paper 0xez 2024-03-21 22:14:48 +08:00
  • 4d0753cffe move file hiyouga 2024-03-21 17:05:17 +08:00
  • 1cf0f11840 add citation hiyouga 2024-03-21 17:04:10 +08:00
  • 052e8b2cc6 paper release hiyouga 2024-03-21 13:49:17 +08:00
  • 8963e89633 update readme hiyouga 2024-03-21 00:48:42 +08:00
  • 935ee0a023 support fsdp + qlora hiyouga 2024-03-21 00:36:06 +08:00
  • 5ed234ca63 add orca_dpo_pairs dataset hiyouga 2024-03-20 20:09:06 +08:00
  • 04884a0911 Merge pull request #2905 from SirlyDreamer/main hoshi-hiyouga 2024-03-20 18:09:54 +08:00
  • c7af26a9e3 fix #2777 #2895 hiyouga 2024-03-20 17:59:45 +08:00
  • d8073488be fix #2346 hiyouga 2024-03-20 17:56:33 +08:00
  • 6fc2d7e063 Follow HF_ENDPOINT environment variable SirlyDreamer 2024-03-20 08:31:30 +00:00
  • e93c7cdb80 Updated README with new information khazic 2024-03-20 14:38:08 +08:00
  • c32d6c8250 Updated README with new information khazic 2024-03-20 14:21:16 +08:00
  • 757158da63 Updated README with new information 刘一博 2024-03-20 14:11:28 +08:00
  • ffdacaa618 fix packages hiyouga 2024-03-17 22:32:03 +08:00
  • e194efab10 fix patcher hiyouga 2024-03-15 19:18:42 +08:00
  • 772fc2eac7 Merge pull request #2849 from S3Studio/DockerizeSupport hoshi-hiyouga 2024-03-15 19:16:02 +08:00
  • ed020579dc fix export hiyouga 2024-03-15 15:06:30 +08:00
  • 096869c7b6 Use official Nvidia base image S3Studio 2024-03-14 18:03:33 +08:00
  • c6873211e9 improve Docker build and runtime parameters S3Studio 2024-03-12 14:05:10 +08:00
  • 623ee1bd88 tiny fix hiyouga 2024-03-14 21:19:06 +08:00
  • aabe90343e fix export hiyouga 2024-03-14 18:17:01 +08:00
  • 764cfb506d fix bug hiyouga 2024-03-13 23:55:31 +08:00
  • 249ad56075 fix bug hiyouga 2024-03-13 23:43:42 +08:00
  • 46f99ff277 improve lora+ impl. hiyouga 2024-03-13 23:32:51 +08:00
  • 73f4513c84 Merge pull request #2830 from qibaoyuan/lora_plus hoshi-hiyouga 2024-03-13 20:15:46 +08:00
  • 3c91e86268 [FEATURE]: ADD LORA+ ALGORITHM 齐保元 2024-03-13 19:43:27 +08:00
  • 42473ec150 fix #2817 hiyouga 2024-03-13 12:42:03 +08:00
  • 6a4e4b9c5b fix #2802 hiyouga 2024-03-13 12:33:45 +08:00
  • 9a784fb4f3 fix kv cache hiyouga 2024-03-13 01:21:50 +08:00
  • 43fd80a1aa support QDoRA hiyouga 2024-03-12 22:12:42 +08:00
  • e6ab1a57ea patch for gemma cpt hiyouga 2024-03-12 21:21:54 +08:00