Commit Graph

  • f6a53d83c8 update readme hiyouga 2024-04-22 00:51:35 +08:00
  • 4ec56dd958 update readme hiyouga 2024-04-22 00:42:25 +08:00
  • ba06eb65ca update readme and examples hiyouga 2024-04-22 00:37:32 +08:00
  • be716972fe remove extras hiyouga 2024-04-22 00:35:41 +08:00
  • 719585a128 update readme hiyouga 2024-04-22 00:21:01 +08:00
  • 348f29aa50 set dev version hiyouga 2024-04-21 23:14:30 +08:00
  • c8fe3f544b release v0.6.3 v0.6.3 hiyouga 2024-04-21 23:13:23 +08:00
  • 0f1ad7140f fix #3366 hiyouga 2024-04-21 21:34:25 +08:00
  • 233e167f68 fix optimizers hiyouga 2024-04-21 20:40:54 +08:00
  • 1d341dcd83 fix #3365 hiyouga 2024-04-21 19:20:18 +08:00
  • d16561e7a4 fix bug in galore optimizer hiyouga 2024-04-21 18:53:22 +08:00
  • f8e219dc81 fix mod stuff hiyouga 2024-04-21 18:11:10 +08:00
  • 3365cc8cf0 Merge pull request #3338 from astramind-ai/main hoshi-hiyouga 2024-04-21 18:05:52 +08:00
  • 3a5e68b7d9 fix #3348 hoshi-hiyouga 2024-04-20 10:34:09 +08:00
  • 0cb596fee1 add dpo mix dataset hiyouga 2024-04-20 01:31:38 +08:00
  • b3b5b530d1 fix #3352 hiyouga 2024-04-19 22:40:01 +08:00
  • 9225c15c88 fix llama3 template hiyouga 2024-04-19 15:46:51 +08:00
  • abd9fed445 fix small typo Marco 2024-04-18 20:33:29 +02:00
  • 44cda2eece Added Mixture of Depths Marco 2024-04-18 20:31:24 +02:00
  • 8397808d1d support llama3 hoshi-hiyouga 2024-04-19 01:13:50 +08:00
  • 9e1bd6420d fix #3324 hiyouga 2024-04-18 15:34:45 +08:00
  • 619264c854 tiny fix hiyouga 2024-04-18 00:22:17 +08:00
  • 1ebac62e3d update readme hiyouga 2024-04-17 23:40:49 +08:00
  • ce9bdb3509 add mixtral 8x22B models hiyouga 2024-04-17 23:35:59 +08:00
  • 0c8d6369ac add CodeQwen models hiyouga 2024-04-17 23:27:22 +08:00
  • bee796f6b5 fix #3316 hiyouga 2024-04-17 22:54:34 +08:00
  • 9f6349a333 fix #3317 hiyouga 2024-04-17 22:17:19 +08:00
  • 171a029c5e lint hiyouga 2024-04-16 18:21:09 +08:00
  • eaefaa0fe0 Merge pull request #3291 from codemayq/main hoshi-hiyouga 2024-04-16 18:12:09 +08:00
  • d301f0a64b Update parser.py hiyouga 2024-04-16 18:09:31 +08:00
  • 0a1578e4e3 update readme and gradio version hiyouga 2024-04-16 18:09:16 +08:00
  • a4167fd925 support badam for all stages hiyouga 2024-04-16 17:44:48 +08:00
  • 42084e08ae Merge pull request #3287 from Ledzy/badam hoshi-hiyouga 2024-04-16 17:32:16 +08:00
  • 9d23f5dc89 Update utils.py hoshi-hiyouga 2024-04-16 17:30:12 +08:00
  • 5978427ae0 Update trainer.py hoshi-hiyouga 2024-04-16 17:29:52 +08:00
  • c7c216069c Update utils.py hoshi-hiyouga 2024-04-16 17:29:30 +08:00
  • cde9d1b917 Update patcher.py hoshi-hiyouga 2024-04-16 17:29:19 +08:00
  • 96213f04b0 Update adapter.py hoshi-hiyouga 2024-04-16 17:28:12 +08:00
  • 7ecea08b9b Update parser.py hoshi-hiyouga 2024-04-16 17:27:25 +08:00
  • 191971865d Update parser.py hoshi-hiyouga 2024-04-16 17:27:02 +08:00
  • ff4f587dd9 Update finetuning_args.py hoshi-hiyouga 2024-04-16 17:26:30 +08:00
  • de728d0371 Update sft.sh hoshi-hiyouga 2024-04-16 17:25:40 +08:00
  • d08e09642d Update requirements.txt hoshi-hiyouga 2024-04-16 17:10:17 +08:00
  • 351493b183 Update setup.py hoshi-hiyouga 2024-04-16 17:10:02 +08:00
  • 86ab47e121 remove badam from core requirements Jonery 2024-04-16 12:25:50 +08:00
  • 6dd6b3e396 resolve gradient checkpointing issue. Jonery 2024-04-16 12:05:27 +08:00
  • 5f1418a68b add check codingma 2024-04-16 10:56:39 +08:00
  • 7b97a79efc support for previewing custom dataset in directory format codingma 2024-04-16 10:43:14 +08:00
  • ce4f653121 add empty template hiyouga 2024-04-16 03:10:02 +08:00
  • b053c6454e update readme hiyouga 2024-04-16 02:36:54 +08:00
  • ebf0f4a77c update readme hiyouga 2024-04-16 02:35:36 +08:00
  • efa808069a support unsloth 2024.4 hiyouga 2024-04-16 00:25:03 +08:00
  • b5c5283dd6 add codegemma hiyouga 2024-04-16 00:11:15 +08:00
  • b638c65519 support cohere commandR #3184 hiyouga 2024-04-15 23:26:42 +08:00
  • d4d471450f Feature BAdam Jonery 2024-04-15 23:15:27 +08:00
  • 3144bdec2c Merge pull request #3254 from marko1616/feature/Add-support-for-CohereForAI/c4ai-command-r-plus hoshi-hiyouga 2024-04-15 22:59:35 +08:00
  • c6d6c4c209 Update template.py hoshi-hiyouga 2024-04-15 22:58:01 +08:00
  • f5f1589662 Update constants.py hoshi-hiyouga 2024-04-15 22:56:55 +08:00
  • 276f2cb24e update examples hiyouga 2024-04-15 22:14:34 +08:00
  • 952b785bb3 change default_system accroding to official template marko1616 2024-04-15 20:45:46 +08:00
  • 72dd676208 Revert "Add support for function call(Not strictly following origin)" marko1616 2024-04-15 20:27:09 +08:00
  • dfaa31e991 Add support for function call(Not strictly following origin) marko1616 2024-04-15 20:16:52 +08:00
  • 86556b1c74 Merge pull request #3261 from khazic/main hoshi-hiyouga 2024-04-15 16:30:57 +08:00
  • 0c80751e87 Merge pull request #3276 from liu-zichen/fix_mixtral hoshi-hiyouga 2024-04-15 15:38:16 +08:00
  • 9338f878a3 fix #3273 hiyouga 2024-04-15 15:32:58 +08:00
  • fde3d91242 fix: mixtral output_router_logits liuzc 2024-04-15 12:11:49 +08:00
  • 19adfb88a9 Upgrade README.md khazic 2024-04-13 20:50:49 +08:00
  • daaafa900a Added specimens for single-card full parameter prediction khazic 2024-04-13 20:45:19 +08:00
  • 0dcc9e0bca Typo fix marko1616 2024-04-13 17:30:21 +08:00
  • aeec78b35c Typo fix marko1616 2024-04-13 07:52:11 +08:00
  • c991654cb4 Add c4ai-command-r-plus link marko1616 2024-04-13 07:32:40 +08:00
  • f328413646 Add template&support(Not tested) marko1616 2024-04-13 04:31:33 +08:00
  • 106a0104da fix #3247 hiyouga 2024-04-12 17:41:33 +08:00
  • 5486ea09e3 fix model card hiyouga 2024-04-12 17:11:59 +08:00
  • 31bbbb6d13 fix #3238 hiyouga 2024-04-12 14:28:11 +08:00
  • 1a77de82fa set dev version hiyouga 2024-04-11 20:27:34 +08:00
  • 7468f2535c release v0.6.2 v0.6.2 hiyouga 2024-04-11 20:08:51 +08:00
  • 38e4f22605 Merge branch 'main' of https://github.com/hiyouga/LLaMA-Factory hiyouga 2024-04-10 23:58:18 +08:00
  • 2bc2fe7b5e fix #3225 hiyouga 2024-04-10 23:57:59 +08:00
  • 6d0140d8a0 Merge pull request #3201 from kno10/patch-1 and fix #3200 hoshi-hiyouga 2024-04-10 00:58:48 +08:00
  • 7856f98965 Update adapter.py hoshi-hiyouga 2024-04-10 00:57:51 +08:00
  • e25ddef08c Update adapter.py hoshi-hiyouga 2024-04-10 00:57:30 +08:00
  • 95a4589bbf Pass additional_target to unsloth Erich Schubert 2024-04-09 17:53:40 +02:00
  • 566d71b7a9 fix quant infer and qwen2moe hiyouga 2024-04-09 17:12:59 +08:00
  • 6030a4a720 tiny fix hiyouga 2024-04-08 21:28:39 +08:00
  • 5dc0cb94d4 Merge pull request #3161 from hiyouga/feature/add-mediatek-model hoshi-hiyouga 2024-04-08 20:56:51 +08:00
  • 325dafcbb0 add empty line codingma 2024-04-07 18:28:08 +08:00
  • 1a8a8b8651 rename template to breeze codingma 2024-04-07 18:27:20 +08:00
  • 61a495cb1e Merge pull request #3160 from sliderSun/main hoshi-hiyouga 2024-04-07 18:00:40 +08:00
  • 75866aa020 rename template to breeze codingma 2024-04-07 11:39:54 +08:00
  • 9e4fda326d support https://github.com/hiyouga/LLaMA-Factory/issues/3152 codingma 2024-04-07 11:34:01 +08:00
  • 1131ddfaff fix spell error sliderSun 2024-04-07 10:59:15 +08:00
  • 9f437b5c43 support Qwen1.5-32B sliderSun 2024-04-07 10:56:03 +08:00
  • 0cc03d3f05 support Qwen1.5-32B sliderSun 2024-04-07 10:26:13 +08:00
  • 04fc2f78bf update readme hiyouga 2024-04-07 00:48:24 +08:00
  • 3ac333fc6a update examples hiyouga 2024-04-04 14:48:21 +08:00
  • a246ac1914 tiny fix hiyouga 2024-04-04 02:19:03 +08:00
  • 48ceac845c back to gradio 4.21 and fix chat hiyouga 2024-04-04 02:07:20 +08:00
  • b1986a06b9 fix bug in latest gradio hiyouga 2024-04-04 00:55:31 +08:00
  • 43d134ba29 fix requires for windows hiyouga 2024-04-03 21:56:43 +08:00