Kingsley
|
833f6027b1
|
[fix] fit neat_packing & mrope model packing (#10283)
Co-authored-by: Yaowei Zheng <hiyouga@buaa.edu.cn>
|
2026-03-20 16:50:11 +08:00 |
|
Kingsley
|
a3d44e3152
|
[mca] support qwen3.5 (#10265)
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
|
2026-03-10 10:55:16 +08:00 |
|
Parag Ekbote
|
eb976d75a2
|
[tracker] Add Trackio Integration for LlamaFactory (#10165)
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
|
2026-03-03 17:19:37 +08:00 |
|
Junyou Su
|
675ce8cc7f
|
[algo] add ASFT (#10174)
|
2026-02-12 13:12:14 +08:00 |
|
浮梦
|
bf04ca6af8
|
[deps] adapt to transformers v5 (#10147)
Co-authored-by: frozenleaves <frozen@Mac.local>
Co-authored-by: hiyouga <hiyouga@buaa.edu.cn>
|
2026-02-02 12:07:19 +08:00 |
|
xvxuopop
|
762b480131
|
[feature] support using ray.remote to start distributed training. (#10109)
|
2026-01-28 16:05:29 +08:00 |
|
Meng WANG
|
e70651ac58
|
[feat] support all_exhausted_without_replacement in datasets.interleave_datasets (#10112)
|
2026-01-20 15:54:07 +08:00 |
|
Kingsley
|
db2f794f7b
|
[misc] update mcore related docker and mca supported models (#10114)
|
2026-01-19 14:55:16 +08:00 |
|
Yaowei Zheng
|
d7d734d54c
|
[misc] fix fp8 (#9742)
|
2026-01-09 16:17:26 +08:00 |
|
Yaowei Zheng
|
4c1eb922e2
|
[misc] fix parser (#9730)
|
2026-01-07 17:36:08 +08:00 |
|
yanglele
|
e944dc442c
|
[feature] add support for EAFT loss (#9720)
Co-authored-by: Yaowei Zheng <hiyouga@buaa.edu.cn>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
|
2026-01-06 23:07:12 +08:00 |
|
Yaowei Zheng
|
8600530002
|
[misc] lint (#9710)
|
2026-01-04 13:47:56 +08:00 |
|
Santosh Bhavani
|
355d5c5e5a
|
[fix] fp8: add Transformer Engine backend support (#9705)
Co-authored-by: Yaowei Zheng <hiyouga@buaa.edu.cn>
|
2026-01-01 10:18:02 +08:00 |
|
浮梦
|
16735b9e35
|
[v1] Refactor kernel plugin (#9669)
Co-authored-by: frozenleaves <frozen@Mac.local>
|
2025-12-31 18:26:48 +08:00 |
|
Copilot
|
eceec8ab69
|
[deps] goodbye python 3.9 (#9677)
Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Co-authored-by: hiyouga <16256802+hiyouga@users.noreply.github.com>
Co-authored-by: hiyouga <hiyouga@buaa.edu.cn>
|
2025-12-27 02:50:44 +08:00 |
|
ZIYI ZENG
|
b0d49e137f
|
[misc] Support split eval_dataset when explict set "predict_with_generate" (#9604)
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
|
2025-12-20 01:46:00 +08:00 |
|
浮梦
|
2b6f16f261
|
[model] temporarily support npu fused options on v0, powered by v1 kernels (#9520)
Co-authored-by: frozenleaves <frozen@Mac.local>
|
2025-11-27 02:08:36 +08:00 |
|
Yaowei Zheng
|
eaf963f67f
|
[model] update kt code (#9406)
|
2025-11-05 15:27:22 +08:00 |
|
Peilin Li
|
934b3084ee
|
[train] KTransformers SFT as backend engine for LLaMA-Factory (#9400)
Co-authored-by: jimmy128 <jimmy128@noreply.gitcode.com>
Co-authored-by: Yaowei Zheng <hiyouga@buaa.edu.cn>
|
2025-11-04 15:54:12 +08:00 |
|
Yaowei Zheng
|
3ae15da9c0
|
[misc] lint code (#9395)
|
2025-11-03 22:08:59 +08:00 |
|
Kingsley
|
13170577b2
|
[feat] support megatron-LM training by mcore_adapter (#9237)
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Co-authored-by: Yaowei Zheng <hiyouga@buaa.edu.cn>
|
2025-10-26 16:21:30 +08:00 |
|
Yaowei Zheng
|
d9d67ba62d
|
[misc] fix import error (#9299)
|
2025-10-17 17:46:27 +08:00 |
|
Yaowei Zheng
|
47a7dc1698
|
[deps] upgrade vllm (#9293)
|
2025-10-16 23:20:26 +08:00 |
|
Ximing Xing
|
c867e28093
|
[model] adds semantic initialization support for special tokens (#9267)
Co-authored-by: ximingxing <ximingxing@tencent.com>
|
2025-10-14 17:00:48 +08:00 |
|
Ben Feuer
|
1c44b60e3e
|
[feat] fp8 training (#8960)
Co-authored-by: Benjamin Feuer <penfever@gmail.com>
Co-authored-by: Yaowei Zheng <hiyouga@buaa.edu.cn>
|
2025-10-01 14:32:53 +08:00 |
|
Yaowei Zheng
|
6ffebe5ff7
|
[data] fix qwen omni plugin (#9204)
Co-authored-by: kingsley <kingsleydodonow@gmail.com>
|
2025-09-28 01:02:29 +08:00 |
|
Yaowei Zheng
|
2c31279316
|
[assets] update wechat (#8962)
|
2025-08-19 02:55:09 +08:00 |
|
Zeju Qiu
|
003a2acb1a
|
[feature] adding orthogononal finetuning (OFT) to llama factory (#8623)
Co-authored-by: Zeju <zqiu@g003.internal.cluster.is.localnet>
Co-authored-by: Zeju <zqiu@login2.is.localnet>
Co-authored-by: Yaowei Zheng <hiyouga@buaa.edu.cn>
|
2025-08-18 18:22:47 +08:00 |
|
XLXW
|
1ada15981a
|
[feature] add support for dft loss (#8917)
|
2025-08-15 23:29:57 +08:00 |
|
Kingsley
|
936f4fd78e
|
[feature] Support MPO (#8930)
|
2025-08-15 15:09:59 +08:00 |
|
Yaowei Zheng
|
e695fdfa70
|
[model] add qwen3 nothink (#8869)
|
2025-08-11 23:17:32 +08:00 |
|
Yaowei Zheng
|
dc61e78e77
|
[hparams] fix data args (#8863)
|
2025-08-08 15:35:50 +08:00 |
|
kahlun
|
8a5d6c8a74
|
[data-loader] Allow dataset_dir to accept a dict for in-memory dataset_info (#8845)
|
2025-08-07 16:26:59 +08:00 |
|
Yaowei Zheng
|
a416ab48d8
|
[deps] upgrade vllm to 0.10.0 (#8787)
|
2025-07-30 22:26:38 +08:00 |
|
Yaowei Zheng
|
4b0ec83928
|
[deps] bump transformers to 4.49.0 (#8564)
|
2025-07-07 20:31:50 +08:00 |
|
Ze-Yi LIN
|
16f13d304b
|
[tracking] fix swanlab hparams (#8532)
Co-authored-by: Yaowei Zheng <hiyouga@buaa.edu.cn>
|
2025-07-02 22:08:44 +08:00 |
|
Kingsley
|
bede213da7
|
[assets] update readme (#8519)
|
2025-07-02 15:38:38 +08:00 |
|
Injae Ryou
|
a5a93597b1
|
[parser] update config loading to use OmegaConf #7793 (#8505)
|
2025-07-01 21:05:13 +08:00 |
|
Yaowei Zheng
|
c6c764388c
|
[assets] update readme (#8396)
|
2025-06-17 16:15:20 +08:00 |
|
Yaowei Zheng
|
3a3bae1cfe
|
[data] fix qwen2vl pos ids (#8387)
|
2025-06-17 00:48:54 +08:00 |
|
Yaowei Zheng
|
9a2d1dec62
|
[assets] update wechat (#8385)
|
2025-06-16 18:23:22 +08:00 |
|
Aman Gupta
|
8e4ac78607
|
[trainer] Add LD-DPO objective (#8362)
|
2025-06-12 16:10:38 +08:00 |
|
hoshi-hiyouga
|
2bf8e993ab
|
[data] fix shared file system (#8179)
|
2025-05-27 18:36:03 +08:00 |
|
hoshi-hiyouga
|
9ae17cd173
|
[deps] update to transformers 4.52 (#8125)
|
2025-05-21 05:16:18 +08:00 |
|
hoshi-hiyouga
|
9b5baa97f0
|
[data] qwen3 fixes (#8109)
|
2025-05-20 02:00:30 +08:00 |
|
hoshi-hiyouga
|
45030ff803
|
[model] switch to gptqmodel (#8108)
|
2025-05-19 22:25:40 +08:00 |
|
Saiya
|
ab41f7956c
|
[infer] support lora adapter for SGLang backend (#8067)
|
2025-05-16 23:33:47 +08:00 |
|
hoshi-hiyouga
|
13b05e74f1
|
[hparam] add enable think argument (#7928)
|
2025-04-30 17:21:30 +08:00 |
|
hoshi-hiyouga
|
73198a6645
|
[misc] fix uv (#7913)
|
2025-04-30 07:45:03 +08:00 |
|
hoshi-hiyouga
|
d4ee44bdef
|
[data] add eval_on_each_dataset arg (#7912)
|
2025-04-30 06:56:43 +08:00 |
|