LlamaFactory

mirror of https://github.com/hiyouga/LlamaFactory.git synced 2026-03-24 00:53:07 +00:00

Author	SHA1	Message	Date
Yaowei Zheng	eaf963f67f	[model] update kt code (#9406 )	2025-11-05 15:27:22 +08:00
Peilin Li	934b3084ee	[train] KTransformers SFT as backend engine for LLaMA-Factory (#9400 ) Co-authored-by: jimmy128 <jimmy128@noreply.gitcode.com> Co-authored-by: Yaowei Zheng <hiyouga@buaa.edu.cn>	2025-11-04 15:54:12 +08:00
Yaowei Zheng	3ae15da9c0	[misc] lint code (#9395 )	2025-11-03 22:08:59 +08:00
Kingsley	13170577b2	[feat] support megatron-LM training by mcore_adapter (#9237 ) Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com> Co-authored-by: Yaowei Zheng <hiyouga@buaa.edu.cn>	2025-10-26 16:21:30 +08:00
Yaowei Zheng	d9d67ba62d	[misc] fix import error (#9299 )	2025-10-17 17:46:27 +08:00
Yaowei Zheng	47a7dc1698	[deps] upgrade vllm (#9293 )	2025-10-16 23:20:26 +08:00
Ximing Xing	c867e28093	[model] adds semantic initialization support for special tokens (#9267 ) Co-authored-by: ximingxing <ximingxing@tencent.com>	2025-10-14 17:00:48 +08:00
Ben Feuer	1c44b60e3e	[feat] fp8 training (#8960 ) Co-authored-by: Benjamin Feuer <penfever@gmail.com> Co-authored-by: Yaowei Zheng <hiyouga@buaa.edu.cn>	2025-10-01 14:32:53 +08:00
Yaowei Zheng	6ffebe5ff7	[data] fix qwen omni plugin (#9204 ) Co-authored-by: kingsley <kingsleydodonow@gmail.com>	2025-09-28 01:02:29 +08:00
Yaowei Zheng	2c31279316	[assets] update wechat (#8962 )	2025-08-19 02:55:09 +08:00
Zeju Qiu	003a2acb1a	[feature] adding orthogononal finetuning (OFT) to llama factory (#8623 ) Co-authored-by: Zeju <zqiu@g003.internal.cluster.is.localnet> Co-authored-by: Zeju <zqiu@login2.is.localnet> Co-authored-by: Yaowei Zheng <hiyouga@buaa.edu.cn>	2025-08-18 18:22:47 +08:00
XLXW	1ada15981a	[feature] add support for dft loss (#8917 )	2025-08-15 23:29:57 +08:00
Kingsley	936f4fd78e	[feature] Support MPO (#8930 )	2025-08-15 15:09:59 +08:00
Yaowei Zheng	e695fdfa70	[model] add qwen3 nothink (#8869 )	2025-08-11 23:17:32 +08:00
Yaowei Zheng	dc61e78e77	[hparams] fix data args (#8863 )	2025-08-08 15:35:50 +08:00
kahlun	8a5d6c8a74	[data-loader] Allow `dataset_dir` to accept a dict for in-memory dataset_info (#8845 )	2025-08-07 16:26:59 +08:00
Yaowei Zheng	a416ab48d8	[deps] upgrade vllm to 0.10.0 (#8787 )	2025-07-30 22:26:38 +08:00
Yaowei Zheng	4b0ec83928	[deps] bump transformers to 4.49.0 (#8564 )	2025-07-07 20:31:50 +08:00
Ze-Yi LIN	16f13d304b	[tracking] fix swanlab hparams (#8532 ) Co-authored-by: Yaowei Zheng <hiyouga@buaa.edu.cn>	2025-07-02 22:08:44 +08:00
Kingsley	bede213da7	[assets] update readme (#8519 )	2025-07-02 15:38:38 +08:00
Injae Ryou	a5a93597b1	[parser] update config loading to use OmegaConf #7793 (#8505 )	2025-07-01 21:05:13 +08:00
Yaowei Zheng	c6c764388c	[assets] update readme (#8396 )	2025-06-17 16:15:20 +08:00
Yaowei Zheng	3a3bae1cfe	[data] fix qwen2vl pos ids (#8387 )	2025-06-17 00:48:54 +08:00
Yaowei Zheng	9a2d1dec62	[assets] update wechat (#8385 )	2025-06-16 18:23:22 +08:00
Aman Gupta	8e4ac78607	[trainer] Add LD-DPO objective (#8362 )	2025-06-12 16:10:38 +08:00
hoshi-hiyouga	2bf8e993ab	[data] fix shared file system (#8179 )	2025-05-27 18:36:03 +08:00
hoshi-hiyouga	9ae17cd173	[deps] update to transformers 4.52 (#8125 )	2025-05-21 05:16:18 +08:00
hoshi-hiyouga	9b5baa97f0	[data] qwen3 fixes (#8109 )	2025-05-20 02:00:30 +08:00
hoshi-hiyouga	45030ff803	[model] switch to gptqmodel (#8108 )	2025-05-19 22:25:40 +08:00
Saiya	ab41f7956c	[infer] support lora adapter for SGLang backend (#8067 )	2025-05-16 23:33:47 +08:00
hoshi-hiyouga	13b05e74f1	[hparam] add enable think argument (#7928 )	2025-04-30 17:21:30 +08:00
hoshi-hiyouga	73198a6645	[misc] fix uv (#7913 )	2025-04-30 07:45:03 +08:00
hoshi-hiyouga	d4ee44bdef	[data] add eval_on_each_dataset arg (#7912 )	2025-04-30 06:56:43 +08:00
Eric Tang	ef03832cd4	[ray] add storage filesystem to ray config (#7854 )	2025-04-27 22:12:40 +08:00
Kingsley	fa0eb91f1f	[data] fix internvl plugin (#7817 )	2025-04-23 00:58:22 +08:00
hoshi-hiyouga	fddcd43c88	[trainer] support early stop (#7797 )	2025-04-22 01:59:33 +08:00
hoshi-hiyouga	b07628dea5	[example] add bash usage (#7794 )	2025-04-22 00:25:51 +08:00
Juanxi Tian	12ada72ed4	[trainer] Add Muon Optimizer (#7749 ) Co-authored-by: hoshi-hiyouga <hiyouga@buaa.edu.cn>	2025-04-21 23:38:37 +08:00
hoshi-hiyouga	416853dd25	[parser] support omegaconf (#7793 )	2025-04-21 23:30:30 +08:00
flashJd	0ac641326b	[misc] fix new tokens adding (#7253 ) Co-authored-by: hoshi-hiyouga <hiyouga@buaa.edu.cn>	2025-04-21 23:19:02 +08:00
hoshi-hiyouga	d222f63cb7	[infer] set env for vllm ascend (#7745 )	2025-04-17 01:08:55 +08:00
hoshi-hiyouga	3df021d4d7	[deps] upgrade vllm (#7728 )	2025-04-15 14:57:40 +08:00
hoshi-hiyouga	7c61b35106	[misc] upgrade cli (#7714 )	2025-04-14 15:41:22 +08:00
Eric Tang	bb8d79bae2	[ray] allow for specifying ray.init kwargs (i.e. runtime_env) (#7647 ) * ray init kwargs * Update trainer_utils.py * fix ray args --------- Co-authored-by: hoshi-hiyouga <hiyouga@buaa.edu.cn>	2025-04-10 11:31:05 +08:00
hoshi-hiyouga	c3c0efbaa0	[misc] fix packing and eval plot (#7623 )	2025-04-07 18:20:57 +08:00
hoshi-hiyouga	5e22597ff1	[infer] vllm video/audio inference (#7566 )	2025-04-02 02:27:04 +08:00
hoshi-hiyouga	2bfcad2394	[model] fix kv cache (#7564 )	2025-04-01 23:07:46 +08:00
Billy Cao	00409ff28a	[data] shard the dataset to allow multiprocessing when streaming is enabled (#7530 ) * Shard the dataset when streaming to allow multiprocessing * Allow user to not set dataset_shards to ensure backward compatibility	2025-04-01 15:36:23 +08:00
Kingsley	7eed496336	[model] add Qwen2.5-Omni model (#7537 ) * preserve image_sizes * preserve image_sizes * init plugin * support audio-text2text lora * nit * support image/video-text2text, audio-text2text * remove args * remove lines * add docs && nit * remove some comments * fix && add merge part script * add license	2025-03-31 20:39:35 +08:00
Xu-pixel	b578a7d5b6	[3rdparty] support swanlab lark notification (#7481 )	2025-03-27 01:52:01 +08:00

1 2 3 4 5

217 Commits