Commit Graph

  • 70653026f5 [fix] make position_id_per_seconds configurable for Qwen2OmniPlugin (#10281) main LincolnBurrows2017 2026-03-16 19:42:38 +08:00
  • 246192abd2 [data] correct gpt_oss template format_assistant (#10269) Ruijie Hou 2026-03-10 21:36:38 +08:00
  • 0258dc14d0 [docker] update npu docker (#10268) 浮梦 2026-03-10 19:37:27 +08:00
  • 3045adf0ba [fix] fallback to audio_processor when feature_extractor is missing (#10267) xxddccaa 2026-03-10 19:36:41 +08:00
  • a3d44e3152 [mca] support qwen3.5 (#10265) Kingsley 2026-03-10 10:55:16 +08:00
  • edeb953bc7 [data] convert filter() to list in read_cloud_json to fix broken empty-check (#10260) JiangNan 2026-03-09 17:12:53 +08:00
  • d045794387 [docs] fix Python version requirement from 3.10 to >=3.11.0 (#10259) yizhouChen 2026-03-09 16:44:07 +08:00
  • 9501c3308a [train] fix compatibility issue with HuggingFace Dataset Column when sav… (#10254) pyx 2026-03-06 18:44:57 +08:00
  • 0ee1c42c2b [v1] Support meta loading for full and free (#10236) jiaqiw09 2026-03-05 23:15:27 +08:00
  • 3061f48d55 [ray] fix get ray head ip (#10252) SnowCharm 2026-03-05 23:14:38 +08:00
  • 2d9bd2aa14 [fix] qwen3.5 projector path (#10242) LittleYanlin 2026-03-04 01:31:09 +08:00
  • c0245c43fc [model] support Qwen3.5 all series models (#10237) Hertz 2026-03-03 17:34:59 +08:00
  • eb976d75a2 [tracker] Add Trackio Integration for LlamaFactory (#10165) Parag Ekbote 2026-03-03 14:49:37 +05:30
  • b5cb7cb0e6 [misc] fix constants (#10232) Yaowei Zheng 2026-03-02 11:10:48 +08:00
  • 0779846513 [infer] support mixed multimodal payloads (#10225) Philip Ottesen 2026-02-28 13:26:53 +01:00
  • 45d335c709 [v1] add seed for training and fix gradient checkpointing (#10211) jiaqiw09 2026-02-28 18:16:06 +08:00
  • 816480012f [fix] register visual part for Qwen3.5 (#10227) Kingsley 2026-02-28 16:39:24 +08:00
  • d3bf882e87 [docker] upgrade to ROCm 7.2 base image, drop PyTorch reinstall (#10223) Mikko Tukiainen 2026-02-27 14:16:33 +02:00
  • 589da21d32 [model] support Aeva (#10214) 娄宗志 2026-02-26 23:03:13 +08:00
  • 122cd46084 [model] update constants (#10220) Yaowei Zheng 2026-02-26 21:13:56 +08:00
  • 2b8b871475 [model] Adapt Qwen3.5 (#10213) 浮梦 2026-02-26 20:45:02 +08:00
  • aab9b400bb [model] Add DeepSpeed Z3 leaf module for Qwen3-Next (#10194) Shanay Mehta 2026-02-24 17:24:37 +05:30
  • 50599c719b [misc] remove safe_serialization arg for transformers v5 compatibility (#10208) P. Clawmogorov 2026-02-24 04:14:19 +01:00
  • a0f3ad0cee [mca] update supported models (#10196) Kingsley 2026-02-20 22:02:49 +08:00
  • f80e15dbb4 [ci] fix ut huggingface hub 429 error when transformers>=5.0.0 (#10155) jiaqiw09 2026-02-12 22:14:10 +08:00
  • 991267fd3b [v1] support quantization (#10161) sunyi0505 2026-02-12 20:37:41 +08:00
  • 5c52afa30d [v1] support deepspeed (#10181) 浮梦 2026-02-12 17:24:30 +08:00
  • 675ce8cc7f [algo] add ASFT (#10174) Junyou Su 2026-02-12 13:12:14 +08:00
  • ab073f4c13 [v1] add LoRA/Freeze support and merge workflow (#10157) jiaqiw09 2026-02-12 13:02:09 +08:00
  • 184304b5b4 [model] add liger kernel support for Qwen3-Next (#10176) Shanay Mehta 2026-02-10 19:17:48 +05:30
  • d3ebd5678d [model] support GLM-OCR SFT (#10183) Xue Yadong 2026-02-10 21:41:01 +08:00
  • 1d5e8ebcd0 [v1] init commit for v1 docs (#10145) 浮梦 2026-02-09 19:43:55 +08:00
  • ea644d04ec [model] support GLM-4.7-Flash SFT (#10173) Shanay Mehta 2026-02-09 08:10:44 +05:30
  • 92fa3df4c4 [trainer] add dpo/kto fsdp fsdp2 support (#10127) Username_Full 2026-02-04 23:27:12 +08:00
  • 8bedfafa4e [model] support MiniCPM-o-4.5 (#10163) Hertz 2026-02-04 23:21:27 +08:00
  • 1a02717fa8 [assets] update readme (#10159) Yaowei Zheng 2026-02-03 19:11:15 +08:00
  • e7cb145f5d [logging] Fix race condition in LoggerHandler during multi-GPU training (#10156) ゆり 2026-02-03 11:14:07 +08:00
  • b53d7037c2 [model] support youtu-vl model (#10152) Hertz 2026-02-02 21:42:43 +08:00
  • bf04ca6af8 [deps] adapt to transformers v5 (#10147) 浮梦 2026-02-02 12:07:19 +08:00
  • 762b480131 [feature] support using ray.remote to start distributed training. (#10109) xvxuopop 2026-01-28 16:05:29 +08:00
  • 9640f79ae5 [fix] add visual.pos_embed to Qwen3-VL visual model keys (#10139) Jewon Lee 2026-01-27 17:33:01 +09:00
  • 7ef19eea00 [v0] Fix reward model training safetensors saving (#10137) jiaqiw09 2026-01-27 16:27:14 +08:00
  • f9f11dcb97 [v1] support training with fsdp2 (#9773) 浮梦 2026-01-25 19:41:58 +08:00
  • 641bfdd482 chore: Update outdated GitHub Actions versions (#10123) Pádraic Slattery 2026-01-25 12:12:39 +01:00
  • e70651ac58 [feat] support all_exhausted_without_replacement in datasets.interleave_datasets (#10112) Meng WANG 2026-01-20 15:54:07 +08:00
  • db2f794f7b [misc] update mcore related docker and mca supported models (#10114) Kingsley 2026-01-19 14:55:16 +08:00
  • 44eadbda1c [v1] fix kernel moe patch (#9867) jiaqiw09 2026-01-17 09:24:54 +08:00
  • 9829ae0a77 [ci] using mp to run kernel test (#9754) 浮梦 2026-01-13 19:43:59 +08:00
  • 958b9c3468 [v1] add sft (#9752) Yaowei Zheng 2026-01-12 03:15:01 +08:00
  • 4d3621e3d3 [model] fixed&added Hunyuan models (#9750) Hertz 2026-01-12 01:15:00 +08:00
  • a296723697 [v1] upgrade batching (#9751) Yaowei Zheng 2026-01-12 00:21:36 +08:00
  • 15b87f3125 [model] support HY-MT model (#9746) Hertz 2026-01-11 16:25:56 +08:00
  • 9f73a6eb23 [deps] fix package (#9745) Yaowei Zheng 2026-01-10 04:27:53 +08:00
  • b2effbd77c [v1] add batch generator (#9744) Yaowei Zheng 2026-01-10 04:24:09 +08:00
  • d7d734d54c [misc] fix fp8 (#9742) Yaowei Zheng 2026-01-09 16:17:26 +08:00
  • 8abb8fb533 [v1] use async streamer (#9741) Yaowei Zheng 2026-01-09 16:07:40 +08:00
  • 766d5ae6ad [ci] fix workflow (#9738) Yaowei Zheng 2026-01-09 14:48:16 +08:00
  • 5cccaeec82 [model] clean obsolete models (#9736) Yaowei Zheng 2026-01-09 14:08:18 +08:00
  • 5fb5d7ebd3 [model] support for microsoft's Phi-4-mini (#9734) Jackey 2026-01-09 12:24:45 +08:00
  • 03a70ba8dd [fix] correct ktransformers example config paths and templates (#9732) Peilin Li 2026-01-08 10:52:50 +08:00
  • 5cfd804b59 [refactor] rename lfm template to lfm2 and add LFM 2.5 to README (#9731) Vo Van Phuc 2026-01-07 18:25:04 +07:00
  • 4c1eb922e2 [misc] fix parser (#9730) Yaowei Zheng 2026-01-07 17:36:08 +08:00
  • 958fb523a2 [model] support LiquidAI's LFM2.5-VL vision-language model (#9729) Vo Van Phuc 2026-01-07 16:20:29 +07:00
  • b4e051bea4 [model] support for LiquidAI's LFM2.5 (Liquid Foundation Models) (#9726) Vo Van Phuc 2026-01-07 13:14:47 +07:00
  • d43e1007e8 [ci] improve cuda ci cache (#9725) 浮梦 2026-01-07 12:34:40 +08:00
  • f89d9367e5 [assets] update README.md (#9724) Xunpeng Xiao 2026-01-07 12:11:50 +08:00
  • d22de0d4bf [v1] add renderer ut (#9722) Yaowei Zheng 2026-01-07 02:06:07 +08:00
  • ea0b4e2466 [v1] add cli sampler (#9721) Yaowei Zheng 2026-01-06 23:31:27 +08:00
  • e944dc442c [feature] add support for EAFT loss (#9720) yanglele 2026-01-06 23:07:12 +08:00
  • 68119e5522 [misc] Add a PyTorch version warning for Conv3D. (#9715) Xunpeng Xiao 2026-01-05 13:26:29 +08:00
  • f60a6e3d01 [v1] add init plugin (#9716) Yaowei Zheng 2026-01-04 20:51:46 +08:00
  • 81b8a50aa5 [deps] Update pyproject.toml and requirements (#9714) jiaqiw09 2026-01-04 19:52:16 +08:00
  • 8600530002 [misc] lint (#9710) Yaowei Zheng 2026-01-04 13:47:56 +08:00
  • 9ae62c6fc0 [model] support Youtu-LLM-2B (#9707) Hertz 2026-01-04 13:17:57 +08:00
  • 0087bc253b [misc] Compatible with an empty architectures field in config.json (#9709) Xunpeng Xiao 2026-01-04 12:11:35 +08:00
  • 355d5c5e5a [fix] fp8: add Transformer Engine backend support (#9705) Santosh Bhavani 2025-12-31 18:18:02 -08:00
  • 6fe6bd290b [misc] set dev version (#9703) Yaowei Zheng 2025-12-31 23:41:40 +08:00
  • 95ac3f2373 [release] Bye 2025 (#9702) v0.9.4 Yaowei Zheng 2025-12-31 22:22:40 +08:00
  • 000526908a [core deps] upgrade TRL to be between 0.18 and 0.24 (#9617) Username_Full 2025-12-31 20:54:27 +08:00
  • c8d7e85b3e [fix] Fix prediction metrics in scripts/vllm_infer.py to match Transformers (#9701) fivehaitao 2025-12-31 18:30:00 +08:00
  • 16735b9e35 [v1] Refactor kernel plugin (#9669) 浮梦 2025-12-31 18:26:48 +08:00
  • 4e1d69579a [data] add DLR-Web dataset for supervised fine-tuning (#9696) Weize Liu 2025-12-30 07:50:38 -05:00
  • 1857fbdd6b [ci] add cuda workflow (#9682) 浮梦 2025-12-29 20:03:00 +08:00
  • bb1ba31005 [misc] lint mca code (#9692) Kingsley 2025-12-29 11:44:38 +08:00
  • e97d0474fb [ci] Fix NPU device condition in docker workflow (#9688) Copilot 2025-12-28 20:04:59 +08:00
  • 3f0c3dc84d [assets] fix installation (#9687) Yaowei Zheng 2025-12-28 19:29:28 +08:00
  • c107cc22d0 [model] support MiniMax-M1&M2 series (#9680) Hertz 2025-12-28 19:02:05 +08:00
  • 7ef1fba34a [version] fix gradio (#9685) Yaowei Zheng 2025-12-28 05:00:51 +08:00
  • eceec8ab69 [deps] goodbye python 3.9 (#9677) Copilot 2025-12-27 02:50:44 +08:00
  • b44f651e09 [ci] fix docker (#9678) Yaowei Zheng 2025-12-27 02:43:46 +08:00
  • 55590f5ece [misc] fix ci with uv (#9676) Yaowei Zheng 2025-12-27 01:39:13 +08:00
  • a1b1931b4a [breaking] migrate from setuptools to uv (#9673) Copilot 2025-12-26 22:47:23 +08:00
  • 3c17f2722c [model] Update ernie_vl to adapt new version (#9665) Xunpeng Xiao 2025-12-26 19:57:49 +08:00
  • a882e2d5fc [assets] Add GitHub Copilot instructions for repository (#9675) Copilot 2025-12-26 17:32:48 +08:00
  • a754604c11 [misc] fix accelerator (#9661) Yaowei Zheng 2025-12-25 02:11:04 +08:00
  • 6a2eafbae3 [feat] Models trained and inferred with Mxfp4 are dequantized by default (#9652) Xunpeng Xiao 2025-12-24 00:26:40 +08:00
  • 84485406b7 [ci] disable pip cache for ci (#9654) Yaowei Zheng 2025-12-23 18:37:40 +08:00
  • 1c8a42d2f8 [v1&WIP] dataloader init (#9645) Kingsley 2025-12-23 16:29:47 +08:00
  • 7901b2f32e [model] efficient tuning for gpt-oss (#9354) thulyubh22 2025-12-23 16:28:38 +08:00
  • 1f1f5a7d1b [ci] remove docker cache (#9640) Yaowei Zheng 2025-12-22 01:03:10 +08:00