Commit Graph

  • 9620825892 [scripts] add video params for vllm infer (#7992) Kingsley 2025-05-09 21:16:52 +08:00
  • 26cbb03a5f [data] Avoid repetitive tool description warp (#8000) yunhao-tech 2025-05-09 21:16:37 +08:00
  • 5f4b793e04 [docs] add GraphGen (#7974) tpoisonooo 2025-05-07 18:23:11 +08:00
  • 994ab6424a [misc] update liger kernel patch (#7966) hoshi-hiyouga 2025-05-06 20:32:16 +02:00
  • aa9ed4db59 [example] update examples (#7964) hoshi-hiyouga 2025-05-06 17:24:25 +02:00
  • ef86a53063 [model] add mimo7b (#7946) Kingsley 2025-05-06 23:10:30 +08:00
  • bf0286e1e3 [misc] fix qwen2 omni (#7962) hoshi-hiyouga 2025-05-06 15:39:13 +02:00
  • ce7032e1b3 [model] add qwen2 omni 3b (#7945) hoshi-hiyouga 2025-05-03 16:36:51 +08:00
  • 5763017cea [assets] Warp Support README Update (#7887) Eric Chen 2025-05-01 12:08:48 -04:00
  • 13b05e74f1 [hparam] add enable think argument (#7928) hoshi-hiyouga 2025-04-30 17:21:30 +08:00
  • c566e39b7d [data] fix base plugin (#7924) hoshi-hiyouga 2025-04-30 16:28:05 +08:00
  • 052ca871bd [data] optimize qwen3 loss computation (#7923) hoshi-hiyouga 2025-04-30 16:18:00 +08:00
  • 73198a6645 [misc] fix uv (#7913) hoshi-hiyouga 2025-04-30 07:45:03 +08:00
  • d4ee44bdef [data] add eval_on_each_dataset arg (#7912) hoshi-hiyouga 2025-04-30 06:56:43 +08:00
  • 6d2cde43e7 [data] replace eos token for base models (#7911) hoshi-hiyouga 2025-04-30 06:52:28 +08:00
  • 11295cdea0 [data] improve mm plugin (#7910) hoshi-hiyouga 2025-04-30 06:34:28 +08:00
  • 98f23c6584 [model] add qwen3 (#7885) hoshi-hiyouga 2025-04-29 09:34:05 +08:00
  • db9559456c [data] fix qwen2.5 omni template (#7883) Kingsley 2025-04-29 00:58:23 +08:00
  • 3ae5da2a04 [model] fix dsv3 leaf node (#7879) hoshi-hiyouga 2025-04-28 18:11:09 +08:00
  • d173cb50f5 [data] fix qwen2 omni plugin (#7875) hoshi-hiyouga 2025-04-28 14:22:41 +08:00
  • df27d7e48a [trainer] make projector trainable in freeze training (#7872) zhaop-l 2025-04-28 13:19:37 +08:00
  • bb5b83352b [data] fix minicpmo vllm infer (#7870) hoshi-hiyouga 2025-04-28 01:59:53 +08:00
  • 1157f4e246 fix attn patch for kimivl (#7867) Kingsley 2025-04-27 23:12:28 +08:00
  • ef03832cd4 [ray] add storage filesystem to ray config (#7854) Eric Tang 2025-04-27 07:12:40 -07:00
  • 2233b739fa [model] fix vit gradient checkpointing (#7830) hoshi-hiyouga 2025-04-23 22:48:48 +08:00
  • 091d2539e8 Merge commit from fork hoshi-hiyouga 2025-04-23 16:38:27 +08:00
  • c1a7f2ebb2 [model] fix moe zero3 (#7826) hoshi-hiyouga 2025-04-23 15:30:49 +08:00
  • fa0eb91f1f [data] fix internvl plugin (#7817) Kingsley 2025-04-23 00:58:22 +08:00
  • 49f9ed0232 [assets] update model readme (#7804) hoshi-hiyouga 2025-04-22 16:43:56 +08:00
  • 2a564c25d1 [model] add arch check for InternVL (#7803) Kingsley 2025-04-22 16:38:05 +08:00
  • 7500e761d3 [misc] update internvl constants (#7801) Kingsley 2025-04-22 15:53:08 +08:00
  • fddcd43c88 [trainer] support early stop (#7797) hoshi-hiyouga 2025-04-22 01:59:33 +08:00
  • 0e4ce039ee [data] improve mmplugin (#7795) hoshi-hiyouga 2025-04-22 01:25:33 +08:00
  • b07628dea5 [example] add bash usage (#7794) hoshi-hiyouga 2025-04-22 00:25:51 +08:00
  • 12ada72ed4 [trainer] Add Muon Optimizer (#7749) Juanxi Tian 2025-04-21 23:38:37 +08:00
  • 416853dd25 [parser] support omegaconf (#7793) hoshi-hiyouga 2025-04-21 23:30:30 +08:00
  • bd7bc31c79 [data] Fix wrong position ids with packed attention masks (#7754) Changrui Chen 2025-04-21 16:19:36 +01:00
  • 0ac641326b [misc] fix new tokens adding (#7253) flashJd 2025-04-21 23:19:02 +08:00
  • c5ba9106ec [model] fix gemma3 export (#7786) ddddng 2025-04-21 23:07:11 +08:00
  • 3b2d3794a5 [misc] fix bug in constant (#7765) Sachin Beldona 2025-04-21 10:06:31 -05:00
  • b605c20768 [assets] update wechat (#7792) hoshi-hiyouga 2025-04-21 21:29:42 +08:00
  • 39169986ef [trainer] fix pt loss (#7748) hoshi-hiyouga 2025-04-17 03:15:35 +08:00
  • 86ebb219d6 [breaking] bump transformers to 4.45.0 & improve ci (#7746) hoshi-hiyouga 2025-04-17 02:36:48 +08:00
  • d222f63cb7 [infer] set env for vllm ascend (#7745) hoshi-hiyouga 2025-04-17 01:08:55 +08:00
  • 2e518f255f [model] support intern-VL 2.5-3 series (#7258) Kingsley 2025-04-17 00:31:30 +08:00
  • 8f88a4e6a4 [misc] improve entrypoint (#7345) ENg-122 2025-04-16 21:48:23 +08:00
  • b9263ff5ac [infer] support vllm-ascend (#7739) leo-pony 2025-04-16 20:06:47 +08:00
  • ee2ab093a7 [api] fix chat messages (#7732) hoshi-hiyouga 2025-04-15 16:39:08 +08:00
  • 3df021d4d7 [deps] upgrade vllm (#7728) hoshi-hiyouga 2025-04-15 14:57:40 +08:00
  • e252abf051 [docker] patch docker-rocm (#7725) Joe Schoonover 2025-04-15 01:36:39 -04:00
  • 1134baeedd [assets] update model readme (#7724) hoshi-hiyouga 2025-04-15 00:41:09 +08:00
  • 2101399c94 [model] Support Kimi_VL thinking/instruct (#7719) Kingsley 2025-04-15 00:21:58 +08:00
  • 3f91a95250 [misc] fix env vars (#7715) hoshi-hiyouga 2025-04-14 16:04:04 +08:00
  • 7c61b35106 [misc] upgrade cli (#7714) hoshi-hiyouga 2025-04-14 15:41:22 +08:00
  • f518bfba5b [deps] upgrade transformers (#7704) hoshi-hiyouga 2025-04-13 18:11:34 +08:00
  • 8162f94db5 [model] add GLM-4-0414 (#7695) Yuxuan Zhang 2025-04-13 17:10:45 +08:00
  • 1f0c52b73c [deps] fix uv conflicts (#7686) hoshi-hiyouga 2025-04-11 18:02:24 +08:00
  • a8caf09c7f [data] support for specifying a dataset in cloud storage (#7567) Eric Tang 2025-04-09 20:31:35 -07:00
  • bb8d79bae2 [ray] allow for specifying ray.init kwargs (i.e. runtime_env) (#7647) Eric Tang 2025-04-09 20:31:05 -07:00
  • 1c436c9f25 [bugfix] enable_gemma_liger_kernel (#7660) Dain Kim 2025-04-10 12:27:30 +09:00
  • 1b0934bccb [misc] fix cuda warn on intel GPU (#7655) jilongW 2025-04-09 21:37:54 +08:00
  • 4eec541857 [data] add coig-p dataset (#7657) hoshi-hiyouga 2025-04-09 21:18:25 +08:00
  • 89a4f9ec7f [assets] update readme (#7654) hoshi-hiyouga 2025-04-09 18:27:38 +08:00
  • 1abd71b551 [assets] update readme (#7644) hoshi-hiyouga 2025-04-09 01:06:06 +08:00
  • 349c56c51c [data] Fix bugs of use_audio_in_video in Qwen2.5 Omni (#7638) Kingsley 2025-04-08 18:40:10 +08:00
  • acb09fa3a3 [trainer] fix key error (#7635) Shawn Tao 2025-04-08 18:39:50 +08:00
  • f75b91077b [sglang] support transformers 4.51.0 (#7639) Adarsh Shirawalmath 2025-04-08 16:09:23 +05:30
  • c3c0efbaa0 [misc] fix packing and eval plot (#7623) hoshi-hiyouga 2025-04-07 18:20:57 +08:00
  • 5115dc8c7f [assets] update readme (#7612) hoshi-hiyouga 2025-04-06 13:58:49 +08:00
  • 831e7f1cfd [model] add llama4 (#7611) hoshi-hiyouga 2025-04-06 13:42:31 +08:00
  • d4cfa9507e [data] fix qwen2.5 omni plugin (#7578) Kingsley 2025-04-02 23:58:39 +08:00
  • d32c6c014d [data] fix qwen2.5 omni plugin (#7573) Kingsley 2025-04-02 21:28:52 +08:00
  • 7b9deb9410 [trainer] fix batch processing in PPO trainer (#7576) gechengze 2025-04-02 21:17:48 +08:00
  • 5e22597ff1 [infer] vllm video/audio inference (#7566) hoshi-hiyouga 2025-04-02 02:27:04 +08:00
  • 2bfcad2394 [model] fix kv cache (#7564) hoshi-hiyouga 2025-04-01 23:07:46 +08:00
  • a13b1bb49a [model] fix use_cache patching for gemma3 multimodal (#7500) Yu Shi Jie 2025-04-01 04:06:48 -04:00
  • d10467d178 [data] specify position_ids in PackedSupervisedDatasetProcessor for neat_packing (#7318) Ritesh Goru 2025-04-01 13:33:13 +05:30
  • aac70663fd [webui] fix launch with proxy (#7332) taoharry 2025-04-01 15:52:56 +08:00
  • 00409ff28a [data] shard the dataset to allow multiprocessing when streaming is enabled (#7530) Billy Cao 2025-04-01 15:36:23 +08:00
  • d70b3b4bc5 [trainer] new kto mismatch pair creation strategy (#7509) Hao 2025-04-01 15:21:53 +08:00
  • e76eba051d [data] fix qwen2.5 omni collator (#7553) hoshi-hiyouga 2025-04-01 00:15:12 +08:00
  • 7eed496336 [model] add Qwen2.5-Omni model (#7537) Kingsley 2025-03-31 20:39:35 +08:00
  • 0f8296626a [deps] pin pydantic to 2.10.6 (#7546) hoshi-hiyouga 2025-03-31 14:42:28 +08:00
  • 8da1d2fa71 [data] fix pixtral plugin (#7505) Kingsley 2025-03-27 17:06:40 +08:00
  • b578a7d5b6 [3rdparty] support swanlab lark notification (#7481) Xu-pixel 2025-03-27 01:52:01 +08:00
  • 24afceddb7 [trainer] fix wsd scheduler (#7304) Kdump 2025-03-26 15:25:02 +08:00
  • 0583d06676 [model] add qwen2vl 32b & upgrade peft (#7469) hoshi-hiyouga 2025-03-25 12:15:58 +08:00
  • ec6a261568 [model] fix lora on quant models (#7456) GuoCoder 2025-03-25 11:59:46 +08:00
  • 6b3b97c738 [misc] update liger-kernel's monkey patch (#7453) Xiaosu Zhu 2025-03-25 11:58:52 +08:00
  • 6d3748f727 [misc] enable liger kernel for gemma3 text and paligemma (#7466) AbdelKarim ELJANDOUBI 2025-03-25 02:27:43 +01:00
  • 7c890170e3 [misc] enable liger kernel for gemma3 (#7462) Kenny Lam 2025-03-24 11:09:59 +00:00
  • ca42c0c406 [assets] fix gemma3 readme (#7449) hoshi-hiyouga 2025-03-24 10:31:25 +08:00
  • 7203365b80 [trainer] fix vlm loss for transformers 4.49 (#7448) hoshi-hiyouga 2025-03-24 10:24:05 +08:00
  • 3612946dd9 [docker] upgrade to torch 2.6 (#7442) rumichi 2025-03-23 22:18:08 +09:00
  • 3aa4f32e9c [misc] fix ci (#7441) hoshi-hiyouga 2025-03-23 21:09:35 +08:00
  • 304796b803 [misc] fix license (#7440) hoshi-hiyouga 2025-03-23 19:31:56 +08:00
  • 7cfd6e4bb0 [scripts] support compute score on vllm's predictions (#7419) SnowFox4004 2025-03-23 19:21:01 +08:00
  • 05b19d6952 [deps] upgrade transformers to 4.50.0 (#7437) hoshi-hiyouga 2025-03-23 17:44:27 +08:00
  • 919415dba9 [deps] upgrade vllm to 0.8 (#7436) hoshi-hiyouga 2025-03-23 14:32:22 +08:00
  • a959c2a509 [misc] fix sglang deps (#7432) Guo, Quan 2025-03-23 14:07:10 +08:00