[inference] support sglang backend (#7278)

* Mimic SGLang offline Engine

* Add more tests and args

* Pass all current tests

* Clean Code

* fix sample_params

* clean code

* Fix Stream Chat

* change sglang from engine mode to server mode

* fix

* Fix Review Issues

* Use SGLang Built-In Utilities

* Fix test SGLang

* Some Doc Issue

* fix sglang engine

* add readme

---------

Co-authored-by: Jin Pan <jpan236@wisc.edu>
Co-authored-by: hiyouga <hiyouga@buaa.edu.cn>
This commit is contained in:
Qiaolin Yu
2025-03-14 16:37:58 -04:00
committed by GitHub
parent 93e6184cbe
commit a44a53ebec
15 changed files with 433 additions and 27 deletions

View File

@@ -54,6 +54,7 @@ extra_require = {
"awq": ["autoawq"],
"aqlm": ["aqlm[gpu]>=1.1.0"],
"vllm": ["vllm>=0.4.3,<=0.7.3"],
"sglang": ["sglang>=0.4.4"],
"galore": ["galore-torch"],
"apollo": ["apollo-torch"],
"badam": ["badam>=1.2.1"],