-
-
Notifications
You must be signed in to change notification settings - Fork 4.7k
Pull requests: vllm-project/vllm
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
[Core][Bugfix] Use correct device to initialize GPU data during CUDA-graph-capture
#10608
opened Nov 24, 2024 by
IdoAsraff
Loading…
[Bug]: Authorization ignored when root_path is set
frontend
#10606
opened Nov 24, 2024 by
chaunceyjiang
Loading…
[Kernel] Remove hard-dependencies of Speculative decode to CUDA workers
#10587
opened Nov 23, 2024 by
xuechendi
Loading…
【Kernel】Tuning fused moe for qwen2-57b in GTX 4090 (tp4pp2)
#10586
opened Nov 23, 2024 by
BBuf
Loading…
[ Kernels ] [ AMD ] Add Fused MoE Configs
#10574
opened Nov 22, 2024 by
robertgshaw2-neuralmagic
•
Draft
[Hardware][Intel-Gaudi] Enable LoRA support for Intel Gaudi (HPU)
#10565
opened Nov 22, 2024 by
SanjuCSudhakaran
Loading…
[Model] Added GLM-4 series hf format model support vllm==0.6.4
#10561
opened Nov 22, 2024 by
sixsixcoder
Loading…
[Benchmark] Benchmark structured output with datasets
#10557
opened Nov 22, 2024 by
xuechendi
Loading…
[Docs] Add dedicated tool calling page to docs
documentation
Improvements or additions to documentation
#10554
opened Nov 21, 2024 by
mgoin
Loading…
[Misc] Enable vLLM to Dynamically Load LoRA from a Remote Server
frontend
#10546
opened Nov 21, 2024 by
angkywilliam
Loading…
[Model]: Add support for Aria model
documentation
Improvements or additions to documentation
#10514
opened Nov 21, 2024 by
xffxff
Loading…
[core] overhaul memory profiling and fix backward compatibility
needs-rebase
#10511
opened Nov 21, 2024 by
youkaichao
Loading…
[Model] Add OLMo November 2024 model
documentation
Improvements or additions to documentation
#10503
opened Nov 20, 2024 by
2015aroras
Loading…
[Core] Implement disagg prefill by StatelessProcessGroup
ci/build
ready
ONLY add when PR is ready to merge/full CI is needed
#10502
opened Nov 20, 2024 by
KuntaiDu
Loading…
Previous Next
ProTip!
What’s not been updated in a month: updated:<2024-10-24.