-
-
Notifications
You must be signed in to change notification settings - Fork 12.2k
Pull requests: vllm-project/vllm
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
[ROCm][Perf] Tune fused_moe and add int4 w4a16 config for amd
rocm
Related to AMD ROCm
#31328
opened Dec 25, 2025 by
yuttian1
Loading…
[ROCm] Migrate xgrammar to upstream release in rocm-test.txt
ci/build
rocm
Related to AMD ROCm
#31327
opened Dec 25, 2025 by
AndreasKaratzas
Loading…
[Feat] allow inplace loading lora
frontend
#31326
opened Dec 25, 2025 by
Jackmin801
•
Draft
1 of 5 tasks
[CI] Fix flaky vision beam search test with flexible semantic validation
#31324
opened Dec 24, 2025 by
AndreasKaratzas
Loading…
[ROCm][CI] Add TorchCodec source build for transcription tests
ci/build
rocm
Related to AMD ROCm
#31323
opened Dec 24, 2025 by
AndreasKaratzas
Loading…
Support LoRA for PLaMo 2/3
documentation
Improvements or additions to documentation
#31322
opened Dec 24, 2025 by
Alnusjaponica
Loading…
4 of 5 tasks
[MoE Refactor] AITER Mixtral Fix
rocm
Related to AMD ROCm
#31321
opened Dec 24, 2025 by
robertgshaw2-redhat
Loading…
5 tasks
[Code Quality] Add missing return type annotations to misc modules
multi-modality
Related to multi-modality (#4194)
#31320
opened Dec 24, 2025 by
yurekami
Loading…
[Code Quality] Add missing return type annotations to core modules
#31318
opened Dec 24, 2025 by
yurekami
Loading…
[Bugfix] Suppress UserWarning for non-writable buffer in binary2tensor
#31314
opened Dec 24, 2025 by
yurekami
Loading…
fix(ray): correct misleading warning message for multi-node clusters
v1
#31301
opened Dec 24, 2025 by
yurekami
Loading…
[Bugfix][Hardware][AMD] Use dynamic WARP_SIZE in sampler vectorized_process
rocm
Related to AMD ROCm
#31295
opened Dec 24, 2025 by
c0de128
Loading…
2 tasks
[Bugfix][Hardware][AMD] Fix uninitialized Qlocal registers in ROCm attention kernel
rocm
Related to AMD ROCm
#31293
opened Dec 24, 2025 by
c0de128
Loading…
fix(config): validate skip_tokenizer_init is not used with multimodal models
ready
ONLY add when PR is ready to merge/full CI is needed
#31291
opened Dec 24, 2025 by
yurekami
Loading…
3 tasks
fix: handle None tokenizer in multimodal processor initialization
multi-modality
Related to multi-modality (#4194)
#31290
opened Dec 24, 2025 by
yurekami
Loading…
2 tasks
fix(spec_decode): sync ngram draft tokens across TP ranks
speculative-decoding
v1
#31288
opened Dec 24, 2025 by
yurekami
Loading…
3 tasks
[Bugfix] Preserve original tokenizer class name for transformers compatibility
#31287
opened Dec 24, 2025 by
yurekami
Loading…
1 of 2 tasks
fix(rocm): add early return in get_flash_attn_version for ROCm
ready
ONLY add when PR is ready to merge/full CI is needed
rocm
Related to AMD ROCm
#31286
opened Dec 24, 2025 by
rabi
Loading…
[Doc] Add GPT-OSS (openai) tool parser documentation
documentation
Improvements or additions to documentation
gpt-oss
Related to GPT-OSS models
tool-calling
#31284
opened Dec 24, 2025 by
yurekami
Loading…
1 of 2 tasks
[Feature] Make EngineCore shutdown timeout configurable via environment variable
v1
#31283
opened Dec 24, 2025 by
yurekami
Loading…
1 of 3 tasks
[Bugfix][Hardware][AMD] Fix last_page_len calculation in AITER MLA decode
rocm
Related to AMD ROCm
v1
#31282
opened Dec 24, 2025 by
c0de128
Loading…
2 tasks
[Bugfix] Disable FlashInfer MoE in batch invariant mode for determinism
#31279
opened Dec 24, 2025 by
yurekami
Loading…
1 of 3 tasks
Support ViT SP parallelism in the encode section of qwen2.5vl/qwen3vl
qwen
Related to Qwen models
#31277
opened Dec 24, 2025 by
ninjazwen
Loading…
5 tasks
Previous Next
ProTip!
Mix and match filters to narrow down what you’re looking for.