-
Notifications
You must be signed in to change notification settings - Fork 28.2k
Insights: huggingface/transformers
Overview
Could not load contribution data
Please try again later
14 Pull requests merged by 11 people
-
[docs] Redesign
#31757 merged
Mar 3, 2025 -
Remove unused code
#36459 merged
Mar 3, 2025 -
[Style] fix E721 warnings
#36474 merged
Mar 3, 2025 -
Fix edge case for continue_final_message
#36404 merged
Mar 3, 2025 -
Fix pipeline+peft interaction
#36480 merged
Mar 3, 2025 -
chore: fix message descriptions in arguments and comments
#36504 merged
Mar 3, 2025 -
Fix some typos in docs
#36502 merged
Mar 3, 2025 -
fix torch_dtype, contiguous, and load_state_dict regression
#36512 merged
Mar 3, 2025 -
Fix kwargs UserWarning in SamImageProcessor
#36479 merged
Mar 3, 2025 -
Check
TRUST_REMOTE_CODE
forRealmRetriever
for security#36511 merged
Mar 3, 2025 -
Fix loading zero3 weights
#36455 merged
Mar 3, 2025 -
Fix _load_state_dict_into_meta_model with device_map=None
#36488 merged
Mar 2, 2025 -
Fix couples of issues from #36335
#36453 merged
Mar 1, 2025 -
Add Got-OCR 2 Fast image processor and refactor slow one
#36185 merged
Mar 1, 2025
17 Pull requests opened by 15 people
-
avoid errors when the size of `input_ids` passed to `PrefixConstrainedLogitsProcessor` is zero
#36489 opened
Mar 1, 2025 -
Allow OOV Image Token for LLaVa Next Variants
#36491 opened
Mar 2, 2025 -
Make SamVisionEncoder public for better accessibility
#36493 opened
Mar 2, 2025 -
Add an event related to forward in the TrainerCallback
#36496 opened
Mar 2, 2025 -
Add support for seed in `DataCollatorForLanguageModeling`
#36497 opened
Mar 2, 2025 -
[WIP] Add EVEv2 model
#36498 opened
Mar 2, 2025 -
Export base streamer.
#36500 opened
Mar 2, 2025 -
Refactor object-detection models
#36514 opened
Mar 3, 2025 -
Update recommended reviewers
#36515 opened
Mar 3, 2025 -
Make ViTPooler configurable
#36517 opened
Mar 3, 2025 -
enable/disable compile for quants methods
#36519 opened
Mar 3, 2025 -
guard torch version for uint16
#36520 opened
Mar 3, 2025 -
Add aya
#36521 opened
Mar 3, 2025 -
[docs] Serving LLMs
#36522 opened
Mar 3, 2025 -
[docs] Fix quantization overview
#36523 opened
Mar 3, 2025 -
chore: Fix typos in docs and examples
#36524 opened
Mar 4, 2025 -
chore: enhance messages in docstrings
#36525 opened
Mar 4, 2025
18 Issues closed by 8 people
-
tokenizers.apply_chat_template with continue_final_message=True with </think> token
#36440 closed
Mar 3, 2025 -
tokenizers.apply_chat_template with `continue_final_message=True` with trailing spaces in input
#35433 closed
Mar 3, 2025 -
Confusing behavior when loading PEFT models with pipeline
#36473 closed
Mar 3, 2025 -
`_load_state_dict_into_meta_model` - `'NoneType' object has no attribute 'load_state_dict'`
#36495 closed
Mar 3, 2025 -
GRPO Reward Weight Scheduler
#36490 closed
Mar 3, 2025 -
please support register_full_backward_pre_hook and register_full_backward_hook
#36507 closed
Mar 3, 2025 -
Some Whisper beam search output (sequences_scores, etc.) is lost in _stack_split_outputs
#32373 closed
Mar 3, 2025 -
[DEV Testing] Issues with `test_modeling_common`
#35857 closed
Mar 3, 2025 -
[BUG]npu zero3 训练自定义模型时,报错Function SumBackward0 returned an invalid gradient at index 0
#36387 closed
Mar 3, 2025 -
Load siglip2 error
#36475 closed
Mar 3, 2025 -
`padding_side` is of type `bool` when it should be `Literal['right', 'left']`
#36252 closed
Mar 3, 2025 -
Add Wan model into Transformers
#36494 closed
Mar 2, 2025 -
Bug introduced in `_load_state_dict_into_meta_model` and `to` `v4.49.0`..`v4.50.0.dev`
#36441 closed
Mar 1, 2025 -
`model.config.to_diff_dict()` delivers different result to `model.save_pretrained()`
#35426 closed
Mar 1, 2025 -
AttributeError: 'Config' object has no attribute '_get_non_default_generation_parameters'
#35543 closed
Mar 1, 2025 -
Prompt_ids feature causing repetitions and hallucinations
#35603 closed
Mar 1, 2025 -
convert_llama_weight_to_hf.py
#35820 closed
Mar 1, 2025
11 Issues opened by 11 people
-
GraniteMoe’s implementation is not compatible with HF’s peft
#36518 opened
Mar 3, 2025 -
Object detection tutorial uses buggy dataset, may lead to crash during training
#36516 opened
Mar 3, 2025 -
model.generate function is not compatible with custom position_ids
#36510 opened
Mar 3, 2025 -
Can not use prompt tuning inference
#36509 opened
Mar 3, 2025 -
haggingface model unsupported torch backward hook
#36508 opened
Mar 3, 2025 -
model from_pretrained bug in 4.50.dev0 in these days
#36506 opened
Mar 3, 2025 -
add a param to control cache in streamer when return output
#36505 opened
Mar 3, 2025 -
TypeError: object of type 'IterableDataset' has no len()
#36501 opened
Mar 3, 2025 -
Support Distill Depth Anything
#36499 opened
Mar 2, 2025 -
Error at scatter num_items_in_batch in ddp/dp
#36492 opened
Mar 2, 2025
69 Unresolved conversations
Sometimes conversations happen on old items that aren’t yet closed. Here is a list of all the Issues and Pull Requests with unresolved conversations.
-
`GPT2Model` StaticCache support
#35761 commented on
Mar 3, 2025 • 6 new comments -
Fix sdpa in sam and refactor relative position embeddings
#36422 commented on
Mar 4, 2025 • 5 new comments -
Integrate SwanLab for offline/online experiment tracking and local visualization
#36433 commented on
Mar 3, 2025 • 3 new comments -
Fix Batch Size Mismatch When Using `crops_n_layers` in `mask-generation` Pipeline #35530
#35627 commented on
Mar 3, 2025 • 3 new comments -
Customize docstrings fast image processor
#36466 commented on
Mar 3, 2025 • 2 new comments -
Add FAST
#35476 commented on
Mar 2, 2025 • 1 new comment -
🚧 [WiP] Add Janus model
#36053 commented on
Mar 4, 2025 • 1 new comment -
Support QuestionAnswering Module for ModernBert based models.
#35566 commented on
Mar 3, 2025 • 1 new comment -
Fix model saving bug post training with tensor parallel in Accelerate
#36434 commented on
Mar 3, 2025 • 1 new comment -
[ModernBERT] Add CausalLM functionality to ModernBERT
#35946 commented on
Mar 3, 2025 • 0 new comments -
Add generation config validation using Pydantic
#35910 commented on
Mar 3, 2025 • 0 new comments -
Introduce modular files for speech models
#35902 commented on
Mar 3, 2025 • 0 new comments -
Add Doge model
#35891 commented on
Mar 2, 2025 • 0 new comments -
Adds GGUF support for Gemma models
#35887 commented on
Mar 3, 2025 • 0 new comments -
Add padding-free to bamba
#35861 commented on
Mar 3, 2025 • 0 new comments -
Github action for auto-assigning reviewers
#35846 commented on
Mar 3, 2025 • 0 new comments -
Add MultipleChoice & QuestionAnswering heads to ModernBERT
#35825 commented on
Mar 3, 2025 • 0 new comments -
Add StyleTTS 2
#35790 commented on
Mar 3, 2025 • 0 new comments -
Add AIMv2 to Transformers
#35550 commented on
Mar 3, 2025 • 0 new comments -
🔴 Video processors as a separate class
#35206 commented on
Mar 3, 2025 • 0 new comments -
Add Phi-3.5-vision
#36036 commented on
Mar 3, 2025 • 0 new comments -
Proper performant flex attention implementation
#36103 commented on
Mar 3, 2025 • 0 new comments -
add DeepSpeed tensor parallel initialization.
#36114 commented on
Mar 3, 2025 • 0 new comments -
Try working around the processor registration bugs
#36184 commented on
Mar 3, 2025 • 0 new comments -
Add evolla rebase main
#36232 commented on
Mar 4, 2025 • 0 new comments -
Add support for DeepseekAI's DeepseekVL
#36248 commented on
Mar 2, 2025 • 0 new comments -
[docs] Update README
#36265 commented on
Mar 3, 2025 • 0 new comments -
Remove remote code warning
#36285 commented on
Mar 3, 2025 • 0 new comments -
Fix ONNX export for sequence classification head
#36332 commented on
Mar 3, 2025 • 0 new comments -
Handle DAC conversion when using weight_norm with newer PyTorch versions
#36393 commented on
Mar 2, 2025 • 0 new comments -
HPU support
#36424 commented on
Mar 3, 2025 • 0 new comments -
Avoid error when using `torch.compile` and `DataCollatorWithFlattening`
#36450 commented on
Mar 2, 2025 • 0 new comments -
Fix fp16 ONNX export for RT-DETR and RT-DETRv2
#36460 commented on
Mar 3, 2025 • 0 new comments -
Fix incorrect attention mask truncate in WhisperFlashAttention2
#36477 commented on
Mar 3, 2025 • 0 new comments -
Sanitize Model Module Names to Follow Python Conventions
#36478 commented on
Mar 3, 2025 • 0 new comments -
Mask2FormerImageProcessor support overlapping features
#35536 commented on
Mar 3, 2025 • 0 new comments -
model.parameters() return [Parameter containing: tensor([], device='cuda:0', dtype=torch.bfloat16, requires_grad=True)] when using zero3
#35994 commented on
Mar 3, 2025 • 0 new comments -
Qwen2VLForConditionalGeneration doesn't work with MPS devices
#36413 commented on
Mar 3, 2025 • 0 new comments -
Support H100 training with FP8 in Trainer and Deepspeed
#25333 commented on
Mar 2, 2025 • 0 new comments -
Support SDPA & Flash Attention 2 for LayoutLMv3
#35467 commented on
Mar 2, 2025 • 0 new comments -
LayerDrop broken in various Flax models (Whisper/BART/more...)
#35468 commented on
Mar 2, 2025 • 0 new comments -
Memory Access out of bounds in mra/cuda_kernel.cu::index_max_cuda_kernel()
#35507 commented on
Mar 2, 2025 • 0 new comments -
Very slow to load deep seekv3 int4 model and device_map="auto" "sequential" bug
#35522 commented on
Mar 2, 2025 • 0 new comments -
adalomo and deepspeed zero3 offload error
#35977 commented on
Mar 2, 2025 • 0 new comments -
llama code break with torch compile
#36484 commented on
Mar 2, 2025 • 0 new comments -
Speed up image processors - cast to array before BatchFeature
#31205 commented on
Mar 2, 2025 • 0 new comments -
A warning message showing that `MultiScaleDeformableAttention.so` is not found in `/root/.cache/torch_extensions` if `ninja` is installed with `transformers`
#35349 commented on
Mar 2, 2025 • 0 new comments -
[Whisper] TypeError: '<=' not supported between instances of 'NoneType' and 'float'
#33552 commented on
Mar 1, 2025 • 0 new comments -
Saving model with shared tensors fails on cpu but succeeds on gpu
#33688 commented on
Mar 1, 2025 • 0 new comments -
Tokenizer does not split text according to newly added input tokens
#35447 commented on
Mar 1, 2025 • 0 new comments -
Can't use Trainer on mps device
#35954 commented on
Mar 1, 2025 • 0 new comments -
DeepSeek V3 Support
#35425 commented on
Mar 1, 2025 • 0 new comments -
Samhq model addition
#35147 commented on
Mar 3, 2025 • 0 new comments -
Add TimesFM Time Series Forecasting Model
#34082 commented on
Mar 3, 2025 • 0 new comments -
TypeError: CustomTrainer.compute_loss() got an unexpected keyword argument 'num_items_in_batch'
#36331 commented on
Mar 4, 2025 • 0 new comments -
Groq inference provider
#36353 commented on
Mar 3, 2025 • 0 new comments -
Dtensor support requires torch>=2.5.1
#36472 commented on
Mar 3, 2025 • 0 new comments -
Gemma2 (quantized) inference is broken - torch._dynamo.exc.UserError: Dynamic control flow is not supported at the moment.
#36485 commented on
Mar 3, 2025 • 0 new comments -
The output tensor's data type is not torch.long when the input text is empty.
#36277 commented on
Mar 3, 2025 • 0 new comments -
Possible bug when using cosine lr scheduler with gradient accumulation
#35484 commented on
Mar 3, 2025 • 0 new comments -
Accidentally allocating 2x memory in new caching_allocator_warmup
#36483 commented on
Mar 3, 2025 • 0 new comments -
Misleading documentation for `is_decoder` configuration parameter
#36482 commented on
Mar 3, 2025 • 0 new comments -
Add type checking to CI
#36481 commented on
Mar 3, 2025 • 0 new comments -
ViTPose tutorial fails
#36454 commented on
Mar 3, 2025 • 0 new comments -
SAM mask-generation - crops_n_layers
#35530 commented on
Mar 3, 2025 • 0 new comments -
Enhance the memory efficiency of loading large models (400B) to prevent out-of-memory errors when using tensor parallelism.
#36467 commented on
Mar 3, 2025 • 0 new comments -
Add EVEv2 : an Encoder-free VLM
#36379 commented on
Mar 3, 2025 • 0 new comments -
Inference with FSDP during training affects checkpoints
#34530 commented on
Mar 3, 2025 • 0 new comments -
Recomputed tensor size does not match when using activation checkpointing when using FSDP and accelerate
#34928 commented on
Mar 3, 2025 • 0 new comments