Pulse · huggingface/transformers

December 4, 2024 – December 11, 2024

102 Unresolved conversations

Sometimes conversations happen on old items that aren’t yet closed. Here is a list of all the Issues and Pull Requests with unresolved conversations.

Add LightGlue model
#31718 commented on Dec 9, 2024 • 54 new comments
Efficient Inference Kernel for SpQR
#34976 commented on Dec 11, 2024 • 23 new comments
[Whisper] 🚨 Fix whisper decoding 🚨
#34135 commented on Dec 11, 2024 • 18 new comments
Enhanced Installation Section in README.md
#35094 commented on Dec 5, 2024 • 12 new comments
Add Zamba2
#34517 commented on Dec 7, 2024 • 12 new comments
[WIP] Refactoring of ImageProcessorFast
#35069 commented on Dec 11, 2024 • 12 new comments
Run model as compressed/uncompressed mode
#34719 commented on Dec 11, 2024 • 10 new comments
HIGGS Quantization Support
#34997 commented on Dec 9, 2024 • 10 new comments
[GGUF] Refactor and decouple gguf checkpoint loading logic
#34385 commented on Dec 10, 2024 • 8 new comments
Add dithering to the `Speech2TextFeatureExtractor` API.
#34638 commented on Dec 10, 2024 • 6 new comments
Universal Speculative Decoding `CandidateGenerator`
#35029 commented on Dec 5, 2024 • 5 new comments
Aggeregate test summary files in CircleCI workflow runs
#34989 commented on Dec 10, 2024 • 4 new comments
add bnb support for Ascend NPU
#31512 commented on Dec 11, 2024 • 4 new comments
Add diffllama
#34083 commented on Dec 10, 2024 • 4 new comments
Fix : Falcon processor doesn't account for a layout difference of qkv between transformers and GGUF
#35088 commented on Dec 10, 2024 • 3 new comments
Enable different torch dtype in sub models
#34873 commented on Dec 10, 2024 • 3 new comments
Fix case of nested tensors in BatchMixFeature
#35063 commented on Dec 9, 2024 • 2 new comments
[whisper] added dropping of attention weights after DTW calculations related to word timestamps if these weights are not requested in the output
#33732 commented on Dec 9, 2024 • 2 new comments
Add ColPali to 🤗 transformers
#33736 commented on Dec 10, 2024 • 2 new comments
Add support for Apple's Depth-Pro
#34583 commented on Dec 6, 2024 • 2 new comments
Output dicts support in text generation pipeline
#35092 commented on Dec 7, 2024 • 2 new comments
Add TextNet
#34979 commented on Dec 8, 2024 • 1 new comment
enable StaticCache for assisted generation
#34797 commented on Dec 11, 2024 • 0 new comments
FEAT : Adding VPTQ quantization method to HFQuantizer
#34770 commented on Dec 6, 2024 • 0 new comments
Update config validation
#34726 commented on Dec 9, 2024 • 0 new comments
Add GOT-OCR 2.0 to Transformers
#34721 commented on Dec 5, 2024 • 0 new comments
Add TimesFM Time Series Forecasting Model
#34082 commented on Dec 10, 2024 • 0 new comments
change bnb tests
#34713 commented on Dec 9, 2024 • 0 new comments
Past Keys Output now working with output router logits
#34707 commented on Dec 6, 2024 • 0 new comments
VLMs: major clean up 🧼
#34502 commented on Dec 10, 2024 • 0 new comments
LLaVA-NeXT: add new model checkpoints
#34195 commented on Dec 11, 2024 • 0 new comments
Modular phi
#34361 commented on Dec 11, 2024 • 0 new comments
[FEAT] Compatibility with dduf format from diffusers
#35093 commented on Dec 11, 2024 • 0 new comments
[Clean-up] Planned removal of the `max_size` argument
#35090 commented on Dec 6, 2024 • 0 new comments
Fix : model used to test ggml conversion of Falcon-7b is incorrect
#35083 commented on Dec 10, 2024 • 0 new comments
[setup] migrate setup script to `pyproject.toml` (reland #22539)
#35077 commented on Dec 6, 2024 • 0 new comments
Use AMD CI workflow defined in hf-workflows
#35058 commented on Dec 6, 2024 • 0 new comments
Add: num_additional_image_tokens to models
#35052 commented on Dec 9, 2024 • 0 new comments
Enable gptqmodel
#35012 commented on Dec 10, 2024 • 0 new comments
switch from `training_args.bin` `training_args.json`
#35010 commented on Dec 12, 2024 • 0 new comments
Refactoring `AssistedCandidateGenerator` for Improved Modularity and Reusability
#35009 commented on Dec 10, 2024 • 0 new comments
Make `test_generate_with_static_cache` even less flaky
#34995 commented on Dec 10, 2024 • 0 new comments
Deprecate _is_quantized_training_enabled
#34991 commented on Dec 10, 2024 • 0 new comments
[ `Core`] Refactor modeling code
#34987 commented on Dec 11, 2024 • 0 new comments
Add the Bamba Model
#34982 commented on Dec 11, 2024 • 0 new comments
[`ESM`] Add support for sdpa.
#34954 commented on Dec 9, 2024 • 0 new comments
Add sdpa for Beit
#34941 commented on Dec 11, 2024 • 0 new comments
Implement AsyncTextIteratorStreamer for asynchronous streaming
#34931 commented on Dec 9, 2024 • 0 new comments
[tests] fix "Tester object has no attribute '_testMethodName'"
#34910 commented on Dec 10, 2024 • 0 new comments
Add Relation DETR
#34900 commented on Dec 9, 2024 • 0 new comments
[WIP] Add flex attention for gpt2
#34861 commented on Dec 6, 2024 • 0 new comments
Add Flex Attention for Mistral along with refactoring
#34845 commented on Dec 5, 2024 • 0 new comments
[`GPTQ`, `CompressedTensors`] Fix unsafe imports and metada check
#34815 commented on Dec 8, 2024 • 0 new comments
Adding support for OpenLMForCausalLM from DataComp
#34081 commented on Dec 5, 2024 • 0 new comments
The dot in the model name when using auto_map will cause a path parsing error.
#35082 commented on Dec 10, 2024 • 0 new comments
Add Flax diverse group search
#25355 commented on Dec 9, 2024 • 0 new comments
Resuming from checkpoint runs into OOM
#30822 commented on Dec 9, 2024 • 0 new comments
NaN model parameter found in meta-llama/Llama-3.2-11B-Vision under 4.46.1 version
#34602 commented on Dec 9, 2024 • 0 new comments
Trying to train a model using automatic1111. Error - Exception training model: 'module 'transformers.integrations' has no attribute 'deepspeed''.
#34427 commented on Dec 9, 2024 • 0 new comments
`dataloader_persistent_workers=True` causes fork-bomb due to repeated creation of `eval_dataloader`
#28469 commented on Dec 9, 2024 • 0 new comments
bus error on version 4.43.0 with pretrained community CLIP model - MacOS
#33357 commented on Dec 8, 2024 • 0 new comments
Accelerate x Trainer issue tracker:
#33345 commented on Dec 8, 2024 • 0 new comments
Passing nn.Parameter values within the model architecture as deep copies.
#34643 commented on Dec 8, 2024 • 0 new comments
ValueError: Architecture deepseek2 not supported
#34335 commented on Dec 7, 2024 • 0 new comments
Does per_device_train_batch_size have a loss error similar to that of GA?
#34579 commented on Dec 7, 2024 • 0 new comments
Enhancing Hugging Face Models with Tensor Parallelism for Large-Scale Model Support 🚀
#32470 commented on Dec 7, 2024 • 0 new comments
xpu device is not used running pipeline(device_map="auto")
#31922 commented on Dec 6, 2024 • 0 new comments
Is there a way to find the earliest version of transformers that has a certain model?
#35097 commented on Dec 6, 2024 • 0 new comments
RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cuda:1! (when checking argument for argument mat1 in method wrapper_CUDA_addmm)
#34695 commented on Dec 6, 2024 • 0 new comments
How to specific customized force_token_ids in whisper
#34107 commented on Dec 6, 2024 • 0 new comments
Unexpected output of _flash_attention_forward() for cross attention
#35032 commented on Dec 6, 2024 • 0 new comments
Duplicate ZeRo 3 Global Step Checkpoint Saves
#34534 commented on Dec 6, 2024 • 0 new comments
ValueError: You are trying to save a non-contiguous tensor in MT5 finetunning
#34623 commented on Dec 6, 2024 • 0 new comments
Make it possible to save and evaluate checkpoint on CTRL+C / `KeyboardInterrupt` with Hugging Face Trainer
#35033 commented on Dec 6, 2024 • 0 new comments
The same situation as #31377 occurred when using Qwen/Qwen2-VL-7B-Instruct
#33399 commented on Dec 5, 2024 • 0 new comments
When extending embeddings, multivariate distribution isn't correctly estimated even when the calculated sigma matrix is symmetric and positive definite
#35075 commented on Dec 5, 2024 • 0 new comments
Documentation for SWAG contradicts itself when constructing the first sentence.
#35095 commented on Dec 5, 2024 • 0 new comments
AssertionError for Pytorch PiPPy example
#34600 commented on Dec 5, 2024 • 0 new comments
Add Molmo (7B-D, 7B-O, 70B)
#33962 commented on Dec 11, 2024 • 0 new comments
#33512 handle last element out of range error
#33625 commented on Dec 9, 2024 • 0 new comments
[WIP] - Enable speculative decoding with batch size >1
#32189 commented on Dec 11, 2024 • 0 new comments
[docs] Redesign
#31757 commented on Dec 11, 2024 • 0 new comments
Support Kosmos-2.5
#31711 commented on Dec 6, 2024 • 0 new comments
Implement SuperGlue model
#29886 commented on Dec 6, 2024 • 0 new comments
Fix model code to accurately convert fairseq wav2vec2 model
#28250 commented on Dec 12, 2024 • 0 new comments
Add Model Support for xLSTM
#27011 commented on Dec 11, 2024 • 0 new comments
Confusing error message
#34658 commented on Dec 11, 2024 • 0 new comments
AutoModelForDepthEstimation/DepthAnythingDepthEstimationHead unexpected behavior in JIT
#34679 commented on Dec 11, 2024 • 0 new comments
Discrepancy in Training Loss Behavior with Gradient Accumulation using DeepSpeed
#34694 commented on Dec 11, 2024 • 0 new comments
trainer resume from checkpoint，the learning rate is not the same as retraining,learning rate is discontinuous
#34053 commented on Dec 11, 2024 • 0 new comments
[i18n-ar] Translating docs to Arabic (العربية)
#32435 commented on Dec 10, 2024 • 0 new comments
Verify interpolation of image processors
#28180 commented on Dec 10, 2024 • 0 new comments
Bug in running facebook/wav2vec2-xlsr-53-espeak-cv-ft
#35064 commented on Dec 10, 2024 • 0 new comments
rework `test_multi_gpu_data_parallel_forward`
#31087 commented on Dec 10, 2024 • 0 new comments
Padding error when using Universal Assisted Generation with ASR pipeline
#34639 commented on Dec 10, 2024 • 0 new comments
Silent failure in generation parameters
#33690 commented on Dec 10, 2024 • 0 new comments
BarkProcessor voice_preset doesn't work
#34634 commented on Dec 10, 2024 • 0 new comments
about gradient accumulation
#34648 commented on Dec 10, 2024 • 0 new comments
Neftune computation is probably wrong with packed training
#34659 commented on Dec 10, 2024 • 0 new comments
FlaxWhisperForConditionalGeneration Out Of Memory Error
#34668 commented on Dec 10, 2024 • 0 new comments
Vision Encoder-Decoder fails with LLaMA decoder due to missing cross-attention implementation
#34674 commented on Dec 10, 2024 • 0 new comments
tokenizer.json modified after tokenizer.save_pretrained of OLMO models
#34744 commented on Dec 10, 2024 • 0 new comments

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

December 4, 2024 – December 11, 2024

Overview

Could not load contribution data

1 Release published by 1 person

49 Pull requests merged by 36 people

48 Pull requests opened by 30 people

39 Issues closed by 18 people

32 Issues opened by 27 people

102 Unresolved conversations

Insights: huggingface/transformers

December 4, 2024 – December 11, 2024

Overview

Could not load contribution data

1 Release published by 1 person

49 Pull requests merged by 36 people

48 Pull requests opened by 30 people

39 Issues closed by 18 people

32 Issues opened by 27 people

102 Unresolved conversations