tokenizer.json modified after tokenizer.save_pretrained of OLMO models

### System Info

- `transformers` version: 4.45.0
- Platform: Linux-6.8.0-48-generic-x86_64-with-glibc2.39
- Python version: 3.10.15
- Huggingface_hub version: 0.26.2
- Safetensors version: 0.4.5
- Accelerate version: 1.0.1
- Accelerate config:    not found
- PyTorch version (GPU?): 2.4.0+rocm6.1 (True)
- Tensorflow version (GPU?): not installed (NA)
- Flax version (CPU?/GPU?/TPU?): not installed (NA)
- Jax version: not installed
- JaxLib version: not installed
- Using distributed or parallel set-up in script?: <fill in>
- Using GPU in script?: <fill in>
- GPU type: AMD Instinct MI250X/MI250

### Who can help?

@ArthurZucker and @itazap

### Information

- [ ] The official example scripts
- [X] My own modified scripts

### Tasks

- [ ] An officially supported task in the `examples` folder (such as GLUE/SQuAD, ...)
- [X] My own task or dataset (give details below)

### Reproduction

When I load and then save the tokenizer with OLMO models, the tokenizer.json files appear different, particularly with the `merge` key.

![image](https://github.com/user-attachments/assets/ad6e82c6-89bc-402a-b956-b106c4fd74de)

The code to reproduce that is :

```
from transformers import AutoTokenizer
tokenizer = AutoTokenizer.from_pretrained("allenai/OLMo-1B-0724-hf")
tokenizer.save_pretrained("saved_tokenizer")
```

### Expected behavior

The original `tokenizer.json` and the saved `tokenizer.json` should be the same.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

tokenizer.json modified after tokenizer.save_pretrained of OLMO models #34744

System Info

Who can help?

Information

Tasks

Reproduction

Expected behavior

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

tokenizer.json modified after tokenizer.save_pretrained of OLMO models #34744

Description

System Info

Who can help?

Information

Tasks

Reproduction

Expected behavior

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions