CL_MASR

CL-MASR: A Continual Learning Benchmark for Multilingual ASR

This is the official benchmark platform accompanying the paper CL-MASR: A Continual Learning Benchmark for Multilingual ASR.

It includes scripts to train Whisper and WavLM-based ASR systems on a subset of 20 languages selected from Common Voice 13 in a continual learning fashion using a handful of methods including rehearsal-based, architecture-based, and regularization-based approaches.

The goal is to continually learn new languages while limiting forgetting the previously learned ones. An ideal method should achieve both positive forward transfer (i.e. improve performance on new tasks leveraging shared knowledge from previous tasks) and positive backward transfer (i.e. improve performance on previous tasks leveraging shared knowledge from new tasks).

The following algorithms have been implemented so far:

Rehearsal-based
- Experience Replay (ER)
- Averaged Gradient Episodic Memory (A-GEM)
- Dark Experience Replay (DER) (task-incremental variant)
Architecture-based
- Progressive Neural Networks (PNN)
- Piggyback (PB)
- Learning to Prompt (L2P) (task-aware variant)
Regularization-based
- Elastic Weight Consolidation (EWC) (online variant)
- Learning without Forgetting (LwF) (online variant)
- Memory Aware Synapses (MAS)

⚡ Dataset [download]

The dataset used for the CL-MASR benchmark is extracted from Common Voice 13 (see reference paper). Each of the 20 languages in the dataset includes approximately 10 hours of training material, with an additional 1 hour designated for validation and another 1 hour for testing purposes.

Download the dataset from here and extract it to a data folder of your choice (CL-MASR by default).

🛠️️ Installation

To set up the benchmark, clone the benchmark repository and install SpeechBrain:

git clone https://github.com/speechbrain/benchmarks.git
cd benchmarks
git submodule update --init --recursive
cd speechbrain
pip install -r requirements.txt
pip install -e .

▶️ Quickstart

Running an experiment

Navigate to <path-to-repository>/benchmarks/CL_MASR/<model>, open a terminal and run:

python train_<cl-method>.py hparams/train_<cl-method>.yaml --data_folder <path-to-data-folder>

NOTE: in order to reproduce the experiments with WavLM large, you need to download checkpoint pretrained on the base languages from here.

NOTE: to profile the model (optional), install ptflops and torchinfo as additional dependencies.

NOTE: multi-GPU training is currently not supported.

Analyzing the results

Navigate to <path-to-repository>/benchmarks/CL_MASR, open a terminal and run:

python analyze_logs.py <path-to-folder-containing-model-logs>

This command will recursively retrieve and analyze all log files that are named according to the format <cl-method>_base=<comma-separated-base-locales>_new=<comma-separated-new-locales>.txt (this is the default naming convention followed in all the training scripts). You can find the resulting performance metric summaries and plots in <path-to-folder-containing-model-logs>. See the help (python analyze_logs.py -h) for advanced configuration options.

NOTE: make sure to specify the --im_refs and --fwt_refs arguments that correspond to the given model (default to Whisper large-v2).

NOTE: to plot the results (optional), install matplotlib and/or plotly as additional dependencies.

📈️ Results

Release	Hyperparameters	Average AWER	Average BWT	Average IM	Average FWT	Logs	GPUs
07-06-23	whisper/hparams/train_ft.yaml	98.50	-84.58	-4.16	-0.83	Link	1xV100 32GB
07-06-23	whisper/hparams/train_er.yaml	50.83	-13.20	-0.81	-4.17	Link	1xV100 32GB
07-06-23	whisper/hparams/train_agem.yaml	81.08	-55.85	0.20	-5.19	Link	1xV100 32GB
01-10-23	whisper/hparams/train_der.yaml	67.84	-41.28	-4.29	-	Not available	1xV100 32GB
07-06-23	whisper/hparams/train_pnn.yaml	44.12	0.00	3.18	-8.16	Link	1xV100 32GB
07-06-23	whisper/hparams/train_pb.yaml	43.95	0.00	3.51	-8.50	Link	1xV100 32GB
01-10-23	whisper/hparams/train_l2p.yaml	114.65	0.00	110.50	-	Not available	1xV100 32GB
07-06-23	whisper/hparams/train_ewc.yaml	98.04	-68.32	2.87	-7.85	Link	1xV100 32GB
07-06-23	whisper/hparams/train_lwf.yaml	95.76	-77.50	0.00	-4.98	Link	1xV100 32GB
01-10-23	whisper/hparams/train_mas.yaml	68.08	-0.58	38.62	-	Not available	1xV100 32GB
07-06-23	wavlm/hparams/train_ft.yaml	91.61	-54.67	-10.19	-0.21	Link	1xV100 32GB
07-06-23	wavlm/hparams/train_er.yaml	60.79	-8.96	-7.62	-2.77	Link	1xV100 32GB
07-06-23	wavlm/hparams/train_agem.yaml	72.54	13.59	35.29	-45.69	Link	1xV100 32GB
01-10-23	wavlm/hparams/train_der.yaml	71.22	-16.64	-3.21	-	Not available	1xV100 32GB
07-06-23	wavlm/hparams/train_pnn.yaml	66.07	0.00	12.95	-23.34	Link	1xV100 32GB
07-06-23	wavlm/hparams/train_pb.yaml	61.87	0.00	2.75	-13.15	Link	1xV100 32GB
01-10-23	wavlm/hparams/train_l2p.yaml	92.72	0.00	52.11	-	Not available	1xV100 32GB
07-06-23	wavlm/hparams/train_ewc.yaml	86.98	-39.54	-4.26	-6.13	Link	1xV100 32GB
07-06-23	wavlm/hparams/train_lwf.yaml	87.17	-26.03	10.42	-20.82	Link	1xV100 32GB
01-10-23	wavlm/hparams/train_mas.yaml	83.06	-1.37	33.22	-	Not available	1xV100 32GB

Raw experiment logs are available here. We do not include the checkpoints due to storage limits (each experiment with Whisper large-v2 generates ~125 GB of checkpoint data).

Analyses generated via analyze_logs.py are available here.

All the experiments were run on 5 CentOS Linux machines with an Intel(R) Xeon(R) Silver 4216 Cascade Lake CPU with 32 cores @ 2.10 GHz, 64 GB RAM and an NVIDIA Tesla V100 SXM2 @ 32 GB with CUDA Toolkit 11.4. With the specified hardware configuration, approximately 10 days are necessary to complete all the experiments.

@ Citing

If you use the CL-MASR benchmark, please cite:

@article{dellalibera2024clmasr,
  author  = {{Della Libera}, Luca and Mousavi, Pooneh and Zaiem, Salah and Subakan, Cem and Ravanelli, Mirco},
  journal = {IEEE/ACM Transactions on Audio, Speech, and Language Processing},
  title   = {{CL-MASR}: A Continual Learning Benchmark for Multilingual {ASR}},
  year    = {2024},
  volume  = {32},
  number  = {},
  pages   = {4931--4944},
  doi     = {10.1109/TASLP.2024.3487410}
}

If you use SpeechBrain, please cite the reference paper:

@article{ravanelli2024open,
  author  = {Mirco Ravanelli and Titouan Parcollet and Adel Moumen and Sylvain de Langen and Cem Subakan and Peter Plantinga and Yingzhi Wang and Pooneh Mousavi and Luca {Della Libera} and Artem Ploujnikov and Francesco Paissan and Davide Borra and Salah Zaiem and Zeyu Zhao and Shucong Zhang and Georgios Karakasidis and Sung-Lin Yeh and Pierre Champion and Aku Rouhe and Rudolf Braun and Florian Mai and Juan Zuluaga-Gomez and Seyed Mahed Mousavi and Andreas Nautsch and Ha Nguyen and Xuechen Liu and Sangeet Sagar and Jarod Duret and Salima Mdhaffar and Ga{{\"e}}lle Laperri{{\`e}}re and Mickael Rouvier and Renato De Mori and Yannick Est{{\`e}}ve},
  title   = {Open-Source Conversational {AI} with {SpeechBrain} 1.0},
  journal = {Journal of Machine Learning Research},
  year    = {2024},
  volume  = {25},
  number  = {333},
  pages   = {1--11},
  url     = {http://jmlr.org/papers/v25/24-0991.html}
}

@article{ravanelli2021speechbrain,
  author  = {Mirco Ravanelli and Titouan Parcollet and Peter Plantinga and Aku Rouhe and Samuele Cornell and Loren Lugosch and Cem Subakan and Nauman Dawalatabad and Abdelwahab Heba and Jianyuan Zhong and Ju-Chieh Chou and Sung-Lin Yeh and Szu-Wei Fu and Chien-Feng Liao and Elena Rastorgueva and François Grondin and William Aris and Hwidong Na and Yan Gao and Renato De Mori and Yoshua Bengio},
  title   = {{SpeechBrain}: A General-Purpose Speech Toolkit},
  journal = {arXiv preprint arXiv:2106.04624},
  year    = {2021},
  url     = {https://arxiv.org/abs/2106.04624},
}

📧 Contact

[email protected]

Name		Name	Last commit message	Last commit date
parent directory ..
wavlm		wavlm
whisper		whisper
.gitignore		.gitignore
README.md		README.md
analyze_logs.py		analyze_logs.py
common_voice_prepare.py		common_voice_prepare.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

CL-MASR: A Continual Learning Benchmark for Multilingual ASR

⚡ Dataset [download]

🛠️️ Installation

▶️ Quickstart

Running an experiment

Analyzing the results

📈️ Results

@ Citing

📧 Contact

FilesExpand file tree

CL_MASR

Directory actions

More options

Directory actions

More options

Latest commit

History

CL_MASR

Folders and files

parent directory

README.md

CL-MASR: A Continual Learning Benchmark for Multilingual ASR

⚡ Dataset [download]

🛠️️ Installation

▶️ Quickstart

Running an experiment

Analyzing the results

📈️ Results

@ Citing

📧 Contact