CTC Beam Search - Torchaudio wrapper - CTC Prefix Beam Search and CTC Beam Search from scratch + kenlm #2011
Conversation
|
Some updates regarding this PR:
I changed the following recipes: CommonVoice, Aishell, LibriSpeech and Switchboard. I can't change the others recipe due to the argument One note: Aishell with Beam Search is not leading to any improvement. I tried the TorchAudio/CTCBeamSearch/CTCPrefixBeamsearch with and without kenLM and it doest yield to any gains.
Done.
Done. I am currently running the recipe tests. I will report shortly if everything went fine. |
Everything went fine! |
|
I ran again recipe tests and all seem to work. The only tests that fail are the direct recipes in timers and such, SLURM, fluent. This is expected because they use the pattern: which is calls code not compatible with this version. When we will release the new version, we have to modify the HF repo accordingly and the issues will be fixed. I think we can finally merge this PR. This is an amazing work @Adel-Moumen l and a fundamental step toward speechbrain 1.0! |
The goal of the PR is to add CTC Frame-synchronous beam search in SpeechBrain. It support kenLM scorers. It works out of the box with a Sentencepiece tokeniser or a CTCLabelEncoder.
Some part of the code is taken and modified from PyCTCDecode (see: https://github.com/kensho-technologies/pyctcdecode).
How to test it?
Download our pertained CTC wav2vec2 from our DropBox
Run the recipe
# make sure that the `output_folder` name match the pretained folder. python3 train_with_wav2vec.py hparams/train_hf_wav2vec.yaml --data_folder=pathResults
Please see the updated README.md in the PR.
To do: