CTC Beam Search - Torchaudio wrapper - CTC Prefix Beam Search and CTC Beam Search from scratch + kenlm by Adel-Moumen · Pull Request #2011 · speechbrain/speechbrain

Adel-Moumen · 2023-05-31T13:52:11Z

The goal of the PR is to add CTC Frame-synchronous beam search in SpeechBrain. It support kenLM scorers. It works out of the box with a Sentencepiece tokeniser or a CTCLabelEncoder.

Some part of the code is taken and modified from PyCTCDecode (see: https://github.com/kensho-technologies/pyctcdecode).

How to test it?

Download our pertained CTC wav2vec2 from our DropBox

cd recipes/LibriSpeech/ASR/CTC/
wget -O wav2vec.zip https://www.dropbox.com/sh/qj2ps85g8oiicrj/AAAxlkQw5Pfo0M9EyHMi8iAra?dl=1
unzip wav2vec.zip -d wav2vec2_ctc

Run the recipe

# make sure that the `output_folder` name match the pretained folder.  
python3 train_with_wav2vec.py hparams/train_hf_wav2vec.yaml --data_folder=path

Results

Please see the updated README.md in the PR.

To do:

Adel-Moumen · 2023-08-24T13:27:51Z

Some updates regarding this PR:

I think we can now extend the beamsearcher to all the other recipes using CTC (e.g., whisper could be the first one).

I changed the following recipes: CommonVoice, Aishell, LibriSpeech and Switchboard. I can't change the others recipe due to the argument space_token. On Media for instance, I don't know what could be this value and I prefer to not change anything in case it will harm the results.

One note: Aishell with Beam Search is not leading to any improvement. I tried the TorchAudio/CTCBeamSearch/CTCPrefixBeamsearch with and without kenLM and it doest yield to any gains.

We can add some full-inference recipe tests for the CTC beamsearchers. As mentioned privately, we can take this opportunity to make the inference test fasters by downloading a version of LibriSpeech test-clean with only a few sentences.

Done.

I would retrain wav2vec2 ctc, due to the dimensionality issues that we have with the previous checkpoints.

Done.

I am currently running the recipe tests. I will report shortly if everything went fine.

Adel-Moumen · 2023-08-25T15:39:37Z

I am currently running the recipe tests. I will report shortly if everything went fine.

Everything went fine!

mravanelli · 2023-09-20T22:46:33Z

I ran again recipe tests and all seem to work. The only tests that fail are the direct recipes in timers and such, SLURM, fluent. This is expected because they use the pattern:

asr_model: !apply:speechbrain.pretrained.EncoderDecoderASR.from_hparams
    source: speechbrain/asr-crdnn-rnnlm-librispeech
    run_opts: {"device":"cuda:0"}

which is calls code not compatible with this version. When we will release the new version, we have to modify the HF repo accordingly and the issues will be fixed.

I think we can finally merge this PR. This is an amazing work @Adel-Moumen l and a fundamental step toward speechbrain 1.0!

BeamSearchDecoderCTC vanilla impl

6812a1a

Adel-Moumen changed the base branch from develop to ctc-prefix-beamsearch May 31, 2023 13:57

Adel-Moumen added 7 commits May 31, 2023 16:36

add lm

4fe9d7f

lm

c16a1d7

remove imports / add name author

f27df33

remove prune history

6e8332c

remove lm_start_states

8939d85

updt

5915568

update add ctc prefix

c02f6dc

Adel-Moumen assigned TParcollet Jun 20, 2023

Adel-Moumen added 20 commits June 21, 2023 10:57

class

5fe3607

dataclasses

b82274d

sentencepiece

1b1ae10

partial decode step

0cf5197

remove unnecessary args

a980fe2

ctc cases

d169913

ctc beam search 1.90% wer working

a588ad6

multithreading

a145334

online decoding

d046171

lm added

950795e

change pool

748186a

refacto

f7ad195

refacto -> folder ctc

1986853

start prefix bs

6e112f3

move generic objs in utils.py

119dcb9

ctc prefix bs

54e0cc1

updt prefix bs

1a2da69

need to fix repeated char

1490eb1

prefix bs working

36c420f

1.90% wer

bffb839

Adel-Moumen and others added 6 commits August 16, 2023 15:00

Merge remote-tracking branch 'upstream/develop' into ctc-frame-sync-bs

8e9426a

Merge branch 'unstable-v0.6' into ctc-frame-sync-bs

1b2abe4

recipe test for wav2vec

b5870a5

aishell

ec18ee3

Merge branch 'speechbrain:develop' into ctc-frame-sync-bs

d43822f

commonvoice

1a40741

Adel-Moumen changed the base branch from unstable-v0.6 to develop August 22, 2023 12:49

Adel-Moumen changed the base branch from develop to unstable-v0.6 August 22, 2023 12:49

Adel-Moumen added 6 commits August 22, 2023 14:56

switchboard

4cb9fd3

update readme

a2a9e0e

update readme

11fb434

update lionk in test file

285c2b0

remove unused space token

b43e6d0

update torchaudio

10af44f

Adel-Moumen added 4 commits August 24, 2023 17:24

remove deprecated language model path

1ccd6a8

fix merge

92e5221

fix vocab

7b88d09

fix switchboard

cca0f1b

Adel-Moumen and others added 7 commits August 25, 2023 18:12

commit

7e7e184

fix conflict

d15400b

Merge remote-tracking branch 'upstream/develop' into ctc-frame-sync-bs

543cabd

Merge remote-tracking branch 'upstream/develop' into ctc-frame-sync-bs

cdd6689

fix test

9852242

Merge branch 'unstable-v0.6' into ctc-frame-sync-bs

d265243

fix style

ada4e9d

remove unsued hparam

06d05ec

mravanelli merged commit 2a51a4c into speechbrain:unstable-v0.6 Sep 20, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

CTC Beam Search - Torchaudio wrapper - CTC Prefix Beam Search and CTC Beam Search from scratch + kenlm #2011

CTC Beam Search - Torchaudio wrapper - CTC Prefix Beam Search and CTC Beam Search from scratch + kenlm #2011
mravanelli merged 171 commits into
speechbrain:unstable-v0.6from
Adel-Moumen:ctc-frame-sync-bs

Adel-Moumen commented May 31, 2023 •

edited

Loading

Uh oh!

Adel-Moumen commented Aug 24, 2023

Uh oh!

Adel-Moumen commented Aug 25, 2023

Uh oh!

mravanelli commented Sep 20, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

Adel-Moumen commented May 31, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

How to test it?

Results

Uh oh!

Adel-Moumen commented Aug 24, 2023

Uh oh!

Adel-Moumen commented Aug 25, 2023

Uh oh!

mravanelli commented Sep 20, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Adel-Moumen commented May 31, 2023 •

edited

Loading