Skip to content

Fix dynamic batching#2174

Merged
mravanelli merged 17 commits into
unstable-v0.6from
revert-2173-revert-2170-fix-dynamic-batching
Sep 28, 2023
Merged

Fix dynamic batching#2174
mravanelli merged 17 commits into
unstable-v0.6from
revert-2173-revert-2170-fix-dynamic-batching

Conversation

@Adel-Moumen

@Adel-Moumen Adel-Moumen commented Sep 26, 2023

Copy link
Copy Markdown
Collaborator

This PR aims at fixing dynamic batching in SpeechBrain. Indeed, as explained in #1881, some parameters are not actually passed in the DynamicBatchSampler object.

This PR is also a clean up of #1891.

@Adel-Moumen

Copy link
Copy Markdown
Collaborator Author

Before:

speechbrain.dataio.sampler - DynamicBatchSampler: Generating dynamic batches
speechbrain.dataio.sampler - DynamicBatchSampler: Bucket 0 with boundary 0.0-3.5 and batch_size 173: Num Examples 11505.0, Num Full Batches 52.000, Pad Factor 96.701.
speechbrain.dataio.sampler - DynamicBatchSampler: Bucket 1 with boundary 3.5-4.4 and batch_size 135: Num Examples 7920.0, Num Full Batches 52.000, Pad Factor 24.668.
speechbrain.dataio.sampler - DynamicBatchSampler: Bucket 2 with boundary 4.4-5.2 and batch_size 115: Num Examples 6010.0, Num Full Batches 48.000, Pad Factor 15.465.
speechbrain.dataio.sampler - DynamicBatchSampler: Bucket 3 with boundary 5.2-5.8 and batch_size 102: Num Examples 5231.0, Num Full Batches 48.000, Pad Factor 11.521.
speechbrain.dataio.sampler - DynamicBatchSampler: Bucket 4 with boundary 5.8-6.4 and batch_size 93: Num Examples 4479.0, Num Full Batches 45.000, Pad Factor 9.315.
speechbrain.dataio.sampler - DynamicBatchSampler: Bucket 5 with boundary 6.4-6.9 and batch_size 86: Num Examples 3914.0, Num Full Batches 43.000, Pad Factor 7.799.
speechbrain.dataio.sampler - DynamicBatchSampler: Bucket 6 with boundary 6.9-7.4 and batch_size 80: Num Examples 3923.0, Num Full Batches 46.000, Pad Factor 6.829.
speechbrain.dataio.sampler - DynamicBatchSampler: Bucket 7 with boundary 7.4-7.9 and batch_size 75: Num Examples 3677.0, Num Full Batches 46.000, Pad Factor 6.068.
speechbrain.dataio.sampler - DynamicBatchSampler: Bucket 8 with boundary 7.9-8.3 and batch_size 71: Num Examples 3484.0, Num Full Batches 47.000, Pad Factor 5.420.
speechbrain.dataio.sampler - DynamicBatchSampler: Bucket 9 with boundary 8.3-8.8 and batch_size 68: Num Examples 3433.0, Num Full Batches 48.000, Pad Factor 5.026.
speechbrain.dataio.sampler - DynamicBatchSampler: Bucket 10 with boundary 8.8-9.2 and batch_size 65: Num Examples 3516.0, Num Full Batches 52.000, Pad Factor 4.617.
speechbrain.dataio.sampler - DynamicBatchSampler: Bucket 11 with boundary 9.2-9.6 and batch_size 62: Num Examples 3434.0, Num Full Batches 53.000, Pad Factor 4.254.
speechbrain.dataio.sampler - DynamicBatchSampler: Bucket 12 with boundary 9.6-10.0 and batch_size 59: Num Examples 3696.0, Num Full Batches 60.000, Pad Factor 4.029.
speechbrain.dataio.sampler - DynamicBatchSampler: Bucket 13 with boundary 10.0-10.4 and batch_size 57: Num Examples 3716.0, Num Full Batches 63.000, Pad Factor 3.775.
speechbrain.dataio.sampler - DynamicBatchSampler: Bucket 14 with boundary 10.4-10.8 and batch_size 55: Num Examples 4004.0, Num Full Batches 70.000, Pad Factor 3.543.
speechbrain.dataio.sampler - DynamicBatchSampler: Bucket 15 with boundary 10.8-11.2 and batch_size 53: Num Examples 4395.0, Num Full Batches 80.000, Pad Factor 3.420.
speechbrain.dataio.sampler - DynamicBatchSampler: Bucket 16 with boundary 11.2-11.5 and batch_size 52: Num Examples 4897.0, Num Full Batches 92.000, Pad Factor 3.219.
speechbrain.dataio.sampler - DynamicBatchSampler: Bucket 17 with boundary 11.5-11.9 and batch_size 50: Num Examples 5769.0, Num Full Batches 112.000, Pad Factor 3.117.
speechbrain.dataio.sampler - DynamicBatchSampler: Bucket 18 with boundary 11.9-12.3 and batch_size 48: Num Examples 6887.0, Num Full Batches 138.000, Pad Factor 2.980.
speechbrain.dataio.sampler - DynamicBatchSampler: Bucket 19 with boundary 12.3-12.6 and batch_size 47: Num Examples 7934.0, Num Full Batches 164.000, Pad Factor 2.854.
speechbrain.dataio.sampler - DynamicBatchSampler: Bucket 20 with boundary 12.6-13.0 and batch_size 46: Num Examples 9430.0, Num Full Batches 201.000, Pad Factor 2.736.
speechbrain.dataio.sampler - DynamicBatchSampler: Bucket 21 with boundary 13.0-13.3 and batch_size 45: Num Examples 11227.0, Num Full Batches 246.000, Pad Factor 2.661.
speechbrain.dataio.sampler - DynamicBatchSampler: Bucket 22 with boundary 13.3-13.7 and batch_size 43: Num Examples 13532.0, Num Full Batches 304.000, Pad Factor 2.591.
speechbrain.dataio.sampler - DynamicBatchSampler: Bucket 23 with boundary 13.7-14.0 and batch_size 42: Num Examples 15688.0, Num Full Batches 362.000, Pad Factor 2.490.
speechbrain.dataio.sampler - DynamicBatchSampler: Bucket 24 with boundary 14.0-14.4 and batch_size 41: Num Examples 17933.0, Num Full Batches 424.000, Pad Factor 2.428.
speechbrain.dataio.sampler - DynamicBatchSampler: Bucket 25 with boundary 14.4-14.7 and batch_size 40: Num Examples 19787.0, Num Full Batches 480.000, Pad Factor 2.370.
speechbrain.dataio.sampler - DynamicBatchSampler: Bucket 26 with boundary 14.7-15.1 and batch_size 39: Num Examples 20922.0, Num Full Batches 519.000, Pad Factor 2.282.
speechbrain.dataio.sampler - DynamicBatchSampler: Bucket 27 with boundary 15.1-15.4 and batch_size 38: Num Examples 21439.0, Num Full Batches 544.000, Pad Factor 2.230.
speechbrain.dataio.sampler - DynamicBatchSampler: Bucket 28 with boundary 15.4-15.8 and batch_size 38: Num Examples 20907.0, Num Full Batches 543.000, Pad Factor 2.181.
speechbrain.dataio.sampler - DynamicBatchSampler: Bucket 29 with boundary 15.8-16.1 and batch_size 37: Num Examples 15911.0, Num Full Batches 422.000, Pad Factor 2.136.
speechbrain.dataio.sampler - DynamicBatchSampler: Bucket 30 with boundary 16.1-16.5 and batch_size 36: Num Examples 7192.0, Num Full Batches 194.000, Pad Factor 2.091.
speechbrain.dataio.sampler - DynamicBatchSampler: Bucket 31 with boundary 16.5-16.8 and batch_size 35: Num Examples 3720.0, Num Full Batches 103.000, Pad Factor 2.046.
speechbrain.dataio.sampler - DynamicBatchSampler: Bucket 32 with boundary 16.8-17.1 and batch_size 34: Num Examples 1487.0, Num Full Batches 41.000, Pad Factor 2.008.
speechbrain.dataio.sampler - DynamicBatchSampler: Bucket 33 with boundary 17.1-17.5 and batch_size 34: Num Examples 122.0, Num Full Batches 3.000, Pad Factor 1.828.
speechbrain.dataio.sampler - DynamicBatchSampler: Bucket 34 with boundary 17.5-17.8 and batch_size 33: Num Examples 12.0, Num Full Batches 0.000, Pad Factor 1.613.
speechbrain.dataio.sampler - DynamicBatchSampler: Bucket 35 with boundary 17.8-18.2 and batch_size 32: Num Examples 6.0, Num Full Batches 0.000, Pad Factor 1.224.
speechbrain.dataio.sampler - DynamicBatchSampler: Bucket 36 with boundary 18.2-18.5 and batch_size 32: Num Examples 9.0, Num Full Batches 0.000, Pad Factor 1.445.
speechbrain.dataio.sampler - DynamicBatchSampler: Bucket 37 with boundary 18.5-18.9 and batch_size 31: Num Examples 9.0, Num Full Batches 0.000, Pad Factor 1.124.
speechbrain.dataio.sampler - DynamicBatchSampler: Bucket 38 with boundary 18.9-19.2 and batch_size 31: Num Examples 9.0, Num Full Batches 0.000, Pad Factor 1.413.
speechbrain.dataio.sampler - DynamicBatchSampler: Bucket 39 with boundary 19.2-19.6 and batch_size 30: Num Examples 5.0, Num Full Batches 0.000, Pad Factor 1.448.
speechbrain.dataio.sampler - DynamicBatchSampler: Bucket 40 with boundary 19.6-19.9 and batch_size 30: Num Examples 14.0, Num Full Batches 0.000, Pad Factor 1.417.
speechbrain.dataio.sampler - DynamicBatchSampler: Bucket 41 with boundary 19.9-20.3 and batch_size 29: Num Examples 7.0, Num Full Batches 0.000, Pad Factor 1.369.
speechbrain.dataio.sampler - DynamicBatchSampler: Bucket 42 with boundary 20.3-20.6 and batch_size 29: Num Examples 4.0, Num Full Batches 0.000, Pad Factor 0.637.
speechbrain.dataio.sampler - DynamicBatchSampler: Bucket 43 with boundary 20.6-21.0 and batch_size 28: Num Examples 5.0, Num Full Batches 0.000, Pad Factor 1.108.
speechbrain.dataio.sampler - DynamicBatchSampler: Bucket 44 with boundary 21.0-21.3 and batch_size 28: Num Examples 4.0, Num Full Batches 0.000, Pad Factor 1.090.
speechbrain.dataio.sampler - DynamicBatchSampler: Bucket 45 with boundary 21.3-21.7 and batch_size 27: Num Examples 7.0, Num Full Batches 0.000, Pad Factor 1.346.
speechbrain.dataio.sampler - DynamicBatchSampler: Bucket 46 with boundary 21.7-22.0 and batch_size 27: Num Examples 1.0, Num Full Batches 0.000, Pad Factor 0.000.
speechbrain.dataio.sampler - DynamicBatchSampler: Bucket 47 with boundary 22.0-22.4 and batch_size 26: Num Examples 2.0, Num Full Batches 0.000, Pad Factor 0.920.
speechbrain.dataio.sampler - DynamicBatchSampler: Bucket 48 with boundary 22.4-22.8 and batch_size 26: Num Examples 6.0, Num Full Batches 0.000, Pad Factor 1.242.
speechbrain.dataio.sampler - DynamicBatchSampler: Bucket 49 with boundary 22.8-23.1 and batch_size 25: Num Examples 2.0, Num Full Batches 0.000, Pad Factor 0.459.
speechbrain.dataio.sampler - DynamicBatchSampler: Bucket 50 with boundary 23.1-23.5 and batch_size 25: Num Examples 1.0, Num Full Batches 0.000, Pad Factor 0.000.
speechbrain.dataio.sampler - DynamicBatchSampler: Bucket 51 with boundary 23.5-23.9 and batch_size 25: Num Examples 2.0, Num Full Batches 0.000, Pad Factor 0.421.
speechbrain.dataio.sampler - DynamicBatchSampler: Bucket 52 with boundary 23.9-24.2 and batch_size 24: Num Examples 2.0, Num Full Batches 0.000, Pad Factor 0.042.
speechbrain.dataio.sampler - DynamicBatchSampler: Bucket 53 with boundary 24.2-24.6 and batch_size 24: Num Examples 2.0, Num Full Batches 0.000, Pad Factor 0.839.
speechbrain.dataio.sampler - DynamicBatchSampler: Bucket 54 with boundary 24.6-25.0 and batch_size 24: Num Examples 2.0, Num Full Batches 0.000, Pad Factor 0.121.
speechbrain.dataio.sampler - DynamicBatchSampler: Bucket 55 with boundary 25.0-25.3 and batch_size 23: Num Examples 2.0, Num Full Batches 0.000, Pad Factor 0.994.
speechbrain.dataio.sampler - DynamicBatchSampler: Bucket 56 with boundary 25.3-25.7 and batch_size 23: Num Examples 0.0, Num Full Batches 0.000, Pad Factor 0.000.
speechbrain.dataio.sampler - DynamicBatchSampler: Bucket 57 with boundary 25.7-26.1 and batch_size 22: Num Examples 2.0, Num Full Batches 0.000, Pad Factor 0.291.
speechbrain.dataio.sampler - DynamicBatchSampler: Bucket 58 with boundary 26.1-26.5 and batch_size 22: Num Examples 1.0, Num Full Batches 0.000, Pad Factor 0.000.
speechbrain.dataio.sampler - DynamicBatchSampler: Bucket 59 with boundary 26.5-26.9 and batch_size 22: Num Examples 1.0, Num Full Batches 0.000, Pad Factor 0.000.
speechbrain.dataio.sampler - DynamicBatchSampler: Bucket 60 with boundary 26.9-27.2 and batch_size 22: Num Examples 1.0, Num Full Batches 0.000, Pad Factor 0.000.
speechbrain.dataio.sampler - DynamicBatchSampler: Bucket 61 with boundary 27.2-27.6 and batch_size 21: Num Examples 0.0, Num Full Batches 0.000, Pad Factor 0.000.
speechbrain.dataio.sampler - DynamicBatchSampler: Bucket 62 with boundary 27.6-28.0 and batch_size 21: Num Examples 1.0, Num Full Batches 0.000, Pad Factor 0.000.
speechbrain.dataio.sampler - DynamicBatchSampler: Bucket 63 with boundary 28.0-28.4 and batch_size 21: Num Examples 0.0, Num Full Batches 0.000, Pad Factor 0.000.
speechbrain.dataio.sampler - DynamicBatchSampler: Bucket 64 with boundary 28.4-28.8 and batch_size 20: Num Examples 0.0, Num Full Batches 0.000, Pad Factor 0.000.
speechbrain.dataio.sampler - DynamicBatchSampler: Bucket 65 with boundary 28.8-29.2 and batch_size 20: Num Examples 0.0, Num Full Batches 0.000, Pad Factor 0.000.
speechbrain.dataio.sampler - DynamicBatchSampler: Bucket 66 with boundary 29.2-29.6 and batch_size 20: Num Examples 0.0, Num Full Batches 0.000, Pad Factor 0.000.
speechbrain.dataio.sampler - DynamicBatchSampler: Bucket 67 with boundary 29.6-30.0 and batch_size 19: Num Examples 1.0, Num Full Batches 0.000, Pad Factor 0.000.
speechbrain.dataio.sampler - DynamicBatchSampler: Bucket 68 with boundary 30.0-30.4 and batch_size 19: Num Examples 0.0, Num Full Batches 0.000, Pad Factor 0.000.
speechbrain.dataio.sampler - DynamicBatchSampler: Bucket 69 with boundary 30.4-30.9 and batch_size 19: Num Examples 0.0, Num Full Batches 0.000, Pad Factor 0.000.
speechbrain.dataio.sampler - DynamicBatchSampler: Bucket 70 with boundary 30.9-31.3 and batch_size 19: Num Examples 0.0, Num Full Batches 0.000, Pad Factor 0.000.
speechbrain.dataio.sampler - DynamicBatchSampler: Bucket 71 with boundary 31.3-31.7 and batch_size 18: Num Examples 0.0, Num Full Batches 0.000, Pad Factor 0.000.
speechbrain.dataio.sampler - DynamicBatchSampler: Bucket 72 with boundary 31.7-32.1 and batch_size 18: Num Examples 0.0, Num Full Batches 0.000, Pad Factor 0.000.
speechbrain.dataio.sampler - DynamicBatchSampler: Bucket 73 with boundary 32.1-32.5 and batch_size 18: Num Examples 0.0, Num Full Batches 0.000, Pad Factor 0.000.
speechbrain.dataio.sampler - DynamicBatchSampler: Bucket 74 with boundary 32.5-33.0 and batch_size 18: Num Examples 0.0, Num Full Batches 0.000, Pad Factor 0.000.
speechbrain.dataio.sampler - DynamicBatchSampler: Bucket 75 with boundary 33.0-33.4 and batch_size 17: Num Examples 0.0, Num Full Batches 0.000, Pad Factor 0.000.
speechbrain.dataio.sampler - DynamicBatchSampler: Bucket 76 with boundary 33.4-33.9 and batch_size 17: Num Examples 0.0, Num Full Batches 0.000, Pad Factor 0.000.
speechbrain.dataio.sampler - DynamicBatchSampler: Bucket 77 with boundary 33.9-34.3 and batch_size 17: Num Examples 0.0, Num Full Batches 0.000, Pad Factor 0.000.
speechbrain.dataio.sampler - DynamicBatchSampler: Bucket 78 with boundary 34.3-34.7 and batch_size 17: Num Examples 0.0, Num Full Batches 0.000, Pad Factor 0.000.
speechbrain.dataio.sampler - DynamicBatchSampler: Bucket 79 with boundary 34.7-35.2 and batch_size 17: Num Examples 0.0, Num Full Batches 0.000, Pad Factor 0.000.
speechbrain.dataio.sampler - DynamicBatchSampler: Bucket 80 with boundary 35.2-35.6 and batch_size 16: Num Examples 0.0, Num Full Batches 0.000, Pad Factor 0.000

....

speechbrain.dataio.sampler - DynamicBatchSampler: Bucket 191 with boundary 236.7-248.9 and batch_size 2: Num Examples 0.0, Num Full Batches 0.000, Pad Factor 0.000.
speechbrain.dataio.sampler - DynamicBatchSampler: Bucket 192 with boundary 248.9-263.1 and batch_size 2: Num Examples 0.0, Num Full Batches 0.000, Pad Factor 0.000.
speechbrain.dataio.sampler - DynamicBatchSampler: Bucket 193 with boundary 263.1-279.7 and batch_size 2: Num Examples 0.0, Num Full Batches 0.000, Pad Factor 0.000.
speechbrain.dataio.sampler - DynamicBatchSampler: Bucket 194 with boundary 279.7-299.6 and batch_size 2: Num Examples 0.0, Num Full Batches 0.000, Pad Factor 0.000.
speechbrain.dataio.sampler - DynamicBatchSampler: Bucket 195 with boundary 299.6-324.2 and batch_size 1: Num Examples 0.0, Num Full Batches 0.000, Pad Factor 0.000.
speechbrain.dataio.sampler - DynamicBatchSampler: Bucket 196 with boundary 324.2-356.1 and batch_size 1: Num Examples 0.0, Num Full Batches 0.000, Pad Factor 0.000.
speechbrain.dataio.sampler - DynamicBatchSampler: Bucket 197 with boundary 356.1-400.0 and batch_size 1: Num Examples 0.0, Num Full Batches 0.000, Pad Factor 0.000.
speechbrain.dataio.sampler - DynamicBatchSampler: Bucket 198 with boundary 400.0-467.6 and batch_size 1: Num Examples 0.0, Num Full Batches 0.000, Pad Factor 0.000.
speechbrain.dataio.sampler - DynamicBatchSampler: Bucket 199 with boundary 467.6-600.0 and batch_size 1: Num Examples 0.0, Num Full Batches 0.000, Pad Factor 0.000.

After:

speechbrain.dataio.sampler - DynamicBatchSampler: Generating dynamic batches
speechbrain.dataio.sampler - DynamicBatchSampler: Bucket 0 with boundary 0.0-3.5 and batch_size 32: Num Examples 11505, Num Full Batches 52, Pad Factor 96.701.
speechbrain.dataio.sampler - DynamicBatchSampler: Bucket 1 with boundary 3.5-4.4 and batch_size 32: Num Examples 7920, Num Full Batches 52, Pad Factor 24.668.
speechbrain.dataio.sampler - DynamicBatchSampler: Bucket 2 with boundary 4.4-5.2 and batch_size 32: Num Examples 6010, Num Full Batches 48, Pad Factor 15.465.
speechbrain.dataio.sampler - DynamicBatchSampler: Bucket 3 with boundary 5.2-5.8 and batch_size 32: Num Examples 5231, Num Full Batches 48, Pad Factor 11.521.
speechbrain.dataio.sampler - DynamicBatchSampler: Bucket 4 with boundary 5.8-6.4 and batch_size 32: Num Examples 4479, Num Full Batches 45, Pad Factor 9.315.
speechbrain.dataio.sampler - DynamicBatchSampler: Bucket 5 with boundary 6.4-6.9 and batch_size 32: Num Examples 3914, Num Full Batches 43, Pad Factor 7.799.
speechbrain.dataio.sampler - DynamicBatchSampler: Bucket 6 with boundary 6.9-7.4 and batch_size 32: Num Examples 3923, Num Full Batches 46, Pad Factor 6.829.
speechbrain.dataio.sampler - DynamicBatchSampler: Bucket 7 with boundary 7.4-7.9 and batch_size 32: Num Examples 3677, Num Full Batches 46, Pad Factor 6.068.
speechbrain.dataio.sampler - DynamicBatchSampler: Bucket 8 with boundary 7.9-8.3 and batch_size 32: Num Examples 3484, Num Full Batches 47, Pad Factor 5.420.
speechbrain.dataio.sampler - DynamicBatchSampler: Bucket 9 with boundary 8.3-8.8 and batch_size 32: Num Examples 3433, Num Full Batches 48, Pad Factor 5.026.
speechbrain.dataio.sampler - DynamicBatchSampler: Bucket 10 with boundary 8.8-9.2 and batch_size 32: Num Examples 3516, Num Full Batches 52, Pad Factor 4.617.
speechbrain.dataio.sampler - DynamicBatchSampler: Bucket 11 with boundary 9.2-9.6 and batch_size 32: Num Examples 3434, Num Full Batches 53, Pad Factor 4.254.
speechbrain.dataio.sampler - DynamicBatchSampler: Bucket 12 with boundary 9.6-10.0 and batch_size 32: Num Examples 3696, Num Full Batches 60, Pad Factor 4.029.
speechbrain.dataio.sampler - DynamicBatchSampler: Bucket 13 with boundary 10.0-10.4 and batch_size 32: Num Examples 3716, Num Full Batches 63, Pad Factor 3.775.
speechbrain.dataio.sampler - DynamicBatchSampler: Bucket 14 with boundary 10.4-10.8 and batch_size 32: Num Examples 4004, Num Full Batches 70, Pad Factor 3.543.
speechbrain.dataio.sampler - DynamicBatchSampler: Bucket 15 with boundary 10.8-11.2 and batch_size 32: Num Examples 4395, Num Full Batches 80, Pad Factor 3.420.
speechbrain.dataio.sampler - DynamicBatchSampler: Bucket 16 with boundary 11.2-11.5 and batch_size 32: Num Examples 4897, Num Full Batches 92, Pad Factor 3.219.
speechbrain.dataio.sampler - DynamicBatchSampler: Bucket 17 with boundary 11.5-11.9 and batch_size 32: Num Examples 5769, Num Full Batches 112, Pad Factor 3.117.
speechbrain.dataio.sampler - DynamicBatchSampler: Bucket 18 with boundary 11.9-12.3 and batch_size 32: Num Examples 6887, Num Full Batches 138, Pad Factor 2.980.
speechbrain.dataio.sampler - DynamicBatchSampler: Bucket 19 with boundary 12.3-12.6 and batch_size 32: Num Examples 7934, Num Full Batches 164, Pad Factor 2.854.
speechbrain.dataio.sampler - DynamicBatchSampler: Bucket 20 with boundary 12.6-13.0 and batch_size 32: Num Examples 9430, Num Full Batches 201, Pad Factor 2.736.
speechbrain.dataio.sampler - DynamicBatchSampler: Bucket 21 with boundary 13.0-13.3 and batch_size 32: Num Examples 11227, Num Full Batches 246, Pad Factor 2.661.
speechbrain.dataio.sampler - DynamicBatchSampler: Bucket 22 with boundary 13.3-13.7 and batch_size 32: Num Examples 13532, Num Full Batches 304, Pad Factor 2.591.
speechbrain.dataio.sampler - DynamicBatchSampler: Bucket 23 with boundary 13.7-14.0 and batch_size 32: Num Examples 15688, Num Full Batches 362, Pad Factor 2.490.
speechbrain.dataio.sampler - DynamicBatchSampler: Bucket 24 with boundary 14.0-14.4 and batch_size 32: Num Examples 17933, Num Full Batches 424, Pad Factor 2.428.
speechbrain.dataio.sampler - DynamicBatchSampler: Bucket 25 with boundary 14.4-14.7 and batch_size 32: Num Examples 19787, Num Full Batches 480, Pad Factor 2.370.
speechbrain.dataio.sampler - DynamicBatchSampler: Bucket 26 with boundary 14.7-15.1 and batch_size 32: Num Examples 20922, Num Full Batches 519, Pad Factor 2.282.
speechbrain.dataio.sampler - DynamicBatchSampler: Bucket 27 with boundary 15.1-15.4 and batch_size 32: Num Examples 21439, Num Full Batches 544, Pad Factor 2.230.
speechbrain.dataio.sampler - DynamicBatchSampler: Bucket 28 with boundary 15.4-15.8 and batch_size 32: Num Examples 20907, Num Full Batches 543, Pad Factor 2.181.
speechbrain.dataio.sampler - DynamicBatchSampler: Bucket 29 with boundary 15.8-16.1 and batch_size 32: Num Examples 15911, Num Full Batches 422, Pad Factor 2.136.
speechbrain.dataio.sampler - DynamicBatchSampler: Bucket 30 with boundary 16.1-16.5 and batch_size 32: Num Examples 7192, Num Full Batches 194, Pad Factor 2.091.
speechbrain.dataio.sampler - DynamicBatchSampler: Bucket 31 with boundary 16.5-16.8 and batch_size 32: Num Examples 3720, Num Full Batches 103, Pad Factor 2.046.
speechbrain.dataio.sampler - DynamicBatchSampler: Bucket 32 with boundary 16.8-17.1 and batch_size 32: Num Examples 1487, Num Full Batches 41, Pad Factor 2.008.
speechbrain.dataio.sampler - DynamicBatchSampler: Bucket 33 with boundary 17.1-17.5 and batch_size 32: Num Examples 122, Num Full Batches 3, Pad Factor 1.828.
speechbrain.dataio.sampler - DynamicBatchSampler: Bucket 34 with boundary 17.5-17.8 and batch_size 32: Num Examples 12, Num Full Batches 0, Pad Factor 1.613.
speechbrain.dataio.sampler - DynamicBatchSampler: Bucket 35 with boundary 17.8-18.2 and batch_size 32: Num Examples 6, Num Full Batches 0, Pad Factor 1.224.
speechbrain.dataio.sampler - DynamicBatchSampler: Bucket 36 with boundary 18.2-18.5 and batch_size 32: Num Examples 9, Num Full Batches 0, Pad Factor 1.445.
speechbrain.dataio.sampler - DynamicBatchSampler: Bucket 37 with boundary 18.5-18.9 and batch_size 31: Num Examples 9, Num Full Batches 0, Pad Factor 1.124.
speechbrain.dataio.sampler - DynamicBatchSampler: Bucket 38 with boundary 18.9-19.2 and batch_size 31: Num Examples 9, Num Full Batches 0, Pad Factor 1.413.
speechbrain.dataio.sampler - DynamicBatchSampler: Bucket 39 with boundary 19.2-19.6 and batch_size 30: Num Examples 5, Num Full Batches 0, Pad Factor 1.448.
speechbrain.dataio.sampler - DynamicBatchSampler: Bucket 40 with boundary 19.6-19.9 and batch_size 30: Num Examples 14, Num Full Batches 0, Pad Factor 1.417.
speechbrain.dataio.sampler - DynamicBatchSampler: Bucket 41 with boundary 19.9-20.3 and batch_size 29: Num Examples 7, Num Full Batches 0, Pad Factor 1.369.
speechbrain.dataio.sampler - DynamicBatchSampler: Bucket 42 with boundary 20.3-20.6 and batch_size 29: Num Examples 4, Num Full Batches 0, Pad Factor 0.637.
speechbrain.dataio.sampler - DynamicBatchSampler: Bucket 43 with boundary 20.6-21.0 and batch_size 28: Num Examples 5, Num Full Batches 0, Pad Factor 1.108.
speechbrain.dataio.sampler - DynamicBatchSampler: Bucket 44 with boundary 21.0-21.3 and batch_size 28: Num Examples 4, Num Full Batches 0, Pad Factor 1.090.
speechbrain.dataio.sampler - DynamicBatchSampler: Bucket 45 with boundary 21.3-21.7 and batch_size 27: Num Examples 7, Num Full Batches 0, Pad Factor 1.346.
speechbrain.dataio.sampler - DynamicBatchSampler: Bucket 46 with boundary 21.7-22.0 and batch_size 27: Num Examples 1, Num Full Batches 0, Pad Factor 0.000.
speechbrain.dataio.sampler - DynamicBatchSampler: Bucket 47 with boundary 22.0-22.4 and batch_size 26: Num Examples 2, Num Full Batches 0, Pad Factor 0.920.
speechbrain.dataio.sampler - DynamicBatchSampler: Bucket 48 with boundary 22.4-22.8 and batch_size 26: Num Examples 6, Num Full Batches 0, Pad Factor 1.242.
speechbrain.dataio.sampler - DynamicBatchSampler: Bucket 49 with boundary 22.8-23.1 and batch_size 25: Num Examples 2, Num Full Batches 0, Pad Factor 0.459.
speechbrain.dataio.sampler - DynamicBatchSampler: Bucket 50 with boundary 23.1-23.5 and batch_size 25: Num Examples 1, Num Full Batches 0, Pad Factor 0.000.
speechbrain.dataio.sampler - DynamicBatchSampler: Bucket 51 with boundary 23.5-23.9 and batch_size 25: Num Examples 2, Num Full Batches 0, Pad Factor 0.421.
speechbrain.dataio.sampler - DynamicBatchSampler: Bucket 52 with boundary 23.9-24.2 and batch_size 24: Num Examples 2, Num Full Batches 0, Pad Factor 0.042.
speechbrain.dataio.sampler - DynamicBatchSampler: Bucket 53 with boundary 24.2-24.6 and batch_size 24: Num Examples 2, Num Full Batches 0, Pad Factor 0.839.
speechbrain.dataio.sampler - DynamicBatchSampler: Bucket 54 with boundary 24.6-25.0 and batch_size 24: Num Examples 2, Num Full Batches 0, Pad Factor 0.121.
speechbrain.dataio.sampler - DynamicBatchSampler: Bucket 55 with boundary 25.0-25.3 and batch_size 23: Num Examples 2, Num Full Batches 0, Pad Factor 0.994.
speechbrain.dataio.sampler - DynamicBatchSampler: Bucket 56 with boundary 25.3-25.7 and batch_size 23: Num Examples 0, Num Full Batches 0, Pad Factor 0.000.
speechbrain.dataio.sampler - DynamicBatchSampler: Bucket 57 with boundary 25.7-26.1 and batch_size 22: Num Examples 2, Num Full Batches 0, Pad Factor 0.291.
speechbrain.dataio.sampler - DynamicBatchSampler: Bucket 58 with boundary 26.1-26.5 and batch_size 22: Num Examples 1, Num Full Batches 0, Pad Factor 0.000.
speechbrain.dataio.sampler - DynamicBatchSampler: Bucket 59 with boundary 26.5-26.9 and batch_size 22: Num Examples 1, Num Full Batches 0, Pad Factor 0.000.
speechbrain.dataio.sampler - DynamicBatchSampler: Bucket 60 with boundary 26.9-27.2 and batch_size 22: Num Examples 1, Num Full Batches 0, Pad Factor 0.000.
speechbrain.dataio.sampler - DynamicBatchSampler: Bucket 61 with boundary 27.2-27.6 and batch_size 21: Num Examples 0, Num Full Batches 0, Pad Factor 0.000.
speechbrain.dataio.sampler - DynamicBatchSampler: Bucket 62 with boundary 27.6-28.0 and batch_size 21: Num Examples 1, Num Full Batches 0, Pad Factor 0.000.
speechbrain.dataio.sampler - DynamicBatchSampler: Bucket 63 with boundary 28.0-28.4 and batch_size 21: Num Examples 0, Num Full Batches 0, Pad Factor 0.000.
speechbrain.dataio.sampler - DynamicBatchSampler: Bucket 64 with boundary 28.4-28.8 and batch_size 20: Num Examples 0, Num Full Batches 0, Pad Factor 0.000.
speechbrain.dataio.sampler - DynamicBatchSampler: Bucket 65 with boundary 28.8-29.2 and batch_size 20: Num Examples 0, Num Full Batches 0, Pad Factor 0.000.
speechbrain.dataio.sampler - DynamicBatchSampler: Bucket 66 with boundary 29.2-29.6 and batch_size 20: Num Examples 0, Num Full Batches 0, Pad Factor 0.000.
speechbrain.dataio.sampler - DynamicBatchSampler: Bucket 67 with boundary 29.6-30.0 and batch_size 19: Num Examples 1, Num Full Batches 0, Pad Factor 0.000.
speechbrain.dataio.sampler - DynamicBatchSampler: Bucket 68 with boundary 30.0-30.4 and batch_size 19: Num Examples 0, Num Full Batches 0, Pad Factor 0.000.
speechbrain.dataio.sampler - DynamicBatchSampler: Bucket 69 with boundary 30.4-30.9 and batch_size 19: Num Examples 0, Num Full Batches 0, Pad Factor 0.000.
speechbrain.dataio.sampler - DynamicBatchSampler: Bucket 70 with boundary 30.9-31.3 and batch_size 19: Num Examples 0, Num Full Batches 0, Pad Factor 0.000.
speechbrain.dataio.sampler - DynamicBatchSampler: Bucket 71 with boundary 31.3-31.7 and batch_size 18: Num Examples 0, Num Full Batches 0, Pad Factor 0.000.
speechbrain.dataio.sampler - DynamicBatchSampler: Bucket 72 with boundary 31.7-32.1 and batch_size 18: Num Examples 0, Num Full Batches 0, Pad Factor 0.000.
speechbrain.dataio.sampler - DynamicBatchSampler: Bucket 73 with boundary 32.1-32.5 and batch_size 18: Num Examples 0, Num Full Batches 0, Pad Factor 0.000.
speechbrain.dataio.sampler - DynamicBatchSampler: Bucket 74 with boundary 32.5-33.0 and batch_size 18: Num Examples 0, Num Full Batches 0, Pad Factor 0.000.
speechbrain.dataio.sampler - DynamicBatchSampler: Bucket 75 with boundary 33.0-33.4 and batch_size 17: Num Examples 0, Num Full Batches 0, Pad Factor 0.000.
speechbrain.dataio.sampler - DynamicBatchSampler: Bucket 76 with boundary 33.4-33.9 and batch_size 17: Num Examples 0, Num Full Batches 0, Pad Factor 0.000.
speechbrain.dataio.sampler - DynamicBatchSampler: Bucket 77 with boundary 33.9-34.3 and batch_size 17: Num Examples 0, Num Full Batches 0, Pad Factor 0.000.
speechbrain.dataio.sampler - DynamicBatchSampler: Bucket 78 with boundary 34.3-34.7 and batch_size 17: Num Examples 0, Num Full Batches 0, Pad Factor 0.000.
speechbrain.dataio.sampler - DynamicBatchSampler: Bucket 79 with boundary 34.7-35.2 and batch_size 17: Num Examples 0, Num Full Batches 0, Pad Factor 0.000.
speechbrain.dataio.sampler - DynamicBatchSampler: Bucket 80 with boundary 35.2-35.6 and batch_size 16: Num Examples 0, Num Full Batches 0, Pad Factor 0.000.
....


speechbrain.dataio.sampler - DynamicBatchSampler: Bucket 160 with boundary 104.3-106.1 and batch_size 5: Num Examples 0, Num Full Batches 0, Pad Factor 0.000.
speechbrain.dataio.sampler - DynamicBatchSampler: Bucket 161 with boundary 106.1-108.0 and batch_size 5: Num Examples 0, Num Full Batches 0, Pad Factor 0.000.
speechbrain.dataio.sampler - DynamicBatchSampler: Bucket 162 with boundary 108.0-110.0 and batch_size 5: Num Examples 0, Num Full Batches 0, Pad Factor 0.000.
speechbrain.dataio.sampler - DynamicBatchSampler: Bucket 163 with boundary 110.0-112.1 and batch_size 5: Num Examples 0, Num Full Batches 0, Pad Factor 0.000.
speechbrain.dataio.sampler - DynamicBatchSampler: Bucket 164 with boundary 112.1-114.2 and batch_size 5: Num Examples 0, Num Full Batches 0, Pad Factor 0.000.
speechbrain.dataio.sampler - DynamicBatchSampler: Bucket 165 with boundary 114.2-116.4 and batch_size 5: Num Examples 0, Num Full Batches 0, Pad Factor 0.000.
speechbrain.dataio.sampler - DynamicBatchSampler: Bucket 166 with boundary 116.4-118.7 and batch_size 5: Num Examples 0, Num Full Batches 0, Pad Factor 0.000.
speechbrain.dataio.sampler - DynamicBatchSampler: Bucket 167 with boundary 118.7-121.1 and batch_size 4: Num Examples 0, Num Full Batches 0, Pad Factor 0.000.
speechbrain.dataio.sampler - DynamicBatchSampler: Bucket 168 with boundary 121.1-123.6 and batch_size 4: Num Examples 0, Num Full Batches 0, Pad Factor 0.000.
speechbrain.dataio.sampler - DynamicBatchSampler: Bucket 169 with boundary 123.6-126.2 and batch_size 4: Num Examples 0, Num Full Batches 0, Pad Factor 0.000.
speechbrain.dataio.sampler - DynamicBatchSampler: Bucket 170 with boundary 126.2-128.9 and batch_size 4: Num Examples 0, Num Full Batches 0, Pad Factor 0.000.
speechbrain.dataio.sampler - DynamicBatchSampler: Bucket 171 with boundary 128.9-131.7 and batch_size 4: Num Examples 0, Num Full Batches 0, Pad Factor 0.000.
speechbrain.dataio.sampler - DynamicBatchSampler: Bucket 172 with boundary 131.7-134.7 and batch_size 4: Num Examples 0, Num Full Batches 0, Pad Factor 0.000.
speechbrain.dataio.sampler - DynamicBatchSampler: Bucket 173 with boundary 134.7-137.8 and batch_size 4: Num Examples 0, Num Full Batches 0, Pad Factor 0.000.
speechbrain.dataio.sampler - DynamicBatchSampler: Bucket 174 with boundary 137.8-141.0 and batch_size 4: Num Examples 0, Num Full Batches 0, Pad Factor 0.000.
speechbrain.dataio.sampler - DynamicBatchSampler: Bucket 175 with boundary 141.0-144.4 and batch_size 4: Num Examples 0, Num Full Batches 0, Pad Factor 0.000.
speechbrain.dataio.sampler - DynamicBatchSampler: Bucket 176 with boundary 144.4-148.0 and batch_size 4: Num Examples 0, Num Full Batches 0, Pad Factor 0.000.
speechbrain.dataio.sampler - DynamicBatchSampler: Bucket 177 with boundary 148.0-151.8 and batch_size 3: Num Examples 0, Num Full Batches 0, Pad Factor 0.000.
speechbrain.dataio.sampler - DynamicBatchSampler: Bucket 178 with boundary 151.8-155.8 and batch_size 3: Num Examples 0, Num Full Batches 0, Pad Factor 0.000.
speechbrain.dataio.sampler - DynamicBatchSampler: Bucket 179 with boundary 155.8-160.1 and batch_size 3: Num Examples 0, Num Full Batches 0, Pad Factor 0.000.
speechbrain.dataio.sampler - DynamicBatchSampler: Bucket 180 with boundary 160.1-164.6 and batch_size 3: Num Examples 0, Num Full Batches 0, Pad Factor 0.000.
speechbrain.dataio.sampler - DynamicBatchSampler: Bucket 181 with boundary 164.6-169.5 and batch_size 3: Num Examples 0, Num Full Batches 0, Pad Factor 0.000.
speechbrain.dataio.sampler - DynamicBatchSampler: Bucket 182 with boundary 169.5-174.7 and batch_size 3: Num Examples 0, Num Full Batches 0, Pad Factor 0.000.
speechbrain.dataio.sampler - DynamicBatchSampler: Bucket 183 with boundary 174.7-180.2 and batch_size 3: Num Examples 0, Num Full Batches 0, Pad Factor 0.000.
speechbrain.dataio.sampler - DynamicBatchSampler: Bucket 184 with boundary 180.2-186.3 and batch_size 3: Num Examples 0, Num Full Batches 0, Pad Factor 0.000.
speechbrain.dataio.sampler - DynamicBatchSampler: Bucket 185 with boundary 186.3-192.8 and batch_size 3: Num Examples 0, Num Full Batches 0, Pad Factor 0.000.
speechbrain.dataio.sampler - DynamicBatchSampler: Bucket 186 with boundary 192.8-199.9 and batch_size 3: Num Examples 0, Num Full Batches 0, Pad Factor 0.000.
speechbrain.dataio.sampler - DynamicBatchSampler: Bucket 187 with boundary 199.9-207.7 and batch_size 2: Num Examples 0, Num Full Batches 0, Pad Factor 0.000.
speechbrain.dataio.sampler - DynamicBatchSampler: Bucket 188 with boundary 207.7-216.3 and batch_size 2: Num Examples 0, Num Full Batches 0, Pad Factor 0.000.
speechbrain.dataio.sampler - DynamicBatchSampler: Bucket 189 with boundary 216.3-225.9 and batch_size 2: Num Examples 0, Num Full Batches 0, Pad Factor 0.000.
speechbrain.dataio.sampler - DynamicBatchSampler: Bucket 190 with boundary 225.9-236.7 and batch_size 2: Num Examples 0, Num Full Batches 0, Pad Factor 0.000.
speechbrain.dataio.sampler - DynamicBatchSampler: Bucket 191 with boundary 236.7-248.9 and batch_size 2: Num Examples 0, Num Full Batches 0, Pad Factor 0.000.
speechbrain.dataio.sampler - DynamicBatchSampler: Bucket 192 with boundary 248.9-263.1 and batch_size 2: Num Examples 0, Num Full Batches 0, Pad Factor 0.000.
speechbrain.dataio.sampler - DynamicBatchSampler: Bucket 193 with boundary 263.1-279.7 and batch_size 2: Num Examples 0, Num Full Batches 0, Pad Factor 0.000.
speechbrain.dataio.sampler - DynamicBatchSampler: Bucket 194 with boundary 279.7-299.6 and batch_size 2: Num Examples 0, Num Full Batches 0, Pad Factor 0.000.
speechbrain.dataio.sampler - DynamicBatchSampler: Bucket 195 with boundary 299.6-324.2 and batch_size 1: Num Examples 0, Num Full Batches 0, Pad Factor 0.000.
speechbrain.dataio.sampler - DynamicBatchSampler: Bucket 196 with boundary 324.2-356.1 and batch_size 1: Num Examples 0, Num Full Batches 0, Pad Factor 0.000.
speechbrain.dataio.sampler - DynamicBatchSampler: Bucket 197 with boundary 356.1-400.0 and batch_size 1: Num Examples 0, Num Full Batches 0, Pad Factor 0.000.
speechbrain.dataio.sampler - DynamicBatchSampler: Bucket 198 with boundary 400.0-467.6 and batch_size 1: Num Examples 0, Num Full Batches 0, Pad Factor 0.000.
speechbrain.dataio.sampler - DynamicBatchSampler: Bucket 199 with boundary 467.6-600.0 and batch_size 1: Num Examples 0, Num Full Batches 0, Pad Factor 0.000. 

@Adel-Moumen Adel-Moumen self-assigned this Sep 26, 2023
@Adel-Moumen Adel-Moumen marked this pull request as ready for review September 26, 2023 16:01
@Adel-Moumen

Copy link
Copy Markdown
Collaborator Author

I had to skip some tests due to the latests merge in develop branch and unstable. Unfortunately, as the HF hub is not yet compatible with our new CTC/attention joint decoding, some tests are failing due to the new interface.

@mravanelli mravanelli left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @Adel-Moumen, thank you for this PR. Here are my comments:

  • I left some minor suggestions in the code directly. Please, take a look.
  • This functionality is important and can play a crucial role in future scaling up of speechbrain. I suggest designing a few unittest to make everything work fine (and will keep working as expected in the future).

Comment thread recipes/LibriSpeech/ASR/transformer/hparams/branchformer_large.yaml Outdated
Comment thread recipes/LibriSpeech/ASR/transformer/hparams/branchformer_large.yaml Outdated
Comment thread recipes/LibriSpeech/ASR/transformer/hparams/conformer_large.yaml Outdated
Comment thread recipes/LibriSpeech/ASR/transformer/hparams/conformer_large.yaml Outdated
Comment thread recipes/LibriSpeech/ASR/transformer/hparams/conformer_small.yaml Outdated
Comment thread recipes/LibriSpeech/ASR/transformer/hparams/conformer_small.yaml Outdated
Comment thread recipes/LibriSpeech/self-supervised-learning/wav2vec2/hparams/wav2vec2_base.yaml Outdated
Comment thread recipes/LibriSpeech/self-supervised-learning/wav2vec2/hparams/wav2vec2_base.yaml Outdated
Comment thread speechbrain/pretrained/interfaces.py
Comment thread speechbrain/utils/text_to_sequence.py
@mravanelli

Copy link
Copy Markdown
Collaborator

Thank you @Adel-Moumen. I'm fine with this PR now.

@mravanelli mravanelli merged commit 32608ab into unstable-v0.6 Sep 28, 2023
@mravanelli mravanelli deleted the revert-2173-revert-2170-fix-dynamic-batching branch September 28, 2023 16:05
mravanelli added a commit that referenced this pull request Jan 7, 2024
* Refactor HF interface, adapt recipes

* Refactor HF interface, adapt recipes

* Refactor HF interface, adapt recipes

* Refactor HF interface, adapt recipes

* Refactor HF interface, adapt recipes

* Refactor HF interface, adapt recipes

* Refactor HF interface, adapt recipes

* Refactor HF interface, adapt recipes

* Refactor HF interface, adapt recipes

* Refactor HF interface, adapt recipes

* Refactor HF interface, adapt recipes

* Refactor HF interface, adapt recipes

* Refactor HF interface, adapt recipes

* Refactor HF interface, adapt recipes

* Refactor HF interface, adapt recipes

* Refactor HF interface, adapt recipes

* Refactor HF interface, adapt recipes

* Refactor HF interface, adapt recipes

* Refactor HF interface, adapt recipes

* Refactor HF interface, adapt recipes

* Refactor HF interface, adapt recipes

* Refactor HF interface, adapt recipes

* Refactor HF interface, adapt recipes

* Refactor HF interface, adapt recipes

* Refactor HF interface, adapt recipes

* Refactor HF interface, adapt recipes

* Refactor HF interface, adapt recipes

* Refactor HF interface, adapt recipes

* Refactor HF interface, adapt recipes

* Refactor HF interface, adapt recipes

* Refactor HF interface, adapt recipes

* Refactor HF interface, adapt recipes

* Refactor HF interface, adapt recipes

* Refactor HF interface, adapt recipes

* Refactor HF interface, adapt recipes

* Refactor HF interface, adapt recipes

* Refactor HF interface, adapt recipes

* Refactor HF interface, adapt recipes

* Refactor HF interface, adapt recipes

* Refactor HF interface, adapt recipes

* Refactor HF interface, adapt recipes

* Refactor HF interface, adapt recipes

* Refactor HF interface, adapt recipes

* rename HF's files

* fix docstrings

* fix args docstrings

* fix docstrings

* change classes' names

* Refactor HF interface, adapt recipes

* Refactor HF interface, adapt recipes

* Refactor HF interface, adapt recipes

* Refactor HF interface, adapt recipes

* Refactor HF interface, adapt recipes

* Refactor HF interface, adapt recipes

* Refactor HF interface, adapt recipes

* Refactor HF interface, adapt recipes

* Refactor HF interface, adapt recipes

* Refactor HF interface, adapt recipes

* Refactor HF interface, adapt recipes

* Refactor HF interface, adapt recipes

* Refactor HF interface, adapt recipes

* Refactor HF interface, adapt recipes

* Refactor HF interface, adapt recipes

* Refactor HF interface, adapt recipes

* Refactor HF interface, adapt recipes

* Refactor HF interface, adapt recipes

* Refactor HF interface, adapt recipes

* Refactor HF interface, adapt recipes

* Refactor HF interface, adapt recipes

* Refactor HF interface, adapt recipes

* Refactor HF interface, adapt recipes

* Refactor HF interface, adapt recipes

* Refactor HF interface, adapt recipes

* Refactor HF interface, adapt recipes

* Refactor HF interface, adapt recipes

* Refactor HF interface, adapt recipes

* Refactor HF interface, adapt recipes

* Refactor HF interface, adapt recipes

* Refactor HF interface, adapt recipes

* Refactor HF interface, adapt recipes

* Refactor HF interface, adapt recipes

* Refactor HF interface, adapt recipes

* Refactor HF interface, adapt recipes

* Refactor HF interface, adapt recipes

* Refactor HF interface, adapt recipes

* Refactor HF interface, adapt recipes

* Refactor HF interface, adapt recipes

* Refactor HF interface, adapt recipes

* Refactor HF interface, adapt recipes

* Refactor HF interface, adapt recipes

* Refactor HF interface, adapt recipes

* Fix docstrings

* commonvoice

* switchboard

* update readme

* update readme

* update lionk in test file

* remove unused space token

* update torchaudio

* remove deprecated language model path

* fix merge

* fix vocab

* fix switchboard

* commit

* fix test

* fix style

* remove unsued hparam

* fix consistancy blank_skip_threshold

* text frames

* CTCPrefixBeamSearcher timestamps

* pre-commit

* test

* test 2

* fix prints

* update ctcprefixbeamsearch timestamps

* remove frames from prefix bs

* ≈Revert "remove frames from prefix bs"

This reverts commit 30900d9.

* remove prefix bs

* ≈Revert "remove prefix bs"

This reverts commit 2f0c3cd.

* Revert "update ctcprefixbeamsearch timestamps"

This reverts commit ce09e19.

* Revert "fix prints"

This reverts commit bf36037.

* Revert "test 2"

This reverts commit 84cda94.

* Revert "test"

This reverts commit f17349f.

* Revert "pre-commit"

This reverts commit 4e1cf0d.

* Revert "CTCPrefixBeamSearcher timestamps"

This reverts commit c3d3cf7.

* Revert "text frames"

This reverts commit e67c761.

* Revert "fix consistancy blank_skip_threshold"

This reverts commit f97a391.

* Update ctc.py

* arg / timestamps

* precommit

* timesteps -> text_frames

* ls seq2seq

* transformer ls

* fix naming

* librispeech

* aishell

* fix linter

* precommit

* switchboard

* timit

* Dynamic batching fixed

* authors

* fix conformer large

* indent

* Revert "Fix dynamic batching" (#2173)

* update doctest skip

* Fix dynamic batching (#2174)

* Revert "Revert "Fix dynamic batching" (#2173)"

This reverts commit faa5e76.

* Update interfaces.py

* Update interfaces.py

* Update text_to_sequence.py

* fix w2v

* aishell

* cv

* ls transformer

* ls ssl

* switchboard

* timit

* precommit

* fix indent

* fix arg

* unit test sorting

* unittests

* remove if main

* Small fixes in averaging checkpoints (#2181)

* add ckpt avg unittest

* avoid hard-coding number of averages

* last fixes

* fix recipe test

* fix recipe test

* convert print into logger

* fix transducer recipe

* remove typing

* fix merge

* precommit

* Update LibriSpeech.csv

* update to new dynamic batching args

* Update unstable branch with new commits  (#2196)

* hyper branch/conf -former fixes

* remove ctc.py from doctest

* get back ctc.py

* remove doctest for torchaudio

* adapt gpt recipe

* adapt gpt recipe

* small follow up fix on openrir

* remove doc test (for now)

* fix issue greedy search

* docstring

* pre-commit

* Fix issues unstable (#2216)

Thank you @Adel-Moumen! I did the tests again and everything works now. As for your points on the recipe tests, I agree. We can eventually do that in another PR.

* Fix missing file / import in huggingface_transformers (#2224)

* init/imports

* comment

* add partial import

* wav2vec -> wav2vec2

* fix ci

* Text based HF (#2214)

* add mbart

* Add tristage scheduler

* Add mbart beam search

* Add IWLST recipes

* Add new models' inteference interface

* Add info of new models

* Add nllb scores

* Add new models' info

* Add test info IWSLT recipe

* Add test info IWSLT recipe

* add docstrings for S2STransformerBeamSearcher

* Update IWSLT recipes

* Update IWSLT recipes

* fix doctest

* add requirements

* add protobuf

* fix doctest

* small fixes

* Add protobuf install

* Minor reform

* Remove protobuf

* Fix docstings

* Fix docstrings

* minor reform

* remove labse

* change authorship

* remove comments

* minor changes

* change authorship

* Fix recipe test

* add info

* Update README.md

* Update README.md

* change recipe structure

---------

Co-authored-by: Mirco Ravanelli <[email protected]>
Co-authored-by: Adel Moumen <[email protected]>

* Neural LM Rescoring (#2187)

* baserescorerinterface

* add rescorers

* first attempt

* update code

* 1.57 wer

* update

* update code

* update code

* docstring example rnn

* updata loader

* docstring example

* tests

* docstring example

* update

* tmpdir

* change path

* update doc

* docstring

* docstring args

* doctest

* fix docstring example

* unnittest

* interface

* yamls update

* full_infernece tests

* model link

* readme

* yaml/inference tests

* update res

* fix wav2vec with wav2vec2

---------

Co-authored-by: Mirco Ravanelli <[email protected]>

* Add wrappers for Encodec and Vocos vocoders (#2231)

* Add wrappers for Encodec and Vocos from Huggingface

* Encodec: Add a comment

* Encodec/Vocos: Add examples, restructure, fix masks

* Vocos: Add a comment about the open pull request

* Encodec/Vocos: Add the ability to customize save_path, fix a log message

* Encodec/Vocos: Cosmetic changes

* Vocos: Cosmetic changes

* Encodec/Vocos: Remove the mandatory Vocos requirement

* Vocos: Remove vocos from __init__.py

* fix init

* Vocos: Add a check for vocos in conftest.py

* Vocos/Encodec: Update documentation, add bandwidth control

* Fix old path in conftest.py

* Cosmetic changes

* Encodec/Vocos: Add support for embedding vectors

* Encodec: Update example

* Encodec/Vocodec: Add automatic reshaping, minor cosmetic changes

---------

Co-authored-by: flexthink <[email protected]>
Co-authored-by: Mirco Ravanelli <[email protected]>

* Semantically-Aligned Multimodal Utterance-level (SAMU) pre-training (#2223)

* add mbart

* Add tristage scheduler

* Add mbart beam search

* Add IWLST recipes

* Add new models' inteference interface

* Add info of new models

* Add nllb scores

* Add new models' info

* Add test info IWSLT recipe

* Add test info IWSLT recipe

* add docstrings for S2STransformerBeamSearcher

* Update IWSLT recipes

* Update IWSLT recipes

* fix doctest

* add requirements

* add protobuf

* fix doctest

* small fixes

* Add protobuf install

* Minor reform

* Remove protobuf

* Fix docstings

* Fix docstrings

* minor reform

* remove labse

* Add attention pooling

* Add labse

* Add info about SAMU

* add iwslt recipes with samu

* fix recipe test

* fix comments

* fix recipe test

* change recipe structure

* fix test recipe

* Add new recipes

* minor doctest change

* minor doctest change

* small changes

* add dropbox links

---------

Co-authored-by: Mirco Ravanelli <[email protected]>

* fix norm (#2237)

* Discrete SSL (#2233)

* clustering training recipies for LibriSpeech for different SSL model

* add Discrete Hubert Model

* load from HF, fix minor issues

* fix hyper-param value

* fix precommit

* fix flake8

* fix batch_size and n_clus values in hyperparams

* fix typos

* fix typo and some cleaning

* fix precommit

* fix device incompatibility and memroty issue

* use fit instead of partial fit

* add README file

* add test recipies

* remove unused fields from hparams

* fix precommmit-yamllint - extra whitespace

* add docstring for load_kmeans for Discrete_hubert.py

* add discrete wavlm, wav2vec

* avoid docstring testing for discrete_ssl models

* fix docstring failed issue

* add discrete_interface to conftest.py

* fix precommit

* Fixes for Encodec (#2240)

* Add wrappers for Encodec and Vocos from Huggingface

* Encodec: Add a comment

* Encodec/Vocos: Add examples, restructure, fix masks

* Vocos: Add a comment about the open pull request

* Encodec/Vocos: Add the ability to customize save_path, fix a log message

* Encodec/Vocos: Cosmetic changes

* Vocos: Cosmetic changes

* Encodec/Vocos: Remove the mandatory Vocos requirement

* Vocos: Remove vocos from __init__.py

* fix init

* Vocos: Add a check for vocos in conftest.py

* Vocos/Encodec: Update documentation, add bandwidth control

* Fix old path in conftest.py

* Cosmetic changes

* Encodec/Vocos: Add support for embedding vectors

* Encodec: Update example

* Encodec/Vocodec: Add automatic reshaping, minor cosmetic changes

* Encodec: Decoupled token extraction, fixed CPU/GPU issues

* Encodec: Add renormalization

---------

Co-authored-by: flexthink <[email protected]>
Co-authored-by: Mirco Ravanelli <[email protected]>

* Refactoring of the 'fit_batch' function (#2010)

* add dataclass

* turn False

* remove valid_step

* update core.py

* update core.py

* update core.py

* precommit

* self.autocast + GradScaler enabled

* freeze opt

* naming

* update core.py

* comments

* example transducer conformer

* update core.py

* small changes

* naming + skip_grad_nans

* doc

* check

* support cpu training

* precision + doctrsting

* name

* change w2v

* restore ckpt

* remove file

* remove casting

* tests

* whisper + fix tests

* seq2seq ls

* update transducer / transformer

* remove on_optimizers_step_end + comments

* update check yaml

* remove default arg

* add precision in yamls

* add precision inside of the yamls

* ckpt and scaler

* run_opt outside brain + test

* several recipe updates

* improve w2v fit_batch fn

* add arg

* update name

* timit

* context manager

* on_fit_batch_start

* update CV

* should_step with noam

* add flag precision

* naming

* aishell

* aishell

* update recipes

* so many recipes 0.0

* update recipes

* last recipes

* zero_grad

* fix grad_accumulation_factor

* update recipes

* update auto_mix_prec flag

* remove opt flag test

* librispeech

* cv ssl

* audio mnist / realm

* voicebank

* fix rescuespeech

* fix lr annealing

* libritts

* multiwoz

* slurp nlu

* should_step

* update yamls

* update yaml

* update batch smpler tedlium

* remove fit batch

* precision flag

* update sampler

* add precision inside of the yamls

* run_opt outside brain + test

* fix auto_mix_prec flag

* docstring

* grad acc

* failing test

* update unittests

* update jarod's pr

* fix removed avg_checkpoint param

* update path

* fix some recipe tests

* update samu recipe

* fix hifigan/IWSLT

* tedlium

---------

Co-authored-by: Mirco Ravanelli <[email protected]>

* Refactor Augmentation (#2206)

* update

* update

* change folder

* remove unnecesary file

* update folder structure

* add noise, add rev

* augmenter refactor

* refactor augment + example in templace

* fix tests + linters

* address comments

* supporting variable-length augmentations in augmenter (e.g., speed change)

* lib refactor (splitting time and freq augmentations)

* fine tune freq drop

* refactor of specaugment (freq-domain) - part 1

* converted specaument (freq domain)

* refactor random shift

* implemented cutcat, swap, and random selection

* extended unittests + small fixes

* improvements and fixes in augment

* plugged feature augmentation + various fixes and improvements

* add sum_batch noise (similat to babble) + various fixes

* add drop bit resolution

* added coded augmentation

* added more unittests

* restore all augmentations

* making AddReveb more similar to AddNoise

* fix device mismatch + fix last batch management

* add workes to speed up AddNoise and AddRev

* improve comments in template yaml

* speed up template (sorting dev and test)

* extend augmenter by adding activation provability

* implemented enable augmentation flag (useful of hparam tuning) + other improvements

* plugged coded augment

* fixed coded augment

* remove old files

* fix integration test

* remove knowledge distill TIMIT reicpes. Too many yaml files to maintain

* convert TIMIT

* fix recipe

* converted templates using EnvCorr

* converted voxceleb

* converted GSC + fixes on voxceleb

* convrted UrbanSound8k

* converted voicebank

* converted other recipes

* converted CommonLanguage, VoxLingua, timers-and-such

* converted all recipes using envcorr

* CommonVoice

* REAL-M

* Aishell1Mix

* LibriMix

* converted all recipes!

* fix linters - part1

* fix linters - part2

* add a note in the template regarding augmentation

* fix docstring tests

* fix yamls

* remove coded tests from docstring

* revised coded tests

* fix identation in codec.py

* try to fix doc issue

* revise lib header in codec.oy

* fix doc

* fix doc attempt

* rename sections

* fix doc

* fix (most) recipe tests

* fix other recipe tests

* address comments

* fix yaml

* fix

* convert recipe

* fix recipes

* fix aug in rescoring recipes

* Delete tmpdir_vocoder directory

* Refactor Inference (files and folders) (#2252)

* refactor inference files and folders

* fix some tests

* fix some tests

* fix doctest

* import lib

* small fixes

* Fix beam search (#2253)

* fix starting pos prefix_length

* block path ctc + fix default value to the old one

* fix issue with score being -inf

* remoev print

* precommit

* Fix ctc beam search (#2263)

* fix logprobs / space_token / warnings

* fix space_token

* pre-commit

* space_token

* simplify parameters

* simplify yamls

* remove comma

* update beam search

* fix vocab/str (#2265)

* Fix blank index ctc (#2266)

* update blank_index

* whisper

* revert change

* mistake

* Cv unstable merge (#2254)

* add fr preproccesing to Common_voice_prepare.py

* add CV , CTC, new languages

* fix precommit and test

* add transducer recipie

* add transformer recipies

* update augmentation of CTC recipies

* update seq-to-seq recipies

* fix whisper HF interface bug. (return str insted of list)

* fix recipe tests

* add fr preproccesing to Common_voice_prepare.py

* add CV , CTC, new languages

* fix precommit and test

* add transducer recipie

* add transformer recipies

* update augmentation of CTC recipies

* update seq-to-seq recipies

* fix whisper HF interface bug. (return str insted of list)

* fix recipe tests

* modify beamsearch for CTC: ar.es.pt and zh-CN

* fix interface conflict

* fix transducer interface bug

---------

Co-authored-by: Mirco Ravanelli <[email protected]>

* Add warnings and fix numba (#2271)

* upperbound torch/tochaudio + remove opt dependancy

* add back automix/bf flags

* linters

* oops

* transformers back

* test requirements

* Fix Bug: CommonVoice Transformer Bug loading correct optimizer (#2278)

* fix trnsfrm bug to load correct opt:adam vs sgd

* add  data_root to the path of common_voice_prepare.py

* add epoch/_counter pretrainer to fr and it recepie

* revert releative path change

* fix opt bug without the need to add epoch_ckpt

* add log and delete launch file

* update the log message

* update WeightedSSLModel (#2272)

* update WeightedSSLModel

* requirements.txt

* fix pre-commit

* Sg/dac (#2246)

* introducing DAC

* lint errors

* black

* documenttion

* remove unused init file

* Fixing tests

* More doc strings

* More doc strings

* PR review

* PR review

* PR review

* Update dac.py

* Update dac.py

* Update dac.py

* make doctests smaller to avoid memory issues in CI

* even smaller tests

---------

Co-authored-by: Shubham Gupta <[email protected]>
Co-authored-by: Mirco Ravanelli <[email protected]>

* add quantization recipies fro IEMCAP, CV, LibriSpeech and LJSpeech (#2255)

* add quantization recipies fro IEMCAP, CV, LibriSpeech and LJSpeech

* update discrete_ssl models

* add iemocap_prepare to main folder + add test

* ix test for iemocap

* fik typos

* fix test recepies,  minor dormat editting

* fix typo in coomonvoice.csv

* fix typo in yaml file

* fix doctests (those that we do not run in the CI)

---------

Co-authored-by: Mirco Ravanelli <[email protected]>

* change emdedding type from long to float to vaoid getting al zeros embedding (#2292)

* Update CVSS (#2285)

* Update CVSS

* Update train_fr-en.yaml

* Update train_fr-en.yaml

* Update HF interface (#2293)

* RNN Tranducer Numba Loss: Add FP16 and BF16 support (code from Samsung AI Cambridge) (#2296)

* Make lobes use fp32 when AMP is active (#2295)

* Added utils.autocast with a fwd_default_precision function

* Decorate all lobes to require float32 precision in AMP

* Fix trailing space in docstring

* Less confusing doc for fwd_default_precision

* Be explicit that only fp inputs are affected by fwd_default_precision

* Typo in docstring

* Remove dtype annotation that is broken for some reason

* Precommit checks will be the end of me

* Fix tests

* Add docstring to precision wrapper function

* Fix style check again..

* adding support for fp16 transducer loss numba

* adding support for fp16 transducer loss numba

* fix fp16 transducer recipe

* add note on half precision

---------

Co-authored-by: asu <[email protected]>
Co-authored-by: Titouan Parcollet/Embedded AI /SRUK/Engineer/Samsung Electronics <[email protected]>
Co-authored-by: Mirco Ravanelli <[email protected]>

* Fix recipe tests for TransformerASR (#2282)

* fix position embedding (#2283)

* fix position embedding

* use speechbrain internal postional encoding and generate mask from sequence lengths

* call mask function from core for tacotron

* minor fix

* fix device

* reduce training epochs

* update links

---------

Co-authored-by: Mirco Ravanelli <[email protected]>

* Gradscaler flags (#2281)

* add flags for gradscaler

* add check_loss_isfinite

* update dict

* typo

* remove default

* better message

* fix pre-commit

* remove checks

* remove new arguments

---------

Co-authored-by: Mirco Ravanelli <[email protected]>

* add llama2 recipies (#2299)

* add llama2 recipies

* fix symbolic links

* fix  bug

* remove unneccary input in docstring

* fix typo

* cleaning llama2 recepies

* update readme

* update interface and add licence to readme

* fic doc string

* fix precommit

* fix extra-dependency

* remove  commented lines

* inter epoch checkpoint

* minor fixes

* add extra req info in llama.py

* fix linters

---------

Co-authored-by: Mirco Ravanelli <[email protected]>

* small fixes

* make all recipes cpu-compliant + make recipe tests passing on both cpu and gpu

* fix some broken links

* remove link to private HF repo

* remove link to private HF repo

* fix libritts recipe test

* fix ljspeech recipe test

* Streamable Conformer-Transducer ASR model for LibriSpeech (#2140)

* Introduce DCT+DCConv logic

* DDP fix?

* Batch of changes and things brought back

* Streaming fixes (successfully trains)

* WIP streaming code

* WIP functional streaming code

* Fix left context

* Fix formatting

* Cleanups and docs in streaming utils

* Better comment hparams, change seed back to orig, improve naming

* uncomment averaging stuff; it was some ipython issue

* Remove pin_memory as it was not beneficial

* More cleanups, comments on context stuff

* More comments and TODOs

* encode_streaming docstring

* Dirty TransducerBeamSearcher change for streaming GS

* Fix precommit

* Fix encoders that do not support chunk_size

* Pre-commit again

* Make chunk_size type consistent

* Fix formatting of doctest in split_wav_lens

* Remove outdated TODO

* Add hasattr streaming to retain model backcompat

* Cleanup doc and naming for transducer_greedy_decode

* Cite paper for chunked attention

* Remove lost comment

* Update comment in self-attention

* Don't apply masked fill fix in the non-bool mask case

* Added TODO README update

* Revert change to custom_tgt_module; patching model instead

* Remove added entry in README

* Fix streaming conformer conv mismatch

* More conformer conv adjustments

* Adjust context size

* Remove outdated comment

* Fixed causal conformer decoder

* Fix linting

* Gate `custom_tgt_module` creation behind the presence of decoder layers

* Re-enable checkpoint averaging

* Change averaged ckpt count to 10

* Add new model results to README

* WIP refactor: Introduce DCTConfig dataclass

* Improved notice in README

* Formatting and linting fixes

* Attempt at fixing circular import?

* utils can't depend on core it seems; move dct

* Whoops, missed file

* Add DCT test, fix issues

* Remove now obsolete yaml variables for streaming

* Formatting

* Add dummy dct_config parameter to keep unsupported encoders working

* Linting fix

* Fix typo

* Add note on runtime autocast accuracy

* Fix very bad typo from refactor in YAML

* Fix hasattr streaming check

* Remove legacy comment

* Fix left context size calculation in new mask code

* Fix causal models in TransformerASR

* Remove comment on high-level inference code

* YAML formatting + commenting dynchunktrain stuff

* Remove outdated comment about DCConv left contexts

* Remove commented out debug prints from TransformerASR

* Move DCT into utils again

* Rename all(?) mentions of DCT to explicit dynamic chunk training

* Clarify padding logic

* Remove now-useless _do_conv, fix horrible formatting

* Slightly fix formatting further

* Add docstrings to forward_streaming methods

* Add a reference on Dynamic Chunk Training

* Rework conformer docstring docs

* Update conformer author list, fix doc formatting for authors

* Fix trailing whitespace in conformer

* Improved comments in Conformer.forward

* Added random dynchunktrain sampler example

* More explicit names for mask functions in TransformerASR

* Added docstring example on encode_streaming

* Pre-commit fix

* Fix typo in conformer

* Initial streaming integration test

* Precommit fix

* Fix indent in YAML

* More consistent spelling in streaming integration test

* Update CommonVoice.csv

* Add KenLM n-gram training recepie (#2304)

* add kenlm training

* fix precommit

* update readmefile with new result

* fix pre-commit

* fix typo

* fix commit reviews

* fix bug in testing

* add docstring and fix indentation

* fix bug in ASR interface

* change encoderasr interface to support ctc beam

* add suppourt fro kenlm in enoderasr interface

* fix typo

* little changes in REAMDE files to improve clarity)

* use binaries sources in bashrc

* fix trailing-whitespace

---------

Co-authored-by: Mirco Ravanelli <[email protected]>

* Create Performance file (automatically) (#2314)

* add performance readme builder

* update recipe csv files

* update README files

* add not in prerelease test

* added performance.md

* fix linters

* update info in README

* Llama2 interface bug (#2318)

* fix llama2 interface bug

* fix minor bug

* update multiwox.csv with correct db and HF link

* New README file (#2315)

* Update README.md

* Update README.md

* Update README.md

* Update README.md

* Update README.md

* Update README.md

* Update README.md

* Update README.md

* Update README.md

* Update README.md

* Update README.md

* Optimize masked Dynamic Chunk Convolution (#2308)

* Reorganized some conformer convolution module to be faster

* Completely get rid of the list of slices in the conformer conv module

* Fix linter check

* Remove unused variable

* More unused variables..

* Remove unused import

* Add conformer streaming code path test

* Fix test formatting

* small fixes in tests

* Update RNNLM.yaml

* BayesSpeech (#2326)

* Create train_bayesspeech.py

* Create bayesspeech.yaml

* Update README.md

* Update LibriSpeech.csv

* add extra-req

---------

Co-authored-by: Mirco Ravanelli <[email protected]>

* adding new controllable exp scheduler

* adding new controllable exp scheduler

* update performance file

* Update PERFORMANCE.md

* Update README.md

---------

Co-authored-by: mhn226 <[email protected]>
Co-authored-by: Adel Moumen <[email protected]>
Co-authored-by: Adel Moumen <[email protected]>
Co-authored-by: Ha Nguyen <[email protected]>
Co-authored-by: flexthink <[email protected]>
Co-authored-by: flexthink <[email protected]>
Co-authored-by: Pooneh Mousavi <[email protected]>
Co-authored-by: shubham-gupta-30 <[email protected]>
Co-authored-by: Shubham Gupta <[email protected]>
Co-authored-by: Parcollet Titouan <[email protected]>
Co-authored-by: asu <[email protected]>
Co-authored-by: Titouan Parcollet/Embedded AI /SRUK/Engineer/Samsung Electronics <[email protected]>
Co-authored-by: Luca Della Libera <[email protected]>
Co-authored-by: Yingzhi WANG <[email protected]>
Co-authored-by: BenoitWang <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants