Q-ASR is a quantization scheme for ASR models that does not require any training/validation data and only uses integer-only computation. Please also check our paper: paper link
You can find detailed installation guides from the NeMo repo.
- Create a Conda virtual environment
conda create -n qasr python=3.8
conda activate qasr
- Install NeMo (Q-ASR)
pip install Cython
git clone https://github.com/kssteven418/Q-ASR.git
cd Q-ASR
./reinstall.sh
Q-ASR is evaluated on the Librispeech dataset, which can be downloaded and preprocessed using the script provided by NeMo.
You can find the script in Q-ASR/scripts/get_librispeech_data.py
.
Run the script using the following command.
# in Q-ASR/scripts
python get_librispeech_data.py --data_sets {dataset} --data_root {DIR}
{datasets}
can be one of the following: {dev_clean, dev_other, train_clean_100, train_clean_360, train_other_500, test_clean, test_other}
.
You can also concatenate multiple items with comma(,) to process multiple datasets (e.g., dev_clean,dev_other
),
or use ALL
to process all.
After processing dev_clean
, for example, the preprocessed datasets will be stored at {DIR}/LibriSpeech/dev-clean-processed
.
Additionally, a manifest file is generated at {DIR}/dev_clean.json
.
This is a json file that links the preprocessed audio files in {DIR}/LibriSpeech/dev-clean-processed
with the corresponding text labels.
Therefore, make sure not to move the preprocessed audio files to another directory unless you modify the manifest file accordingly
(otherwise, the manifest file will not locate the audio files).
Q-ASR consists of 2 steps: (1) Synthetic data generation, and (2) Calibration and evaluation, each of which can be run with the python scripts
synthesize.py
and inference.py
in Q-ASR/examples/asr/quantization
.
Run the following command for synthetic data generation.
# in Q-ASR/examples/asr/quantization
python synthesize.py --asr_model {model_name} --dataset {path_to_manifest} \
--num_batch {num_batch} --batch_size {batch_size} \
--seq_len {seq_len} --train_iter {train_iter} --lr {lr} \
--dump_path {dump_path} --dump_prefix {dump_prefix}
For instance,
python synthesize.py --asr_model QuartzNet15x5Base-En --dataset {DIR}/dev_clean.json \
--num_batch 20 --batch_size 8 \
--seq_len 500 --train_iter 200 --lr 0.05 \
--dump_path dump --dump_prefix quartznet
Note that {DIR}/dev_clean.json
is the manifest file (generated from the preprocessing step) for the target evaluation dataset.
Please use the flag -h
to see the details for the input arguments.
The resulting dataset is stored at {dump_path}
and will be loaded in the following calibration/evaluation step.
After generating the synthetic data, run the following command to calibrate and evaluate the quantized model.
# in Q-ASR/examples/asr/quantization
python inference.py --asr_model {model_name} --dataset {path_to_manifest} \
--load {load} --weight_bit {wb} --act_bit {ab} --percentile {p}
For instance,
python inference.py --asr_model QuartzNet15x5Base-En --dataset {DIR}/dev_clean.json \
--load dump/quartznet_nb20_iter200_lr0.050.pkl \
--weight_bit 6 --act_bit 6 --percentile 99.996
Similarly, {DIR}/dev_clean.json
is the manifest file (generated from the preprocessing step) for the target evaluation dataset.
Please use the flag -h
to see the details for the input arguments.
You can also use --dynamic
flag to perform dynamic quantization, instead of using --load
flag to load the synthetic dataset.