Skip to content

jjwang/kaldi-audio-server

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

9 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

An Audio Server based on Kaldi

This project shows how to build an ASR service based on Kaldi, a famous Speech Recognition Toolkit.

Currently, two server side program audio-server-online2-nnet2 (out-of-date) and audio-server-online2-nnet3 are provided. They are based on online2 & nnet2/nnet3(chain), and provide multi tcp service at the same time. More features that maybe useful for a practical system, as well as client side demo will be added later.

To start, you need to INSTALL Kaldi from main Kaldi repository in GitHub. Make ext maybe also required. Then build audio-server-online2-nnet2 (out-of-date) and audio-server-online2-nnet3 based on Kaldi.

To test, you could use any online2 & nnet2/nnet3(chain) model trained with kaldi, such as Api.ai kaldi Speech Recognition Model or Librispeech TDNN models with silence probability.

A simple example using nnet3 model

  • . ./path.sh
  • nohup audio-server-online2-nnet3 --acoustic-scale=1.0 --beam=11.0 --frame-subsampling-factor=3 --lattice-beam=4.0 --max-active=5000 --config=exp/api.ai-model/conf/online.conf --word-symbol-table=exp/api.ai-model/words.txt data/lang_nosp/phones/align_lexicon.int exp/api.ai-model/final.mdl exp/api.ai-model/HCLG.fst &

A simple example using nnet2 model (out-of-date)

  • nohup audio-server-online2-nnet2 --config=exp/nnet2_online/nnet_ms_a_online/conf/online_nnet2_decoding.conf --word-symbol-table=data/lang/words.txt data/lang/phones/align_lexicon.int exp/nnet2_online/nnet_ms_a_online/final.mdl exp/nnet2_online/nnet_ms_a_online/graph_test/HCLG.fst &

Test

  • online-audio-client localhost 5010 'scp:data/test_clean_example/wav.scp'

About

Kaldi-based speech recognition server

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • C++ 81.3%
  • Shell 14.0%
  • Makefile 4.7%