Skip to content

Latest commit

 

History

History
 
 

Folders and files

NameName
Last commit message
Last commit date

parent directory

..
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Techmo Sarmata ASR - python client.

Stream audio from wave or microphone to the ASR engine and prints all reponses:

  • status,
  • transcription,
  • semantic interpretation,
  • confidence, (when applicable).

Dependencies (listed in requirements.txt):

  • grpcio
  • grpcio-tools
  • protobuf
  • pydub
  • pyaudio

pyaudio requires development packages of Python and PortAudio:

sudo apt-get install python3.5-dev portaudio19-dev

To regenerate sources from .proto, run (virtualenv must be enabled before):

./make_proto.sh

This might be required when using other gRPC or Protocol Buffers version.

To run:

  • Use python 3.5 with virtual environment and install required packages.
virtualenv -p python3.5 venv
source venv/bin/activate
pip install -r requirements.txt

Run (wave):

python sarmata_client.py --service-address ip:port --grammar grammar_file.xml --wave-path audio.wav

Run (microphone):

python sarmata_client.py --service-address ip:port --grammar grammar_file.xml --mic

There is a possibility to define grammar before recognitions:

python sarmata_client.py --service-address ip:port --define-grammar --grammar grammar_file.xml --grammar-name simple_grammar

and then run recognition:

python sarmata_client.py --service-address ip:port --grammar-name simple_grammar --wave-path audio.wav

or

python sarmata_client.py --service-address ip:port --grammar-name simple_grammar --mic

To delete a cached grammar, run with --define-grammar and --grammar-name options only (no --grammar option):

python sarmata_client.py --service-address ip:port --define-grammar --grammar-name grammar_to_delete

Usage:

usage: sarmata_client.py [-h] --service-address ADDRESS [--define-grammar]
                         [--grammar-name GRAMMAR_NAME] [--grammar GRAMMAR]
                         [--wave-path WAVE] [--mic]
                         [--service-settings SERVICE_SETTINGS]
                         [--max-alternatives MAX_ALTERNATIVES]
                         [--no-match-threshold NO_MATCH_THRESHOLD]
                         [--speech-complete-timeout SPEECH_COMPLETE_TIMEOUT]
                         [--speech-incomplete-timeout SPEECH_INCOMPLETE_TIMEOUT]
                         [--no-input-timeout NO_INPUT_TIMEOUT]
                         [--recognition-timeout RECOGNITION_TIMEOUT]

optional arguments:
  -h, --help            show this help message and exit
  --service-address ADDRESS
                        IP address and port (address:port) of a service the
                        client will connect to.
  --define-grammar      Defines a new grammar to be cached by the service
                        under ID given by `--grammar-name` option from data
                        given by `--grammar` option.
  --grammar-name GRAMMAR_NAME
                        Name (ID) of the grammar in the service's grammar
                        cache.
  --grammar GRAMMAR     SRGS grammar file (ABNF or XML format accepted).
  --wave-path WAVE      Path to wave file with speech to be recognized. Should
                        be mono, 8kHz or 16kHz.
  --mic                 Use microphone as an audio source (instead of wave
                        file).
  --service-settings SERVICE_SETTINGS
                        Semicolon-separated list of key=value pairs defining
                        settings to be sent to service via gRPC request.
  --max-alternatives MAX_ALTERNATIVES
                        Maximum number of recognition hypotheses to be
                        returned.
  --no-match-threshold NO_MATCH_THRESHOLD
                        Confidence acceptance threshold.
  --speech-complete-timeout SPEECH_COMPLETE_TIMEOUT
                        MRCPv2 Speech-Complete-Timeout in milliseconds.
  --speech-incomplete-timeout SPEECH_INCOMPLETE_TIMEOUT
                        MRCPv2 Speech-Incomplete-Timeout in milliseconds.
  --no-input-timeout NO_INPUT_TIMEOUT
                        MRCPv2 No-Input-Timeout in milliseconds.
  --recognition-timeout RECOGNITION_TIMEOUT
                        MRCPv2 Recognition-Timeout in milliseconds.