SpatialCodec: Neural Spatial Speech Coding

This is the inference models of our paper SpatialCodec: Neural Spatial Speech Coding: https://arxiv.org/abs/2309.07432. Our demo site is shown here: https://xzwy.github.io/SpatialCodecDemo/.

General Pipeline

Inference SpatialCodec + Sub-band Codec

python ./SpatialCodec/mimo_inference.py --input_dir $input_dir --ref_ckpt_dir $ref_ckpt_dir --spatial_ckpt_dir $spatial_ckpt_dir --output_dir $output_dir

Inference MIMO E2E

python ./MIMO_SPATIAL_CODEC/mimo_inference.py --input_dir $input_dir --ckpt_dir $ckpt_dir --output_dir $output_dir

Checkpoints

The checkpoints can be accessed here: https://drive.google.com/drive/folders/1iHVpJj8HieIOAZYCUyYFihuFxeCara1u?usp=sharing

Citations

If you use our SpatialCodec for your research, please consider citing

@misc{xu2023spatialcodec,
      title={SpatialCodec: Neural Spatial Speech Coding}, 
      author={Zhongweiyang Xu and Yong Xu and Vinay Kothapally and Heming Wang and Muqiao Yang and Dong Yu},
      year={2023},
      eprint={2309.07432},
      archivePrefix={arXiv},
      primaryClass={cs.SD}
}

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
MIMO_SPATIAL_CODEC		MIMO_SPATIAL_CODEC
SpatialCodec		SpatialCodec
eval_scripts		eval_scripts
README.md		README.md
general_pipeline.png		general_pipeline.png
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

SpatialCodec: Neural Spatial Speech Coding

General Pipeline

Inference SpatialCodec + Sub-band Codec

Inference MIMO E2E

Checkpoints

Citations

About

Releases

Packages

Languages

XZWY/SpatialCodec

Folders and files

Latest commit

History

Repository files navigation

SpatialCodec: Neural Spatial Speech Coding

General Pipeline

Inference SpatialCodec + Sub-band Codec

Inference MIMO E2E

Checkpoints

Citations

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages