SOTA discrete acoustic codec models with 40 tokens per second for audio language modeling
semantic
text-to-speech
codec
acoustic
dac
speech-representation
audio-representation
encodec
soundstream
music-representation-learning
gpt4o
speech-language-model
-
Updated
Dec 2, 2024 - Python