A Survey of Spoken Dialogue Models (60 pages)
streaming duplex speech moshi speech-representation encodec gpt-4o speech-language-model spoken-dialogue-models modal-alignment intreaction mini-omni llama-omni wavtokenizer
-
Updated
Nov 28, 2024