A Survey of Spoken Dialogue Models (60 pages)
streaming
duplex
speech
moshi
speech-representation
encodec
gpt-4o
speech-language-model
spoken-dialogue-models
modal-alignment
intreaction
mini-omni
llama-omni
wavtokenizer
-
Updated
Nov 28, 2024