LTX-2: Efficient Joint Audio-Visual Foundation Model
LTX-2 is an open-source audiovisual diffusion model that generates synchronized video and audio content using a dual-stream transformer architecture with cross-modal attention and classifier-free guidance.
- 29 authors
LTX-2: Efficient Joint Audio-Visual Foundation Model
LTX-2 is an open-source audiovisual diffusion model that generates synchronized video and audio content using a dual-stream transformer architecture with cross-modal attention and classifier-free guidance.
- 29 authors
