TTS Generation Web UI (Bark, MusicGen + AudioGen, Tortoise, RVC, Vocos, Demucs, SeamlessM4T, MAGNet, StyleTTS2, MMS, Stable Audio, Mars5, F5-TTS, ParlerTTS)
-
Updated
Nov 23, 2024 - TypeScript
TTS Generation Web UI (Bark, MusicGen + AudioGen, Tortoise, RVC, Vocos, Demucs, SeamlessM4T, MAGNet, StyleTTS2, MMS, Stable Audio, Mars5, F5-TTS, ParlerTTS)
(Windows/Linux/MacOS) Local WebUI with neural network models (Text, Image, Video, 3D, Audio) on python (Gradio interface). Translated on 3 languages
ONNX-compatible Fast SeamlessM4T—Massively Multilingual & Multimodal Machine Translation
SeamlessM4t-Translator: Utilizing the powerful Seamless M4t Facebook model in the backend, this project facilitates seamless translation functionalities including S2ST, S2TT, T2ST, and T2TT queries.
EchoSight is a tool that helps visually impaired individuals by audibly describing images taken with a Raspberry Pi Camera or inputted via image path or URL across different operating systems.
Turn any LLM into Jarvis
Automatic speech recognition (ASR)
How I used Seamless m4t large to get to the top 5 of the mozilla common voice competition hosted on Zindi
Just Run As It. Note: after install package, remember restart kernal
Translation from one language to another without speech intermediate
Add a description, image, and links to the seamlessm4t topic page so that developers can more easily learn about it.
To associate your repository with the seamlessm4t topic, visit your repo's landing page and select "manage topics."