You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Self-host the powerful Chatterbox TTS model. This server offers a user-friendly Web UI, flexible API endpoints (incl. OpenAI compatible), predefined voices, voice cloning, and large audiobook-scale text processing. Runs accelerated on NVIDIA (CUDA), AMD (ROCm), and CPU.
Fine-tuning toolkit for Chatterbox TTS & Chatterbox TURBO models. Supports 23 languages with smart vocabulary extension. Features offline preprocessing, automatic VAD trimming, and voice cloning capabilities. Train custom TTS models with your own dataset in LJSpeech and file-based format.
Real-time conversational AI with voice cloning and emotion detection. Analyzes conversation context to deliver dramatically expressive responses using your cloned voice. Built with FastRTC and Chatterbox TTS for natural, emotionally-aware voice interactions.
I'm sorry, Dave. I'm afraid I can't let that spam call through. — Local AI answering service with voice cloning, call recording, and sub-second latency. Runs entirely on your GPU. No cloud. No per-minute billing. Just HAL.
A unified Text-to-Speech gateway combining multiple TTS providers (Kokoro ONNX, Chatterbox TTS, OpenAI Edge TTS) with a modern React frontend and production-ready Docker deployment.
Meet VoiceFlow 🎙️🔊, your production-ready microservices platform for all things AI speech! It's designed to make high-performance voice processing a breeze, letting you effortlessly transcribe audio to text and convert text into natural-sounding speech. 🚀
A Next-Generation Neural TTS Engine. High-quality, human-like voice synthesis with low latency and multi-language support. Built for developers and creators.
A fully local voice agent powered by DeepSeek R1, Faster Whisper, and Chatterbox TTS. Orchestrated with LangGraph for seamless voice-to-voice interaction without cloud dependencies.
A web API for speech-to-text (STT) and text-to-speech (TTS) that integrates with existing engines, supporting real-time audio streaming and modular engine selection.