The Open Source Voice Agent Platform
Orchestrate ultra-low latency AI pipelines for real-time conversations over WebRTC.
demo_stimm.mp4
🔊 Sound on! Watch the ultra-low latency (< 1s) in action.
A modular, real-time AI voice assistant platform built with Python (FastAPI) and Next.js. This project provides a flexible infrastructure for creating, managing, and interacting with voice agents using various LLM, TTS, and STT providers.
Features · Why Stimm? · Use Cases · How it Works · Quick Start · Docs
- Real-time Voice Interaction: Low-latency voice conversations using WebRTC and WebSocket transports.
- SIP Telephony Integration: Connect incoming phone calls to AI agents via SIP protocol.
- Modular AI Providers:
- LLM: Support for Groq, Mistral, OpenRouter, and local Llama.cpp.
- TTS: Deepgram, ElevenLabs, Async.ai, and local Kokoro.
- STT: Deepgram and local Whisper.
- Administrable RAG Configurations: Create and manage multiple RAG configurations with Qdrant and per‑agent knowledge bases.
- Agent Management: Admin interface to configure and manage multiple agents with different personalities and provider settings.
- Modern Frontend: Responsive web interface built with Next.js 16 and Tailwind CSS.
- Robust Infrastructure: Dockerized deployment with Traefik reverse proxy, PostgreSQL for data persistence, and Alembic for migrations.
- Voice Activity Detection: Integrated Silero VAD for accurate speech detection.
- Ultra-low latency thanks to optimized Silero VAD and LiveKit real-time media pipeline.
- Provider-agnostic (LLM, TTS, STT): choose any AI stack.
- Trimmable dependencies: Heavy build-time dependencies (like
pyaudio) are optional, keeping the core image lean and secure. - Lightweight core: Pure ONNX Runtime inference for fast installation.
- Scalable architecture: Docker, Traefik, Postgres, and a technical foundation designed for production deployment.
- Customer support voicebots: Handle common queries automatically.
- Interactive phone-based assistants (SIP): Connect traditional telephony to AI.
- Real-time agent demos: Perfect for AI research and prototyping.
- On-premise conversational agents: Deploy securely with AGPL-friendly terms.
graph LR
%% High Contrast Theme
classDef person fill:#FF007F,stroke:#fff,stroke-width:4px,color:white,font-weight:bold;
classDef transport fill:#00B0FF,stroke:#fff,stroke-width:2px,color:white,font-weight:bold;
classDef input fill:#7F00FF,stroke:#fff,stroke-width:2px,color:white,font-weight:bold;
classDef brain fill:#FFD700,stroke:#FF8C00,stroke-width:3px,color:black,font-weight:bold,stroke-dasharray: 5 5;
classDef output fill:#00E676,stroke:#fff,stroke-width:2px,color:black,font-weight:bold;
%% Multi-platform User
User([👤 📱 💻 📞 User]):::person
%% LiveKit Layer
subgraph "📡 LiveKit Infrastructure"
direction TB
Room[🔄 Real-time Room <br/> WebRTC / SIP]:::transport
end
%% Stimm Core Layer
subgraph "⚡ Stimm Core"
direction TB
Hear[👂 Hear & Transcribe]:::input
Think(🧠 Think & Retrieve):::brain
Speak[🗣️ Speak & Respond]:::output
end
%% Connections
User <==>|Audio Stream| Room
Room ==>|Raw Audio| Hear
Hear ==>|Text| Think
Think ==>|Text| Speak
Speak ==>|Synthesized Audio| Room
%% Link Styles
linkStyle default stroke-width:3px,fill:none,stroke:#666
Get Stimm up and running in minutes:
# Clone the repository
git clone https://github.com/stimm-ai/stimm.git
cd stimm
# Set up environment (copies .env.example files)
chmod +x scripts/setup_env.sh
./scripts/setup_env.sh
# Start all services with Docker Compose
docker-compose up --buildOnce the services are running, open your browser to:
- Frontend Admin: http://front.localhost/agent/admin (or http://localhost:3000/agent/admin)
- API Documentation: http://api.localhost/docs (or http://localhost:8001/docs)
For detailed instructions, refer to the Full Documentation or check the guides below:
- Configuration: Configure providers (LLM, TTS, STT).
- Web Interface: Manage agents and RAG via the UI.
- SIP Integration: Connect phone numbers to your agents.
- Development: Setup local environment.
For local development, see the Development Guide in the documentation.
# Start supporting services (PostgreSQL, Qdrant, LiveKit, Redis, SIP)
docker compose up -d postgres qdrant traefik livekit redis sip
# Install Python dependencies (add --extra audio for local CLI audio/testing)
uv sync --group dev --group docs --all-extras
# Set up environment files and Python path (optional)
./scripts/setup_env.sh
# Run backend locally
uv run python -m src.main
# In another terminal, run frontend
cd src/front
npm install
npm run devStimm follows a modern security-first approach using open-source SAST (Static Application Security Testing) and SCA (Software Composition Analysis) tools integrated into the development workflow.
- Semgrep: Multi-language security scanner for finding complex vulnerabilities.
- Bandit: Python-specific security scanner for common vulnerabilities.
- ESLint SonarJS: Advanced code quality and security rules for the React frontend, providing SonarQube-like analysis.
- pip-audit: Checks Python dependencies for known vulnerabilities.
- npm audit: Checks JavaScript dependencies for known vulnerabilities.
Developers can run the entire suite of security, quality, and formatting checks with a single command:
# Run all checks on all files in the repository
pre-commit run --all-filesThis command runs Ruff (lint/format), Prettier, Bandit, Semgrep, pip-audit, npm audit, and frontend-security-lint in sequence. It is highly recommended to run this before submitting any pull request.
- Ruff: Extremely fast Python linter and formatter.
- ESLint & Prettier: Frontend linting and formatting.
We welcome contributions! Please read our Contributing Guide for details on how to submit pull requests, report issues, and our code of conduct.
By contributing, you agree to the Contributor License Agreement (CLA).
Stimm is open-source software licensed under the GNU Affero General Public License v3.0 (AGPL v3). See the LICENSE file for details.
Trademark Notice: The name "Stimm" and the Stimm logo are exclusive trademarks of the project maintainers and are not covered by the open-source license. Derivative works must remove the logo and change the name to avoid suggesting official affiliation.
Built with LiveKit
Stimm relies on LiveKit for high-performance real-time media transport (WebRTC).
Disclaimer: Stimm is an independent project and is not affiliated with, endorsed by, or sponsored by LiveKit.