Stars
litagin02 / Style-Bert-VITS2
Forked from fishaudio/Bert-VITS2Style-Bert-VITS2: Bert-VITS2 with more controllable voice styles.
Repository for research project about watermarkng audio
Download YouTube video (or supply your own) and generate dual languange subtitles with OpenAI Whisper and translation API (GPT) 下载 YouTube 视频(或提供您自己的视频)并使用 Whisper 和翻译API (GPT) 生成双语字幕
This repo is a pipeline of VITS finetuning for fast speaker adaptation TTS, and many-to-many voice conversion
[NO LONGER MAINTAINED] Command-line utility for auto-generating subtitles for any video file
Learn Python with Colaboratory (colab.research.google.com)
ManaTTS is the largest open Persian speech dataset with 86+ hours of transcribed audio. Includes data collection pipeline and tools. Suitable for Persian text-to-speech models.
Turn PDFs and EPUBs into audiobooks, subtitles or videos into dubbed videos (including translation), and more. For free. Pandrator uses local models, notably XTTS, including voice-cloning (instant,…
A program to dub non-english media with modern AI speech synthesis, diarization, and voice cloning!
Noise removal/ reducer from the audio file in python. De-noising is done using Wavelets and thresholding is done by VISU Shrink thresholding technique
A neural word aligner based on multilingual BERT
A Telegram Bot that automatically reacts to posts in Telegram Channels, groups, and private messages, developed as a server-less application.✨
Fine-Tuning your VITS model using a pre-trained model
MARS5 speech model (TTS) from CAMB.AI
Modern spell checking library - accurate, fast, multi-language
Create different voices for the Espeak synthesizer. New version restored and improved, but the documentation has not yet been restored.
Talking Head (3D): A JavaScript class for real-time lip-sync using Ready Player Me full-body 3D avatars.
Data and code for grapheme-to-phoneme transducers in lots of languages
Next-generation TTS model using flow-matching and DiT, inspired by Stable Diffusion 3
Everything about note management. All in Zotero.
[CVPR 2022--Oral] Restormer: Efficient Transformer for High-Resolution Image Restoration. SOTA for motion deblurring, image deraining, denoising (Gaussian/real data), and defocus deblurring.
Document Image Enhancement with GANs - TPAMI journal
datagym-ru / tg_tqdm
Forked from ermakovpetr/tg_tqdmExtension for tqdm progressbar in Telegram