vits

TTS system

Develops an end-to-end text-to-speech system that generates more natural audio than existing models

VITS: Conditional Variational Autoencoder with Adversarial Learning for End-to-End Text-to-Speech

GitHub

7k stars

54 watching

1k forks

Language: Python

last commit: over 2 years ago

deep-learningpytorchspeech-synthesistext-to-speechtts

jaywalnut310.github.io/vits-demo/index.html

Related projects:

Repository	Description	Stars
mozilla/tts	An open-source project providing a suite of deep learning models and tools for advanced text-to-speech synthesis.	9,466
jasonppy/voicecraft	A neural codec model for speech editing and text-to-speech synthesis in real-time, using few seconds of reference audio.	7,744
rvc-boss/gpt-sovits	An AI system for generating human-like voices from text inputs, using deep learning techniques and pre-trained models.	36,977
coqui-ai/tts	A deep learning toolkit for generating human-like speech from text	36,118
plachtaa/vall-e-x	A research implementation of Microsoft's VALL-E X zero-shot TTS model for multilingual text-to-speech synthesis and voice cloning	7,719
neonbjb/tortoise-tts	An open-source text-to-speech system trained with high-quality audio capabilities	13,373
camb-ai/mars5-tts	A deep learning-based speech synthesis model that generates natural-sounding audio with controlled prosody.	2,551
oxford-cs-deepnlp-2017/lectures	An open-source repository containing lecture slides and course materials for an advanced natural language processing course.	15,702
mubertai/mubert-text-to-music	Generates music based on user input prompts using the Mubert API	2,738
metavoiceio/metavoice-src	A deep learning model for generating human-like speech	3,936
matlab-deep-learning/wav2vec-2.0	Enables speech-to-text transcription using a pre-trained neural network model in MATLAB.	7
coqui-ai/stt	A toolkit for building and deploying speech-to-text models using deep learning techniques	2,302
facebookresearch/fairseq	A toolkit for training custom sequence-to-sequence models for various NLP tasks	30,675
ai-forever/kandinsky-2	A multilingual text2image latent diffusion model with improved aesthetics and controllability	2,774
suno-ai/bark	A text-to-audio model that generates realistic speech and other audio	36,433