GPT-SoVITS

Voice Generator

An AI system for generating human-like voices from text inputs, using deep learning techniques and pre-trained models.

1 min voice data can also be used to train a good TTS model! (few shot voice cloning)

GitHub

37k stars

218 watching

4k forks

Language: Python

last commit: 9 months ago

Linked from 1 awesome list

text-to-speechttsvitsvoice-clonevoice-cloneaivoice-cloning

Backlinks from these awesome lists:

hannibal046/awesome-llm

Related projects:

Repository	Description	Stars
jasonppy/voicecraft	A neural codec model for speech editing and text-to-speech synthesis in real-time, using few seconds of reference audio.	7,744
metavoiceio/metavoice-src	A deep learning model for generating human-like speech	3,936
neonbjb/tortoise-tts	An open-source text-to-speech system trained with high-quality audio capabilities	13,373
coqui-ai/tts	A deep learning toolkit for generating human-like speech from text	36,118
plachtaa/vall-e-x	A research implementation of Microsoft's VALL-E X zero-shot TTS model for multilingual text-to-speech synthesis and voice cloning	7,719
coqui-ai/stt	A toolkit for building and deploying speech-to-text models using deep learning techniques	2,302
mozilla/tts	An open-source project providing a suite of deep learning models and tools for advanced text-to-speech synthesis.	9,466
mshumer/gpt-prompt-engineer	A tool for automating the process of generating and ranking effective prompts for AI models like GPT-4, GPT-3.5-Turbo, or Claude 3 Opus.	9,411
tensorspeech/tensorflowtts	Real-time speech synthesis using state-of-the-art architectures	3,855
openai/whisper	A general-purpose speech recognition system trained on large-scale weak supervision	72,752
instruction-tuning-with-gpt-4/gpt-4-llm	This project generates instruction-following data using GPT-4 to fine-tune large language models for real-world tasks.	4,244
jaywalnut310/vits	Develops an end-to-end text-to-speech system that generates more natural audio than existing models	6,947
minimaxir/gpt-2-simple	A tool for retraining and fine-tuning the OpenAI GPT-2 text generation model on new datasets.	3,398
camb-ai/mars5-tts	A deep learning-based speech synthesis model that generates natural-sounding audio with controlled prosody.	2,551
rhasspy/piper	A fast local neural text-to-speech system optimized for small devices	7,002