EmotiVoice
TTS Engine
An open-source text-to-speech engine with emotion synthesis and multiple voice options
EmotiVoice 😊: a Multi-Voice and Prompt-Controlled TTS Engine
8k stars
64 watching
639 forks
Language: Python
last commit: 5 months ago aideep-learningemotionemotivoicemulti-speakerpromptpythonpytorchspeechspeech-synthesisstyletext-to-speechtts
Related projects:
Repository | Description | Stars |
---|---|---|
metavoiceio/metavoice-src | A deep learning model for generating human-like speech | 3,936 |
plachtaa/vall-e-x | A research implementation of Microsoft's VALL-E X zero-shot TTS model for multilingual text-to-speech synthesis and voice cloning | 7,719 |
picovoice/rhino | A deep learning-based speech-to-intent engine for on-device voice interaction | 633 |
rvc-boss/gpt-sovits | An AI system for generating human-like voices from text inputs, using deep learning techniques and pre-trained models. | 36,977 |
coqui-ai/tts | A deep learning toolkit for generating human-like speech from text | 36,118 |
netease-youdao/bcembedding | Provides bilingual and crosslingual retrieval models for semantic search and question-answering in multiple languages | 1,528 |
jasonppy/voicecraft | A neural codec model for speech editing and text-to-speech synthesis in real-time, using few seconds of reference audio. | 7,744 |
mumble-voip/mumble | A high-quality, low-latency voice chat software with support for multiple platforms and plugins. | 6,484 |
openai-translator/openai-translator | A multi-platform translator and text processing tool leveraging ChatGPT API | 24,004 |
mozilla/tts | An open-source project providing a suite of deep learning models and tools for advanced text-to-speech synthesis. | 9,466 |
mediar-ai/screenpipe | A platform for building and deploying AI agents with full context using screen recordings, allowing for 24/7 monitoring and control. | 11,060 |
anse-app/chatgpt-demo | A demo project showcasing integration with the OpenAI GPT-3.5 Turbo API for generating chatbot-like responses. | 8,014 |
damo-nlp-sg/video-llama | An audio-visual language model designed to understand and respond to video content with improved instruction-following capabilities | 2,842 |
suno-ai/bark | A text-to-audio model that generates realistic speech and other audio | 36,433 |
weechat/weechat | An extensible chat client with modular architecture and support for multiple protocols. | 2,997 |