VoiceCraft

Speech editor

A neural codec model for speech editing and text-to-speech synthesis in real-time, using few seconds of reference audio.

Zero-Shot Speech Editing and Text-to-Speech in the Wild

GitHub

8k stars

89 watching

755 forks

Language: Jupyter Notebook

last commit: about 2 years ago

Related projects:

Repository	Description	Stars
rvc-boss/gpt-sovits	An AI system for generating human-like voices from text inputs, using deep learning techniques and pre-trained models.	36,977
neonbjb/tortoise-tts	An open-source text-to-speech system trained with high-quality audio capabilities	13,373
metavoiceio/metavoice-src	A deep learning model for generating human-like speech	3,936
coqui-ai/tts	A deep learning toolkit for generating human-like speech from text	36,118
mozilla/tts	An open-source project providing a suite of deep learning models and tools for advanced text-to-speech synthesis.	9,466
suno-ai/bark	A text-to-audio model that generates realistic speech and other audio	36,433
tensorspeech/tensorflowtts	Real-time speech synthesis using state-of-the-art architectures	3,855
mshumer/gpt-prompt-engineer	A tool for automating the process of generating and ranking effective prompts for AI models like GPT-4, GPT-3.5-Turbo, or Claude 3 Opus.	9,411
camb-ai/mars5-tts	A deep learning-based speech synthesis model that generates natural-sounding audio with controlled prosody.	2,551
facebookresearch/audiocraft	A deep learning library for generating high-quality audio	21,134
huggingface/text-generation-inference	A toolkit for deploying and serving Large Language Models (LLMs) for high-performance text generation	9,456
jackmort/chatgpt.nvim	A plugin for Neovim that integrates with the ChatGPT API to generate natural language responses and assist with coding tasks.	3,825
jaywalnut310/vits	Develops an end-to-end text-to-speech system that generates more natural audio than existing models	6,947
pipecat-ai/pipecat	A modular framework for building conversational AI applications with real-time voice and multimodal interactions.	3,825
haoheliu/audioldm	A Python-based audio generation tool that can produce speech, sound effects, music, and more, using text as input or guided by user description.	2,483