VoiceCraft

Speech editor

A neural codec model for speech editing and text-to-speech synthesis in real-time, using few seconds of reference audio.

Zero-Shot Speech Editing and Text-to-Speech in the Wild

GitHub

8k stars
88 watching
746 forks
Language: Jupyter Notebook
last commit: 5 months ago

Related projects:

Repository Description Stars
rvc-boss/gpt-sovits An AI system for generating human-like voices from text inputs, using deep learning techniques and pre-trained models. 35,728
neonbjb/tortoise-tts A multi-voice text-to-speech system trained on high-quality data 13,225
metavoiceio/metavoice-src A deep learning model for generating human-like speech 3,891
coqui-ai/tts A deep learning toolkit for generating human-like speech from text 35,453
mozilla/tts An open-source project providing a suite of deep learning models and tools for advanced text-to-speech synthesis. 9,401
suno-ai/bark A text-to-audio model that generates realistic speech and other audio 36,126
tensorspeech/tensorflowtts Real-time speech synthesis using state-of-the-art architectures 3,839
mshumer/gpt-prompt-engineer A tool for automating the process of generating and ranking effective prompts for AI models like GPT-4, GPT-3.5-Turbo, or Claude 3 Opus. 9,368
camb-ai/mars5-tts A deep learning-based speech synthesis model that generates natural-sounding audio with controlled prosody. 2,530
facebookresearch/audiocraft A deep learning library for generating high-quality audio 20,969
huggingface/text-generation-inference A toolkit for deploying and serving Large Language Models. 9,106
jackmort/chatgpt.nvim A plugin for Neovim that integrates with the ChatGPT API to generate natural language responses and assist with coding tasks. 3,779
jaywalnut310/vits Develops an end-to-end text-to-speech system that generates more natural audio than existing models 6,860
pipecat-ai/pipecat A framework for building conversational AI agents with voice and multimodal interactions 3,383
haoheliu/audioldm An AI-powered tool for generating various types of audio content from text input or existing audio files 2,446