VoiceCraft
Speech editor
A neural codec model for speech editing and text-to-speech synthesis in real-time, using few seconds of reference audio.
Zero-Shot Speech Editing and Text-to-Speech in the Wild
8k stars
89 watching
755 forks
Language: Jupyter Notebook
last commit: 8 months ago Related projects:
Repository | Description | Stars |
---|---|---|
| An AI system for generating human-like voices from text inputs, using deep learning techniques and pre-trained models. | 36,977 |
| An open-source text-to-speech system trained with high-quality audio capabilities | 13,373 |
| A deep learning model for generating human-like speech | 3,936 |
| A deep learning toolkit for generating human-like speech from text | 36,118 |
| An open-source project providing a suite of deep learning models and tools for advanced text-to-speech synthesis. | 9,466 |
| A text-to-audio model that generates realistic speech and other audio | 36,433 |
| Real-time speech synthesis using state-of-the-art architectures | 3,855 |
| A tool for automating the process of generating and ranking effective prompts for AI models like GPT-4, GPT-3.5-Turbo, or Claude 3 Opus. | 9,411 |
| A deep learning-based speech synthesis model that generates natural-sounding audio with controlled prosody. | 2,551 |
| A deep learning library for generating high-quality audio | 21,134 |
| A toolkit for deploying and serving Large Language Models (LLMs) for high-performance text generation | 9,456 |
| A plugin for Neovim that integrates with the ChatGPT API to generate natural language responses and assist with coding tasks. | 3,825 |
| Develops an end-to-end text-to-speech system that generates more natural audio than existing models | 6,947 |
| A modular framework for building conversational AI applications with real-time voice and multimodal interactions. | 3,825 |
| A Python-based audio generation tool that can produce speech, sound effects, music, and more, using text as input or guided by user description. | 2,483 |