bark

Audio generator

A text-to-audio model that generates realistic speech and other audio

🔊 Text-Prompted Generative Audio Model

GitHub

36k stars
332 watching
4k forks
Language: Jupyter Notebook
last commit: 3 months ago
Linked from 3 awesome lists


Backlinks from these awesome lists:

Related projects:

Repository Description Stars
camb-ai/mars5-tts A deep learning-based speech synthesis model that generates natural-sounding audio with controlled prosody. 2,530
jasonppy/voicecraft A neural codec model for speech editing and text-to-speech synthesis in real-time, using few seconds of reference audio. 7,638
speechbrain/speechbrain A PyTorch-based toolkit for building conversational AI systems with advanced speech and text processing capabilities. 8,922
haoheliu/audioldm An AI-powered tool for generating various types of audio content from text input or existing audio files 2,446
mozilla/tts An open-source project providing a suite of deep learning models and tools for advanced text-to-speech synthesis. 9,401
mshumer/gpt-prompt-engineer A tool for automating the process of generating and ranking effective prompts for AI models like GPT-4, GPT-3.5-Turbo, or Claude 3 Opus. 9,368
coqui-ai/tts A deep learning toolkit for generating human-like speech from text 35,453
aigc-audio/audiogpt An audio processing toolkit that provides pre-trained models and tools for tasks like speech synthesis, music generation, sound detection, and talking head creation. 10,025
bigscience-workshop/promptsource A toolkit for creating and using natural language prompts to enable large language models to generalize to new tasks. 2,696
coqui-ai/stt A toolkit for building and deploying speech-to-text models using deep learning techniques 2,283
plachtaa/vall-e-x A research implementation of Microsoft's VALL-E X zero-shot TTS model for multilingual text-to-speech synthesis and voice cloning 7,670
hahahumble/speechgpt An application that enables users to converse with ChatGPT via speech and text interfaces. 2,746
huggingface/transformers A collection of pre-trained machine learning models for various natural language and computer vision tasks, enabling developers to fine-tune and deploy these models on their own projects. 135,022
lucidrains/musiclm-pytorch Implementation of Google's MusicLM model for music generation using attention networks and text-conditioning. 3,166
nvidia/waveglow Generates high-quality speech from mel-spectrograms using a flow-based network architecture 2,285