bark

Audio generator

A text-to-audio model that generates realistic speech and other audio

🔊 Text-Prompted Generative Audio Model

36k stars

330 watching

4k forks

Language: Jupyter Notebook

last commit: almost 2 years ago

Linked from 3 awesome lists

Backlinks from these awesome lists:

Related projects:

Repository	Description	Stars
camb-ai/mars5-tts	A deep learning-based speech synthesis model that generates natural-sounding audio with controlled prosody.	2,551
jasonppy/voicecraft	A neural codec model for speech editing and text-to-speech synthesis in real-time, using few seconds of reference audio.	7,744
speechbrain/speechbrain	A PyTorch-based toolkit for building conversational AI systems with advanced speech and text processing capabilities.	9,066
haoheliu/audioldm	A Python-based audio generation tool that can produce speech, sound effects, music, and more, using text as input or guided by user description.	2,483
mozilla/tts	An open-source project providing a suite of deep learning models and tools for advanced text-to-speech synthesis.	9,466
mshumer/gpt-prompt-engineer	A tool for automating the process of generating and ranking effective prompts for AI models like GPT-4, GPT-3.5-Turbo, or Claude 3 Opus.	9,411
coqui-ai/tts	A deep learning toolkit for generating human-like speech from text	36,118
aigc-audio/audiogpt	An audio processing toolkit that provides pre-trained models and tools for tasks like speech synthesis, music generation, sound detection, and talking head creation.	10,061
bigscience-workshop/promptsource	A toolkit for creating and using natural language prompts to enable large language models to generalize to new tasks.	2,718
coqui-ai/stt	A toolkit for building and deploying speech-to-text models using deep learning techniques	2,302
plachtaa/vall-e-x	A research implementation of Microsoft's VALL-E X zero-shot TTS model for multilingual text-to-speech synthesis and voice cloning	7,719
hahahumble/speechgpt	An application that enables users to converse with ChatGPT via speech and text interfaces.	2,752
huggingface/transformers	A collection of pre-trained machine learning models for various natural language and computer vision tasks, enabling developers to fine-tune and deploy these models on their own projects.	136,357
lucidrains/musiclm-pytorch	Implementation of Google's MusicLM model for music generation using attention networks and text-conditioning.	3,189
nvidia/waveglow	Generates high-quality speech from mel-spectrograms using a flow-based network architecture	2,294