AudioLDM

Audio Generator

A Python-based audio generation tool that can produce speech, sound effects, music, and more, using text as input or guided by user description.

AudioLDM: Generate speech, sound effects, music and beyond, with text.

GitHub

2k stars

42 watching

225 forks

Language: Python

last commit: over 1 year ago

audio-generation

audioldm.github.io/

Related projects:

Repository	Description	Stars
grame-cncm/faust	A functional programming language for real-time signal processing and synthesis	2,605
suno-ai/bark	A text-to-audio model that generates realistic speech and other audio	36,433
aigc-audio/audiogpt	An audio processing toolkit that provides pre-trained models and tools for tasks like speech synthesis, music generation, sound detection, and talking head creation.	10,061
lucidrains/musiclm-pytorch	Implementation of Google's MusicLM model for music generation using attention networks and text-conditioning.	3,189
facebookresearch/audiocraft	A deep learning library for generating high-quality audio	21,134
ibm/max-audio-sample-generator	A tool to generate audio samples based on input commands and lo-fi instrumental music tracks.	22
jiaaro/pydub	A Python library for manipulating and editing audio files	9,024
superkogito/pydiogment	A Python library for generating multiple audio files based on a starting mono audio file with various effects such as speed change, tone alteration and noise addition.	83
jasonppy/voicecraft	A neural codec model for speech editing and text-to-speech synthesis in real-time, using few seconds of reference audio.	7,744
juandagilc/audio-effects	A collection of audio effects plugins implemented from a book and contributing examples	757
oobabooga/text-generation-webui	A web-based interface for generating text using large language models	41,123
mubertai/mubert-text-to-music	Generates music based on user input prompts using the Mubert API	2,738
gl326/bard-audio	An audio engine for Game Maker Studio 2 designed to facilitate good audio implementation.	38
rustaudio/cpal	A cross-platform audio I/O library in pure Rust	2,772
nvidia/waveglow	Generates high-quality speech from mel-spectrograms using a flow-based network architecture	2,294