AudioLDM

Audio Generator

A Python-based audio generation tool that can produce speech, sound effects, music, and more, using text as input or guided by user description.

AudioLDM: Generate speech, sound effects, music and beyond, with text.

GitHub

2k stars
42 watching
225 forks
Language: Python
last commit: about 1 month ago
audio-generation

Related projects:

Repository Description Stars
grame-cncm/faust A functional programming language for real-time signal processing and synthesis 2,605
suno-ai/bark A text-to-audio model that generates realistic speech and other audio 36,433
aigc-audio/audiogpt An audio processing toolkit that provides pre-trained models and tools for tasks like speech synthesis, music generation, sound detection, and talking head creation. 10,061
lucidrains/musiclm-pytorch Implementation of Google's MusicLM model for music generation using attention networks and text-conditioning. 3,189
facebookresearch/audiocraft A deep learning library for generating high-quality audio 21,134
ibm/max-audio-sample-generator A tool to generate audio samples based on input commands and lo-fi instrumental music tracks. 22
jiaaro/pydub A Python library for manipulating and editing audio files 9,024
superkogito/pydiogment A Python library for generating multiple audio files based on a starting mono audio file with various effects such as speed change, tone alteration and noise addition. 83
jasonppy/voicecraft A neural codec model for speech editing and text-to-speech synthesis in real-time, using few seconds of reference audio. 7,744
juandagilc/audio-effects A collection of audio effects plugins implemented from a book and contributing examples 757
oobabooga/text-generation-webui A web-based interface for generating text using large language models 41,123
mubertai/mubert-text-to-music Generates music based on user input prompts using the Mubert API 2,738
gl326/bard-audio An audio engine for Game Maker Studio 2 designed to facilitate good audio implementation. 38
rustaudio/cpal A cross-platform audio I/O library in pure Rust 2,772
nvidia/waveglow Generates high-quality speech from mel-spectrograms using a flow-based network architecture 2,294