AudioLDM

Audio generator

An AI-powered tool for generating various types of audio content from text input or existing audio files

AudioLDM: Generate speech, sound effects, music and beyond, with text.

GitHub

2k stars
42 watching
224 forks
Language: Python
last commit: about 1 month ago
audio-generation

Related projects:

Repository Description Stars
grame-cncm/faust A functional programming language for real-time signal processing and synthesis 2,586
suno-ai/bark A text-to-audio model that generates realistic speech and other audio 36,126
aigc-audio/audiogpt An audio processing toolkit that provides pre-trained models and tools for tasks like speech synthesis, music generation, sound detection, and talking head creation. 10,032
lucidrains/musiclm-pytorch Implementation of Google's MusicLM model for music generation using attention networks and text-conditioning. 3,166
facebookresearch/audiocraft A deep learning library for generating high-quality audio 21,018
ibm/max-audio-sample-generator A tool to generate audio samples based on input commands and lo-fi instrumental music tracks. 21
jiaaro/pydub A Python library for manipulating and editing audio files 8,952
superkogito/pydiogment A Python library for generating multiple audio files based on a starting mono audio file with various effects such as speed change, tone alteration and noise addition. 83
jasonppy/voicecraft A neural codec model for speech editing and text-to-speech synthesis in real-time, using few seconds of reference audio. 7,650
juandagilc/audio-effects A collection of audio effects plugins implemented from a book and contributing examples 748
oobabooga/text-generation-webui A web-based interface for generating text using large language models 40,673
mubertai/mubert-text-to-music Generates music based on user input prompts using the Mubert API 2,733
gl326/bard-audio An audio engine for Game Maker Studio 2 designed to facilitate good audio implementation. 38
rustaudio/cpal A cross-platform audio I/O library in pure Rust 2,718
nvidia/waveglow Generates high-quality speech from mel-spectrograms using a flow-based network architecture 2,285