audiocraft
Audio generator
A deep learning library for generating high-quality audio
Audiocraft is a library for audio processing and generation with deep learning. It features the state-of-the-art EnCodec audio compressor / tokenizer, along with MusicGen, a simple and controllable music generation LM with textual and melodic conditioning.
21k stars
211 watching
2k forks
Language: Python
last commit: 3 months ago
Linked from 2 awesome lists
Related projects:
Repository | Description | Stars |
---|---|---|
pytorch/audio | A PyTorch module providing tools and functions for audio signal processing | 2,561 |
lucidrains/musiclm-pytorch | Implementation of Google's MusicLM model for music generation using attention networks and text-conditioning. | 3,189 |
archinetai/audio-diffusion-pytorch | An audio generation library that uses diffusion models to produce high-quality audio samples from noise or text input | 1,975 |
libaudioflux/audioflux | A deep learning tool library for extracting features from audio signals. | 2,940 |
enhuiz/vall-e | An implementation of VALL-E in PyTorch for text-to-speech synthesis | 2,970 |
deepsound-project/samplernn-pytorch | An implementation of an audio generation model using PyTorch | 290 |
tyiannak/pyaudioanalysis | A comprehensive Python library for feature extraction, classification, segmentation, and applications of audio data. | 5,918 |
facebookresearch/encodec | A deep learning-based audio codec that supports high-fidelity neural audio compression. | 3,536 |
spotify/pedalboard | A Python library for processing and manipulating audio data | 5,286 |
pyannote/pyannote-audio | A toolkit for speaker diarization using PyTorch and speech activity detection. | 6,508 |
facebookresearch/audio2photoreal | Generating photorealistic avatars from audio | 2,715 |
haoheliu/audioldm | A Python-based audio generation tool that can produce speech, sound effects, music, and more, using text as input or guided by user description. | 2,483 |
jasonppy/voicecraft | A neural codec model for speech editing and text-to-speech synthesis in real-time, using few seconds of reference audio. | 7,744 |
aigc-audio/audiogpt | An audio processing toolkit that provides pre-trained models and tools for tasks like speech synthesis, music generation, sound detection, and talking head creation. | 10,061 |
nvidia/waveglow | Generates high-quality speech from mel-spectrograms using a flow-based network architecture | 2,294 |