AudioGPT
Audio toolkit
An audio processing toolkit that provides pre-trained models and tools for tasks like speech synthesis, music generation, sound detection, and talking head creation.
AudioGPT: Understanding and Generating Speech, Music, Sound, and Talking Head
10k stars
135 watching
862 forks
Language: Python
last commit: 5 months ago
Linked from 2 awesome lists
audiogptmusicsoundspeechtalking-head
Related projects:
Repository | Description | Stars |
---|---|---|
hahahumble/speechgpt | An application that enables users to converse with ChatGPT via speech and text interfaces. | 2,746 |
haoheliu/audioldm | An AI-powered tool for generating various types of audio content from text input or existing audio files | 2,446 |
speechbrain/speechbrain | A PyTorch-based toolkit for building conversational AI systems with advanced speech and text processing capabilities. | 8,922 |
nvidia/waveglow | Generates high-quality speech from mel-spectrograms using a flow-based network architecture | 2,285 |
facebookresearch/audiocraft | A deep learning library for generating high-quality audio | 20,969 |
pyannote/pyannote-audio | A toolkit for speaker diarization using PyTorch and speech activity detection. | 6,333 |
archinetai/audio-diffusion-pytorch | An audio generation library that uses diffusion models to produce high-quality audio samples from noise or text input | 1,961 |
pytorch/audio | A PyTorch module providing tools and functions for audio signal processing | 2,538 |
futantan/opengpt | An open-source platform to create and run AI-powered chat applications | 3,934 |
waylaidwanderer/node-chatgpt-api | Provides client-side access to ChatGPT and Bing AI APIs using Node.js | 4,204 |
cogentapps/chat-with-gpt | An open-source ChatGPT app with added features and customization options | 2,322 |
open-mmlab/mmagic | A toolkit for building and experimenting with generative AI models for image and video generation, restoration, enhancement, and other tasks. | 6,945 |
williamfzc/chat-gpt-ppt | Automates the creation of PowerPoint presentations using ChatGPT as a backend. | 906 |
suno-ai/bark | A text-to-audio model that generates realistic speech and other audio | 36,126 |