AudioGPT

Audio toolkit

An audio processing toolkit that provides pre-trained models and tools for tasks like speech synthesis, music generation, sound detection, and talking head creation.

AudioGPT: Understanding and Generating Speech, Music, Sound, and Talking Head

GitHub

10k stars
135 watching
862 forks
Language: Python
last commit: 5 months ago
Linked from 2 awesome lists

audiogptmusicsoundspeechtalking-head

Backlinks from these awesome lists:

Related projects:

Repository Description Stars
hahahumble/speechgpt An application that enables users to converse with ChatGPT via speech and text interfaces. 2,746
haoheliu/audioldm An AI-powered tool for generating various types of audio content from text input or existing audio files 2,446
speechbrain/speechbrain A PyTorch-based toolkit for building conversational AI systems with advanced speech and text processing capabilities. 8,922
nvidia/waveglow Generates high-quality speech from mel-spectrograms using a flow-based network architecture 2,285
facebookresearch/audiocraft A deep learning library for generating high-quality audio 20,969
pyannote/pyannote-audio A toolkit for speaker diarization using PyTorch and speech activity detection. 6,333
archinetai/audio-diffusion-pytorch An audio generation library that uses diffusion models to produce high-quality audio samples from noise or text input 1,961
pytorch/audio A PyTorch module providing tools and functions for audio signal processing 2,538
futantan/opengpt An open-source platform to create and run AI-powered chat applications 3,934
waylaidwanderer/node-chatgpt-api Provides client-side access to ChatGPT and Bing AI APIs using Node.js 4,204
cogentapps/chat-with-gpt An open-source ChatGPT app with added features and customization options 2,322
open-mmlab/mmagic A toolkit for building and experimenting with generative AI models for image and video generation, restoration, enhancement, and other tasks. 6,945
williamfzc/chat-gpt-ppt Automates the creation of PowerPoint presentations using ChatGPT as a backend. 906
suno-ai/bark A text-to-audio model that generates realistic speech and other audio 36,126