gazelle

Audio responder

An implementation of a joint speech language model that responds directly to audio input

Joint speech-language model - respond directly to audio!

GitHub

357 stars

15 watching

34 forks

Language: Python

last commit: about 1 year ago

Linked from 1 awesome list

audiollmmultimodalspeech

tincans.ai

Backlinks from these awesome lists:

amrzv/awesome-colab-notebooks

Related projects:

Repository	Description	Stars
qwenlm/qwen2-audio	An audio-language model that can analyze or respond to speech instructions based on audio input	1,306
microsoft/pengi	An Audio Language Model framework that uses transfer learning to generate text from audio inputs	295
soerenab/audiomnist	This project provides an implementation of a deep learning framework to classify audio signals and offers insights into the model's decision-making process using Explainable Artificial Intelligence (AI) techniques.	351
balavenkatesh3322/audio-pretrained-model	A collection of pre-trained audio and speech models for various applications	183
bytedance/salmonn	A large language model enabling speech, audio event perception and music inputs to achieve multilingual capabilities	1,091
yongxuustc/dcase2017_task4_cvssp	A system for audio classification and detection using machine learning models	4
yuangongnd/ltu	An audio and speech large language model implementation with pre-trained models, datasets, and inference options	396
laion-ai/clap	A library for learning audio embeddings from text and audio data using contrastive language-audio pretraining	1,457
kinwaicheuk/nnaudio	An audio processing toolkit using PyTorch convolutional neural networks to generate spectrograms from raw audio data	1,036
ibm/max-audio-classifier	Identifies sounds in short audio clips using machine learning and PCA transformation	154
awni/speech	A PyTorch implementation of end-to-end speech recognition models.	756
keunwoochoi/auralisation	Reconstructs audio features learned by convolutional neural networks into audible sounds	42
gen2brain/malgo	Provides a set of audio APIs for Go programming language	305
picovoice/rhino	A deep learning-based speech-to-intent engine for on-device voice interaction	633
chrisguttandin/standardized-audio-context	A cross-browser wrapper for the Web Audio API aiming to closely follow the standard.	687