Qwen2-Audio

Audio Response Model

An audio-language model that can analyze or respond to speech instructions based on audio input

The official repo of Qwen2-Audio chat & pretrained large audio language model proposed by Alibaba Cloud.

GitHub

1k stars

33 watching

91 forks

Language: Python

last commit: almost 2 years ago

Related projects:

Repository	Description	Stars
qwenlm/qwen-audio	A multimodal audio language model developed by Alibaba Cloud that supports various tasks and languages	1,515
microsoft/pengi	An Audio Language Model framework that uses transfer learning to generate text from audio inputs	295
balavenkatesh3322/audio-pretrained-model	A collection of pre-trained audio and speech models for various applications	183
tincans-ai/gazelle	An implementation of a joint speech language model that responds directly to audio input	357
yuangongnd/ltu	An audio and speech large language model implementation with pre-trained models, datasets, and inference options	396
qwenlm/qwen	This repository provides large language models and chat capabilities based on pre-trained Chinese models.	14,797
soerenab/audiomnist	This project provides an implementation of a deep learning framework to classify audio signals and offers insights into the model's decision-making process using Explainable Artificial Intelligence (AI) techniques.	351
keunwoochoi/auralisation	Reconstructs audio features learned by convolutional neural networks into audible sounds	42
qwenlm/qwen2-vl	A multimodal large language model series developed by the Qwen team to understand and process images, videos, and text.	3,613
vivo-ai-lab/bluelm	Develops and releases large language models trained on vast amounts of data for various applications, including natural language understanding, text generation, and more.	864
gen2brain/malgo	Provides a set of audio APIs for Go programming language	305
yongxuustc/dcase2017_task4_cvssp	A system for audio classification and detection using machine learning models	4
ebu/libadm	An ITU-R BS.2076 conformant XML library for audio definition model creation and modification	39
matlab-deep-learning/wav2vec-2.0	Enables speech-to-text transcription using a pre-trained neural network model in MATLAB.	7
flagai-open/aquila2	Provides pre-trained language models and tools for fine-tuning and evaluation	439