Qwen2-Audio

Audio Response Model

An audio-language model that can analyze or respond to speech instructions based on audio input

The official repo of Qwen2-Audio chat & pretrained large audio language model proposed by Alibaba Cloud.

GitHub

1k stars
31 watching
82 forks
Language: Python
last commit: 3 months ago

Related projects:

Repository Description Stars
qwenlm/qwen-audio A multimodal audio language model developed by Alibaba Cloud that supports various tasks and languages 1,486
microsoft/pengi An Audio Language Model framework that uses transfer learning to generate text from audio inputs 290
balavenkatesh3322/audio-pretrained-model A collection of pre-trained audio and speech models for various applications 182
tincans-ai/gazelle An implementation of a joint speech language model that responds directly to audio input 355
yuangongnd/ltu An audio and speech large language model implementation with pre-trained models, datasets, and inference options 385
qwenlm/qwen This repository provides large language models and chat capabilities based on pre-trained Chinese models. 14,164
soerenab/audiomnist This project provides an implementation of a deep learning framework to classify audio signals and offers insights into the model's decision-making process using Explainable Artificial Intelligence (AI) techniques. 347
keunwoochoi/auralisation Reconstructs audio features learned by convolutional neural networks into audible sounds 42
qwenlm/qwen2-vl A multimodal large language model series developed by the Qwen team to understand and process images, videos, and text. 3,093
vivo-ai-lab/bluelm Develops and releases large language models trained on vast amounts of data for various applications, including natural language understanding, text generation, and more. 852
gen2brain/malgo Provides a set of audio APIs for Go programming language 301
yongxuustc/dcase2017_task4_cvssp A system for audio classification and detection using machine learning models 4
ebu/libadm An ITU-R BS.2076 conformant XML library for audio definition model creation and modification 39
matlab-deep-learning/wav2vec-2.0 Enables speech-to-text transcription using a pre-trained neural network model in MATLAB. 8
flagai-open/aquila2 Provides pre-trained language models and tools for fine-tuning and evaluation 437