Qwen2-Audio
Audio Response Model
An audio-language model that can analyze or respond to speech instructions based on audio input
The official repo of Qwen2-Audio chat & pretrained large audio language model proposed by Alibaba Cloud.
1k stars
31 watching
82 forks
Language: Python
last commit: 3 months ago Related projects:
Repository | Description | Stars |
---|---|---|
qwenlm/qwen-audio | A multimodal audio language model developed by Alibaba Cloud that supports various tasks and languages | 1,486 |
microsoft/pengi | An Audio Language Model framework that uses transfer learning to generate text from audio inputs | 290 |
balavenkatesh3322/audio-pretrained-model | A collection of pre-trained audio and speech models for various applications | 182 |
tincans-ai/gazelle | An implementation of a joint speech language model that responds directly to audio input | 355 |
yuangongnd/ltu | An audio and speech large language model implementation with pre-trained models, datasets, and inference options | 385 |
qwenlm/qwen | This repository provides large language models and chat capabilities based on pre-trained Chinese models. | 14,164 |
soerenab/audiomnist | This project provides an implementation of a deep learning framework to classify audio signals and offers insights into the model's decision-making process using Explainable Artificial Intelligence (AI) techniques. | 347 |
keunwoochoi/auralisation | Reconstructs audio features learned by convolutional neural networks into audible sounds | 42 |
qwenlm/qwen2-vl | A multimodal large language model series developed by the Qwen team to understand and process images, videos, and text. | 3,093 |
vivo-ai-lab/bluelm | Develops and releases large language models trained on vast amounts of data for various applications, including natural language understanding, text generation, and more. | 852 |
gen2brain/malgo | Provides a set of audio APIs for Go programming language | 301 |
yongxuustc/dcase2017_task4_cvssp | A system for audio classification and detection using machine learning models | 4 |
ebu/libadm | An ITU-R BS.2076 conformant XML library for audio definition model creation and modification | 39 |
matlab-deep-learning/wav2vec-2.0 | Enables speech-to-text transcription using a pre-trained neural network model in MATLAB. | 8 |
flagai-open/aquila2 | Provides pre-trained language models and tools for fine-tuning and evaluation | 437 |