Qwen-Audio

Audio Model

A multimodal audio language model developed by Alibaba Cloud that supports various tasks and languages

The official repo of Qwen-Audio (通义千问-Audio) chat & pretrained large audio language model proposed by Alibaba Cloud.

GitHub

1k stars
25 watching
107 forks
Language: Python
last commit: 5 months ago

Related projects:

Repository Description Stars
qwenlm/qwen2-audio An audio-language model that can analyze or respond to speech instructions based on audio input 1,229
microsoft/pengi An Audio Language Model framework that uses transfer learning to generate text from audio inputs 290
balavenkatesh3322/audio-pretrained-model A collection of pre-trained audio and speech models for various applications 182
yuangongnd/ltu An audio and speech large language model implementation with pre-trained models, datasets, and inference options 385
vivo-ai-lab/bluelm Develops and releases large language models trained on vast amounts of data for various applications, including natural language understanding, text generation, and more. 852
yunwentechnology/unilm This project provides pre-trained models for natural language understanding and generation tasks using the UniLM architecture. 438
jthorborg/ape An Audio Programming Environment with support for AU and DSP plugins 14
keunwoochoi/auralisation Reconstructs audio features learned by convolutional neural networks into audible sounds 42
soerenab/audiomnist This project provides an implementation of a deep learning framework to classify audio signals and offers insights into the model's decision-making process using Explainable Artificial Intelligence (AI) techniques. 347
xverse-ai/xverse-moe-a36b Develops and publishes large multilingual language models with advanced mixing-of-experts architecture. 36
yfzhang114/slime Develops large multimodal models for high-resolution understanding and analysis of text, images, and other data types. 137
ebu/libadm An ITU-R BS.2076 conformant XML library for audio definition model creation and modification 39
langboat/mengzi3 An 8B and 13B language model based on the Llama architecture with multilingual capabilities. 2,032
x-d-lab/mindchat An AI model for providing emotional support and psychological assessment through conversational interfaces 609
bobazooba/xllm-demo A demo project showcasing customization possibilities of an XLLM library 9