Qwen-Audio

Audio Model

A multimodal audio language model developed by Alibaba Cloud that supports various tasks and languages

The official repo of Qwen-Audio (通义千问-Audio) chat & pretrained large audio language model proposed by Alibaba Cloud.

GitHub

2k stars
26 watching
111 forks
Language: Python
last commit: 7 months ago

Related projects:

Repository Description Stars
qwenlm/qwen2-audio An audio-language model that can analyze or respond to speech instructions based on audio input 1,306
microsoft/pengi An Audio Language Model framework that uses transfer learning to generate text from audio inputs 295
balavenkatesh3322/audio-pretrained-model A collection of pre-trained audio and speech models for various applications 183
yuangongnd/ltu An audio and speech large language model implementation with pre-trained models, datasets, and inference options 396
vivo-ai-lab/bluelm Develops and releases large language models trained on vast amounts of data for various applications, including natural language understanding, text generation, and more. 864
yunwentechnology/unilm This project provides pre-trained models and tools for natural language understanding (NLU) and generation (NLG) tasks in Chinese. 439
jthorborg/ape An Audio Programming Environment with support for AU and DSP plugins 14
keunwoochoi/auralisation Reconstructs audio features learned by convolutional neural networks into audible sounds 42
soerenab/audiomnist This project provides an implementation of a deep learning framework to classify audio signals and offers insights into the model's decision-making process using Explainable Artificial Intelligence (AI) techniques. 351
xverse-ai/xverse-moe-a36b Develops and publishes large multilingual language models with advanced mixing-of-experts architecture. 37
yfzhang114/slime Develops large multimodal models for high-resolution understanding and analysis of text, images, and other data types. 143
ebu/libadm An ITU-R BS.2076 conformant XML library for audio definition model creation and modification 39
langboat/mengzi3 An 8B and 13B language model based on the Llama architecture with multilingual capabilities. 2,031
x-d-lab/mindchat Provides a suite of AI-powered models for mental health support and evaluation 625
bobazooba/xllm-demo A demo project showcasing customization possibilities of an XLLM library 9