Qwen-Audio
Audio Model
A multimodal audio language model developed by Alibaba Cloud that supports various tasks and languages
The official repo of Qwen-Audio (通义千问-Audio) chat & pretrained large audio language model proposed by Alibaba Cloud.
1k stars
25 watching
107 forks
Language: Python
last commit: 5 months ago Related projects:
Repository | Description | Stars |
---|---|---|
qwenlm/qwen2-audio | An audio-language model that can analyze or respond to speech instructions based on audio input | 1,229 |
microsoft/pengi | An Audio Language Model framework that uses transfer learning to generate text from audio inputs | 290 |
balavenkatesh3322/audio-pretrained-model | A collection of pre-trained audio and speech models for various applications | 182 |
yuangongnd/ltu | An audio and speech large language model implementation with pre-trained models, datasets, and inference options | 385 |
vivo-ai-lab/bluelm | Develops and releases large language models trained on vast amounts of data for various applications, including natural language understanding, text generation, and more. | 852 |
yunwentechnology/unilm | This project provides pre-trained models for natural language understanding and generation tasks using the UniLM architecture. | 438 |
jthorborg/ape | An Audio Programming Environment with support for AU and DSP plugins | 14 |
keunwoochoi/auralisation | Reconstructs audio features learned by convolutional neural networks into audible sounds | 42 |
soerenab/audiomnist | This project provides an implementation of a deep learning framework to classify audio signals and offers insights into the model's decision-making process using Explainable Artificial Intelligence (AI) techniques. | 347 |
xverse-ai/xverse-moe-a36b | Develops and publishes large multilingual language models with advanced mixing-of-experts architecture. | 36 |
yfzhang114/slime | Develops large multimodal models for high-resolution understanding and analysis of text, images, and other data types. | 137 |
ebu/libadm | An ITU-R BS.2076 conformant XML library for audio definition model creation and modification | 39 |
langboat/mengzi3 | An 8B and 13B language model based on the Llama architecture with multilingual capabilities. | 2,032 |
x-d-lab/mindchat | An AI model for providing emotional support and psychological assessment through conversational interfaces | 609 |
bobazooba/xllm-demo | A demo project showcasing customization possibilities of an XLLM library | 9 |