Qwen-Audio

Audio Model

A multimodal audio language model developed by Alibaba Cloud that supports various tasks and languages

The official repo of Qwen-Audio (通义千问-Audio) chat & pretrained large audio language model proposed by Alibaba Cloud.

GitHub

2k stars

26 watching

111 forks

Language: Python

last commit: about 1 year ago

Related projects:

Repository	Description	Stars
qwenlm/qwen2-audio	An audio-language model that can analyze or respond to speech instructions based on audio input	1,306
microsoft/pengi	An Audio Language Model framework that uses transfer learning to generate text from audio inputs	295
balavenkatesh3322/audio-pretrained-model	A collection of pre-trained audio and speech models for various applications	183
yuangongnd/ltu	An audio and speech large language model implementation with pre-trained models, datasets, and inference options	396
vivo-ai-lab/bluelm	Develops and releases large language models trained on vast amounts of data for various applications, including natural language understanding, text generation, and more.	864
yunwentechnology/unilm	This project provides pre-trained models and tools for natural language understanding (NLU) and generation (NLG) tasks in Chinese.	439
jthorborg/ape	An Audio Programming Environment with support for AU and DSP plugins	14
keunwoochoi/auralisation	Reconstructs audio features learned by convolutional neural networks into audible sounds	42
soerenab/audiomnist	This project provides an implementation of a deep learning framework to classify audio signals and offers insights into the model's decision-making process using Explainable Artificial Intelligence (AI) techniques.	351
xverse-ai/xverse-moe-a36b	Develops and publishes large multilingual language models with advanced mixing-of-experts architecture.	37
yfzhang114/slime	Develops large multimodal models for high-resolution understanding and analysis of text, images, and other data types.	143
ebu/libadm	An ITU-R BS.2076 conformant XML library for audio definition model creation and modification	39
langboat/mengzi3	An 8B and 13B language model based on the Llama architecture with multilingual capabilities.	2,031
x-d-lab/mindchat	Provides a suite of AI-powered models for mental health support and evaluation	625
bobazooba/xllm-demo	A demo project showcasing customization possibilities of an XLLM library	9