Qwen2-Audio
Audio Response Model
An audio-language model that can analyze or respond to speech instructions based on audio input
The official repo of Qwen2-Audio chat & pretrained large audio language model proposed by Alibaba Cloud.
1k stars
33 watching
91 forks
Language: Python
last commit: about 1 year ago Related projects:
| Repository | Description | Stars |
|---|---|---|
| | A multimodal audio language model developed by Alibaba Cloud that supports various tasks and languages | 1,515 |
| | An Audio Language Model framework that uses transfer learning to generate text from audio inputs | 295 |
| | A collection of pre-trained audio and speech models for various applications | 183 |
| | An implementation of a joint speech language model that responds directly to audio input | 357 |
| | An audio and speech large language model implementation with pre-trained models, datasets, and inference options | 396 |
| | This repository provides large language models and chat capabilities based on pre-trained Chinese models. | 14,797 |
| | This project provides an implementation of a deep learning framework to classify audio signals and offers insights into the model's decision-making process using Explainable Artificial Intelligence (AI) techniques. | 351 |
| | Reconstructs audio features learned by convolutional neural networks into audible sounds | 42 |
| | A multimodal large language model series developed by the Qwen team to understand and process images, videos, and text. | 3,613 |
| | Develops and releases large language models trained on vast amounts of data for various applications, including natural language understanding, text generation, and more. | 864 |
| | Provides a set of audio APIs for Go programming language | 305 |
| | A system for audio classification and detection using machine learning models | 4 |
| | An ITU-R BS.2076 conformant XML library for audio definition model creation and modification | 39 |
| | Enables speech-to-text transcription using a pre-trained neural network model in MATLAB. | 7 |
| | Provides pre-trained language models and tools for fine-tuning and evaluation | 439 |