Pengi
Audio Model
An Audio Language Model framework that uses transfer learning to generate text from audio inputs
An Audio Language model for Audio Tasks
290 stars
14 watching
16 forks
Language: Python
last commit: 7 months ago Related projects:
Repository | Description | Stars |
---|---|---|
yuangongnd/ltu | An audio and speech large language model implementation with pre-trained models, datasets, and inference options | 385 |
balavenkatesh3322/audio-pretrained-model | A collection of pre-trained audio and speech models for various applications | 182 |
qwenlm/qwen-audio | A multimodal audio language model developed by Alibaba Cloud that supports various tasks and languages | 1,486 |
ibm/max-audio-classifier | Identifies sounds in short audio clips using machine learning and PCA transformation | 153 |
yongxuustc/dcase2017_task4_cvssp | A system for audio classification and detection using machine learning models | 4 |
elanmart/psmm | An implementation of a neural network model for character-level language modeling. | 50 |
qwenlm/qwen2-audio | An audio-language model that can analyze or respond to speech instructions based on audio input | 1,229 |
awni/speech | A PyTorch implementation of end-to-end speech recognition models. | 754 |
openai/finetune-transformer-lm | This project provides code and model for improving language understanding through generative pre-training using a transformer-based architecture. | 2,160 |
jordipons/music-audio-tagging-at-scale-models | Research on end-to-end learning for music audio tagging using large datasets and different front-end paradigms. | 148 |
microsoft/mpnet | Develops a method for pre-training language understanding models by combining masked and permuted techniques, and provides code for implementation and fine-tuning. | 288 |
jthorborg/ape | An Audio Programming Environment with support for AU and DSP plugins | 14 |
keunwoochoi/auralisation | Reconstructs audio features learned by convolutional neural networks into audible sounds | 42 |
cpjku/madmom | A Python audio signal processing library used in music information retrieval tasks. | 1,347 |
ynop/audiomate | A Python library for handling audio datasets, providing tools for accessing, manipulating, and preparing data for machine learning tasks. | 131 |