Pengi

Audio Model

An Audio Language Model framework that uses transfer learning to generate text from audio inputs

An Audio Language model for Audio Tasks

295 stars

14 watching

17 forks

Language: Python

last commit: over 2 years ago

Screenshot of microsoft/Pengi website

arxiv.org/abs/2305.11834

Related projects:

Repository	Description	Stars
yuangongnd/ltu	An audio and speech large language model implementation with pre-trained models, datasets, and inference options	396
balavenkatesh3322/audio-pretrained-model	A collection of pre-trained audio and speech models for various applications	183
qwenlm/qwen-audio	A multimodal audio language model developed by Alibaba Cloud that supports various tasks and languages	1,515
ibm/max-audio-classifier	Identifies sounds in short audio clips using machine learning and PCA transformation	154
yongxuustc/dcase2017_task4_cvssp	A system for audio classification and detection using machine learning models	4
elanmart/psmm	An implementation of a neural network model for character-level language modeling.	50
qwenlm/qwen2-audio	An audio-language model that can analyze or respond to speech instructions based on audio input	1,306
awni/speech	A PyTorch implementation of end-to-end speech recognition models.	756
openai/finetune-transformer-lm	This project provides code and model for improving language understanding through generative pre-training using a transformer-based architecture.	2,167
jordipons/music-audio-tagging-at-scale-models	Research on end-to-end learning for music audio tagging using large datasets and different front-end paradigms.	149
microsoft/mpnet	Develops a method for pre-training language understanding models by combining masked and permuted techniques, and provides code for implementation and fine-tuning.	288
jthorborg/ape	An Audio Programming Environment with support for AU and DSP plugins	14
keunwoochoi/auralisation	Reconstructs audio features learned by convolutional neural networks into audible sounds	42
cpjku/madmom	A Python audio signal processing library used in music information retrieval tasks.	1,366
ynop/audiomate	A Python library for handling audio datasets, providing tools for accessing, manipulating, and preparing data for machine learning tasks.	133