Pengi

Audio Model

An Audio Language Model framework that uses transfer learning to generate text from audio inputs

An Audio Language model for Audio Tasks

GitHub

290 stars
14 watching
16 forks
Language: Python
last commit: 7 months ago

Related projects:

Repository Description Stars
yuangongnd/ltu An audio and speech large language model implementation with pre-trained models, datasets, and inference options 385
balavenkatesh3322/audio-pretrained-model A collection of pre-trained audio and speech models for various applications 182
qwenlm/qwen-audio A multimodal audio language model developed by Alibaba Cloud that supports various tasks and languages 1,486
ibm/max-audio-classifier Identifies sounds in short audio clips using machine learning and PCA transformation 153
yongxuustc/dcase2017_task4_cvssp A system for audio classification and detection using machine learning models 4
elanmart/psmm An implementation of a neural network model for character-level language modeling. 50
qwenlm/qwen2-audio An audio-language model that can analyze or respond to speech instructions based on audio input 1,229
awni/speech A PyTorch implementation of end-to-end speech recognition models. 754
openai/finetune-transformer-lm This project provides code and model for improving language understanding through generative pre-training using a transformer-based architecture. 2,160
jordipons/music-audio-tagging-at-scale-models Research on end-to-end learning for music audio tagging using large datasets and different front-end paradigms. 148
microsoft/mpnet Develops a method for pre-training language understanding models by combining masked and permuted techniques, and provides code for implementation and fine-tuning. 288
jthorborg/ape An Audio Programming Environment with support for AU and DSP plugins 14
keunwoochoi/auralisation Reconstructs audio features learned by convolutional neural networks into audible sounds 42
cpjku/madmom A Python audio signal processing library used in music information retrieval tasks. 1,347
ynop/audiomate A Python library for handling audio datasets, providing tools for accessing, manipulating, and preparing data for machine learning tasks. 131