awesome-python-scientific-audio
Audio analysis library
A curated collection of Python packages and tools for scientific research in audio and music applications
Curated list of python software and packages related to scientific research in audio
2k stars
77 watching
169 forks
last commit: over 1 year ago
Linked from 6 awesome lists
audioawesome-listpython
Python for Scientific Audio / Audio Related Packages | |||
audiolazy | 691 | over 2 years ago | Expressive Digital Signal Processing (DSP) package for Python |
audioread | 489 | 9 months ago | Cross-library (GStreamer + Core Audio + MAD + FFmpeg) audio decoding |
mutagen | Reads and writes all kind of audio metadata for various formats | ||
pyAV | PyAV is a Pythonic binding for FFmpeg or Libav | ||
(Py)Soundfile | Library based on libsndfile, CFFI, and NumPy | ||
pySox | 519 | 5 months ago | Wrapper for sox |
stempeg | 96 | 3 months ago | read/write of STEMS multistream audio |
tinytag | 703 | 18 days ago | reading music meta data of MP3, OGG, FLAC and Wave files |
acoustics | useful tools for acousticians | ||
AudioTK | 252 | almost 4 years ago | DSP filter toolbox (lots of filters) |
AudioTSM | real-time audio time-scale modification procedures | ||
Gammatone | 220 | over 1 year ago | Gammatone filterbank implementation |
pyFFTW | Wrapper for FFTW(3) | ||
NSGT | Non-stationary gabor transform, constant-q | ||
matchering | 1,759 | 15 days ago | Automated reference audio mastering |
MDCT | 51 | over 2 years ago | MDCT transform |
pydub | Manipulate audio with a simple and easy high level interface | ||
pytftb | Implementation of the MATLAB Time-Frequency Toolbox | ||
pyroomacoustics | 1,460 | 4 days ago | Room Acoustics Simulation (RIR generator) |
PyRubberband | 166 | about 2 months ago | Wrapper for to do pitch-shifting and time-stretching |
PyWavelets | Discrete Wavelet Transform in Python | ||
Resampy | Sample rate conversion | ||
SFS-Python | Sound Field Synthesis Toolbox | ||
sound_field_analysis | Analyze, visualize and process sound field data recorded by spherical microphone arrays | ||
STFT | Standalone package for Short-Time Fourier Transform | ||
aubio | Feature extractor, written in C, Python interface | ||
audioFlux | 2,915 | 6 months ago | A library for audio and music analysis, feature extraction |
audiolazy | 691 | over 2 years ago | Realtime Audio Processing lib, general purpose |
essentia | Music related low level and high level feature extractor, C++ based, includes Python bindings | ||
python_speech_features | 2,376 | about 3 years ago | Common speech features for ASR |
pyYAAFE | 244 | over 3 years ago | Python bindings for YAAFE feature extractor |
speechpy | 880 | about 3 years ago | Library for Speech Processing and Recognition, mostly feature extraction for now |
spafe | 458 | 5 months ago | Python library for features extraction from audio files |
audiomentations | 1,873 | 8 days ago | Audio Data Augmentation |
muda | Musical Data Augmentation | ||
pydiogment | 83 | over 1 year ago | Audio Data Augmentation |
aeneas | Forced aligner, based on MFCC+DTW, 35+ languages | ||
deepspeech | 25,358 | 3 months ago | Pretrained automatic speech recognition |
gentle | 1,453 | 7 months ago | Forced-aligner built on Kaldi |
Parselmouth | 1,066 | 23 days ago | Python interface to the phonetics and speech analysis, synthesis, and manipulation software |
persephone | Automatic phoneme transcription tool | ||
pyannote.audio | 6,333 | 10 days ago | Neural building blocks for speaker diarization |
pyAudioAnalysis | 5,885 | 8 months ago | ² - Feature Extraction, Classification, Diarization |
py-webrtcvad | 2,066 | 5 months ago | Interface to the WebRTC Voice Activity Detector |
pypesq | 356 | over 1 year ago | Wrapper for the PESQ score calculation |
pystoi | 326 | 11 months ago | Short Term Objective Intelligibility measure (STOI) |
PyWorldVocoder | 725 | about 1 year ago | Wrapper for Morise's World Vocoder |
Montreal Forced Aligner | Forced aligner, based on Kaldi (HMM), English (others can be trained) | ||
SIDEKIT | Speaker and Language recognition | ||
SpeechRecognition | 8,440 | 12 days ago | Wrapper for several ASR engines and APIs, online and offline |
sed_eval | Evaluation toolbox for Sound Event Detection | ||
cochlea | 108 | 4 months ago | Inner ear models |
Brian2 | Spiking neural networks simulator, includes cochlea model | ||
Loudness | 36 | over 5 years ago | Perceived loudness, includes Zwicker, Moore/Glasberg model |
pyloudnorm | Audio loudness meter and normalization, implements ITU-R BS.1770-4 | ||
Sound Field Synthesis Toolbox | Sound Field Synthesis Toolbox | ||
commonfate | 17 | over 4 years ago | Common Fate Model and Transform |
NTFLib | 47 | about 9 years ago | Sparse Beta-Divergence Tensor Factorization |
NUSSL | Holistic source separation framework including DSP methods and deep learning methods | ||
NIMFA | Several flavors of non-negative-matrix factorization | ||
Catchy | 21 | almost 8 years ago | Corpus Analysis Tools for Computational Hook Discovery |
chord-detection | 110 | over 1 year ago | Algorithms for chord detection and key estimation |
Madmom | MIR packages with strong focus on beat detection, onset detection and chord recognition | ||
mir_eval | Common scores for various MIR tasks. Also includes bss_eval implementation | ||
msaf | Music Structure Analysis Framework | ||
librosa | General audio and music analysis | ||
Kapre | 922 | about 1 year ago | Keras Audio Preprocessors |
TorchAudio | 2,538 | 6 days ago | PyTorch Audio Loaders |
nnAudio | 1,032 | 9 months ago | Accelerated audio processing using 1D convolution networks in PyTorch |
Music21 | Toolkit for Computer-Aided Musicology | ||
Mido | Realtime MIDI wrapper | ||
mingus | 862 | 7 months ago | Advanced music theory and notation package with MIDI file and playback support |
Pretty-MIDI | Utility functions for handling MIDI data in a nice/intuitive way | ||
Jupylet | 230 | 10 months ago | Subtractive, additive, FM, and sample-based sound synthesis |
PYO | Realtime audio dsp engine | ||
python-sounddevice | 1,052 | 20 days ago | PortAudio wrapper providing realtime audio I/O with NumPy |
ReTiSAR | 70 | 12 months ago | Binarual rendering of streamed or IR-based high-order spherical microphone array signals |
TimeSide (Beta) | 371 | about 1 month ago | high level audio analysis, imaging, transcoding, streaming and labelling |
beets | Music library manager and tagger | ||
musdb | Parse and process the MUSDB18 dataset | ||
medleydb | Parse audio + annotations | ||
Soundcloud API | 105 | 10 months ago | Wrapper for |
Youtube-Downloader | Download youtube videos (and the audio) | ||
audiomate | 131 | over 1 year ago | Loading different types of audio datasets |
mirdata | Common loaders for Music Information Retrieval (MIR) datasets | ||
VamPy Host | Interface compiled vamp plugins | ||
Python for Scientific Audio / Tutorials | |||
Whirlwind Tour Of Python | fast-paced introduction to Python essentials, aimed at researchers and developers | ||
Introduction to Numpy and Scipy | Highly recommended tutorial, covers large parts of the scientific Python ecosystem | ||
Numpy for MATLAB® Users | Short overview of equivalent python functions for switchers | ||
MIR Notebooks | collection of instructional iPython Notebooks for music information retrieval (MIR) | ||
Selected Topics in Audio Signal Processing | 64 | about 3 years ago | Exercises as iPython notebooks |
Live-coding a music synthesizer | Live-coding video showing how to use the SoundDevice library to reproduce realistic sounds. | ||
Python for Scientific Audio / Books | |||
Python Data Science Handbook | 43,214 | 5 months ago | Jake Vanderplas, Excellent Book and accompanying tutorial notebooks |
Fundamentals of Music Processing | Meinard Müller, comes with Python exercises | ||
Python for Scientific Audio / Scientific Papers | |||
Python for audio signal processing | John C. Glover, Victor Lazzarini and Joseph Timoney, Linux Audio Conference 2011 | ||
librosa: Audio and Music Signal Analysis in Python | , - Brian McFee, Colin Raffel, Dawen Liang, Daniel P.W. Ellis, Matt McVicar, Eric Battenberg, Oriol Nieto, Scipy 2015 | ||
pyannote.audio: neural building blocks for speaker diarization | , - Hervé Bredin, Ruiqing Yin, Juan Manuel Coria, Gregory Gelly, Pavel Korshunov, Marvin Lavechin, Diego Fustes, Hadrien Titeux, Wassim Bouaziz, Marie-Philippe Gill, ICASSP 2020 | ||
Python for Scientific Audio / Other Resources | |||
Coursera Course | Audio Signal Processing, Python based course from UPF of Barcelona and Stanford University | ||
Digital Signal Processing Course | Masters Course Material (University of Rostock) with many Python examples | ||
Slack Channel | Music Information Retrieval Community |
More related projects:
- anishathalye/neural-style
- softcatala/whisper-ctranslate2
- scipy/scipy
- m-bain/whisperx
- avsystem/anjay
- danshapero/icepack-py
- yoggy/sendosc
- farama-foundation/arcade-learning-environment
- ifm/ifm3d
- yuki-koyama/mathtoolbox
- ml-gde/e2e-tflite-tutorials
- alexandre01/ultimatelabeling
- unslothai/hyperlearn
- ibm/max-speech-to-text-converter