python_speech_features
Speech feature extractor
A library that computes speech features commonly used in Automatic Speech Recognition (ASR) systems.
This library provides common speech features for ASR including MFCCs and filterbank energies.
2k stars
87 watching
617 forks
Language: Python
last commit: over 3 years ago
Linked from 1 awesome list
Related projects:
Repository | Description | Stars |
---|---|---|
jameslyons/matlab_speech_features | A set of speech feature extraction functions for various machine learning applications. | 43 |
astorfi/speechpy | Provides tools and libraries for extracting speech features from audio data. | 881 |
tyiannak/pyaudioanalysis | A comprehensive Python library for feature extraction, classification, segmentation, and applications of audio data. | 5,918 |
pytorch/audio | A PyTorch module providing tools and functions for audio signal processing | 2,561 |
superkogito/spafe | A Python library for extracting audio features from mono audio files using various filter banks and spectrogram algorithms. | 461 |
rf5/transfusion-asr | An ASR project that uses diffusion models to transcribe speech | 76 |
peak1995/speech-enhancement-dsp | This repository provides MATLAB implementations of traditional speech enhancement techniques including spectral subtraction, Wiener filtering, and Kalman filtering. | 84 |
awni/speech | A PyTorch implementation of end-to-end speech recognition models. | 756 |
bmcfee/pyrubberband | Provides a lightweight Python wrapper for audio processing tasks | 167 |
belangeo/pyo | A Python module for digital signal processing and audio synthesis, allowing users to create complex audio chains in real-time. | 1,329 |
iver56/audiomentations | Library for audio data augmentation used in machine learning | 1,903 |
cpjku/madmom | A Python audio signal processing library used in music information retrieval tasks. | 1,366 |
linto-ai/whisper-timestamped | An extension to the Whisper speech recognition model that adds word-level timestamps and confidence scores. | 2,121 |
vocalpy/vak | A Python framework for training and applying neural networks to acoustic communication research | 78 |
r9y9/tacotron_pytorch | An implementation of Tacotron speech synthesis model using PyTorch. | 309 |