python_speech_features
Speech feature extractor
A library that computes speech features commonly used in Automatic Speech Recognition (ASR) systems.
This library provides common speech features for ASR including MFCCs and filterbank energies.
2k stars
87 watching
618 forks
Language: Python
last commit: about 3 years ago
Linked from 1 awesome list
Related projects:
Repository | Description | Stars |
---|---|---|
jameslyons/matlab_speech_features | A set of speech feature extraction functions for various machine learning applications. | 43 |
astorfi/speechpy | Provides tools and libraries for extracting speech features from audio data. | 880 |
tyiannak/pyaudioanalysis | A comprehensive Python library for feature extraction, classification, segmentation, and applications of audio data. | 5,885 |
pytorch/audio | A PyTorch module providing tools and functions for audio signal processing | 2,545 |
superkogito/spafe | A Python library for extracting audio features from mono audio files using various filter banks and spectrogram algorithms. | 458 |
rf5/transfusion-asr | An ASR project that uses diffusion models to transcribe speech | 75 |
peak1995/speech-enhancement-dsp | This repository provides MATLAB implementations of traditional speech enhancement techniques including spectral subtraction, Wiener filtering, and Kalman filtering. | 82 |
awni/speech | A PyTorch implementation of end-to-end speech recognition models. | 754 |
bmcfee/pyrubberband | Provides a lightweight Python wrapper for audio processing tasks | 166 |
belangeo/pyo | A Python module providing a wide range of signal processing primitives and tools for creating complex audio manipulations. | 1,322 |
iver56/audiomentations | Library for audio data augmentation used in machine learning | 1,873 |
cpjku/madmom | A Python audio signal processing library used in music information retrieval tasks. | 1,347 |
linto-ai/whisper-timestamped | An extension of the Whisper model to predict word timestamps and confidence scores with improved accuracy | 2,045 |
vocalpy/vak | A Python framework for training and applying neural networks to acoustic communication research | 78 |
r9y9/tacotron_pytorch | An implementation of Tacotron speech synthesis model using PyTorch. | 309 |