espresso

ASR toolkit

An end-to-end neural speech recognition toolkit based on PyTorch and fairseq.

Espresso: A Fast End-to-End Neural Speech Recognition Toolkit

GitHub

941 stars
42 watching
116 forks
Language: Python
last commit: 5 months ago
asrend-to-endfairseqkaldipythonpytorchspeech-recognition

Related projects:

Repository Description Stars
asyml/texar-pytorch A toolkit providing easy-to-use machine learning modules and functionalities for natural language processing and text generation tasks 745
arjo129/uspeech A toolkit for speech recognition on Arduino using C++ 473
awni/speech A PyTorch implementation of end-to-end speech recognition models. 756
linto-ai/whisper-timestamped An extension to the Whisper speech recognition model that adds word-level timestamps and confidence scores. 2,121
thecodrr/vspeech Provides an interface to Mozilla's DeepSpeech TensorFlow-based Speech-to-Text library using V bindings. 49
fuelen/owl A toolkit for building and customizing command-line user interfaces in Elixir. 439
seannaren/deepspeech.pytorch A deep learning-based speech recognition system built on top of PyTorch Lightning. 2,109
artfwo/aiosc A minimalistic Open Sound Control communication module using asyncio for network operations. 36
kinwaicheuk/nnaudio An audio processing toolkit using PyTorch convolutional neural networks to generate spectrograms from raw audio data 1,036
abitdodgy/gibran A natural language processing toolkit with tokenization and Levenshtein distance functionality 65
misaogura/flashtorch Toolkit for visualizing neural network behavior in PyTorch 737
kaiyangzhou/dassl.pytorch A PyTorch toolbox for supporting research and development of domain adaptation, generalization, and semi-supervised learning methods in computer vision. 1,236
marl/pysox A Python wrapper around an audio signal processing library. 519
openseg-group/openseg.pytorch Provides a PyTorch implementation of several computer vision tasks including object detection, segmentation and parsing. 1,191
flagai-open/aquila2 Provides pre-trained language models and tools for fine-tuning and evaluation 439