espresso

ASR toolkit

An end-to-end neural speech recognition toolkit based on PyTorch and fairseq.

Espresso: A Fast End-to-End Neural Speech Recognition Toolkit

GitHub

941 stars

42 watching

116 forks

Language: Python

last commit: 11 months ago

asrend-to-endfairseqkaldipythonpytorchspeech-recognition

Related projects:

Repository	Description	Stars
asyml/texar-pytorch	A toolkit providing easy-to-use machine learning modules and functionalities for natural language processing and text generation tasks	745
arjo129/uspeech	A toolkit for speech recognition on Arduino using C++	473
awni/speech	A PyTorch implementation of end-to-end speech recognition models.	756
linto-ai/whisper-timestamped	An extension to the Whisper speech recognition model that adds word-level timestamps and confidence scores.	2,121
thecodrr/vspeech	Provides an interface to Mozilla's DeepSpeech TensorFlow-based Speech-to-Text library using V bindings.	49
fuelen/owl	A toolkit for building and customizing command-line user interfaces in Elixir.	439
seannaren/deepspeech.pytorch	A deep learning-based speech recognition system built on top of PyTorch Lightning.	2,109
artfwo/aiosc	A minimalistic Open Sound Control communication module using asyncio for network operations.	36
kinwaicheuk/nnaudio	An audio processing toolkit using PyTorch convolutional neural networks to generate spectrograms from raw audio data	1,036
abitdodgy/gibran	A natural language processing toolkit with tokenization and Levenshtein distance functionality	65
misaogura/flashtorch	Toolkit for visualizing neural network behavior in PyTorch	737
kaiyangzhou/dassl.pytorch	A PyTorch toolbox for supporting research and development of domain adaptation, generalization, and semi-supervised learning methods in computer vision.	1,236
marl/pysox	A Python wrapper around an audio signal processing library.	519
openseg-group/openseg.pytorch	Provides a PyTorch implementation of several computer vision tasks including object detection, segmentation and parsing.	1,191
flagai-open/aquila2	Provides pre-trained language models and tools for fine-tuning and evaluation	439