wav2letter

ASR toolkit

An open-source toolkit for automatic speech recognition using deep learning and end-to-end approaches.

Facebook AI Research's Automatic Speech Recognition Toolkit

GitHub

6k stars
246 watching
1k forks
Language: C++
last commit: 4 months ago
Linked from 3 awesome lists

cppdeep-learningend-to-endspeech-recognitionwav2letter

Backlinks from these awesome lists:

Related projects:

Repository Description Stars
flashlight/flashlight A C++ machine learning library with autograd support and high-performance defaults for efficient computation. 5,285
m-bain/whisperx An automatic speech recognition system with word-level timestamps and speaker diarization. 12,489
aofdev/vue-pwa-speech Enables synchronous speech recognition with Google Cloud Speech API on a Progressive Web App 99
mahmoudashraf97/whisper-diarization Automates speaker diarization from audio recordings using OpenAI Whisper ASR and additional neural networks. 3,718
seannaren/deepspeech.torch A speech recognition system based on the DeepSpeech2 architecture 259
huggingface/distil-whisper A machine learning model that uses audio input to generate text transcriptions at high speeds and with good accuracy. 3,613
shashikg/whispers2t An optimized speech-to-text pipeline designed to improve inference speed and accuracy 310
donnyyou/torchcv A comprehensive PyTorch-based framework for computer vision tasks 2,250
aofdev/vue-speech-streaming A Vue2 project providing streaming speech recognition with Google Cloud Speech API 73
ggerganov/whisper.cpp A high-performance implementation of the OpenAI Whisper ASR model in C++ 35,706
speechbrain/speechbrain A PyTorch-based toolkit for building conversational AI systems with advanced speech and text processing capabilities. 8,922
purfview/whisper-standalone-win Executable standalone versions of Whisper and Faster-Whisper speech recognition tools 1,326
linto-ai/whisper-timestamped An extension of the Whisper model to predict word timestamps and confidence scores with improved accuracy 2,045
facebookresearch/laser A library for calculating and using multilingual sentence embeddings. 3,599
xenova/whisper-web An open-source speech recognition system built using machine learning models and JavaScript. 2,578