wav2letter

ASR toolkit

An open-source toolkit for automatic speech recognition using deep learning and end-to-end training.

Facebook AI Research's Automatic Speech Recognition Toolkit

GitHub

6k stars
247 watching
1k forks
Language: C++
last commit: about 2 months ago
Linked from 3 awesome lists

cppdeep-learningend-to-endspeech-recognitionwav2letter

Backlinks from these awesome lists:

Related projects:

Repository Description Stars
flashlight/flashlight A C++ machine learning library with autograd support and high-performance defaults for efficient computation. 5,300
m-bain/whisperx An automatic speech recognition system with word-level timestamps and speaker diarization. 12,894
aofdev/vue-pwa-speech Enables synchronous speech recognition with Google Cloud Speech API on a Progressive Web App 99
mahmoudashraf97/whisper-diarization Automates speaker diarization from audio recordings using OpenAI Whisper ASR and additional neural networks. 3,874
seannaren/deepspeech.torch A speech recognition system based on the DeepSpeech2 architecture 259
huggingface/distil-whisper A machine learning model that uses audio input to generate text transcriptions at high speeds and with good accuracy. 3,644
shashikg/whispers2t An optimized speech-to-text pipeline designed to improve inference speed and accuracy 330
donnyyou/torchcv A comprehensive PyTorch-based framework for computer vision tasks 2,249
aofdev/vue-speech-streaming A Vue2 project providing streaming speech recognition with Google Cloud Speech API 73
ggerganov/whisper.cpp A high-performance inference implementation of an automatic speech recognition model in C++ 36,332
speechbrain/speechbrain A PyTorch-based toolkit for building conversational AI systems with advanced speech and text processing capabilities. 9,066
purfview/whisper-standalone-win Provides standalone executables for OpenAI's Whisper & Faster-Whisper speech recognition and transcription tools 1,405
linto-ai/whisper-timestamped An extension to the Whisper speech recognition model that adds word-level timestamps and confidence scores. 2,121
facebookresearch/laser A library for calculating and using multilingual sentence embeddings. 3,604
xenova/whisper-web An open-source speech recognition system built using machine learning models and JavaScript. 2,651