wav2letter

ASR toolkit

An open-source toolkit for automatic speech recognition using deep learning and end-to-end training.

Facebook AI Research's Automatic Speech Recognition Toolkit

GitHub

6k stars

247 watching

1k forks

Language: C++

last commit: 9 months ago

Linked from 3 awesome lists

cppdeep-learningend-to-endspeech-recognitionwav2letter

Screenshot of flashlight/wav2letter website

github.com/facebookresearch/wav2letter/wiki

Backlinks from these awesome lists:

Related projects:

Repository	Description	Stars
flashlight/flashlight	A C++ machine learning library with autograd support and high-performance defaults for efficient computation.	5,300
m-bain/whisperx	An automatic speech recognition system with word-level timestamps and speaker diarization.	12,894
aofdev/vue-pwa-speech	Enables synchronous speech recognition with Google Cloud Speech API on a Progressive Web App	99
mahmoudashraf97/whisper-diarization	Automates speaker diarization from audio recordings using OpenAI Whisper ASR and additional neural networks.	3,874
seannaren/deepspeech.torch	A speech recognition system based on the DeepSpeech2 architecture	259
huggingface/distil-whisper	A machine learning model that uses audio input to generate text transcriptions at high speeds and with good accuracy.	3,644
shashikg/whispers2t	An optimized speech-to-text pipeline designed to improve inference speed and accuracy	330
donnyyou/torchcv	A comprehensive PyTorch-based framework for computer vision tasks	2,249
aofdev/vue-speech-streaming	A Vue2 project providing streaming speech recognition with Google Cloud Speech API	73
ggerganov/whisper.cpp	A high-performance inference implementation of an automatic speech recognition model in C++	36,332
speechbrain/speechbrain	A PyTorch-based toolkit for building conversational AI systems with advanced speech and text processing capabilities.	9,066
purfview/whisper-standalone-win	Provides standalone executables for OpenAI's Whisper & Faster-Whisper speech recognition and transcription tools	1,405
linto-ai/whisper-timestamped	An extension to the Whisper speech recognition model that adds word-level timestamps and confidence scores.	2,121
facebookresearch/laser	A library for calculating and using multilingual sentence embeddings.	3,604
xenova/whisper-web	An open-source speech recognition system built using machine learning models and JavaScript.	2,651