wav2letter
ASR toolkit
An open-source toolkit for automatic speech recognition using deep learning and end-to-end training.
Facebook AI Research's Automatic Speech Recognition Toolkit
6k stars
247 watching
1k forks
Language: C++
last commit: about 2 months ago
Linked from 3 awesome lists
cppdeep-learningend-to-endspeech-recognitionwav2letter
Related projects:
Repository | Description | Stars |
---|---|---|
flashlight/flashlight | A C++ machine learning library with autograd support and high-performance defaults for efficient computation. | 5,300 |
m-bain/whisperx | An automatic speech recognition system with word-level timestamps and speaker diarization. | 12,894 |
aofdev/vue-pwa-speech | Enables synchronous speech recognition with Google Cloud Speech API on a Progressive Web App | 99 |
mahmoudashraf97/whisper-diarization | Automates speaker diarization from audio recordings using OpenAI Whisper ASR and additional neural networks. | 3,874 |
seannaren/deepspeech.torch | A speech recognition system based on the DeepSpeech2 architecture | 259 |
huggingface/distil-whisper | A machine learning model that uses audio input to generate text transcriptions at high speeds and with good accuracy. | 3,644 |
shashikg/whispers2t | An optimized speech-to-text pipeline designed to improve inference speed and accuracy | 330 |
donnyyou/torchcv | A comprehensive PyTorch-based framework for computer vision tasks | 2,249 |
aofdev/vue-speech-streaming | A Vue2 project providing streaming speech recognition with Google Cloud Speech API | 73 |
ggerganov/whisper.cpp | A high-performance inference implementation of an automatic speech recognition model in C++ | 36,332 |
speechbrain/speechbrain | A PyTorch-based toolkit for building conversational AI systems with advanced speech and text processing capabilities. | 9,066 |
purfview/whisper-standalone-win | Provides standalone executables for OpenAI's Whisper & Faster-Whisper speech recognition and transcription tools | 1,405 |
linto-ai/whisper-timestamped | An extension to the Whisper speech recognition model that adds word-level timestamps and confidence scores. | 2,121 |
facebookresearch/laser | A library for calculating and using multilingual sentence embeddings. | 3,604 |
xenova/whisper-web | An open-source speech recognition system built using machine learning models and JavaScript. | 2,651 |