whisperX

ASR system

An automatic speech recognition system with word-level timestamps and speaker diarization.

WhisperX: Automatic Speech Recognition with Word-level Timestamps (& Diarization)

GitHub

12k stars
139 watching
1k forks
Language: Python
last commit: 3 months ago
Linked from 1 awesome list

asrspeechspeech-recognitionspeech-to-textwhisper

Backlinks from these awesome lists:

Related projects:

Repository Description Stars
mahmoudashraf97/whisper-diarization Automates speaker diarization from audio recordings using OpenAI Whisper ASR and additional neural networks. 3,718
huggingface/distil-whisper A machine learning model that uses audio input to generate text transcriptions at high speeds and with good accuracy. 3,613
ggerganov/whisper.cpp A high-performance implementation of the OpenAI Whisper ASR model in C++ 35,706
openai/whisper A general-purpose speech recognition system trained on large-scale weak supervision 71,257
vaibhavs10/insanely-fast-whisper A command-line tool for fast audio transcription using the Whisper AI model 7,731
systran/faster-whisper A fast speech recognition system built on top of the CTranslate2 transformer model 12,506
linto-ai/whisper-timestamped An extension of the Whisper model to predict word timestamps and confidence scores with improved accuracy 2,045
const-me/whisper An implementation of OpenAI's Whisper ASR model using DirectCompute for GPGPU inference 8,460
purfview/whisper-standalone-win Executable standalone versions of Whisper and Faster-Whisper speech recognition tools 1,326
mybigday/whisper.rn A React Native binding of Whisper's automatic speech recognition model 395
ochen1/insanely-fast-whisper-cli A command-line interface for fast and accurate automatic speech recognition using Whisper optimization 322
arthurfdlr/whisper-youtube Transcribes Youtube videos using OpenAI's Whisper speech recognition model 362
flashlight/wav2letter An open-source toolkit for automatic speech recognition using deep learning and end-to-end approaches. 6,390
macoron/whisper.unity Provides a high-performance speech recognition system for Unity3D applications. 433
xenova/whisper-web An open-source speech recognition system built using machine learning models and JavaScript. 2,578