whisper-timestamped

Timestamped ASR

An extension to the Whisper speech recognition model that adds word-level timestamps and confidence scores.

Multilingual Automatic Speech Recognition with word-level timestamps and confidence

GitHub

2k stars

31 watching

162 forks

Language: Python

last commit: over 1 year ago

Linked from 1 awesome list

asrattention-is-all-you-needattention-mechanismattention-modelattention-networkattention-seq2seqattention-visualizationdeep-learningmachine-learningmultilingual-modelspythonpython3pytorchspeaker-diarizationspeechspeech-processingspeech-recognitionspeech-to-texttransformerswhisper

Backlinks from these awesome lists:

sindresorhus/awesome-whisper

Related projects:

Repository	Description	Stars
bnosac/audio.whisper	Provides an R interface to the Whisper Automatic Speech Recognition model	119
mybigday/whisper.rn	A React Native binding of Whisper's automatic speech recognition model	408
ochen1/insanely-fast-whisper-cli	A command-line interface for fast and accurate automatic speech recognition using Whisper optimization	328
shashikg/whispers2t	An optimized speech-to-text pipeline designed to improve inference speed and accuracy	330
arthurfdlr/whisper-youtube	Transcribes Youtube videos using OpenAI's Whisper speech recognition model	369
purfview/whisper-standalone-win	Provides standalone executables for OpenAI's Whisper & Faster-Whisper speech recognition and transcription tools	1,405
yuangongnd/whisper-at	An audio processing model that adds audio event tagging capabilities to an existing speech recognition system with minimal additional computational cost.	343
srijith-rkr/kaust-whisper-adapter	A tool for fine-tuning the OpenAI Whisper speech recognition model using residual adapters and parameter-efficient learning methods.	32
macoron/whisper.unity	Provides a high-performance speech recognition system for Unity3D applications.	445
collabora/whisperlive	An implementation of Whisper's speech-to-text functionality in a real-time transcription application	2,186
m-bain/whisperx	An automatic speech recognition system with word-level timestamps and speaker diarization.	12,894
rf5/transfusion-asr	An ASR project that uses diffusion models to transcribe speech	76
ggerganov/whisper.spm	A Swift package for C implementation of a speech recognition system	169
chengsokdara/use-whisper	A React hook that enables real-time speech-to-text functionality using the OpenAI Whisper API	738
sandrohanea/whisper.net	A .NET implementation of OpenAI Whisper models for speech recognition and text-to-speech conversion.	601