whisper-timestamped
Timestamped ASR
An extension to the Whisper speech recognition model that adds word-level timestamps and confidence scores.
Multilingual Automatic Speech Recognition with word-level timestamps and confidence
2k stars
31 watching
162 forks
Language: Python
last commit: 3 months ago
Linked from 1 awesome list
asrattention-is-all-you-needattention-mechanismattention-modelattention-networkattention-seq2seqattention-visualizationdeep-learningmachine-learningmultilingual-modelspythonpython3pytorchspeaker-diarizationspeechspeech-processingspeech-recognitionspeech-to-texttransformerswhisper
Related projects:
Repository | Description | Stars |
---|---|---|
| Provides an R interface to the Whisper Automatic Speech Recognition model | 119 |
| A React Native binding of Whisper's automatic speech recognition model | 408 |
| A command-line interface for fast and accurate automatic speech recognition using Whisper optimization | 328 |
| An optimized speech-to-text pipeline designed to improve inference speed and accuracy | 330 |
| Transcribes Youtube videos using OpenAI's Whisper speech recognition model | 369 |
| Provides standalone executables for OpenAI's Whisper & Faster-Whisper speech recognition and transcription tools | 1,405 |
| An audio processing model that adds audio event tagging capabilities to an existing speech recognition system with minimal additional computational cost. | 343 |
| A tool for fine-tuning the OpenAI Whisper speech recognition model using residual adapters and parameter-efficient learning methods. | 32 |
| Provides a high-performance speech recognition system for Unity3D applications. | 445 |
| An implementation of Whisper's speech-to-text functionality in a real-time transcription application | 2,186 |
| An automatic speech recognition system with word-level timestamps and speaker diarization. | 12,894 |
| An ASR project that uses diffusion models to transcribe speech | 76 |
| A Swift package for C implementation of a speech recognition system | 169 |
| A React hook that enables real-time speech-to-text functionality using the OpenAI Whisper API | 738 |
| A .NET implementation of OpenAI Whisper models for speech recognition and text-to-speech conversion. | 601 |