whispering
Transcription tool
An open-source tool for real-time audio and image transcription with support for multiple languages and various applications
Whispering Tiger - OpenAI's whisper (and other models) with OSC and Websocket support. Allowing live transcription / translation in VRChat and Overlays in most Streaming Applications
404 stars
12 watching
30 forks
Language: Python
last commit: 2 months ago
Linked from 1 awesome list
Related projects:
Repository | Description | Stars |
---|---|---|
| An implementation of Whisper's speech-to-text functionality in a real-time transcription application | 2,186 |
| A service for transcribing and processing audio files using OpenAI Whisper, providing both GUI and API options. | 1,854 |
| A React hook that enables real-time speech-to-text functionality using the OpenAI Whisper API | 738 |
| A tool for fine-tuning the OpenAI Whisper speech recognition model using residual adapters and parameter-efficient learning methods. | 32 |
| A .NET implementation of OpenAI Whisper models for speech recognition and text-to-speech conversion. | 601 |
| An AI-powered speech recognition and translation tool that utilizes CTranslate2 and Faster-whisper implementations for faster and more efficient processing. | 938 |
| Provides standalone executables for OpenAI's Whisper & Faster-Whisper speech recognition and transcription tools | 1,405 |
| An optimized speech-to-text pipeline designed to improve inference speed and accuracy | 330 |
| An AI-powered audio and video transcription tool with cross-platform support for desktop devices. | 1,390 |
| An iOS application that assists users in transcribing audio files for writing or language learning purposes. | 7 |
| A tool for transcribing written text into the International Phonetic Alphabet (IPA) format. | 668 |
| A wiki-like application for collaborative transcription of handwritten documents from scanned pages. | 171 |
| Transcribes Youtube videos using OpenAI's Whisper speech recognition model | 369 |
| A tool to securely share sensitive information through encrypted links with expiration dates and limited access attempts. | 385 |
| An audio processing model that adds audio event tagging capabilities to an existing speech recognition system with minimal additional computational cost. | 343 |