transfusion-asr
Speech Transcription Tool
An ASR project that uses diffusion models to transcribe speech
Transcribing Speech with Multinomial Diffusion, training code and models.
76 stars
8 watching
5 forks
Language: Python
last commit: over 1 year ago
Linked from 1 awesome list
asrbinomial-distributiondiffusiondiscrete-diffusionpytorchspeech-recognition
Related projects:
Repository | Description | Stars |
---|---|---|
| An optimized speech-to-text pipeline designed to improve inference speed and accuracy | 330 |
| Provides an R interface to the Whisper Automatic Speech Recognition model | 119 |
| An extension to the Whisper speech recognition model that adds word-level timestamps and confidence scores. | 2,121 |
| An implementation of Whisper's speech-to-text functionality in a real-time transcription application | 2,186 |
| Transcribes Youtube videos using OpenAI's Whisper speech recognition model | 369 |
| Enables speech-to-text transcription using a pre-trained Deep Speech model in MATLAB. | 7 |
| A tool for phonetic transcription of languages with close-to-phonetic writing systems | 10 |
| Provides pre-trained ASR models for efficient inference using TFLite | 11 |
| Developing low-resource speech command recognition systems using adversarial reprogramming and transfer learning | 18 |
| A toolkit for creating and manipulating state-of-the-art diffusion models in PyTorch | 8 |
| An online transcription tool for a specific application, allowing users to input audio or video and receive a written text summary | 2 |
| A tool for fine-tuning the OpenAI Whisper speech recognition model using residual adapters and parameter-efficient learning methods. | 32 |
| A PyTorch implementation of end-to-end speech recognition models. | 756 |
| Software that allows video translation with synchronized audio, utilizing speech-to-text and text-to-speech technologies. | 924 |
| A React Native binding of Whisper's automatic speech recognition model | 408 |