STT
STT toolkit
A toolkit for building and deploying speech-to-text models using deep learning techniques
🐸STT - The deep learning toolkit for Speech-to-Text. Training and deploying STT models has never been so easy.
2k stars
62 watching
278 forks
Language: C++
last commit: 9 months ago
Linked from 1 awesome list
asrautomatic-speech-recognitiondeep-learningspeech-recognitionspeech-recognition-apispeech-recognizerspeech-to-textstttensorflowvoice-recognition
Related projects:
Repository | Description | Stars |
---|---|---|
coqui-ai/tts | A deep learning toolkit for generating human-like speech from text | 35,453 |
conchylicultor/deepqa | A deep learning-based chatbot model using TensorFlow and RNNs to generate responses to user queries. | 2,934 |
rvc-boss/gpt-sovits | An AI system for generating human-like voices from text inputs, using deep learning techniques and pre-trained models. | 35,728 |
openvinotoolkit/open_model_zoo | A collection of pre-trained deep learning models and demo applications for accelerating inference tasks | 4,098 |
tensorspeech/tensorflowtts | Real-time speech synthesis using state-of-the-art architectures | 3,839 |
microsoft/deepspeed | A deep learning optimization library that makes distributed training and inference easy, efficient, and effective. | 35,463 |
eleutherai/gpt-neox | Provides a framework for training large-scale language models on GPUs with advanced features and optimizations. | 6,941 |
deci-ai/super-gradients | A unified library for building and fine-tuning state-of-the-art computer vision models | 4,590 |
huggingface/transformers | A collection of pre-trained machine learning models for various natural language and computer vision tasks, enabling developers to fine-tune and deploy these models on their own projects. | 135,022 |
replicate/cog | A tool for packaging and deploying machine learning models in a standard, production-ready container environment. | 8,081 |
openvinotoolkit/openvino | A toolkit for optimizing and deploying artificial intelligence models in various applications | 7,279 |
dmlc/gluon-cv | A toolkit for building and deploying deep learning models in computer vision | 5,833 |
jasonppy/voicecraft | A neural codec model for speech editing and text-to-speech synthesis in real-time, using few seconds of reference audio. | 7,638 |
thudm/cogvlm | Develops a state-of-the-art visual language model with applications in image understanding and dialogue systems. | 6,080 |
mozilla/tts | An open-source project providing a suite of deep learning models and tools for advanced text-to-speech synthesis. | 9,401 |