STT

STT toolkit

A toolkit for building and deploying speech-to-text models using deep learning techniques

🐸STT - The deep learning toolkit for Speech-to-Text. Training and deploying STT models has never been so easy.

GitHub

2k stars
62 watching
278 forks
Language: C++
last commit: 9 months ago
Linked from 1 awesome list

asrautomatic-speech-recognitiondeep-learningspeech-recognitionspeech-recognition-apispeech-recognizerspeech-to-textstttensorflowvoice-recognition

Backlinks from these awesome lists:

Related projects:

Repository Description Stars
coqui-ai/tts A deep learning toolkit for generating human-like speech from text 35,453
conchylicultor/deepqa A deep learning-based chatbot model using TensorFlow and RNNs to generate responses to user queries. 2,934
rvc-boss/gpt-sovits An AI system for generating human-like voices from text inputs, using deep learning techniques and pre-trained models. 35,728
openvinotoolkit/open_model_zoo A collection of pre-trained deep learning models and demo applications for accelerating inference tasks 4,098
tensorspeech/tensorflowtts Real-time speech synthesis using state-of-the-art architectures 3,839
microsoft/deepspeed A deep learning optimization library that makes distributed training and inference easy, efficient, and effective. 35,463
eleutherai/gpt-neox Provides a framework for training large-scale language models on GPUs with advanced features and optimizations. 6,941
deci-ai/super-gradients A unified library for building and fine-tuning state-of-the-art computer vision models 4,590
huggingface/transformers A collection of pre-trained machine learning models for various natural language and computer vision tasks, enabling developers to fine-tune and deploy these models on their own projects. 135,022
replicate/cog A tool for packaging and deploying machine learning models in a standard, production-ready container environment. 8,081
openvinotoolkit/openvino A toolkit for optimizing and deploying artificial intelligence models in various applications 7,279
dmlc/gluon-cv A toolkit for building and deploying deep learning models in computer vision 5,833
jasonppy/voicecraft A neural codec model for speech editing and text-to-speech synthesis in real-time, using few seconds of reference audio. 7,638
thudm/cogvlm Develops a state-of-the-art visual language model with applications in image understanding and dialogue systems. 6,080
mozilla/tts An open-source project providing a suite of deep learning models and tools for advanced text-to-speech synthesis. 9,401