speechbrain
Conversational AI toolkit
A PyTorch-based toolkit for building conversational AI systems with advanced speech and text processing capabilities.
A PyTorch-based Speech Toolkit
9k stars
134 watching
1k forks
Language: Python
last commit: over 1 year ago asraudioaudio-processingdeep-learninghuggingfacelanguage-modelpytorchspeaker-diarizationspeaker-recognitionspeaker-verificationspeech-enhancementspeech-processingspeech-recognitionspeech-separationspeech-to-textspeech-toolkitspeechrecognitionspoken-language-understandingtransformersvoice-recognition
Related projects:
| Repository | Description | Stars |
|---|---|---|
| | Develops state-of-the-art speech recognition systems using PyTorch and Kaldi toolkits | 2,370 |
| | A toolkit for speaker diarization using PyTorch and speech activity detection. | 6,508 |
| | A collection of pre-trained machine learning models for various natural language and computer vision tasks, enabling developers to fine-tune and deploy these models on their own projects. | 136,357 |
| | An application that enables users to converse with ChatGPT via speech and text interfaces. | 2,752 |
| | A PyTorch module providing tools and functions for audio signal processing | 2,561 |
| | A text-to-audio model that generates realistic speech and other audio | 36,433 |
| | Real-time speech synthesis using state-of-the-art architectures | 3,855 |
| | An automatic speech recognition system with word-level timestamps and speaker diarization. | 12,894 |
| | An open-source project providing a suite of deep learning models and tools for advanced text-to-speech synthesis. | 9,466 |
| | A deep learning-based speech recognition system built on top of PyTorch Lightning. | 2,109 |
| | An audio processing toolkit that provides pre-trained models and tools for tasks like speech synthesis, music generation, sound detection, and talking head creation. | 10,061 |
| | A comprehensive machine learning framework that provides a wide range of algorithms and data structures for tasks such as classification, regression, clustering, and visualization. | 6,066 |
| | A general-purpose speech recognition system trained on large-scale weak supervision | 72,752 |
| | A toolkit for end-to-end speech processing with deep learning and Kaldi-style data processing | 8,596 |
| | An implementation of Google's 2018 BERT model in PyTorch, allowing pre-training and fine-tuning for natural language processing tasks | 6,251 |