speechbrain

Conversational AI toolkit

A PyTorch-based toolkit for building conversational AI systems with advanced speech and text processing capabilities.

A PyTorch-based Speech Toolkit

GitHub

9k stars
134 watching
1k forks
Language: Python
last commit: over 1 year ago
asraudioaudio-processingdeep-learninghuggingfacelanguage-modelpytorchspeaker-diarizationspeaker-recognitionspeaker-verificationspeech-enhancementspeech-processingspeech-recognitionspeech-separationspeech-to-textspeech-toolkitspeechrecognitionspoken-language-understandingtransformersvoice-recognition

Related projects:

Repository Description Stars
mravanelli/pytorch-kaldi Develops state-of-the-art speech recognition systems using PyTorch and Kaldi toolkits 2,370
pyannote/pyannote-audio A toolkit for speaker diarization using PyTorch and speech activity detection. 6,508
huggingface/transformers A collection of pre-trained machine learning models for various natural language and computer vision tasks, enabling developers to fine-tune and deploy these models on their own projects. 136,357
hahahumble/speechgpt An application that enables users to converse with ChatGPT via speech and text interfaces. 2,752
pytorch/audio A PyTorch module providing tools and functions for audio signal processing 2,561
suno-ai/bark A text-to-audio model that generates realistic speech and other audio 36,433
tensorspeech/tensorflowtts Real-time speech synthesis using state-of-the-art architectures 3,855
m-bain/whisperx An automatic speech recognition system with word-level timestamps and speaker diarization. 12,894
mozilla/tts An open-source project providing a suite of deep learning models and tools for advanced text-to-speech synthesis. 9,466
seannaren/deepspeech.pytorch A deep learning-based speech recognition system built on top of PyTorch Lightning. 2,109
aigc-audio/audiogpt An audio processing toolkit that provides pre-trained models and tools for tasks like speech synthesis, music generation, sound detection, and talking head creation. 10,061
haifengl/smile A comprehensive machine learning framework that provides a wide range of algorithms and data structures for tasks such as classification, regression, clustering, and visualization. 6,066
openai/whisper A general-purpose speech recognition system trained on large-scale weak supervision 72,752
espnet/espnet A toolkit for end-to-end speech processing with deep learning and Kaldi-style data processing 8,596
codertimo/bert-pytorch An implementation of Google's 2018 BERT model in PyTorch, allowing pre-training and fine-tuning for natural language processing tasks 6,251