waveglow
Speech generator
Generates high-quality speech from mel-spectrograms using a flow-based network architecture
A Flow-based Generative Network for Speech Synthesis
2k stars
77 watching
530 forks
Language: Python
last commit: about 1 year ago
Linked from 2 awesome lists
Related projects:
Repository | Description | Stars |
---|---|---|
npuichigo/waveglow | A PyTorch implementation of a speech synthesis network based on flow-based generative architecture. | 206 |
nvidia/tacotron2 | This PyTorch implementation provides a toolkit for speech synthesis using a deep neural network architecture. | 5,099 |
lucidrains/musiclm-pytorch | Implementation of Google's MusicLM model for music generation using attention networks and text-conditioning. | 3,166 |
ibab/tensorflow-wavenet | An implementation of a WaveNet generative neural network architecture for audio generation | 5,414 |
pyannote/pyannote-audio | A toolkit for speaker diarization using PyTorch and speech activity detection. | 6,333 |
pytorch/audio | A PyTorch module providing tools and functions for audio signal processing | 2,538 |
aigc-audio/audiogpt | An audio processing toolkit that provides pre-trained models and tools for tasks like speech synthesis, music generation, sound detection, and talking head creation. | 10,025 |
facebookresearch/audio2photoreal | Generating photorealistic avatars from audio | 2,709 |
facebookresearch/audiocraft | A deep learning library for generating high-quality audio | 20,969 |
speechbrain/speechbrain | A PyTorch-based toolkit for building conversational AI systems with advanced speech and text processing capabilities. | 8,922 |
nvidia/nemo | A scalable generative AI framework for creating and deploying large language models and multimodal models | 12,118 |
nvlabs/stylegan | A deep learning framework implementing a generator architecture for generating images with controllable attributes and disentangled latent factors | 14,152 |
pytorch/glow | A compiler and execution engine for neural networks that generates optimized code for hardware accelerators | 3,235 |
const-me/whisper | An implementation of OpenAI's Whisper ASR model using DirectCompute for GPGPU inference | 8,460 |
rhasspy/piper | A fast local neural text-to-speech system optimized for small devices | 6,576 |