waveglow
Speech generator
Generates high-quality speech from mel-spectrograms using a flow-based network architecture
A Flow-based Generative Network for Speech Synthesis
2k stars
77 watching
531 forks
Language: Python
last commit: about 1 year ago
Linked from 2 awesome lists
Related projects:
Repository | Description | Stars |
---|---|---|
npuichigo/waveglow | A PyTorch implementation of a speech synthesis network based on flow-based generative architecture. | 206 |
nvidia/tacotron2 | This PyTorch implementation provides a toolkit for speech synthesis using a deep neural network architecture. | 5,123 |
lucidrains/musiclm-pytorch | Implementation of Google's MusicLM model for music generation using attention networks and text-conditioning. | 3,189 |
ibab/tensorflow-wavenet | An implementation of a WaveNet generative neural network architecture for audio generation | 5,417 |
pyannote/pyannote-audio | A toolkit for speaker diarization using PyTorch and speech activity detection. | 6,508 |
pytorch/audio | A PyTorch module providing tools and functions for audio signal processing | 2,561 |
aigc-audio/audiogpt | An audio processing toolkit that provides pre-trained models and tools for tasks like speech synthesis, music generation, sound detection, and talking head creation. | 10,061 |
facebookresearch/audio2photoreal | Generating photorealistic avatars from audio | 2,715 |
facebookresearch/audiocraft | A deep learning library for generating high-quality audio | 21,134 |
speechbrain/speechbrain | A PyTorch-based toolkit for building conversational AI systems with advanced speech and text processing capabilities. | 9,066 |
nvidia/nemo | A scalable generative AI framework for creating and deploying large language models and multimodal models | 12,438 |
nvlabs/stylegan | A deep learning framework implementing a generator architecture for generating images with controllable attributes and disentangled latent factors | 14,178 |
pytorch/glow | A compiler and execution engine for neural networks that generates optimized code for hardware accelerators | 3,247 |
const-me/whisper | An implementation of OpenAI's Whisper ASR model using DirectCompute for GPGPU inference | 8,617 |
rhasspy/piper | A fast local neural text-to-speech system optimized for small devices | 7,002 |