audio2photoreal

Avatar generation

Generating photorealistic avatars from audio

Code and dataset for photorealistic Codec Avatars driven from audio

3k stars

31 watching

261 forks

Language: Python

last commit: almost 2 years ago

Related projects:

Repository	Description	Stars
facebookresearch/imagebind	An AI framework that combines data from multiple sources into a single embedding space, enabling various applications such as cross-modal retrieval and generation.	8,424
sebastianstarke/ai4animation	A deep learning framework for data-driven character animation in Unity3D	7,927
zejun-yang/aniportrait	An open-source framework for generating photorealistic animations driven by audio and reference images.	4,718
facebookresearch/dinov2	A PyTorch implementation of a self-supervised learning method for learning robust visual features without supervision.	9,425
facebookresearch/pytorch3d	A deep learning library for 3D data processing and computer vision research using PyTorch	8,889
pyannote/pyannote-audio	A toolkit for speaker diarization using PyTorch and speech activity detection.	6,508
facebookresearch/sam2	A software framework for video segmentation in images and videos using AI models	13,054
facebookresearch/ca_body	A Python implementation of a neural network architecture for image avatar body generation	47
facebookresearch/audiocraft	A deep learning library for generating high-quality audio	21,134
huggingface/lerobot	A platform providing pre-trained models, datasets, and tools for robotics with focus on imitation learning and reinforcement learning.	7,874
facebookresearch/eft	Provides pseudo-GT 3D human pose data and pre-trained models for training 3D pose estimation algorithms	379
nvidia/vid2vid	A PyTorch implementation of a video-to-video translation method for generating photorealistic videos from semantic label maps or other input data.	8,623
tyiannak/pyaudioanalysis	A comprehensive Python library for feature extraction, classification, segmentation, and applications of audio data.	5,918
pytorch/audio	A PyTorch module providing tools and functions for audio signal processing	2,561
nvidia/waveglow	Generates high-quality speech from mel-spectrograms using a flow-based network architecture	2,294