CogVideo

Video generator

Generates videos from text and images using large language models

text and image to video generation: CogVideoX (2024) and CogVideo (ICLR 2023)

GitHub

10k stars
129 watching
922 forks
Language: Python
last commit: about 1 month ago
Linked from 1 awesome list

cogvideoximage-to-videollmsoratext-to-videovideo-generation

Backlinks from these awesome lists:

Related projects:

Repository Description Stars
thudm/cogvlm Develops a state-of-the-art visual language model with applications in image understanding and dialogue systems. 6,182
doubiiu/dynamicrafter This project generates animated videos from open-domain images by leveraging pre-trained video diffusion priors. 2,668
thudm/cogview A framework for generating images from text using transformers. 1,735
doubiiu/tooncrafter Generates cartoon-style videos from two images using pre-trained diffusion models 5,447
tachibanayoshino/animeganv2 A tool that generates anime styles from landscape photos and videos using deep learning techniques. 5,118
coqui-ai/tts A deep learning toolkit for generating human-like speech from text 36,118
thudm/codegeex A multilingual code generation model with pre-trained weights on 20 programming languages and capabilities for generating executable programs and translating between different languages. 8,304
facebookresearch/co-tracker A model for tracking any point on a video using transformer-based architecture and optical flow benefits 3,978
threestudio-project/threestudio A unified framework for 3D content generation 6,410
vision-cair/minigpt-4 Enabling vision-language understanding by fine-tuning large language models on visual data. 25,490
internlm/internlm-xcomposer A comprehensive multimodal system for long-term streaming video and audio interactions with capabilities including text-image comprehension and composition 2,616
nvidia/vid2vid A PyTorch implementation of a video-to-video translation method for generating photorealistic videos from semantic label maps or other input data. 8,623
coqui-ai/stt A toolkit for building and deploying speech-to-text models using deep learning techniques 2,302
cumulo-autumn/streamdiffusion A pipeline-level solution for real-time interactive image generation using advanced diffusion techniques 9,801
clovaai/stargan-v2 A Python implementation of an image-to-image translation model for generating diverse images across multiple domains. 3,513