CogVideo
Video generator
Generates videos from text and images using large language models
text and image to video generation: CogVideoX (2024) and CogVideo (ICLR 2023)
9k stars
127 watching
859 forks
Language: Python
last commit: 7 days ago
Linked from 1 awesome list
cogvideoximage-to-videollmsoratext-to-videovideo-generation
Related projects:
Repository | Description | Stars |
---|---|---|
thudm/cogvlm | Develops a state-of-the-art visual language model with applications in image understanding and dialogue systems. | 6,080 |
doubiiu/dynamicrafter | This project generates animated videos from open-domain images by leveraging pre-trained video diffusion priors. | 2,596 |
thudm/cogview | A framework for generating images from text using transformers. | 1,723 |
doubiiu/tooncrafter | Generates cartoon-style videos from two images using pre-trained diffusion models | 5,372 |
tachibanayoshino/animeganv2 | A tool that generates anime styles from landscape photos and videos using deep learning techniques. | 5,102 |
coqui-ai/tts | A deep learning toolkit for generating human-like speech from text | 35,453 |
thudm/codegeex | A multilingual code generation model with pre-trained weights on 20 programming languages and capabilities for generating executable programs and translating between different languages. | 8,240 |
facebookresearch/co-tracker | A model for tracking any point on a video using transformer-based architecture and optical flow benefits | 3,850 |
threestudio-project/threestudio | A unified framework for 3D content generation using Python and various machine learning-based algorithms. | 6,319 |
vision-cair/minigpt-4 | Enabling vision-language understanding by fine-tuning large language models on visual data. | 25,422 |
internlm/internlm-xcomposer | A large vision language model that can understand and generate text from visual inputs, with capabilities for long-contextual input and output, high-resolution understanding, fine-grained video understanding, and multi-turn multi-image dialogue. | 2,521 |
nvidia/vid2vid | A PyTorch implementation of a video-to-video translation method for generating photorealistic videos from semantic label maps or other input data. | 8,607 |
coqui-ai/stt | A toolkit for building and deploying speech-to-text models using deep learning techniques | 2,283 |
cumulo-autumn/streamdiffusion | A pipeline-level solution for real-time interactive image generation using diffusion-based techniques | 9,736 |
clovaai/stargan-v2 | A Python implementation of an image-to-image translation model for generating diverse images across multiple domains. | 3,506 |