CogVideo
Video generator
Generates videos from text and images using large language models
text and image to video generation: CogVideoX (2024) and CogVideo (ICLR 2023)
10k stars
129 watching
922 forks
Language: Python
last commit: 2 months ago
Linked from 1 awesome list
cogvideoximage-to-videollmsoratext-to-videovideo-generation
Related projects:
Repository | Description | Stars |
---|---|---|
| Develops a state-of-the-art visual language model with applications in image understanding and dialogue systems. | 6,182 |
| This project generates animated videos from open-domain images by leveraging pre-trained video diffusion priors. | 2,668 |
| A framework for generating images from text using transformers. | 1,735 |
| Generates cartoon-style videos from two images using pre-trained diffusion models | 5,447 |
| A tool that generates anime styles from landscape photos and videos using deep learning techniques. | 5,118 |
| A deep learning toolkit for generating human-like speech from text | 36,118 |
| A multilingual code generation model with pre-trained weights on 20 programming languages and capabilities for generating executable programs and translating between different languages. | 8,304 |
| A model for tracking any point on a video using transformer-based architecture and optical flow benefits | 3,978 |
| A unified framework for 3D content generation | 6,410 |
| Enabling vision-language understanding by fine-tuning large language models on visual data. | 25,490 |
| A comprehensive multimodal system for long-term streaming video and audio interactions with capabilities including text-image comprehension and composition | 2,616 |
| A PyTorch implementation of a video-to-video translation method for generating photorealistic videos from semantic label maps or other input data. | 8,623 |
| A toolkit for building and deploying speech-to-text models using deep learning techniques | 2,302 |
| A pipeline-level solution for real-time interactive image generation using advanced diffusion techniques | 9,801 |
| A Python implementation of an image-to-image translation model for generating diverse images across multiple domains. | 3,513 |