VideoLLaMA2

Video generator

An audio-visual language model designed to understand and generate video content

VideoLLaMA 2: Advancing Spatial-Temporal Modeling and Audio Understanding in Video-LLMs

GitHub

871 stars
10 watching
60 forks
Language: Python
last commit: 8 days ago

Related projects:

Repository Description Stars
showlab/vlog Transforms video content into a long document containing visual and audio information that can be used for chat or other applications. 538
aspiers/ly2video Converts music represented by a GNU LilyPond file into a video containing a horizontally scrolling music staff synchronized with audio rendering. 158
dvlab-research/llama-vid An image-based language model that uses large language models to generate visual and text features from videos 733
dcdmllm/momentor A video Large Language Model designed for fine-grained comprehension and localization in videos with a custom Temporal Perception Module for improved temporal modeling 54
showlab/show-1 This project enables text-to-video generation by combining pixel and latent diffusion models 1,103
damo-nlp-sg/llm-zoo A collection of information about various large language models used in natural language processing 272
mbzuai-oryx/video-chatgpt A video conversation model that generates meaningful conversations about videos using large vision and language models 1,213
vpgtrans/vpgtrans Transfers visual prompt generators across large language models to reduce training costs and enable customization of multimodal LLMs 269
singularity42/vgan-tensorflow An implementation of a deep learning model to generate videos with dynamic scenes 15
nus-hpc-ai-lab/videosys A toolkit for high-performance video generation and processing using deep learning techniques 1,773
rupertluo/valley An offline video assistant system powered by large language models and computer vision techniques. 211
lxtgh/omg-seg Develops an end-to-end model for multiple visual perception and reasoning tasks using a single encoder, decoder, and large language model. 1,300
boheumd/ma-lmm This project develops an AI model for long-term video understanding 244
bryandlee/tune-a-video Unofficial implementation of a deep learning model to generate or modify video content 191
radi-cho/datasetgpt A command-line interface to generate textual datasets with Large Language Models 293