MovieChat

Video analyzer

A deep learning model designed to efficiently process and analyze long videos using large language models

[CVPR 2024] MovieChat: From Dense Token to Sparse Memory for Long Video Understanding

GitHub

525 stars
10 watching
41 forks
Language: Python
last commit: 23 days ago
computer-visiondatasetlarge-language-modelsllamalong-video-understandingmultimodal-large-language-models

Related projects:

Repository Description Stars
renshuhuai-andy/timechat A large language model designed to understand and process long videos with temporal information 286
pku-yuangroup/video-bench Evaluates and benchmarks large language models' video understanding capabilities 117
bradyfu/video-mme An evaluation framework for large language models in video analysis, providing a comprehensive benchmark of their capabilities. 406
gabeur/mmt Develops a cross-modal architecture for video retrieval by combining multiple types of features from videos and text 258
vision-cair/longvu An artificial intelligence system designed to understand and describe long-form video content 270
dvlab-research/llama-vid An image-based language model that uses large language models to generate visual and text features from videos 733
shahen94/react-native-video-processing A native video editing library for React Native that provides tools for trimming, compressing, and processing videos on mobile devices. 1,253
huaizhengzhang/awsome-deep-learning-for-video-analysis A collection of resources and tools for video analysis using deep learning and multi-modal learning techniques. 763
boheumd/ma-lmm This project develops an AI model for long-term video understanding 244
agermanidis/thingscoop A utility for analyzing videos based on objects and scenes within them 358
rlhf-v/rlhf-v Aligns large language models' behavior through fine-grained correctional human feedback to improve trustworthiness and accuracy. 233
gsig/pyvideoresearch A collection of video analysis methods and datasets for research and development 533
makarandtapaswi/movieqa_cvpr2016 This project explores question-answering in movies using various machine learning approaches. 80
mayankchd/movie A CLI tool for retrieving and comparing movie information 161
msracver/flow-guided-feature-aggregation An implementation of an end-to-end learning framework for video object detection using feature aggregation along motion paths 723