Video-MME

Video analysis benchmark

Comprehensive benchmark for evaluating multi-modal large language models on video analysis tasks

✨✨Video-MME: The First-Ever Comprehensive Evaluation Benchmark of Multi-modal LLMs in Video Analysis

GitHub

422 stars
5 watching
17 forks
last commit: about 1 month ago
large-language-modelslarge-vision-language-modelsmmemultimodal-large-language-modelsvideovideo-mme

Related projects:

Repository Description Stars
aifeg/benchlmm An open-source benchmarking framework for evaluating cross-style visual capability of large multimodal models 84
pku-yuangroup/video-bench Evaluates and benchmarks large language models' video understanding capabilities 121
rese1f/moviechat Develops a method for long video understanding by optimizing memory usage 550
tsb0601/mmvp An evaluation framework for multimodal language models' visual capabilities using image and question benchmarks. 296
felixgithub2017/mmcu Measures the understanding of massive multitask Chinese datasets using large language models 87
boheumd/ma-lmm This project develops an AI model for long-term video understanding 254
yfzhang114/mme-realworld A multimodal large language model benchmark designed to simulate real-world challenges and measure the performance of such models in practical scenarios. 86
pku-yuangroup/chronomagic-bench Provides a benchmarking framework for evaluating the quality of text-to-video generation models 191
huaizhengzhang/awsome-deep-learning-for-video-analysis A collection of resources and tools for video analysis using deep learning and multi-modal learning techniques. 767
cmmmu-benchmark/cmmmu A benchmark for evaluating the performance of multimodal question answering models on diverse domains and data types 46
mltframework/mlt A multimedia framework designed for video editing, providing tools and libraries for audio and video processing. 1,522
damo-nlp-sg/m3exam A benchmark for evaluating large language models in multiple languages and formats 93
gabeur/mmt Develops a cross-modal architecture for video retrieval by combining multiple types of features from videos and text 259
chenllliang/mmevalpro A benchmarking framework for evaluating Large Multimodal Models by providing rigorous metrics and an efficient evaluation pipeline. 22
laomao0/bin Software to interpolate blurry video frames and enhance image quality 209