Video-MME

Video analysis benchmark

Comprehensive benchmark for evaluating multi-modal large language models on video analysis tasks

✨✨Video-MME: The First-Ever Comprehensive Evaluation Benchmark of Multi-modal LLMs in Video Analysis

GitHub

422 stars

5 watching

17 forks

last commit: 10 months ago

large-language-modelslarge-vision-language-modelsmmemultimodal-large-language-modelsvideovideo-mme

Related projects:

Repository	Description	Stars
aifeg/benchlmm	An open-source benchmarking framework for evaluating cross-style visual capability of large multimodal models	84
pku-yuangroup/video-bench	Evaluates and benchmarks large language models' video understanding capabilities	121
rese1f/moviechat	Develops a method for long video understanding by optimizing memory usage	550
tsb0601/mmvp	An evaluation framework for multimodal language models' visual capabilities using image and question benchmarks.	296
felixgithub2017/mmcu	Measures the understanding of massive multitask Chinese datasets using large language models	87
boheumd/ma-lmm	This project develops an AI model for long-term video understanding	254
yfzhang114/mme-realworld	A multimodal large language model benchmark designed to simulate real-world challenges and measure the performance of such models in practical scenarios.	86
pku-yuangroup/chronomagic-bench	Provides a benchmarking framework for evaluating the quality of text-to-video generation models	191
huaizhengzhang/awsome-deep-learning-for-video-analysis	A collection of resources and tools for video analysis using deep learning and multi-modal learning techniques.	767
cmmmu-benchmark/cmmmu	A benchmark for evaluating the performance of multimodal question answering models on diverse domains and data types	46
mltframework/mlt	A multimedia framework designed for video editing, providing tools and libraries for audio and video processing.	1,522
damo-nlp-sg/m3exam	A benchmark for evaluating large language models in multiple languages and formats	93
gabeur/mmt	Develops a cross-modal architecture for video retrieval by combining multiple types of features from videos and text	259
chenllliang/mmevalpro	A benchmarking framework for evaluating Large Multimodal Models by providing rigorous metrics and an efficient evaluation pipeline.	22
laomao0/bin	Software to interpolate blurry video frames and enhance image quality	209