Video-MME
Video analysis benchmark
Comprehensive benchmark for evaluating multi-modal large language models on video analysis tasks
✨✨Video-MME: The First-Ever Comprehensive Evaluation Benchmark of Multi-modal LLMs in Video Analysis
422 stars
5 watching
17 forks
last commit: about 1 month ago large-language-modelslarge-vision-language-modelsmmemultimodal-large-language-modelsvideovideo-mme
Related projects:
Repository | Description | Stars |
---|---|---|
aifeg/benchlmm | An open-source benchmarking framework for evaluating cross-style visual capability of large multimodal models | 84 |
pku-yuangroup/video-bench | Evaluates and benchmarks large language models' video understanding capabilities | 121 |
rese1f/moviechat | Develops a method for long video understanding by optimizing memory usage | 550 |
tsb0601/mmvp | An evaluation framework for multimodal language models' visual capabilities using image and question benchmarks. | 296 |
felixgithub2017/mmcu | Measures the understanding of massive multitask Chinese datasets using large language models | 87 |
boheumd/ma-lmm | This project develops an AI model for long-term video understanding | 254 |
yfzhang114/mme-realworld | A multimodal large language model benchmark designed to simulate real-world challenges and measure the performance of such models in practical scenarios. | 86 |
pku-yuangroup/chronomagic-bench | Provides a benchmarking framework for evaluating the quality of text-to-video generation models | 191 |
huaizhengzhang/awsome-deep-learning-for-video-analysis | A collection of resources and tools for video analysis using deep learning and multi-modal learning techniques. | 767 |
cmmmu-benchmark/cmmmu | A benchmark for evaluating the performance of multimodal question answering models on diverse domains and data types | 46 |
mltframework/mlt | A multimedia framework designed for video editing, providing tools and libraries for audio and video processing. | 1,522 |
damo-nlp-sg/m3exam | A benchmark for evaluating large language models in multiple languages and formats | 93 |
gabeur/mmt | Develops a cross-modal architecture for video retrieval by combining multiple types of features from videos and text | 259 |
chenllliang/mmevalpro | A benchmarking framework for evaluating Large Multimodal Models by providing rigorous metrics and an efficient evaluation pipeline. | 22 |
laomao0/bin | Software to interpolate blurry video frames and enhance image quality | 209 |