MMBench

Multi-modal model evaluation suite

A collection of benchmarks to evaluate the multi-modal understanding capability of large vision language models.

Official Repo of "MMBench: Is Your Multi-modal Model an All-around Player?"

GitHub

168 stars
3 watching
10 forks
last commit: 5 months ago

Related projects:

Repository Description Stars
open-compass/vlmevalkit An evaluation toolkit for large vision-language models 1,514
multimodal-art-projection/omnibench Evaluates and benchmarks multimodal language models' ability to process visual, acoustic, and textual inputs simultaneously. 15
openbmb/viscpm A family of large multimodal models supporting multimodal conversational capabilities and text-to-image generation in multiple languages 1,098
open-compass/lawbench Evaluates the legal knowledge of large language models using a custom benchmarking framework. 273
opengvlab/multi-modality-arena An evaluation platform for comparing multi-modality models on visual question-answering tasks 478
will-singularity/skywork-mm An empirical study aiming to develop a large language model capable of effectively integrating multiple input modalities 23
openm3d/m3dbench An open-source software project providing a comprehensive 3D instruction-following dataset with multi-modal prompts for training large language models. 58
fuxiaoliu/mmc Develops a large-scale dataset and benchmark for training multimodal chart understanding models using large language models. 87
tsb0601/mmvp An evaluation framework for multimodal language models' visual capabilities using image and question benchmarks. 296
sail-sg/mmcbench A benchmarking framework designed to evaluate the robustness of large multimodal models against common corruption scenarios 27
mickcrosse/mtrf-toolbox A MATLAB package for modeling and analyzing multivariate neural responses to dynamic stimuli. 85
open-mmlab/mmhuman3d Provides a modular framework and tools for working with 3D human parametric models in computer vision and graphics 1,253
open-mmlab/mmaction An open-source toolbox for action understanding from video data using PyTorch. 1,863
ailab-cvc/seed-bench A benchmark for evaluating large language models' ability to process multimodal input 322
pleisto/yuren-baichuan-7b A multi-modal large language model that integrates natural language and visual capabilities with fine-tuning for various tasks 73