MMBench

Multi-modal model evaluation suite

A collection of benchmarks to evaluate the multi-modal understanding capability of large vision language models.

Official Repo of "MMBench: Is Your Multi-modal Model an All-around Player?"

GitHub

168 stars

3 watching

10 forks

last commit: 11 months ago

Related projects:

Repository	Description	Stars
open-compass/vlmevalkit	An evaluation toolkit for large vision-language models	1,514
multimodal-art-projection/omnibench	Evaluates and benchmarks multimodal language models' ability to process visual, acoustic, and textual inputs simultaneously.	15
openbmb/viscpm	A family of large multimodal models supporting multimodal conversational capabilities and text-to-image generation in multiple languages	1,098
open-compass/lawbench	Evaluates the legal knowledge of large language models using a custom benchmarking framework.	273
opengvlab/multi-modality-arena	An evaluation platform for comparing multi-modality models on visual question-answering tasks	478
will-singularity/skywork-mm	An empirical study aiming to develop a large language model capable of effectively integrating multiple input modalities	23
openm3d/m3dbench	An open-source software project providing a comprehensive 3D instruction-following dataset with multi-modal prompts for training large language models.	58
fuxiaoliu/mmc	Develops a large-scale dataset and benchmark for training multimodal chart understanding models using large language models.	87
tsb0601/mmvp	An evaluation framework for multimodal language models' visual capabilities using image and question benchmarks.	296
sail-sg/mmcbench	A benchmarking framework designed to evaluate the robustness of large multimodal models against common corruption scenarios	27
mickcrosse/mtrf-toolbox	A MATLAB package for modeling and analyzing multivariate neural responses to dynamic stimuli.	85
open-mmlab/mmhuman3d	Provides a modular framework and tools for working with 3D human parametric models in computer vision and graphics	1,253
open-mmlab/mmaction	An open-source toolbox for action understanding from video data using PyTorch.	1,863
ailab-cvc/seed-bench	A benchmark for evaluating large language models' ability to process multimodal input	322
pleisto/yuren-baichuan-7b	A multi-modal large language model that integrates natural language and visual capabilities with fine-tuning for various tasks	73