MMBench

Multi-modal model evaluation suite

A collection of benchmarks to evaluate the multi-modal understanding capability of large vision language models.

Official Repo of "MMBench: Is Your Multi-modal Model an All-around Player?"

GitHub

163 stars
3 watching
10 forks
last commit: 3 months ago

Related projects:

Repository Description Stars
open-compass/vlmevalkit A toolkit for evaluating large vision-language models on various benchmarks and datasets. 1,343
multimodal-art-projection/omnibench Evaluates and benchmarks multimodal language models' ability to process visual, acoustic, and textual inputs simultaneously. 14
openbmb/viscpm A family of large multimodal models supporting multimodal conversational capabilities and text-to-image generation in multiple languages 1,089
open-compass/lawbench Evaluates the legal knowledge of large language models using a custom benchmarking framework. 267
opengvlab/multi-modality-arena An evaluation platform for comparing multi-modality models on visual question-answering tasks 467
will-singularity/skywork-mm An empirical study aiming to develop a large language model capable of effectively integrating multiple input modalities 23
openm3d/m3dbench An open-source software project providing a comprehensive 3D instruction-following dataset with multi-modal prompts for training large language models. 57
fuxiaoliu/mmc Develops a large-scale dataset and benchmark for training multimodal chart understanding models using large language models. 84
tsb0601/mmvp An evaluation framework for multimodal language models' visual capabilities using image and question benchmarks. 288
sail-sg/mmcbench A benchmarking framework designed to evaluate the robustness of large multimodal models against common corruption scenarios 27
mickcrosse/mtrf-toolbox A MATLAB package for modeling and analyzing multivariate neural responses to dynamic stimuli. 81
open-mmlab/mmhuman3d Provides a modular framework and tools for working with 3D human parametric models in computer vision and graphics 1,240
open-mmlab/mmaction An open-source toolbox for action understanding from video data using PyTorch. 1,863
ailab-cvc/seed-bench A benchmark for evaluating large language models' ability to process multimodal input 315
pleisto/yuren-baichuan-7b A multi-modal large language model that integrates natural language and visual capabilities with fine-tuning for various tasks 72