CMMMU

Multimodal QA Benchmark

A benchmark for evaluating the performance of multimodal question answering models on diverse domains and data types

46 stars

2 watching

1 forks

Language: Python

last commit: about 1 year ago

Related projects:

Repository	Description	Stars
felixgithub2017/mmcu	Measures the understanding of massive multitask Chinese datasets using large language models	87
damo-nlp-sg/m3exam	A benchmark for evaluating large language models in multiple languages and formats	93
multimodal-art-projection/omnibench	Evaluates and benchmarks multimodal language models' ability to process visual, acoustic, and textual inputs simultaneously.	15
aifeg/benchlmm	An open-source benchmarking framework for evaluating cross-style visual capability of large multimodal models	84
ailab-cvc/seed-bench	A benchmark for evaluating large language models' ability to process multimodal input	322
cmawer/reproducible-model	A project demonstrating how to create a reproducible machine learning model using Python and version control	86
qcri/llmebench	A benchmarking framework for large language models	81
mna/gocostmodel	A benchmarking package for the Go language.	61
junyangwang0410/amber	An LLM-free benchmark suite for evaluating MLLMs' hallucination capabilities in various tasks and dimensions	98
mlcommons/inference	Measures the performance of deep learning models in various deployment scenarios.	1,256
mikegu721/xiezhibenchmark	An evaluation suite to assess language models' performance in multi-choice questions	93
mariomka/regex-benchmark	A benchmarking project comparing the performance of different programming languages' regex engines	315
cmu-safari/prim-benchmarks	A benchmarking suite for evaluating the performance of memory-centric computing architectures	142
freedomintelligence/mllm-bench	Evaluates and compares the performance of multimodal large language models on various tasks	56
bradyfu/video-mme	Comprehensive benchmark for evaluating multi-modal large language models on video analysis tasks	422