CMMMU

Multimodal QA Benchmark

A benchmark for evaluating the performance of multimodal question answering models on diverse domains and data types

GitHub

46 stars
2 watching
1 forks
Language: Python
last commit: 4 months ago

Related projects:

Repository Description Stars
felixgithub2017/mmcu Measures the understanding of massive multitask Chinese datasets using large language models 87
damo-nlp-sg/m3exam A benchmark for evaluating large language models in multiple languages and formats 93
multimodal-art-projection/omnibench Evaluates and benchmarks multimodal language models' ability to process visual, acoustic, and textual inputs simultaneously. 15
aifeg/benchlmm An open-source benchmarking framework for evaluating cross-style visual capability of large multimodal models 84
ailab-cvc/seed-bench A benchmark for evaluating large language models' ability to process multimodal input 322
cmawer/reproducible-model A project demonstrating how to create a reproducible machine learning model using Python and version control 86
qcri/llmebench A benchmarking framework for large language models 81
mna/gocostmodel A benchmarking package for the Go language. 61
junyangwang0410/amber An LLM-free benchmark suite for evaluating MLLMs' hallucination capabilities in various tasks and dimensions 98
mlcommons/inference Measures the performance of deep learning models in various deployment scenarios. 1,256
mikegu721/xiezhibenchmark An evaluation suite to assess language models' performance in multi-choice questions 93
mariomka/regex-benchmark A benchmarking project comparing the performance of different programming languages' regex engines 315
cmu-safari/prim-benchmarks A benchmarking suite for evaluating the performance of memory-centric computing architectures 142
freedomintelligence/mllm-bench Evaluates and compares the performance of multimodal large language models on various tasks 56
bradyfu/video-mme Comprehensive benchmark for evaluating multi-modal large language models on video analysis tasks 422