CMMMU
Multimodal QA Benchmark
A benchmark for evaluating the performance of multimodal question answering models on diverse domains and data types
46 stars
2 watching
1 forks
Language: Python
last commit: about 1 year ago Related projects:
| Repository | Description | Stars |
|---|---|---|
| | Measures the understanding of massive multitask Chinese datasets using large language models | 87 |
| | A benchmark for evaluating large language models in multiple languages and formats | 93 |
| | Evaluates and benchmarks multimodal language models' ability to process visual, acoustic, and textual inputs simultaneously. | 15 |
| | An open-source benchmarking framework for evaluating cross-style visual capability of large multimodal models | 84 |
| | A benchmark for evaluating large language models' ability to process multimodal input | 322 |
| | A project demonstrating how to create a reproducible machine learning model using Python and version control | 86 |
| | A benchmarking framework for large language models | 81 |
| | A benchmarking package for the Go language. | 61 |
| | An LLM-free benchmark suite for evaluating MLLMs' hallucination capabilities in various tasks and dimensions | 98 |
| | Measures the performance of deep learning models in various deployment scenarios. | 1,256 |
| | An evaluation suite to assess language models' performance in multi-choice questions | 93 |
| | A benchmarking project comparing the performance of different programming languages' regex engines | 315 |
| | A benchmarking suite for evaluating the performance of memory-centric computing architectures | 142 |
| | Evaluates and compares the performance of multimodal large language models on various tasks | 56 |
| | Comprehensive benchmark for evaluating multi-modal large language models on video analysis tasks | 422 |