BenchLMM

Visual Model Benchmark

An open-source benchmarking framework for evaluating cross-style visual capability of large multimodal models

[ECCV 2024] BenchLMM: Benchmarking Cross-style Visual Capability of Large Multimodal Models

GitHub

84 stars

0 watching

6 forks

Language: Python

last commit: over 1 year ago

benchmarkcvdatasetlarge-language-modelslarge-multimodal-models

arxiv.org/abs/2312.02896

Related projects:

Repository	Description	Stars
ailab-cvc/seed-bench	A benchmark for evaluating large language models' ability to process multimodal input	322
damo-nlp-sg/m3exam	A benchmark for evaluating large language models in multiple languages and formats	93
qcri/llmebench	A benchmarking framework for large language models	81
ucsc-vlaa/vllm-safety-benchmark	A benchmark for evaluating the safety and robustness of vision language models against adversarial attacks.	72
junyangwang0410/amber	An LLM-free benchmark suite for evaluating MLLMs' hallucination capabilities in various tasks and dimensions	98
freedomintelligence/mllm-bench	Evaluates and compares the performance of multimodal large language models on various tasks	56
szilard/benchm-ml	A benchmark for evaluating machine learning algorithms' performance on large datasets	1,874
multimodal-art-projection/omnibench	Evaluates and benchmarks multimodal language models' ability to process visual, acoustic, and textual inputs simultaneously.	15
ailab-cvc/seed	An implementation of a multimodal language model with capabilities for comprehension and generation	585
bradyfu/video-mme	Comprehensive benchmark for evaluating multi-modal large language models on video analysis tasks	422
felixgithub2017/mmcu	Measures the understanding of massive multitask Chinese datasets using large language models	87
i-gallegos/fair-llm-benchmark	Compiles bias evaluation datasets and provides access to original data sources for large language models	115
mlcommons/inference	Measures the performance of deep learning models in various deployment scenarios.	1,256
tsb0601/mmvp	An evaluation framework for multimodal language models' visual capabilities using image and question benchmarks.	296
lxtgh/omg-seg	Develops an end-to-end model for multiple visual perception and reasoning tasks using a single encoder, decoder, and large language model.	1,336