EvALign-ICL

Multimodal model evaluator

Evaluating and improving large multimodal models through in-context learning

[ICLR2024] (EvALign-ICL Benchmark) Beyond Task Performance: Evaluating and Reducing the Flaws of Large Multimodal Models with In-Context Learning

GitHub

21 stars

2 watching

0 forks

Language: Python

last commit: over 2 years ago

Related projects:

Repository	Description	Stars
pkunlp-icler/pca-eval	An open-source benchmark and evaluation tool for assessing multimodal large language models' performance in embodied decision-making tasks	99
ys-zong/vl-icl	A benchmarking suite for multimodal in-context learning models	31
chenllliang/mmevalpro	A benchmarking framework for evaluating Large Multimodal Models by providing rigorous metrics and an efficient evaluation pipeline.	22
freedomintelligence/mllm-bench	Evaluates and compares the performance of multimodal large language models on various tasks	56
x-plug/mplug-halowl	Evaluates and mitigates hallucinations in multimodal large language models	82
ailab-cvc/seed-bench	A benchmark for evaluating large language models' ability to process multimodal input	322
evolvinglmms-lab/lmms-eval	Tools and evaluation framework for accelerating the development of large multimodal models by providing an efficient way to assess their performance	2,164
yuweihao/mm-vet	Evaluates the capabilities of large multimodal models using a set of diverse tasks and metrics	274
uw-madison-lee-lab/cobsat	Provides a benchmarking framework and dataset for evaluating the performance of large language models in text-to-image tasks	30
yuliang-liu/multimodalocr	An evaluation benchmark for OCR capabilities in large multmodal models.	484
lancopku/iais	This project proposes a novel method for calibrating attention distributions in multimodal models to improve contextualized representations of image-text pairs.	30
multimodal-art-projection/omnibench	Evaluates and benchmarks multimodal language models' ability to process visual, acoustic, and textual inputs simultaneously.	15
open-compass/vlmevalkit	An evaluation toolkit for large vision-language models	1,514
declare-lab/instruct-eval	An evaluation framework for large language models trained with instruction tuning methods	535
esmvalgroup/esmvaltool	A community-developed tool for evaluating climate models and providing diagnostic metrics.	230