TouchStone

Vision Model Evaluator

A tool to evaluate vision-language models by comparing their performance on various tasks such as image recognition and text generation.

Touchstone: Evaluating Vision-Language Models by Language Models

GitHub

79 stars

3 watching

0 forks

Language: Python

last commit: over 1 year ago

Related projects:

Repository	Description	Stars
openai/simple-evals	Evaluates language models using standardized benchmarks and prompting techniques.	2,059
huggingface/evaluate	An evaluation framework for machine learning models and datasets, providing standardized metrics and tools for comparing model performance.	2,063
allenai/olmo-eval	A framework for evaluating language models on NLP tasks	326
pkunlp-icler/pca-eval	An open-source benchmark and evaluation tool for assessing multimodal large language models' performance in embodied decision-making tasks	99
edublancas/sklearn-evaluation	A tool for evaluating and visualizing machine learning model performance	3
open-compass/vlmevalkit	An evaluation toolkit for large vision-language models	1,514
modelscope/evalscope	A framework for efficiently evaluating and benchmarking large models	308
vchitect/vbench	A benchmark suite for evaluating the performance of video generative models	643
truskovskiyk/nima.pytorch	Assesses and evaluates images using deep learning models	335
tsb0601/mmvp	An evaluation framework for multimodal language models' visual capabilities using image and question benchmarks.	296
zhourax/vega	Develops a multimodal task and dataset to assess vision-language models' ability to handle interleaved image-text inputs.	33
ucsc-vlaa/vllm-safety-benchmark	A benchmark for evaluating the safety and robustness of vision language models against adversarial attacks.	72
chenllliang/mmevalpro	A benchmarking framework for evaluating Large Multimodal Models by providing rigorous metrics and an efficient evaluation pipeline.	22
huggingface/lighteval	An all-in-one toolkit for evaluating Large Language Models (LLMs) across multiple backends.	879
vishaal27/sus-x	This is an open-source project that proposes a novel method to train large-scale vision-language models with minimal resources and no fine-tuning required.	94