TouchStone

Vision Model Evaluator

A tool to evaluate vision-language models by comparing their performance on various tasks such as image recognition and text generation.

Touchstone: Evaluating Vision-Language Models by Language Models

GitHub

78 stars
3 watching
0 forks
Language: Python
last commit: 10 months ago

Related projects:

Repository Description Stars
openai/simple-evals A library for evaluating language models using standardized prompts and benchmarking tests. 1,939
huggingface/evaluate An evaluation framework for machine learning models and datasets, providing standardized metrics and tools for comparing model performance. 2,034
allenai/olmo-eval An evaluation framework for large language models. 310
pkunlp-icler/pca-eval An open-source benchmark and evaluation tool for assessing multimodal large language models' performance in embodied decision-making tasks 100
edublancas/sklearn-evaluation A tool for evaluating and visualizing machine learning model performance 3
open-compass/vlmevalkit A toolkit for evaluating large vision-language models on various benchmarks and datasets. 1,343
modelscope/evalscope A framework for efficient large model evaluation and performance benchmarking. 248
vchitect/vbench A tool for evaluating and benchmarking video generative models in computer vision and artificial intelligence 576
truskovskiyk/nima.pytorch Assesses and evaluates images using deep learning models 335
tsb0601/mmvp An evaluation framework for multimodal language models' visual capabilities using image and question benchmarks. 288
zhourax/vega Develops a multimodal task and dataset to assess vision-language models' ability to handle interleaved image-text inputs. 33
ucsc-vlaa/vllm-safety-benchmark A benchmark for evaluating the safety and robustness of vision language models against adversarial attacks. 67
chenllliang/mmevalpro A benchmarking framework for evaluating Large Multimodal Models by providing rigorous metrics and an efficient evaluation pipeline. 22
huggingface/lighteval A toolkit for evaluating Large Language Models across multiple backends 804
vishaal27/sus-x This is an open-source project that proposes a novel method to train large-scale vision-language models with minimal resources and no fine-tuning required. 94