GPT-4V-Evaluation

GPT-4V evaluation tool

An evaluation framework for GPT-4V models using data from An Early Evaluation of GPT-4V(ision)

Data for evaluating GPT-4V

11 stars

2 watching

0 forks

last commit: over 2 years ago

Related projects:

Repository	Description	Stars
scut-dlvclab/gpt-4v_ocr	Evaluates the Optical Character Recognition capabilities of GPT-4V(ision) using various tasks and scenarios to identify its strengths and weaknesses	121
pjlab-adg/gpt4v-ad-exploration	An autonomous driving project exploring the capabilities of a visual-language model in understanding complex driving scenes and making decisions	288
prometheus-eval/prometheus-eval	An open-source framework that enables language model evaluation using Prometheus and GPT4	820
0xeb/gpt-analyst	A resource repository providing tools and guides for analyzing and reverse engineering GPT models.	184
ai-secure/decodingtrust	An assessment tool for evaluating trustworthiness in GPT models across various aspects such as toxicity, bias, robustness, and fairness.	267
allenai/olmo-eval	A framework for evaluating language models on NLP tasks	326
open-compass/vlmevalkit	An evaluation toolkit for large vision-language models	1,514
yuweihao/mm-vet	Evaluates the capabilities of large multimodal models using a set of diverse tasks and metrics	274
vchitect/vbench	A benchmark suite for evaluating the performance of video generative models	643
jshilong/gpt4roi	Training and deploying large language models on computer vision tasks using region-of-interest inputs	517
langchain-ai/auto-evaluator	Automated evaluation of language models for question answering tasks	749
ailab-cvc/gpt4tools	An intelligent system that enables automatic control and utilization of visual foundation models to interact with images in conversational settings.	762
gzcch/bingo	An analysis project investigating limitations of visual language models in understanding and processing images with potential biases and interference challenges.	53
zzhanghub/eval-co-sod	An evaluation tool for co-saliency detection tasks	97
usepa/amet	Tools for evaluating and analyzing model predictions in atmospheric science	21