GPT-4V-Evaluation
GPT-4V evaluation tool
An evaluation framework for GPT-4V models using data from An Early Evaluation of GPT-4V(ision)
Data for evaluating GPT-4V
11 stars
2 watching
0 forks
last commit: about 1 year ago Related projects:
Repository | Description | Stars |
---|---|---|
scut-dlvclab/gpt-4v_ocr | Evaluates the Optical Character Recognition capabilities of GPT-4V(ision) using various tasks and scenarios to identify its strengths and weaknesses | 120 |
pjlab-adg/gpt4v-ad-exploration | An autonomous driving project exploring the capabilities of a visual-language model in understanding complex driving scenes and making decisions | 287 |
prometheus-eval/prometheus-eval | An open-source framework that enables language model evaluation using Prometheus and GPT4 | 796 |
0xeb/gpt-analyst | A resource repository providing tools and guides for analyzing and reverse engineering GPT models. | 181 |
ai-secure/decodingtrust | An assessment tool for evaluating trustworthiness in GPT models across various aspects such as toxicity, bias, robustness, and fairness. | 259 |
allenai/olmo-eval | An evaluation framework for large language models. | 310 |
open-compass/vlmevalkit | A toolkit for evaluating large vision-language models on various benchmarks and datasets. | 1,343 |
yuweihao/mm-vet | Evaluates the capabilities of large multimodal models using a set of diverse tasks and metrics | 267 |
vchitect/vbench | A tool for evaluating and benchmarking video generative models in computer vision and artificial intelligence | 576 |
jshilong/gpt4roi | Training and deploying large language models on computer vision tasks using region-of-interest inputs | 506 |
langchain-ai/auto-evaluator | Automated evaluation of language models for question answering tasks | 744 |
ailab-cvc/gpt4tools | An intelligent system that enables automatic control and utilization of visual foundation models to interact with images in conversational settings. | 760 |
gzcch/bingo | An analysis project investigating limitations of visual language models in understanding and processing images with potential biases and interference challenges. | 53 |
zzhanghub/eval-co-sod | An evaluation tool for co-saliency detection tasks | 96 |
usepa/amet | Tools for evaluating and analyzing model predictions in atmospheric science | 21 |