GPT-4V-Evaluation
GPT-4V evaluation tool
An evaluation framework for GPT-4V models using data from An Early Evaluation of GPT-4V(ision)
Data for evaluating GPT-4V
11 stars
2 watching
0 forks
last commit: about 1 year ago Related projects:
Repository | Description | Stars |
---|---|---|
scut-dlvclab/gpt-4v_ocr | Evaluates the Optical Character Recognition capabilities of GPT-4V(ision) using various tasks and scenarios to identify its strengths and weaknesses | 121 |
pjlab-adg/gpt4v-ad-exploration | An autonomous driving project exploring the capabilities of a visual-language model in understanding complex driving scenes and making decisions | 288 |
prometheus-eval/prometheus-eval | An open-source framework that enables language model evaluation using Prometheus and GPT4 | 820 |
0xeb/gpt-analyst | A resource repository providing tools and guides for analyzing and reverse engineering GPT models. | 184 |
ai-secure/decodingtrust | An assessment tool for evaluating trustworthiness in GPT models across various aspects such as toxicity, bias, robustness, and fairness. | 267 |
allenai/olmo-eval | A framework for evaluating language models on NLP tasks | 326 |
open-compass/vlmevalkit | An evaluation toolkit for large vision-language models | 1,514 |
yuweihao/mm-vet | Evaluates the capabilities of large multimodal models using a set of diverse tasks and metrics | 274 |
vchitect/vbench | A benchmark suite for evaluating the performance of video generative models | 643 |
jshilong/gpt4roi | Training and deploying large language models on computer vision tasks using region-of-interest inputs | 517 |
langchain-ai/auto-evaluator | Automated evaluation of language models for question answering tasks | 749 |
ailab-cvc/gpt4tools | An intelligent system that enables automatic control and utilization of visual foundation models to interact with images in conversational settings. | 762 |
gzcch/bingo | An analysis project investigating limitations of visual language models in understanding and processing images with potential biases and interference challenges. | 53 |
zzhanghub/eval-co-sod | An evaluation tool for co-saliency detection tasks | 97 |
usepa/amet | Tools for evaluating and analyzing model predictions in atmospheric science | 21 |