Bingo
Model evaluation tool
An analysis project investigating limitations of visual language models in understanding and processing images with potential biases and interference challenges.
53 stars
3 watching
1 forks
last commit: 11 months ago Related projects:
Repository | Description | Stars |
---|---|---|
| An evaluation suite to assess language models' performance in multi-choice questions | 93 |
| An evaluation tool for co-saliency detection tasks | 97 |
| An end-to-end trained model capable of generating natural language responses integrated with object segmentation masks for interactive visual conversations | 797 |
| A collection of benchmarks to evaluate the multi-modal understanding capability of large vision language models. | 168 |
| A repository of papers and resources for evaluating large language models. | 1,450 |
| A benchmarking platform for evaluating Chinese general-purpose models through anonymous, random battles | 143 |
| An evaluation toolkit for large vision-language models | 1,514 |
| Measures the understanding of massive multitask Chinese datasets using large language models | 87 |
| Developing large language models for agricultural applications to improve crop yields and support rural development. | 22 |
| Evaluates the capabilities of large multimodal models using a set of diverse tasks and metrics | 274 |
| A tool for evaluating and improving the fairness of machine learning models | 57 |
| Evaluates and compares the performance of multimodal large language models on various tasks | 56 |
| An investigation into the relationship between misleading images and hallucinations in large language models | 8 |
| Provides tools to understand and interpret the decisions made by XGBoost models in machine learning | 253 |
| An evaluation framework for multimodal language models' visual capabilities using image and question benchmarks. | 296 |