MultimodalOCR
OCR Benchmark
An evaluation benchmark for OCR capabilities in large multmodal models.
On the Hidden Mystery of OCR in Large Multimodal Models (OCRBench)
471 stars
14 watching
30 forks
Language: Python
last commit: about 1 month ago Related projects:
Repository | Description | Stars |
---|---|---|
multimodal-art-projection/omnibench | Evaluates and benchmarks multimodal language models' ability to process visual, acoustic, and textual inputs simultaneously. | 14 |
yuliang-liu/monkey | A toolkit for building conversational AI models that can process images and text inputs. | 1,825 |
ailab-cvc/seed-bench | A benchmark for evaluating large language models' ability to process multimodal input | 315 |
mshukor/evalign-icl | Evaluating and improving large multimodal models through in-context learning | 20 |
ys-zong/vl-icl | A benchmarking suite for multimodal in-context learning models | 28 |
felixgithub2017/mmcu | Evaluates the semantic understanding capabilities of large Chinese language models using a multimodal dataset. | 87 |
oeg-upm/lubm4obda | Evaluates Ontology-Based Data Access systems with inference and meta knowledge benchmarking | 4 |
openml/automlbenchmark | A framework for evaluating and comparing AutoML systems in a standardized way | 405 |
qcri/llmebench | A benchmarking framework for large language models | 80 |
aifeg/benchlmm | An open-source benchmarking framework for evaluating cross-style visual capability of large multimodal models | 83 |
yuweihao/mm-vet | Evaluates the capabilities of large multimodal models using a set of diverse tasks and metrics | 267 |
uw-madison-lee-lab/cobsat | Provides a benchmarking framework and dataset for evaluating the performance of large language models in text-to-image tasks | 28 |
pkunlp-icler/pca-eval | An open-source benchmark and evaluation tool for assessing multimodal large language models' performance in embodied decision-making tasks | 100 |
chenllliang/mmevalpro | A benchmarking framework for evaluating Large Multimodal Models by providing rigorous metrics and an efficient evaluation pipeline. | 22 |
fuxiaoliu/mmc | Develops a large-scale dataset and benchmark for training multimodal chart understanding models using large language models. | 84 |