MultimodalOCR

OCR Benchmark

An evaluation benchmark for OCR capabilities in large multmodal models.

On the Hidden Mystery of OCR in Large Multimodal Models (OCRBench)

GitHub

471 stars
14 watching
30 forks
Language: Python
last commit: about 1 month ago

Related projects:

Repository Description Stars
multimodal-art-projection/omnibench Evaluates and benchmarks multimodal language models' ability to process visual, acoustic, and textual inputs simultaneously. 14
yuliang-liu/monkey A toolkit for building conversational AI models that can process images and text inputs. 1,825
ailab-cvc/seed-bench A benchmark for evaluating large language models' ability to process multimodal input 315
mshukor/evalign-icl Evaluating and improving large multimodal models through in-context learning 20
ys-zong/vl-icl A benchmarking suite for multimodal in-context learning models 28
felixgithub2017/mmcu Evaluates the semantic understanding capabilities of large Chinese language models using a multimodal dataset. 87
oeg-upm/lubm4obda Evaluates Ontology-Based Data Access systems with inference and meta knowledge benchmarking 4
openml/automlbenchmark A framework for evaluating and comparing AutoML systems in a standardized way 405
qcri/llmebench A benchmarking framework for large language models 80
aifeg/benchlmm An open-source benchmarking framework for evaluating cross-style visual capability of large multimodal models 83
yuweihao/mm-vet Evaluates the capabilities of large multimodal models using a set of diverse tasks and metrics 267
uw-madison-lee-lab/cobsat Provides a benchmarking framework and dataset for evaluating the performance of large language models in text-to-image tasks 28
pkunlp-icler/pca-eval An open-source benchmark and evaluation tool for assessing multimodal large language models' performance in embodied decision-making tasks 100
chenllliang/mmevalpro A benchmarking framework for evaluating Large Multimodal Models by providing rigorous metrics and an efficient evaluation pipeline. 22
fuxiaoliu/mmc Develops a large-scale dataset and benchmark for training multimodal chart understanding models using large language models. 84