MultimodalOCR

OCR Benchmark

An evaluation benchmark for OCR capabilities in large multmodal models.

On the Hidden Mystery of OCR in Large Multimodal Models (OCRBench)

GitHub

484 stars
15 watching
32 forks
Language: Python
last commit: 3 months ago

Related projects:

Repository Description Stars
multimodal-art-projection/omnibench Evaluates and benchmarks multimodal language models' ability to process visual, acoustic, and textual inputs simultaneously. 15
yuliang-liu/monkey An end-to-end image captioning system that uses large multi-modal models and provides tools for training, inference, and demo usage. 1,849
ailab-cvc/seed-bench A benchmark for evaluating large language models' ability to process multimodal input 322
mshukor/evalign-icl Evaluating and improving large multimodal models through in-context learning 21
ys-zong/vl-icl A benchmarking suite for multimodal in-context learning models 31
felixgithub2017/mmcu Measures the understanding of massive multitask Chinese datasets using large language models 87
oeg-upm/lubm4obda Evaluates Ontology-Based Data Access systems with inference and meta knowledge benchmarking 4
openml/automlbenchmark A framework for evaluating and comparing machine learning pipelines and neural architectures. 413
qcri/llmebench A benchmarking framework for large language models 81
aifeg/benchlmm An open-source benchmarking framework for evaluating cross-style visual capability of large multimodal models 84
yuweihao/mm-vet Evaluates the capabilities of large multimodal models using a set of diverse tasks and metrics 274
uw-madison-lee-lab/cobsat Provides a benchmarking framework and dataset for evaluating the performance of large language models in text-to-image tasks 30
pkunlp-icler/pca-eval An open-source benchmark and evaluation tool for assessing multimodal large language models' performance in embodied decision-making tasks 99
chenllliang/mmevalpro A benchmarking framework for evaluating Large Multimodal Models by providing rigorous metrics and an efficient evaluation pipeline. 22
fuxiaoliu/mmc Develops a large-scale dataset and benchmark for training multimodal chart understanding models using large language models. 87