lmms-eval
Model evaluation toolkit
Tools and evaluation framework for accelerating the development of large multimodal models by providing an efficient way to assess their performance
Accelerating the development of large multimodal models (LMMs) with one-click evaluation module - lmms-eval.
2k stars
3 watching
168 forks
Language: Python
last commit: 1 day ago
Linked from 1 awesome list
Related projects:
Repository | Description | Stars |
---|---|---|
freedomintelligence/mllm-bench | Evaluates and compares the performance of multimodal large language models on various tasks | 56 |
chenllliang/mmevalpro | A benchmarking framework for evaluating Large Multimodal Models by providing rigorous metrics and an efficient evaluation pipeline. | 22 |
mlgroupjlu/llm-eval-survey | A repository of papers and resources for evaluating large language models. | 1,450 |
allenai/olmo-eval | A framework for evaluating language models on NLP tasks | 326 |
mlabonne/llm-autoeval | A tool to automate the evaluation of large language models in Google Colab using various benchmarks and custom parameters. | 566 |
mshukor/evalign-icl | Evaluating and improving large multimodal models through in-context learning | 21 |
open-compass/vlmevalkit | An evaluation toolkit for large vision-language models | 1,514 |
declare-lab/instruct-eval | An evaluation framework for large language models trained with instruction tuning methods | 535 |
prometheus-eval/prometheus-eval | An open-source framework that enables language model evaluation using Prometheus and GPT4 | 820 |
esmvalgroup/esmvaltool | A community-developed tool for evaluating climate models and providing diagnostic metrics. | 230 |
h2oai/h2o-llm-eval | An evaluation framework for large language models with Elo rating system and A/B testing capabilities | 50 |
maluuba/nlg-eval | A toolset for evaluating and comparing natural language generation models | 1,350 |
huggingface/lighteval | An all-in-one toolkit for evaluating Large Language Models (LLMs) across multiple backends. | 879 |
evolvinglmms-lab/longva | An open-source project that enables the transfer of language understanding to vision capabilities through long context processing. | 347 |
modelscope/evalscope | A framework for efficiently evaluating and benchmarking large models | 308 |