LLMeBench

LLM benchmarker

A benchmarking framework for large language models

Benchmarking Large Language Models

GitHub

80 stars
13 watching
16 forks
Language: Python
last commit: about 2 months ago
Linked from 1 awesome list

benchmarkinglarge-language-modelsllmmultilingual

Backlinks from these awesome lists:

Related projects:

Repository Description Stars
ray-project/llmperf A tool for evaluating the performance of large language model APIs 641
damo-nlp-sg/m3exam A benchmark for evaluating large language models in multiple languages and formats 92
bilibili/index-1.9b A lightweight, multilingual language model with a long context length 904
multimodal-art-projection/omnibench Evaluates and benchmarks multimodal language models' ability to process visual, acoustic, and textual inputs simultaneously. 14
aifeg/benchlmm An open-source benchmarking framework for evaluating cross-style visual capability of large multimodal models 83
felixgithub2017/mmcu Evaluates the semantic understanding capabilities of large Chinese language models using a multimodal dataset. 87
nanbeige/nanbeige Develops large language models for text understanding and generation tasks. 85
ailab-cvc/seed-bench A benchmark for evaluating large language models' ability to process multimodal input 315
xverse-ai/xverse-7b A multilingual large language model developed by XVERSE Technology Inc. 50
bytedance/lynx-llm A framework for training GPT4-style language models with multimodal inputs using large datasets and pre-trained models 229
tianyi-lab/hallusionbench An image-context reasoning benchmark designed to challenge large vision-language models and help improve their accuracy 243
pleisto/yuren-baichuan-7b A multi-modal large language model that integrates natural language and visual capabilities with fine-tuning for various tasks 72
gmftbygmftby/science-llm A large-scale language model for scientific domain training on redpajama arXiv split 122
openbmb/bmlist A curated list of large machine learning models tracked over time 341
aiplanethub/beyondllm An open-source toolkit for building and evaluating large language models 261