LLMeBench
LLM benchmarker
A benchmarking framework for large language models
Benchmarking Large Language Models
80 stars
13 watching
16 forks
Language: Python
last commit: about 2 months ago
Linked from 1 awesome list
benchmarkinglarge-language-modelsllmmultilingual
Related projects:
Repository | Description | Stars |
---|---|---|
ray-project/llmperf | A tool for evaluating the performance of large language model APIs | 641 |
damo-nlp-sg/m3exam | A benchmark for evaluating large language models in multiple languages and formats | 92 |
bilibili/index-1.9b | A lightweight, multilingual language model with a long context length | 904 |
multimodal-art-projection/omnibench | Evaluates and benchmarks multimodal language models' ability to process visual, acoustic, and textual inputs simultaneously. | 14 |
aifeg/benchlmm | An open-source benchmarking framework for evaluating cross-style visual capability of large multimodal models | 83 |
felixgithub2017/mmcu | Evaluates the semantic understanding capabilities of large Chinese language models using a multimodal dataset. | 87 |
nanbeige/nanbeige | Develops large language models for text understanding and generation tasks. | 85 |
ailab-cvc/seed-bench | A benchmark for evaluating large language models' ability to process multimodal input | 315 |
xverse-ai/xverse-7b | A multilingual large language model developed by XVERSE Technology Inc. | 50 |
bytedance/lynx-llm | A framework for training GPT4-style language models with multimodal inputs using large datasets and pre-trained models | 229 |
tianyi-lab/hallusionbench | An image-context reasoning benchmark designed to challenge large vision-language models and help improve their accuracy | 243 |
pleisto/yuren-baichuan-7b | A multi-modal large language model that integrates natural language and visual capabilities with fine-tuning for various tasks | 72 |
gmftbygmftby/science-llm | A large-scale language model for scientific domain training on redpajama arXiv split | 122 |
openbmb/bmlist | A curated list of large machine learning models tracked over time | 341 |
aiplanethub/beyondllm | An open-source toolkit for building and evaluating large language models | 261 |