SuperCLUElyb

Model benchmark

A benchmarking platform for evaluating Chinese general-purpose models through anonymous, random battles

SuperCLUE琅琊榜:中文通用大模型匿名对战评价基准

GitHub

141 stars
5 watching
6 forks
last commit: 5 months ago

Related projects:

Repository Description Stars
cluebenchmark/cluepretrainedmodels Provides pre-trained models for Chinese language tasks with improved performance and smaller model sizes compared to existing models. 804
cluebenchmark/cluecorpus2020 A large-scale pre-training corpus for Chinese language models 925
cluebenchmark/electra Trains and evaluates a Chinese language model using adversarial training on a large corpus. 140
cluebenchmark/pclue A large-scale dataset for training models to perform multiple tasks and zero-shot learning in natural language processing. 468
clue-ai/promptclue A pre-trained language model for multiple natural language processing tasks with support for few-shot learning and transfer learning. 654
felixgithub2017/mmcu Evaluates the semantic understanding capabilities of large Chinese language models using a multimodal dataset. 87
clue-ai/chatyuan-7b An updated version of a large language model designed to improve performance on multiple tasks and datasets 13
qcri/llmebench A benchmarking framework for large language models 80
aifeg/benchlmm An open-source benchmarking framework for evaluating cross-style visual capability of large multimodal models 83
catboost/benchmarks Comparative benchmarks of various machine learning algorithms 169
bitshifter/mathbench-rs A benchmarking framework comparing performance of different Rust linear algebra libraries 198
ibob/picobench A microbenchmarking library for C++ 211
yuliang-liu/multimodalocr An evaluation benchmark for OCR capabilities in large multmodal models. 471
robustbench/robustbench A standardized benchmark for measuring the robustness of machine learning models against adversarial attacks 667
clue-ai/chatyuan Large language model for dialogue support in multiple languages 1,902