MMCU

Chinese understanding benchmark

Measures the understanding of massive multitask Chinese datasets using large language models

MEASURING MASSIVE MULTITASK CHINESE UNDERSTANDING

GitHub

87 stars
2 watching
12 forks
Language: Python
last commit: 10 months ago

Related projects:

Repository Description Stars
ieit-yuan/yuan2.0-m32 A high-performance language model designed to excel in tasks like natural language understanding, mathematical computation, and code generation 182
hit-scir/chinese-mixtral-8x7b An implementation of a large language model for Chinese text processing, focusing on MoE (Multi-Headed Attention) architecture and incorporating a vast vocabulary. 645
fuxiaoliu/mmc Develops a large-scale dataset and benchmark for training multimodal chart understanding models using large language models. 87
mikegu721/xiezhibenchmark An evaluation suite to assess language models' performance in multi-choice questions 93
pku-yuangroup/video-bench Evaluates and benchmarks large language models' video understanding capabilities 121
cluebenchmark/cluepretrainedmodels Provides pre-trained models for Chinese language tasks with improved performance and smaller model sizes compared to existing models. 806
tencent/tencent-hunyuan-large This project makes a large language model accessible for research and development 1,245
qcri/llmebench A benchmarking framework for large language models 81
freedomintelligence/mllm-bench Evaluates and compares the performance of multimodal large language models on various tasks 56
yunwentechnology/unilm This project provides pre-trained models and tools for natural language understanding (NLU) and generation (NLG) tasks in Chinese. 439
damo-nlp-sg/m3exam A benchmark for evaluating large language models in multiple languages and formats 93
brightmart/xlnet_zh Trains a large Chinese language model on massive data and provides a pre-trained model for downstream tasks 230
cluebenchmark/electra Trains and evaluates a Chinese language model using adversarial training on a large corpus. 140
shawn-ieitsystems/yuan-1.0 Large-scale language model with improved performance on NLP tasks through distributed training and efficient data processing 591
yuliang-liu/multimodalocr An evaluation benchmark for OCR capabilities in large multmodal models. 484