MMCU

Chinese understanding benchmark

Measures the understanding of massive multitask Chinese datasets using large language models

MEASURING MASSIVE MULTITASK CHINESE UNDERSTANDING

GitHub

87 stars

2 watching

12 forks

Language: Python

last commit: over 1 year ago

Related projects:

Repository	Description	Stars
ieit-yuan/yuan2.0-m32	A high-performance language model designed to excel in tasks like natural language understanding, mathematical computation, and code generation	182
hit-scir/chinese-mixtral-8x7b	An implementation of a large language model for Chinese text processing, focusing on MoE (Multi-Headed Attention) architecture and incorporating a vast vocabulary.	645
fuxiaoliu/mmc	Develops a large-scale dataset and benchmark for training multimodal chart understanding models using large language models.	87
mikegu721/xiezhibenchmark	An evaluation suite to assess language models' performance in multi-choice questions	93
pku-yuangroup/video-bench	Evaluates and benchmarks large language models' video understanding capabilities	121
cluebenchmark/cluepretrainedmodels	Provides pre-trained models for Chinese language tasks with improved performance and smaller model sizes compared to existing models.	806
tencent/tencent-hunyuan-large	This project makes a large language model accessible for research and development	1,245
qcri/llmebench	A benchmarking framework for large language models	81
freedomintelligence/mllm-bench	Evaluates and compares the performance of multimodal large language models on various tasks	56
yunwentechnology/unilm	This project provides pre-trained models and tools for natural language understanding (NLU) and generation (NLG) tasks in Chinese.	439
damo-nlp-sg/m3exam	A benchmark for evaluating large language models in multiple languages and formats	93
brightmart/xlnet_zh	Trains a large Chinese language model on massive data and provides a pre-trained model for downstream tasks	230
cluebenchmark/electra	Trains and evaluates a Chinese language model using adversarial training on a large corpus.	140
shawn-ieitsystems/yuan-1.0	Large-scale language model with improved performance on NLP tasks through distributed training and efficient data processing	591
yuliang-liu/multimodalocr	An evaluation benchmark for OCR capabilities in large multmodal models.	484