CLUEPretrainedModels

Chinese language models

Provides pre-trained models for Chinese language tasks with improved performance and smaller model sizes compared to existing models.

高质量中文预训练模型集合：最先进大模型、最快小模型、相似度专门模型

GitHub

806 stars

19 watching

96 forks

Language: Python

last commit: about 5 years ago

albertbertchinesecorpusdatasetdistillationpretrained-modelsrobertasemantic-similaritysentence-analysissentence-classificationsentence-pairstext-classification

Screenshot of CLUEbenchmark/CLUEPretrainedModels website

arxiv.org/abs/2003.01355

Related projects:

Repository	Description	Stars
cluebenchmark/cluecorpus2020	A large-scale Chinese corpus for pre-training language models.	927
cluebenchmark/electra	Trains and evaluates a Chinese language model using adversarial training on a large corpus.	140
clue-ai/chatyuan	Large language model for dialogue support in multiple languages	1,903
clue-ai/promptclue	A pre-trained language model for multiple natural language processing tasks with support for few-shot learning and transfer learning.	656
clue-ai/chatyuan-7b	An updated version of a large language model designed to improve performance on multiple tasks and datasets	13
brightmart/xlnet_zh	Trains a large Chinese language model on massive data and provides a pre-trained model for downstream tasks	230
cluebenchmark/supercluelyb	A benchmarking platform for evaluating Chinese general-purpose models through anonymous, random battles	143
shannonai/chinesebert	A deep learning model that incorporates visual and phonetic features of Chinese characters to improve its ability to understand Chinese language nuances	545
yunwentechnology/unilm	This project provides pre-trained models and tools for natural language understanding (NLU) and generation (NLG) tasks in Chinese.	439
hit-scir/chinese-mixtral-8x7b	An implementation of a large language model for Chinese text processing, focusing on MoE (Multi-Headed Attention) architecture and incorporating a vast vocabulary.	645
zhuiyitechnology/pretrained-models	A collection of pre-trained language models for natural language processing tasks	989
felixgithub2017/mmcu	Measures the understanding of massive multitask Chinese datasets using large language models	87
ymcui/macbert	Improves pre-trained Chinese language models by incorporating a correction task to alleviate inconsistency issues with downstream tasks	646
nkcs-iclab/linglong	A pre-trained Chinese language model with a modest parameter count, designed to be accessible and useful for researchers with limited computing resources.	18
tsinghuaai/cpm	Develops large-scale pre-trained models for Chinese natural language understanding and generative tasks with the goal of building efficient and effective models for various applications.	163