CLUEPretrainedModels

Chinese language models

Provides pre-trained models for Chinese language tasks with improved performance and smaller model sizes compared to existing models.

高质量中文预训练模型集合:最先进大模型、最快小模型、相似度专门模型

GitHub

804 stars
19 watching
96 forks
Language: Python
last commit: over 4 years ago
albertbertchinesecorpusdatasetdistillationpretrained-modelsrobertasemantic-similaritysentence-analysissentence-classificationsentence-pairstext-classification

Related projects:

Repository Description Stars
cluebenchmark/cluecorpus2020 A large-scale pre-training corpus for Chinese language models 925
cluebenchmark/electra Trains and evaluates a Chinese language model using adversarial training on a large corpus. 140
clue-ai/chatyuan Large language model for dialogue support in multiple languages 1,902
clue-ai/promptclue A pre-trained language model for multiple natural language processing tasks with support for few-shot learning and transfer learning. 654
clue-ai/chatyuan-7b An updated version of a large language model designed to improve performance on multiple tasks and datasets 13
brightmart/xlnet_zh Trains a large Chinese language model on massive data and provides a pre-trained model for downstream tasks 230
cluebenchmark/supercluelyb A benchmarking platform for evaluating Chinese general-purpose models through anonymous, random battles 141
shannonai/chinesebert A deep learning model that incorporates visual and phonetic features of Chinese characters to improve its ability to understand Chinese language nuances 542
yunwentechnology/unilm This project provides pre-trained models for natural language understanding and generation tasks using the UniLM architecture. 438
hit-scir/chinese-mixtral-8x7b An implementation of a large language model for Chinese text processing, focusing on MoE (Multi-Headed Attention) architecture and incorporating a vast vocabulary. 641
zhuiyitechnology/pretrained-models A collection of pre-trained language models for natural language processing tasks 987
felixgithub2017/mmcu Evaluates the semantic understanding capabilities of large Chinese language models using a multimodal dataset. 87
ymcui/macbert Improves pre-trained Chinese language models by incorporating a correction task to alleviate inconsistency issues with downstream tasks 645
nkcs-iclab/linglong A pre-trained Chinese language model with a modest parameter count, designed to be accessible and useful for researchers with limited computing resources. 17
tsinghuaai/cpm Develops large-scale pre-trained models for Chinese natural language understanding and generative tasks with the goal of building efficient and effective models for various applications. 163