Yuan2.0-M32

Language Model

A high-performance language model designed to excel in tasks like natural language understanding, mathematical computation, and code generation

Mixture-of-Experts (MoE) Language Model

GitHub

180 stars
3 watching
40 forks
Language: Python
last commit: 2 months ago

Related projects:

Repository Description Stars
shawn-ieitsystems/yuan-1.0 Large-scale language model with improved performance on NLP tasks through distributed training and efficient data processing 591
ieit-yuan/yuan-2.0 An open-source large language model framework for building conversational AI applications 681
yunwentechnology/unilm This project provides pre-trained models for natural language understanding and generation tasks using the UniLM architecture. 438
ibm-granite/granite-3.0-language-models A collection of lightweight state-of-the-art language models designed to support multilinguality, coding, and reasoning tasks on constrained resources. 214
ymcui/lert A pre-trained language model designed to leverage linguistic features and outperform comparable baselines on Chinese natural language understanding tasks. 202
01-ai/yi A series of large language models trained from scratch to excel in multiple NLP tasks 7,699
elanmart/psmm An implementation of a neural network model for character-level language modeling. 50
felixgithub2017/mmcu Evaluates the semantic understanding capabilities of large Chinese language models using a multimodal dataset. 87
xverse-ai/xverse-moe-a36b Develops and publishes large multilingual language models with advanced mixing-of-experts architecture. 36
xverse-ai/xverse-13b A large language model developed to support multiple languages and applications 649
zhuiyitechnology/roformer-sim An upgraded version of SimBERT model with integrated retrieval and generation capabilities 438
clue-ai/chatyuan Large language model for dialogue support in multiple languages 1,902
tencent/tencent-hunyuan-large This project makes a large language model accessible for research and development 1,114
xverse-ai/xverse-65b A large language model developed by XVERSE Technology Inc. using transformer architecture and fine-tuned on diverse data sets for various applications. 132
clue-ai/chatyuan-7b An updated version of a large language model designed to improve performance on multiple tasks and datasets 13