Chinese-Mixtral

Mixtral model developer

Develops and releases Mixtral-based models for natural language processing tasks with a focus on Chinese text generation and understanding

中文Mixtral混合专家大模型（Chinese Mixtral MoE LLMs）

GitHub

589 stars

15 watching

43 forks

Language: Python

last commit: over 1 year ago

32k64klarge-language-modelsllmmixtralmixture-of-expertsmoenlp

Screenshot of ymcui/Chinese-Mixtral website

arxiv.org/abs/2403.01851

Related projects:

Repository	Description	Stars
hit-scir/chinese-mixtral-8x7b	An implementation of a large language model for Chinese text processing, focusing on MoE (Multi-Headed Attention) architecture and incorporating a vast vocabulary.	645
ymcui/chinese-xlnet	Provides pre-trained models for Chinese natural language processing tasks using the XLNet architecture	1,652
ymcui/macbert	Improves pre-trained Chinese language models by incorporating a correction task to alleviate inconsistency issues with downstream tasks	646
ymcui/chinese-mobilebert	An implementation of MobileBERT, a pre-trained language model, in Python for NLP tasks.	81
ymcui/chinese-electra	Provides pre-trained Chinese language models based on the ELECTRA framework for natural language processing tasks	1,405
ymcui/lert	A pre-trained language model designed to leverage linguistic features and outperform comparable baselines on Chinese natural language understanding tasks.	202
brightmart/xlnet_zh	Trains a large Chinese language model on massive data and provides a pre-trained model for downstream tasks	230
ymcui/pert	Develops a pre-trained language model to learn semantic knowledge from permuted text without mask labels	356
michael-wzhu/shennong-tcm-llm	Develops and deploys a large language model for Chinese traditional medicine applications	316
xverse-ai/xverse-moe-a4.2b	Developed by XVERSE Technology Inc. as a multilingual large language model with a unique mixture-of-experts architecture and fine-tuned for various tasks such as conversation, question answering, and natural language understanding.	36
xverse-ai/xverse-moe-a36b	Develops and publishes large multilingual language models with advanced mixing-of-experts architecture.	37
yunwentechnology/unilm	This project provides pre-trained models and tools for natural language understanding (NLU) and generation (NLG) tasks in Chinese.	439
michael-wzhu/chinese-llama2	A custom Chinese version of the Meta Llama 2 model for improved Chinese language support and application	748
pleisto/yuren-baichuan-7b	A multi-modal large language model that integrates natural language and visual capabilities with fine-tuning for various tasks	73
pku-yuangroup/moe-llava	A large vision-language model using a mixture-of-experts architecture to improve performance on multi-modal learning tasks	2,023