Chinese-Mixtral
Mixtral model developer
Develops and releases Mixtral-based models for natural language processing tasks with a focus on Chinese text generation and understanding
中文Mixtral混合专家大模型(Chinese Mixtral MoE LLMs)
584 stars
15 watching
43 forks
Language: Python
last commit: 7 months ago 32k64klarge-language-modelsllmmixtralmixture-of-expertsmoenlp
Related projects:
Repository | Description | Stars |
---|---|---|
hit-scir/chinese-mixtral-8x7b | An implementation of a large language model for Chinese text processing, focusing on MoE (Multi-Headed Attention) architecture and incorporating a vast vocabulary. | 641 |
ymcui/chinese-xlnet | Provides pre-trained models for Chinese natural language processing tasks using the XLNet architecture | 1,653 |
ymcui/macbert | Improves pre-trained Chinese language models by incorporating a correction task to alleviate inconsistency issues with downstream tasks | 645 |
ymcui/chinese-mobilebert | An implementation of MobileBERT, a pre-trained language model, in Python for NLP tasks. | 80 |
ymcui/chinese-electra | Provides pre-trained Chinese language models based on the ELECTRA framework for natural language processing tasks | 1,403 |
ymcui/lert | A pre-trained language model designed to leverage linguistic features and outperform comparable baselines on Chinese natural language understanding tasks. | 202 |
brightmart/xlnet_zh | Trains a large Chinese language model on massive data and provides a pre-trained model for downstream tasks | 230 |
ymcui/pert | Develops a pre-trained language model to learn semantic knowledge from permuted text without mask labels | 354 |
michael-wzhu/shennong-tcm-llm | Develops and deploys a large language model for Chinese traditional medicine applications | 299 |
xverse-ai/xverse-moe-a4.2b | Developed by XVERSE Technology Inc. as a multilingual large language model with a unique mixture-of-experts architecture and fine-tuned for various tasks such as conversation, question answering, and natural language understanding. | 36 |
xverse-ai/xverse-moe-a36b | Develops and publishes large multilingual language models with advanced mixing-of-experts architecture. | 36 |
yunwentechnology/unilm | This project provides pre-trained models for natural language understanding and generation tasks using the UniLM architecture. | 438 |
michael-wzhu/chinese-llama2 | A custom Chinese version of the Meta Llama 2 model for improved Chinese language support and application | 747 |
pleisto/yuren-baichuan-7b | A multi-modal large language model that integrates natural language and visual capabilities with fine-tuning for various tasks | 72 |
pku-yuangroup/moe-llava | Develops a neural network architecture for multi-modal learning with large vision-language models | 1,980 |