CuMo
Mixture-of-experts model
A method for scaling multimodal large language models by combining multiple experts and fine-tuning them together
CuMo: Scaling Multimodal LLM with Co-Upcycled Mixture-of-Experts
134 stars
2 watching
10 forks
Language: Python
last commit: 6 months ago Related projects:
Repository | Description | Stars |
---|---|---|
pku-yuangroup/moe-llava | Develops a neural network architecture for multi-modal learning with large vision-language models | 1,980 |
antoine77340/mixture-of-embedding-experts | An open-source implementation of the Mixture-of-Embeddings-Experts model in Pytorch for video-text retrieval tasks. | 118 |
xverse-ai/xverse-moe-a4.2b | Developed by XVERSE Technology Inc. as a multilingual large language model with a unique mixture-of-experts architecture and fine-tuned for various tasks such as conversation, question answering, and natural language understanding. | 36 |
haozhezhao/mic | Develops a multimodal vision-language model to enable machines to understand complex relationships between instructions and images in various tasks. | 334 |
pleisto/yuren-baichuan-7b | A multi-modal large language model that integrates natural language and visual capabilities with fine-tuning for various tasks | 72 |
deepseek-ai/deepseek-moe | A large language model with improved efficiency and performance compared to similar models | 1,006 |
yfzhang114/slime | Develops large multimodal models for high-resolution understanding and analysis of text, images, and other data types. | 137 |
ieit-yuan/yuan2.0-m32 | A high-performance language model designed to excel in tasks like natural language understanding, mathematical computation, and code generation | 180 |
yuweihao/mm-vet | Evaluates the capabilities of large multimodal models using a set of diverse tasks and metrics | 267 |
mbzuai-nlp/bactrian-x | A collection of multilingual language models trained on a dataset of instructions and responses in various languages. | 94 |
shi-labs/vcoder | An adapter for improving large language models at object-level perception tasks with auxiliary perception modalities | 261 |
jshilong/gpt4roi | Training and deploying large language models on computer vision tasks using region-of-interest inputs | 506 |
felixgithub2017/mmcu | Evaluates the semantic understanding capabilities of large Chinese language models using a multimodal dataset. | 87 |
shizhediao/davinci | An implementation of vision-language models for multimodal learning tasks, enabling generative vision-language models to be fine-tuned for various applications. | 43 |
damo-nlp-sg/m3exam | A benchmark for evaluating large language models in multiple languages and formats | 92 |