CuMo

Mixture-of-experts model

A method for scaling multimodal large language models by combining multiple experts and fine-tuning them together

CuMo: Scaling Multimodal LLM with Co-Upcycled Mixture-of-Experts

GitHub

134 stars
2 watching
10 forks
Language: Python
last commit: 6 months ago

Related projects:

Repository Description Stars
pku-yuangroup/moe-llava Develops a neural network architecture for multi-modal learning with large vision-language models 1,980
antoine77340/mixture-of-embedding-experts An open-source implementation of the Mixture-of-Embeddings-Experts model in Pytorch for video-text retrieval tasks. 118
xverse-ai/xverse-moe-a4.2b Developed by XVERSE Technology Inc. as a multilingual large language model with a unique mixture-of-experts architecture and fine-tuned for various tasks such as conversation, question answering, and natural language understanding. 36
haozhezhao/mic Develops a multimodal vision-language model to enable machines to understand complex relationships between instructions and images in various tasks. 334
pleisto/yuren-baichuan-7b A multi-modal large language model that integrates natural language and visual capabilities with fine-tuning for various tasks 72
deepseek-ai/deepseek-moe A large language model with improved efficiency and performance compared to similar models 1,006
yfzhang114/slime Develops large multimodal models for high-resolution understanding and analysis of text, images, and other data types. 137
ieit-yuan/yuan2.0-m32 A high-performance language model designed to excel in tasks like natural language understanding, mathematical computation, and code generation 180
yuweihao/mm-vet Evaluates the capabilities of large multimodal models using a set of diverse tasks and metrics 267
mbzuai-nlp/bactrian-x A collection of multilingual language models trained on a dataset of instructions and responses in various languages. 94
shi-labs/vcoder An adapter for improving large language models at object-level perception tasks with auxiliary perception modalities 261
jshilong/gpt4roi Training and deploying large language models on computer vision tasks using region-of-interest inputs 506
felixgithub2017/mmcu Evaluates the semantic understanding capabilities of large Chinese language models using a multimodal dataset. 87
shizhediao/davinci An implementation of vision-language models for multimodal learning tasks, enabling generative vision-language models to be fine-tuned for various applications. 43
damo-nlp-sg/m3exam A benchmark for evaluating large language models in multiple languages and formats 92