CuMo
Mixture-of-experts model
A method for scaling multimodal large language models by combining multiple experts and fine-tuning them together
CuMo: Scaling Multimodal LLM with Co-Upcycled Mixture-of-Experts
136 stars
2 watching
10 forks
Language: Python
last commit: 9 months ago Related projects:
Repository | Description | Stars |
---|---|---|
| A large vision-language model using a mixture-of-experts architecture to improve performance on multi-modal learning tasks | 2,023 |
| An open-source implementation of the Mixture-of-Embeddings-Experts model in Pytorch for video-text retrieval tasks. | 118 |
| Developed by XVERSE Technology Inc. as a multilingual large language model with a unique mixture-of-experts architecture and fine-tuned for various tasks such as conversation, question answering, and natural language understanding. | 36 |
| Develops a multimodal vision-language model to enable machines to understand complex relationships between instructions and images in various tasks. | 337 |
| A multi-modal large language model that integrates natural language and visual capabilities with fine-tuning for various tasks | 73 |
| A large language model with improved efficiency and performance compared to similar models | 1,024 |
| Develops large multimodal models for high-resolution understanding and analysis of text, images, and other data types. | 143 |
| A high-performance language model designed to excel in tasks like natural language understanding, mathematical computation, and code generation | 182 |
| Evaluates the capabilities of large multimodal models using a set of diverse tasks and metrics | 274 |
| A collection of multilingual language models trained on a dataset of instructions and responses in various languages. | 94 |
| An adapter for improving large language models at object-level perception tasks with auxiliary perception modalities | 266 |
| Training and deploying large language models on computer vision tasks using region-of-interest inputs | 517 |
| Measures the understanding of massive multitask Chinese datasets using large language models | 87 |
| Implementing a unified modal learning framework for generative vision-language models | 43 |
| A benchmark for evaluating large language models in multiple languages and formats | 93 |