MoE-LLaVA

Mixture of Experts Model

A large vision-language model using a mixture-of-experts architecture to improve performance on multi-modal learning tasks

Mixture-of-Experts for Large Vision-Language Models

GitHub

2k stars

24 watching

126 forks

Language: Python

last commit: 11 months ago

large-vision-language-modelmixture-of-expertsmoemulti-modal

Screenshot of PKU-YuanGroup/MoE-LLaVA website

arxiv.org/abs/2401.15947

Related projects:

Repository	Description	Stars
xverse-ai/xverse-moe-a4.2b	Developed by XVERSE Technology Inc. as a multilingual large language model with a unique mixture-of-experts architecture and fine-tuned for various tasks such as conversation, question answering, and natural language understanding.	36
shi-labs/cumo	A method for scaling multimodal large language models by combining multiple experts and fine-tuning them together	136
byungkwanlee/moai	Improves performance of vision language tasks by integrating computer vision capabilities into large language models	314
skyworkai/skywork-moe	A high-performance mixture-of-experts model with innovative training techniques for language processing tasks	126
yfzhang114/llava-align	Debiasing techniques to minimize hallucinations in large visual language models	75
xverse-ai/xverse-moe-a36b	Develops and publishes large multilingual language models with advanced mixing-of-experts architecture.	37
pku-yuangroup/languagebind	Extending pretraining models to handle multiple modalities by aligning language and video representations	751
jshilong/gpt4roi	Training and deploying large language models on computer vision tasks using region-of-interest inputs	517
deepseek-ai/deepseek-moe	A large language model with improved efficiency and performance compared to similar models	1,024
pleisto/yuren-baichuan-7b	A multi-modal large language model that integrates natural language and visual capabilities with fine-tuning for various tasks	73
ymcui/chinese-mixtral	Develops and releases Mixtral-based models for natural language processing tasks with a focus on Chinese text generation and understanding	589
alibaba/conv-llava	This project presents an optimization technique for large-scale image models to reduce computational requirements while maintaining performance.	106
gordonhu608/mqt-llava	A vision-language model that uses a query transformer to encode images as visual tokens and allows flexible choice of the number of visual tokens.	101
ieit-yuan/yuan2.0-m32	A high-performance language model designed to excel in tasks like natural language understanding, mathematical computation, and code generation	182
llava-vl/llava-plus-codebase	A platform for training and deploying large language and vision models that can use tools to perform tasks	717