LLaVA-MoD
MMLM Distiller
This project provides a novel framework for training small-scale Multimodal Language Models by distilling knowledge from larger models using a sparse Mixture of Experts architecture.
Making LLaVA Tiny via MoE-Knowledge Distillation
69 stars
9 watching
4 forks
Language: Python
last commit: 3 months ago