Parrot

Visual Instruction Toolkit

A method and toolkit for fine-tuning large language models to perform visual instruction tasks in multiple languages.

🎉 The code repository for "Parrot: Multilingual Visual Instruction Tuning" in PyTorch.

GitHub

34 stars

4 watching

1 forks

Language: Python

last commit: almost 2 years ago

mixture-of-expertsmultilingualmultimodal-large-language-modelsvision-language-model

arxiv.org/abs/2406.02539

Related projects:

Repository	Description	Stars
aidc-ai/ovis	An MLLM architecture designed to align visual and textual embeddings through structural alignment	575
kaiyangzhou/dassl.pytorch	A PyTorch toolbox for supporting research and development of domain adaptation, generalization, and semi-supervised learning methods in computer vision.	1,236
pvit-official/pvit	A project that extends large language models by integrating an additional region-level vision encoder to improve visual instruction tuning.	37
vhellendoorn/code-lms	A guide to using pre-trained large language models in source code analysis and generation	1,789
salt-nlp/llavar	An open-source project that enhances visual instruction tuning for text-rich image understanding by integrating GPT-4 models with multimodal datasets.	259
deepseek-ai/deepseek-vl	A multimodal AI model that enables real-world vision-language understanding applications	2,145
embodiedgpt/embodiedgpt_pytorch	A PyTorch-based toolkit for creating customized multimedia datasets and handling heterogeneous data for training AI models.	346
nvlabs/prismer	A deep learning framework for training multi-modal models with vision and language capabilities.	1,299
liaoning97/revo-lion	A comprehensive dataset and evaluation framework for Vision-Language Instruction Tuning models	11
ailab-cvc/seed	An implementation of a multimodal language model with capabilities for comprehension and generation	585
opendatalab/vigc	Autonomously generates high-quality image-text instruction fine-tuning datasets	91
rucaibox/comvint	Creating synthetic visual reasoning instructions to improve the performance of large language models on image-related tasks	18
zhanghang1989/pytorch-encoding	A Python framework for building deep learning models with optimized encoding layers and batch normalization.	2,044
misaogura/flashtorch	Toolkit for visualizing neural network behavior in PyTorch	737
baai-dcai/visual-instruction-tuning	A dataset and model designed to scale visual instruction tuning using language-only GPT-4 models.	164