Parrot

Visual Instruction Toolkit

A method and toolkit for fine-tuning large language models to perform visual instruction tasks in multiple languages.

🎉 The code repository for "Parrot: Multilingual Visual Instruction Tuning" in PyTorch.

GitHub

30 stars
4 watching
1 forks
Language: Python
last commit: 3 months ago
mixture-of-expertsmultilingualmultimodal-large-language-modelsvision-language-model

Related projects:

Repository Description Stars
aidc-ai/ovis An architecture designed to align visual and textual embeddings in multimodal learning 517
kaiyangzhou/dassl.pytorch A PyTorch toolbox for supporting research and development of domain adaptation, generalization, and semi-supervised learning methods in computer vision. 1,217
pvit-official/pvit A project that extends large language models by integrating an additional region-level vision encoder to improve visual instruction tuning. 36
vhellendoorn/code-lms A guide to using pre-trained large language models in source code analysis and generation 1,782
salt-nlp/llavar An open-source project that enhances visual instruction tuning for text-rich image understanding by integrating GPT-4 models with multimodal datasets. 258
deepseek-ai/deepseek-vl A multimodal AI model that enables real-world vision-language understanding applications 2,077
embodiedgpt/embodiedgpt_pytorch A PyTorch-based toolkit for creating customized multimedia datasets and handling heterogeneous data for training AI models. 340
nvlabs/prismer A deep learning framework for training multi-modal models with vision and language capabilities. 1,298
liaoning97/revo-lion A comprehensive dataset and evaluation framework for Vision-Language Instruction Tuning models 11
ailab-cvc/seed An implementation of a multimodal language model with capabilities for comprehension and generation 576
opendatalab/vigc Autonomously generates high-quality image-text instruction fine-tuning datasets 90
rucaibox/comvint Creating synthetic visual reasoning instructions to improve the performance of large language models on image-related tasks 18
zhanghang1989/pytorch-encoding A Python framework for building deep learning models with optimized encoding layers and batch normalization. 2,041
misaogura/flashtorch Toolkit for visualizing neural network behavior in PyTorch 734
baai-dcai/visual-instruction-tuning A dataset and model designed to scale visual instruction tuning using language-only GPT-4 models. 163