Parrot
Visual Instruction Toolkit
A method and toolkit for fine-tuning large language models to perform visual instruction tasks in multiple languages.
🎉 The code repository for "Parrot: Multilingual Visual Instruction Tuning" in PyTorch.
30 stars
4 watching
1 forks
Language: Python
last commit: 3 months ago mixture-of-expertsmultilingualmultimodal-large-language-modelsvision-language-model
Related projects:
Repository | Description | Stars |
---|---|---|
aidc-ai/ovis | An architecture designed to align visual and textual embeddings in multimodal learning | 517 |
kaiyangzhou/dassl.pytorch | A PyTorch toolbox for supporting research and development of domain adaptation, generalization, and semi-supervised learning methods in computer vision. | 1,217 |
pvit-official/pvit | A project that extends large language models by integrating an additional region-level vision encoder to improve visual instruction tuning. | 36 |
vhellendoorn/code-lms | A guide to using pre-trained large language models in source code analysis and generation | 1,782 |
salt-nlp/llavar | An open-source project that enhances visual instruction tuning for text-rich image understanding by integrating GPT-4 models with multimodal datasets. | 258 |
deepseek-ai/deepseek-vl | A multimodal AI model that enables real-world vision-language understanding applications | 2,077 |
embodiedgpt/embodiedgpt_pytorch | A PyTorch-based toolkit for creating customized multimedia datasets and handling heterogeneous data for training AI models. | 340 |
nvlabs/prismer | A deep learning framework for training multi-modal models with vision and language capabilities. | 1,298 |
liaoning97/revo-lion | A comprehensive dataset and evaluation framework for Vision-Language Instruction Tuning models | 11 |
ailab-cvc/seed | An implementation of a multimodal language model with capabilities for comprehension and generation | 576 |
opendatalab/vigc | Autonomously generates high-quality image-text instruction fine-tuning datasets | 90 |
rucaibox/comvint | Creating synthetic visual reasoning instructions to improve the performance of large language models on image-related tasks | 18 |
zhanghang1989/pytorch-encoding | A Python framework for building deep learning models with optimized encoding layers and batch normalization. | 2,041 |
misaogura/flashtorch | Toolkit for visualizing neural network behavior in PyTorch | 734 |
baai-dcai/visual-instruction-tuning | A dataset and model designed to scale visual instruction tuning using language-only GPT-4 models. | 163 |