Parrot
Visual Instruction Toolkit
A method and toolkit for fine-tuning large language models to perform visual instruction tasks in multiple languages.
🎉 The code repository for "Parrot: Multilingual Visual Instruction Tuning" in PyTorch.
34 stars
4 watching
1 forks
Language: Python
last commit: 6 months ago mixture-of-expertsmultilingualmultimodal-large-language-modelsvision-language-model
Related projects:
Repository | Description | Stars |
---|---|---|
| An MLLM architecture designed to align visual and textual embeddings through structural alignment | 575 |
| A PyTorch toolbox for supporting research and development of domain adaptation, generalization, and semi-supervised learning methods in computer vision. | 1,236 |
| A project that extends large language models by integrating an additional region-level vision encoder to improve visual instruction tuning. | 37 |
| A guide to using pre-trained large language models in source code analysis and generation | 1,789 |
| An open-source project that enhances visual instruction tuning for text-rich image understanding by integrating GPT-4 models with multimodal datasets. | 259 |
| A multimodal AI model that enables real-world vision-language understanding applications | 2,145 |
| A PyTorch-based toolkit for creating customized multimedia datasets and handling heterogeneous data for training AI models. | 346 |
| A deep learning framework for training multi-modal models with vision and language capabilities. | 1,299 |
| A comprehensive dataset and evaluation framework for Vision-Language Instruction Tuning models | 11 |
| An implementation of a multimodal language model with capabilities for comprehension and generation | 585 |
| Autonomously generates high-quality image-text instruction fine-tuning datasets | 91 |
| Creating synthetic visual reasoning instructions to improve the performance of large language models on image-related tasks | 18 |
| A Python framework for building deep learning models with optimized encoding layers and batch normalization. | 2,044 |
| Toolkit for visualizing neural network behavior in PyTorch | 737 |
| A dataset and model designed to scale visual instruction tuning using language-only GPT-4 models. | 164 |