Visual-Instruction-Tuning

Visual Instruction Tuning

A dataset and model designed to scale visual instruction tuning using language-only GPT-4 models.

SVIT: Scaling up Visual Instruction Tuning

GitHub

163 stars
5 watching
4 forks
Language: Python
last commit: 5 months ago

Related projects:

Repository Description Stars
pvit-official/pvit A project that extends large language models by integrating an additional region-level vision encoder to improve visual instruction tuning. 36
icoz69/stablellava A tool for generating and evaluating multimodal Large Language Models with visual instruction tuning capabilities 91
salt-nlp/llavar An open-source project that enhances visual instruction tuning for text-rich image understanding by integrating GPT-4 models with multimodal datasets. 258
rucaibox/comvint Creating synthetic visual reasoning instructions to improve the performance of large language models on image-related tasks 18
openai/lm-human-preferences Training methods and tools for fine-tuning language models using human preferences 1,229
aidc-ai/parrot A method and toolkit for fine-tuning large language models to perform visual instruction tasks in multiple languages. 30
opendatalab/vigc Autonomously generates high-quality image-text instruction fine-tuning datasets 90
jshilong/gpt4roi Training and deploying large language models on computer vision tasks using region-of-interest inputs 506
vt-nlp/multiinstruct A multimodal benchmark dataset designed to evaluate the performance of vision-language foundation models through instruction tuning. 133
ys-zong/vlguard Improves safety and helpfulness of large language models by fine-tuning them using safety-critical tasks 45
pevma/septun-mark-ii A tuning guide for optimizing the performance of a network intrusion prevention system 113
pevma/septun A guide to tuning Suricata for maximum performance in network intrusion detection systems 204
circleradon/osprey This project presents a new approach to fine-grained visual understanding using pixel-wise mask regions in language instructions 770
microsoft/archai Automates the search for optimal neural network configurations in deep learning applications 467
liaoning97/revo-lion A comprehensive dataset and evaluation framework for Vision-Language Instruction Tuning models 11