StableLLAVA
Visual Instruction Tuning Tool
A tool for generating and evaluating multimodal Large Language Models with visual instruction tuning capabilities
Official repo for StableLLAVA
93 stars
4 watching
10 forks
Language: Python
last commit: about 1 year ago Related projects:
Repository | Description | Stars |
---|---|---|
salt-nlp/llavar | An open-source project that enhances visual instruction tuning for text-rich image understanding by integrating GPT-4 models with multimodal datasets. | 259 |
baai-dcai/visual-instruction-tuning | A dataset and model designed to scale visual instruction tuning using language-only GPT-4 models. | 164 |
ys-zong/vlguard | Improves safety and helpfulness of large language models by fine-tuning them using safety-critical tasks | 47 |
aidc-ai/parrot | A method and toolkit for fine-tuning large language models to perform visual instruction tasks in multiple languages. | 34 |
roboflow/maestro | A tool to streamline fine-tuning of multimodal models for vision-language tasks | 1,415 |
aidc-ai/ovis | An MLLM architecture designed to align visual and textual embeddings through structural alignment | 575 |
rucaibox/comvint | Creating synthetic visual reasoning instructions to improve the performance of large language models on image-related tasks | 18 |
dvlab-research/llama-vid | An image-based language model that uses large language models to generate visual and text features from videos | 748 |
alibaba/conv-llava | This project presents an optimization technique for large-scale image models to reduce computational requirements while maintaining performance. | 106 |
openai/lm-human-preferences | Training methods and tools for fine-tuning language models using human preferences | 1,240 |
pevma/septun-mark-ii | A tuning guide for optimizing the performance of a network intrusion prevention system | 114 |
pevma/septun | A guide to tuning Suricata for maximum performance in network intrusion detection systems | 204 |
spandan-madan/pytorch_fine_tuning_tutorial | Provides guidance on fine-tuning pre-trained models for image classification tasks using PyTorch. | 279 |
dvlab-research/prompt-highlighter | An interactive control system for text generation in multi-modal language models | 135 |
rlhf-v/rlhf-v | Aligns large language models' behavior through fine-grained correctional human feedback to improve trustworthiness and accuracy. | 245 |