LVIS-INSTRUCT4V

Visual Instructions

A dataset of fine-grained visual instructions generated by prompting a large language model with images from another dataset

GitHub

131 stars
3 watching
0 forks
last commit: 11 months ago

Related projects:

Repository Description Stars
xuefuzhao/instructionwild Creating a large-scale user-based instruction dataset for natural language processing research and development 453
flagopen/flaginstruct A collection of diverse instruction corpora for improving the development and tuning of Chinese Language Models 173
vt-nlp/multiinstruct A multimodal benchmark dataset designed to evaluate the performance of vision-language foundation models through instruction tuning. 133
orhonovich/unnatural-instructions A collection of automatically generated instructions for training language models. 175
jy0205/lavit A unified framework for training large language models to understand and generate visual content 528
rucaibox/comvint Creating synthetic visual reasoning instructions to improve the performance of large language models on image-related tasks 18
salt-nlp/llavar An open-source project that enhances visual instruction tuning for text-rich image understanding by integrating GPT-4 models with multimodal datasets. 258
ffxsam/vue-typescript-cookbook A cookbook and resource guide for developers learning Vue.js with TypeScript 273
pvit-official/pvit A project that extends large language models by integrating an additional region-level vision encoder to improve visual instruction tuning. 36
ncsoft/cap2qa A dataset and implementation of a method to generate instructions based on visual data 5
aidc-ai/parrot A method and toolkit for fine-tuning large language models to perform visual instruction tasks in multiple languages. 30
vefstathiou/so_word2vec This is a word embedding model trained on Stack Overflow posts for use in natural language processing tasks. 40
alexcode/vue2vis A Vue.js wrapper for the popular visualization library Visjs, allowing developers to easily integrate 2D and 3D graphing capabilities into their web applications. 217
freedomintelligence/allava A collection of datasets and models designed to support the training of lite vision-language models. 246
vlf-silkie/vlfeedback An annotated preference dataset and training framework for improving large vision language models. 85