VQA-Transfer-ExternalData

VQA trainer

Tools and scripts for training and evaluating a visual question answering model using transfer learning from an external data source.

Transfer Learning via Unsupervised Task Discovery for Visual Question Answering

GitHub

20 stars

3 watching

2 forks

Language: Python

last commit: over 6 years ago

Related projects:

Repository	Description	Stars
markdtw/vqa-winner-cvprw-2017	Implementations and tools for training and fine-tuning a visual question answering model based on the 2017 CVPR workshop winner's approach.	164
cadene/vqa.pytorch	A PyTorch implementation of visual question answering with multimodal representation learning	718
hengyuan-hu/bottom-up-attention-vqa	An implementation of a VQA system using bottom-up attention, aiming to improve the efficiency and speed of visual question answering tasks.	755
jayleicn/tvqa	PyTorch implementation of video question answering system based on TVQA dataset	172
akirafukui/vqa-mcb	A software framework for training and deploying multimodal visual question answering models using compact bilinear pooling.	222
guoyang9/unk-vqa	A VQA dataset with unanswerable questions designed to test the limits of large models' knowledge and reasoning abilities.	3
henryjunw/tag	A Python-based system for generating visual question-answer pairs using text-aware approaches to improve Text-VQA performance.	21
gt-vision-lab/vqa_lstm_cnn	A Visual Question Answering model using a deeper LSTM and normalized CNN architecture.	377
jiasenlu/hiecoattenvqa	A framework for training Hierarchical Co-Attention models for Visual Question Answering using preprocessed data and a specific image model.	349
jnhwkim/nips-mrn-vqa	This project presents a neural network model designed to answer visual questions by combining question and image features in a residual learning framework.	39
hitvoice/drqa	Implementing reading comprehension from Wikipedia questions to answer open-domain queries using PyTorch and SQuAD dataset	401
milvlg/prophet	An implementation of a two-stage framework designed to prompt large language models with answer heuristics for knowledge-based visual question answering tasks.	270
vpgtrans/vpgtrans	Transfers visual prompt generators across large language models to reduce training costs and enable customization of multimodal LLMs	270
xiaoman-zhang/pmc-vqa	A medical visual question-answering dataset and toolkit for training models to understand medical images and instructions.	180
vishaal27/sus-x	This is an open-source project that proposes a novel method to train large-scale vision-language models with minimal resources and no fine-tuning required.	94