imageqa-san

Image QA model

This project provides code for training image question answering models using stacked attention networks and convolutional neural networks.

code for Stacked attention networks for image question answering

GitHub

107 stars
8 watching
52 forks
Language: Python
last commit: almost 8 years ago

Related projects:

Repository Description Stars
jnhwkim/nips-mrn-vqa This project presents a neural network model designed to answer visual questions by combining question and image features in a residual learning framework. 39
zhengpeng7/birefnet An implementation of a deep learning-based image segmentation model for high-resolution images 1,319
xiaoman-zhang/pmc-vqa A medical visual question-answering dataset and toolkit for training models to understand medical images and instructions. 174
gt-vision-lab/vqa_lstm_cnn A Visual Question Answering model using a deeper LSTM and normalized CNN architecture. 376
zhegan27/semantic_compositional_nets A deep learning framework providing a model architecture and training code for image captioning using semantic compositional networks 70
hszhao/pspnet A PyTorch implementation of a deep learning model for semantic image segmentation 1,593
tencentarc-qq/qa-clip Provides Chinese language models with high performance for image-text retrieval and classification tasks. 48
jiasenlu/hiecoattenvqa A framework for training Hierarchical Co-Attention models for Visual Question Answering using preprocessed data and a specific image model. 349
juntang-zhuang/laddernet A deep learning implementation of a multi-path network architecture for medical image segmentation 139
visionlearninggroup/ask_attend_and_answer Develops a deep learning model to answer questions about visual scenes based on spatial attention and question guidance 25
cadene/vqa.pytorch A PyTorch implementation of visual question answering with multimodal representation learning 716
masaiahhan/correlationqa An investigation into the relationship between misleading images and hallucinations in large language models 8
zsdonghao/text-to-image A TensorFlow implementation of generating images from text descriptions using a Generative Adversarial Network (GAN) architecture 599
ocampor/image-quality Library providing a set of tools and algorithms for evaluating the quality of digital images 401
allenai/document-qa Tools and codebase for training neural question answering models on multiple paragraphs of text data 434