imageqa-san

Image QA model

This project provides code for training image question answering models using stacked attention networks and convolutional neural networks.

code for Stacked attention networks for image question answering

GitHub

108 stars

8 watching

52 forks

Language: Python

last commit: over 8 years ago

Related projects:

Repository	Description	Stars
jnhwkim/nips-mrn-vqa	This project presents a neural network model designed to answer visual questions by combining question and image features in a residual learning framework.	39
zhengpeng7/birefnet	An open-source implementation of an image segmentation model that combines background removal and object detection capabilities.	1,484
xiaoman-zhang/pmc-vqa	A medical visual question-answering dataset and toolkit for training models to understand medical images and instructions.	180
gt-vision-lab/vqa_lstm_cnn	A Visual Question Answering model using a deeper LSTM and normalized CNN architecture.	377
zhegan27/semantic_compositional_nets	A deep learning framework providing a model architecture and training code for image captioning using semantic compositional networks	70
hszhao/pspnet	A PyTorch implementation of a deep learning model for semantic image segmentation	1,598
tencentarc-qq/qa-clip	Provides Chinese language models with high performance for image-text retrieval and classification tasks.	51
jiasenlu/hiecoattenvqa	A framework for training Hierarchical Co-Attention models for Visual Question Answering using preprocessed data and a specific image model.	349
juntang-zhuang/laddernet	A deep learning implementation of a multi-path network architecture for medical image segmentation	140
visionlearninggroup/ask_attend_and_answer	Develops a deep learning model to answer questions about visual scenes based on spatial attention and question guidance	25
cadene/vqa.pytorch	A PyTorch implementation of visual question answering with multimodal representation learning	718
masaiahhan/correlationqa	An investigation into the relationship between misleading images and hallucinations in large language models	8
zsdonghao/text-to-image	A TensorFlow implementation of generating images from text descriptions using a Generative Adversarial Network (GAN) architecture	602
ocampor/image-quality	Library providing a set of tools and algorithms for evaluating the quality of digital images	402
allenai/document-qa	Tools and codebase for training neural question answering models on multiple paragraphs of text data	435