Ask_Attend_and_Answer

Visual QA Model

Develops a deep learning model to answer questions about visual scenes based on spatial attention and question guidance

Ask, Attend and Answer: Exploring Question-Guided Spatial Attention for Visual Question Answering

GitHub

25 stars

4 watching

11 forks

Language: C++

last commit: over 4 years ago

Related projects:

Repository	Description	Stars
jiasenlu/hiecoattenvqa	A framework for training Hierarchical Co-Attention models for Visual Question Answering using preprocessed data and a specific image model.	349
jnhwkim/nips-mrn-vqa	This project presents a neural network model designed to answer visual questions by combining question and image features in a residual learning framework.	39
zcyang/imageqa-san	This project provides code for training image question answering models using stacked attention networks and convolutional neural networks.	108
gt-vision-lab/vqa_lstm_cnn	A Visual Question Answering model using a deeper LSTM and normalized CNN architecture.	377
akosiorek/attend_infer_repeat	An implementation of Attend, Infer, Repeat, a method for fast scene understanding using generative models.	82
huggingface/node-question-answering	Provides a simple way to perform question answering using a pre-trained model in Node.js	466
yunjey/show-attend-and-tell	Generates captions for images using an attention-based neural network	907
cadene/vqa.pytorch	A PyTorch implementation of visual question answering with multimodal representation learning	718
jazzsaxmafia/show_attend_and_tell.tensorflow	A TensorFlow implementation of a neural caption generator using attention mechanisms.	506
rowanz/r2c	An open-source project providing PyTorch code and data for a deep learning model that enables visual commonsense reasoning.	466
davidmascharka/tbd-nets	An open-source implementation of a deep learning model designed to improve the balance between performance and interpretability in visual reasoning tasks.	348
jiasenlu/adaptiveattention	Adaptive attention mechanism for image captioning using visual sentinels	335
hyeonwoonoh/vqa-transfer-externaldata	Tools and scripts for training and evaluating a visual question answering model using transfer learning from an external data source.	20
allenai/document-qa	Tools and codebase for training neural question answering models on multiple paragraphs of text data	435
deepseek-ai/deepseek-vl	A multimodal AI model that enables real-world vision-language understanding applications	2,145