bottom-up-attention

Attention model training

Trains a bottom-up attention model using Faster R-CNN and Visual Genome annotations for image captioning and VQA tasks

Bottom-up attention model for image captioning and VQA, based on Faster R-CNN and Visual Genome

GitHub

1k stars
26 watching
378 forks
Language: Jupyter Notebook
last commit: almost 2 years ago
Linked from 1 awesome list

caffecaptioning-imagesfaster-rcnnimage-captioningmscocomscoco-datasetvisual-question-answeringvqa

Backlinks from these awesome lists:

Related projects:

Repository Description Stars
hengyuan-hu/bottom-up-attention-vqa An implementation of a VQA system using bottom-up attention, aiming to improve the efficiency and speed of visual question answering tasks. 754
gt-vision-lab/vqa_lstm_cnn A Visual Question Answering model using a deeper LSTM and normalized CNN architecture. 376
pistony/residualattentionnetwork A Gluon implementation of Residual Attention Network for image classification tasks 107
jiasenlu/adaptiveattention Adaptive attention mechanism for image captioning using visual sentinels 334
zcyang/imageqa-san This project provides code for training image question answering models using stacked attention networks and convolutional neural networks. 107
kaushalshetty/structured-self-attention A deep learning model that generates sentence embeddings using structured self-attention and is used for binary and multiclass classification tasks. 494
koichiro11/residual-attention-network An image classification neural network implementation using attention mechanisms and residual learning 94
jessemelpolio/faster_rcnn_for_dota This repository provides code for training a Faster R-CNN object detection model on DOTA datasets. 336
bigballon/cifar-zoo Provides implementations of CNN architectures and improvement methods for image classification on the CIFAR benchmark. 700
chapternewscu/image-captioning-with-semantic-attention A deep learning model for generating image captions with semantic attention 51
cszn/ircnn This project trains deep CNN denoisers to improve image restoration tasks such as deblurring and demosaicking through model-based optimization methods. 600
fwang91/residual-attention-network An implementation of a deep neural network architecture using attention mechanisms and residual connections for image classification tasks. 551
emedvedev/attention-ocr A TensorFlow model for recognizing text in images using visual attention and a sequence-to-sequence architecture. 1,077
cadene/vqa.pytorch A PyTorch implementation of visual question answering with multimodal representation learning 716
szagoruyko/attention-transfer Improves performance of convolutional neural networks by transferring knowledge from teacher models to student models using attention mechanisms. 1,444