bottom-up-attention

Attention model training

Trains a bottom-up attention model using Faster R-CNN and Visual Genome annotations for image captioning and VQA tasks

Bottom-up attention model for image captioning and VQA, based on Faster R-CNN and Visual Genome

GitHub

1k stars

26 watching

378 forks

Language: Jupyter Notebook

last commit: almost 3 years ago

Linked from 1 awesome list

caffecaptioning-imagesfaster-rcnnimage-captioningmscocomscoco-datasetvisual-question-answeringvqa

Screenshot of peteanderson80/bottom-up-attention website

panderson.me/up-down-attention/

Backlinks from these awesome lists:

zhjohnchan/awesome-image-captioning

Related projects:

Repository	Description	Stars
hengyuan-hu/bottom-up-attention-vqa	An implementation of a VQA system using bottom-up attention, aiming to improve the efficiency and speed of visual question answering tasks.	755
gt-vision-lab/vqa_lstm_cnn	A Visual Question Answering model using a deeper LSTM and normalized CNN architecture.	377
pistony/residualattentionnetwork	A Gluon implementation of Residual Attention Network for image classification tasks	108
jiasenlu/adaptiveattention	Adaptive attention mechanism for image captioning using visual sentinels	335
zcyang/imageqa-san	This project provides code for training image question answering models using stacked attention networks and convolutional neural networks.	108
kaushalshetty/structured-self-attention	A deep learning model that generates sentence embeddings using structured self-attention and is used for binary and multiclass classification tasks.	494
koichiro11/residual-attention-network	An image classification neural network implementation using attention mechanisms and residual learning	94
jessemelpolio/faster_rcnn_for_dota	This repository provides code for training a Faster R-CNN object detection model on DOTA datasets.	337
bigballon/cifar-zoo	Provides implementations of CNN architectures and improvement methods for image classification on the CIFAR benchmark.	703
chapternewscu/image-captioning-with-semantic-attention	A deep learning model for generating image captions with semantic attention	51
cszn/ircnn	This project trains deep CNN denoisers to improve image restoration tasks such as deblurring and demosaicking through model-based optimization methods.	602
fwang91/residual-attention-network	An implementation of a deep neural network architecture using attention mechanisms and residual connections for image classification tasks.	554
emedvedev/attention-ocr	A TensorFlow model for recognizing text in images using visual attention and a sequence-to-sequence architecture.	1,079
cadene/vqa.pytorch	A PyTorch implementation of visual question answering with multimodal representation learning	718
szagoruyko/attention-transfer	Improves performance of convolutional neural networks by transferring knowledge from teacher models to student models using attention mechanisms.	1,449