bottom-up-attention
Attention model training
Trains a bottom-up attention model using Faster R-CNN and Visual Genome annotations for image captioning and VQA tasks
Bottom-up attention model for image captioning and VQA, based on Faster R-CNN and Visual Genome
1k stars
26 watching
378 forks
Language: Jupyter Notebook
last commit: almost 2 years ago
Linked from 1 awesome list
caffecaptioning-imagesfaster-rcnnimage-captioningmscocomscoco-datasetvisual-question-answeringvqa
Related projects:
Repository | Description | Stars |
---|---|---|
hengyuan-hu/bottom-up-attention-vqa | An implementation of a VQA system using bottom-up attention, aiming to improve the efficiency and speed of visual question answering tasks. | 754 |
gt-vision-lab/vqa_lstm_cnn | A Visual Question Answering model using a deeper LSTM and normalized CNN architecture. | 376 |
pistony/residualattentionnetwork | A Gluon implementation of Residual Attention Network for image classification tasks | 107 |
jiasenlu/adaptiveattention | Adaptive attention mechanism for image captioning using visual sentinels | 334 |
zcyang/imageqa-san | This project provides code for training image question answering models using stacked attention networks and convolutional neural networks. | 107 |
kaushalshetty/structured-self-attention | A deep learning model that generates sentence embeddings using structured self-attention and is used for binary and multiclass classification tasks. | 494 |
koichiro11/residual-attention-network | An image classification neural network implementation using attention mechanisms and residual learning | 94 |
jessemelpolio/faster_rcnn_for_dota | This repository provides code for training a Faster R-CNN object detection model on DOTA datasets. | 336 |
bigballon/cifar-zoo | Provides implementations of CNN architectures and improvement methods for image classification on the CIFAR benchmark. | 700 |
chapternewscu/image-captioning-with-semantic-attention | A deep learning model for generating image captions with semantic attention | 51 |
cszn/ircnn | This project trains deep CNN denoisers to improve image restoration tasks such as deblurring and demosaicking through model-based optimization methods. | 600 |
fwang91/residual-attention-network | An implementation of a deep neural network architecture using attention mechanisms and residual connections for image classification tasks. | 551 |
emedvedev/attention-ocr | A TensorFlow model for recognizing text in images using visual attention and a sequence-to-sequence architecture. | 1,077 |
cadene/vqa.pytorch | A PyTorch implementation of visual question answering with multimodal representation learning | 716 |
szagoruyko/attention-transfer | Improves performance of convolutional neural networks by transferring knowledge from teacher models to student models using attention mechanisms. | 1,444 |