tbd-nets

Visual Reasoning Model

An open-source implementation of a deep learning model designed to improve the balance between performance and interpretability in visual reasoning tasks.

PyTorch implementation of "Transparency by Design: Closing the Gap Between Performance and Interpretability in Visual Reasoning"

GitHub

348 stars

15 watching

74 forks

Language: Jupyter Notebook

last commit: over 3 years ago

deep-learningmachine-learningneural-networkspytorchvisual-question-answeringvisualizationvqa

Screenshot of davidmascharka/tbd-nets website

arxiv.org/abs/1803.05268

Related projects:

Repository	Description	Stars
rowanz/r2c	An open-source project providing PyTorch code and data for a deep learning model that enables visual commonsense reasoning.	466
nexusapoorvacus/deepvariationstructuredrl	An implementation of reinforcement learning for visual relationship and attribute detection using PyTorch.	63
mrgemy95/visual-interaction-networks-pytorch	An implementation of Deepmind's Visual Interaction Networks using PyTorch to predict future events in physical scenes.	166
codeslake/pvdnet	An open-source implementation of a deep learning model for video deblurring and motion estimation.	114
deepcs233/visual-cot	A framework for training multi-modal language models with a focus on visual inputs and providing interpretable thoughts.	162
isht7/pytorch-deeplab-resnet	A deep learning model implementation of the DeepLab ResNet architecture for image segmentation tasks.	602
kunpengli1994/vsrn	An open-source PyTorch implementation of a visual semantic reasoning model for image-text matching	294
kaiyangzhou/dassl.pytorch	A PyTorch toolbox for supporting research and development of domain adaptation, generalization, and semi-supervised learning methods in computer vision.	1,236
cadene/vqa.pytorch	A PyTorch implementation of visual question answering with multimodal representation learning	718
nvlabs/relvit	A deep learning framework designed to improve visual reasoning capabilities by utilizing concepts and semantic relations.	64
nvlabs/bongard-hoi	A benchmarking tool and software framework for evaluating few-shot visual reasoning capabilities in computer vision models.	64
gt-vision-lab/vqa_lstm_cnn	A Visual Question Answering model using a deeper LSTM and normalized CNN architecture.	377
javeywang/pyramid-attention-networks-pytorch	An implementation of a deep learning model using PyTorch for semantic segmentation tasks.	237
jnhwkim/nips-mrn-vqa	This project presents a neural network model designed to answer visual questions by combining question and image features in a residual learning framework.	39
vlgiitr/dmn-plus	A PyTorch implementation of an improved question answering architecture with dynamic memory networks and attention mechanisms	64