tbd-nets

Visual Reasoning Model

An open-source implementation of a deep learning model designed to improve the balance between performance and interpretability in visual reasoning tasks.

PyTorch implementation of "Transparency by Design: Closing the Gap Between Performance and Interpretability in Visual Reasoning"

GitHub

348 stars
15 watching
74 forks
Language: Jupyter Notebook
last commit: about 3 years ago
deep-learningmachine-learningneural-networkspytorchvisual-question-answeringvisualizationvqa

Related projects:

Repository Description Stars
rowanz/r2c An open-source project providing PyTorch code and data for a deep learning model that enables visual commonsense reasoning. 466
nexusapoorvacus/deepvariationstructuredrl An implementation of reinforcement learning for visual relationship and attribute detection using PyTorch. 63
mrgemy95/visual-interaction-networks-pytorch An implementation of Deepmind's Visual Interaction Networks using PyTorch to predict future events in physical scenes. 166
codeslake/pvdnet An open-source implementation of a deep learning model for video deblurring and motion estimation. 114
deepcs233/visual-cot A framework for training multi-modal language models with a focus on visual inputs and providing interpretable thoughts. 162
isht7/pytorch-deeplab-resnet A deep learning model implementation of the DeepLab ResNet architecture for image segmentation tasks. 602
kunpengli1994/vsrn An open-source PyTorch implementation of a visual semantic reasoning model for image-text matching 294
kaiyangzhou/dassl.pytorch A PyTorch toolbox for supporting research and development of domain adaptation, generalization, and semi-supervised learning methods in computer vision. 1,236
cadene/vqa.pytorch A PyTorch implementation of visual question answering with multimodal representation learning 718
nvlabs/relvit A deep learning framework designed to improve visual reasoning capabilities by utilizing concepts and semantic relations. 64
nvlabs/bongard-hoi A benchmarking tool and software framework for evaluating few-shot visual reasoning capabilities in computer vision models. 64
gt-vision-lab/vqa_lstm_cnn A Visual Question Answering model using a deeper LSTM and normalized CNN architecture. 377
javeywang/pyramid-attention-networks-pytorch An implementation of a deep learning model using PyTorch for semantic segmentation tasks. 237
jnhwkim/nips-mrn-vqa This project presents a neural network model designed to answer visual questions by combining question and image features in a residual learning framework. 39
vlgiitr/dmn-plus A PyTorch implementation of an improved question answering architecture with dynamic memory networks and attention mechanisms 64