tbd-nets
Visual Reasoning Model
An open-source implementation of a deep learning model designed to improve the balance between performance and interpretability in visual reasoning tasks.
PyTorch implementation of "Transparency by Design: Closing the Gap Between Performance and Interpretability in Visual Reasoning"
348 stars
15 watching
74 forks
Language: Jupyter Notebook
last commit: almost 3 years ago deep-learningmachine-learningneural-networkspytorchvisual-question-answeringvisualizationvqa
Related projects:
Repository | Description | Stars |
---|---|---|
rowanz/r2c | An open-source project providing PyTorch code and data for a deep learning model that enables visual commonsense reasoning. | 466 |
nexusapoorvacus/deepvariationstructuredrl | An implementation of reinforcement learning for visual relationship and attribute detection using PyTorch. | 63 |
mrgemy95/visual-interaction-networks-pytorch | An implementation of Deepmind's Visual Interaction Networks using PyTorch to predict future events in physical scenes. | 166 |
codeslake/pvdnet | An open-source implementation of a deep learning model for video deblurring and motion estimation. | 114 |
deepcs233/visual-cot | Develops a multi-modal language model with a comprehensive dataset and benchmark for chain-of-thought reasoning | 134 |
isht7/pytorch-deeplab-resnet | A deep learning model implementation of the DeepLab ResNet architecture for image segmentation tasks. | 602 |
kunpengli1994/vsrn | An open-source PyTorch implementation of a visual semantic reasoning model for image-text matching | 294 |
kaiyangzhou/dassl.pytorch | A PyTorch toolbox for supporting research and development of domain adaptation, generalization, and semi-supervised learning methods in computer vision. | 1,217 |
cadene/vqa.pytorch | A PyTorch implementation of visual question answering with multimodal representation learning | 716 |
nvlabs/relvit | A deep learning framework designed to improve visual reasoning capabilities by utilizing concepts and semantic relations. | 64 |
nvlabs/bongard-hoi | A benchmarking tool and software framework for evaluating few-shot visual reasoning capabilities in computer vision models. | 64 |
gt-vision-lab/vqa_lstm_cnn | A Visual Question Answering model using a deeper LSTM and normalized CNN architecture. | 376 |
javeywang/pyramid-attention-networks-pytorch | An implementation of a deep learning model using PyTorch for semantic segmentation tasks. | 235 |
jnhwkim/nips-mrn-vqa | This project presents a neural network model designed to answer visual questions by combining question and image features in a residual learning framework. | 39 |
vlgiitr/dmn-plus | A PyTorch implementation of an improved question answering architecture with dynamic memory networks and attention mechanisms | 64 |