zsgnet-pytorch

Object groundings model

An implementation of a computer vision model that grounds objects in images using natural language queries.

Official implementation of ICCV19 oral paper Zero-Shot grounding of Objects from Natural Language Queries (https://arxiv.org/abs/1908.07129)

GitHub

69 stars
4 watching
12 forks
Language: Python
last commit: over 4 years ago
groundingnlpobjectsvision

Related projects:

Repository Description Stars
jhcho99/coformer An implementation of a deep learning model for grounding situation recognition in images 43
kaiyangzhou/dassl.pytorch A PyTorch toolbox for supporting research and development of domain adaptation, generalization, and semi-supervised learning methods in computer vision. 1,217
microsoft/som Enables visual grounding in large language models by overlaying spatial and speakable marks on images 1,183
zhanghang1989/pytorch-encoding A Python framework for building deep learning models with optimized encoding layers and batch normalization. 2,041
mbzuai-oryx/groundinglmm An end-to-end trained model capable of generating natural language responses integrated with object segmentation masks. 781
kazuto1011/pspnet-pytorch Re-implementation of a deep learning model for semantic segmentation using PyTorch. 52
elliottd/groundedtranslation Trains multilingual image description models using neural sequence models and extracts hidden features from trained models. 46
javeywang/pyramid-attention-networks-pytorch An implementation of a deep learning model using PyTorch for semantic segmentation tasks. 235
devendrachaplot/deeprl-grounding Trains an RL agent to execute natural language instructions in a 3D environment using a combination of A3C and gated attention mechanisms. 237
l0sg/relational-rnn-pytorch An implementation of DeepMind's Relational Recurrent Neural Networks (Santoro et al. 2018) in PyTorch for word language modeling 244
hszhao/pspnet A PyTorch implementation of a deep learning model for semantic image segmentation 1,593
shizhediao/davinci An implementation of vision-language models for multimodal learning tasks, enabling generative vision-language models to be fine-tuned for various applications. 43
isht7/pytorch-deeplab-resnet A deep learning model implementation of the DeepLab ResNet architecture for image segmentation tasks. 602
nicolas-chaulet/torch-points3d A PyTorch framework for building and training deep learning models on point clouds. 219
fxia22/pointnet.pytorch This is an implementation of the PointNet algorithm in PyTorch for 3D point cloud classification and segmentation tasks. 2,149