vsepp

ImageCaptionRetrieval

A PyTorch implementation of visual-semantic embedding methods for image-caption retrieval

PyTorch Code for the paper "VSE++: Improving Visual-Semantic Embeddings with Hard Negatives"

GitHub

489 stars
15 watching
125 forks
Language: Python
last commit: almost 3 years ago
bmvcnegativespaperpytorchvse

Related projects:

Repository Description Stars
nickjiang2378/vl-interp This project provides an official PyTorch implementation of a method to interpret and edit vision-language representations to mitigate hallucinations in image captions. 31
ruotianluo/imagecaptioning.pytorch A Python-based framework for training and testing image captioning models using PyTorch. 1,451
kunpengli1994/vsrn An open-source PyTorch implementation of a visual semantic reasoning model for image-text matching 294
penghao-wu/vstar PyTorch implementation of guided visual search mechanism for multimodal LLMs 527
xiadingz/video-caption.pytorch PyTorch implementation of video captioning, combining deep learning and computer vision techniques. 401
openseg-group/openseg.pytorch Provides a PyTorch implementation of several computer vision tasks including object detection, segmentation and parsing. 1,190
jayleicn/clipbert An efficient framework for end-to-end learning on image-text and video-text tasks 704
ruotianluo/self-critical.pytorch An implementation of Self-critical Sequence Training for Image Captioning and related techniques. 997
felixgwu/img_classification_pk_pytorch A PyTorch project for comparing image classification models and facilitating quick experiment setup 365
facebookresearch/pycls A flexible PyTorch image classification framework for rapid research exploration and model evaluation. 2,138
kacky24/stylenet A PyTorch implementation of a framework for generating captions with styles for images and videos. 63
woozzu/dong_iccv_2017 An implementation of semantic image synthesis via adversarial learning using PyTorch 145
kaiyangzhou/dassl.pytorch A PyTorch toolbox for supporting research and development of domain adaptation, generalization, and semi-supervised learning methods in computer vision. 1,217
cadene/vqa.pytorch A PyTorch implementation of visual question answering with multimodal representation learning 716
ben-louis/deep-image-analogy-pytorch A Python implementation of an image analogy algorithm based on PyTorch. 181