vsepp

ImageCaptionRetrieval

A PyTorch implementation of visual-semantic embedding methods for image-caption retrieval

PyTorch Code for the paper "VSE++: Improving Visual-Semantic Embeddings with Hard Negatives"

GitHub

492 stars
15 watching
125 forks
Language: Python
last commit: about 3 years ago
bmvcnegativespaperpytorchvse

Related projects:

Repository Description Stars
nickjiang2378/vl-interp This project provides an official PyTorch implementation of a method to interpret and edit vision-language representations to mitigate hallucinations in image captions. 46
ruotianluo/imagecaptioning.pytorch A Python-based framework for training and testing image captioning models using PyTorch. 1,458
kunpengli1994/vsrn An open-source PyTorch implementation of a visual semantic reasoning model for image-text matching 294
penghao-wu/vstar PyTorch implementation of guided visual search mechanism for multimodal LLMs 541
xiadingz/video-caption.pytorch PyTorch implementation of video captioning, combining deep learning and computer vision techniques. 402
openseg-group/openseg.pytorch Provides a PyTorch implementation of several computer vision tasks including object detection, segmentation and parsing. 1,191
jayleicn/clipbert An efficient framework for end-to-end learning on image-text and video-text tasks 709
ruotianluo/self-critical.pytorch An implementation of Self-critical Sequence Training for Image Captioning and related techniques. 998
felixgwu/img_classification_pk_pytorch A PyTorch project for comparing image classification models and facilitating quick experiment setup 366
facebookresearch/pycls A flexible PyTorch image classification framework for rapid research exploration and model evaluation. 2,141
kacky24/stylenet A PyTorch implementation of a framework for generating captions with styles for images and videos. 63
woozzu/dong_iccv_2017 An implementation of semantic image synthesis via adversarial learning using PyTorch 145
kaiyangzhou/dassl.pytorch A PyTorch toolbox for supporting research and development of domain adaptation, generalization, and semi-supervised learning methods in computer vision. 1,236
cadene/vqa.pytorch A PyTorch implementation of visual question answering with multimodal representation learning 718
ben-louis/deep-image-analogy-pytorch A Python implementation of an image analogy algorithm based on PyTorch. 181