VLP

Image Captioner

A project for pre-training models to support image captioning and question answering tasks.

Vision-Language Pre-training for Image Captioning and Question Answering

GitHub

416 stars

19 watching

62 forks

Language: Python

last commit: over 4 years ago

Related projects:

Repository	Description	Stars
fengyang0317/unsupervised_captioning	An unsupervised image captioning framework that allows generating captions from images without paired data.	215
contextualai/lens	Enhances language models to generate text based on visual descriptions of images	352
ruotianluo/imagecaptioning.pytorch	A Python-based framework for training and testing image captioning models using PyTorch.	1,458
lukemelas/image-paragraph-captioning	Trains image paragraph captioning models to generate diverse and accurate captions	90
libvips/lua-vips	A Lua binding for a fast image processing library with low memory needs.	129
ruotianluo/self-critical.pytorch	An implementation of Self-critical Sequence Training for Image Captioning and related techniques.	998
nickjiang2378/vl-interp	This project provides an official PyTorch implementation of a method to interpret and edit vision-language representations to mitigate hallucinations in image captions.	46
yiwuzhong/sub-gc	A PyTorch implementation of image captioning models via scene graph decomposition.	96
xiadingz/video-caption.pytorch	PyTorch implementation of video captioning, combining deep learning and computer vision techniques.	402
chxj1992/slide_captcha_cracker	A project that uses image processing techniques to locate a sliding captcha puzzle within a background image.	142
rmokady/clip_prefix_caption	An approach to image captioning that leverages the CLIP model and fine-tunes a language model without requiring additional supervision or object annotation.	1,326
microsoft/vision-longformer	An implementation of a vision transformer architecture designed for high-resolution image encoding with multiple efficient attention mechanisms	243
chapternewscu/image-captioning-with-semantic-attention	A deep learning model for generating image captions with semantic attention	51
hasinhayder/imagecaptionhoveranimation	A CSS3-based solution to create hover animations for image captions	354
lumingyin/quickcaption	Automated captioning and transcription tool for video and audio files	74