VLP

Image Captioner

A project for pre-training models to support image captioning and question answering tasks.

Vision-Language Pre-training for Image Captioning and Question Answering

GitHub

412 stars
19 watching
62 forks
Language: Python
last commit: almost 3 years ago

Related projects:

Repository Description Stars
fengyang0317/unsupervised_captioning An unsupervised image captioning framework that allows generating captions from images without paired data. 215
contextualai/lens Enhances language models to generate text based on visual descriptions of images 351
ruotianluo/imagecaptioning.pytorch A Python-based framework for training and testing image captioning models using PyTorch. 1,451
lukemelas/image-paragraph-captioning Trains image paragraph captioning models to generate diverse and accurate captions 90
libvips/lua-vips A Lua binding for a fast image processing library with low memory needs. 127
ruotianluo/self-critical.pytorch An implementation of Self-critical Sequence Training for Image Captioning and related techniques. 997
nickjiang2378/vl-interp This project provides an official PyTorch implementation of a method to interpret and edit vision-language representations to mitigate hallucinations in image captions. 31
yiwuzhong/sub-gc A PyTorch implementation of image captioning models via scene graph decomposition. 96
xiadingz/video-caption.pytorch PyTorch implementation of video captioning, combining deep learning and computer vision techniques. 401
chxj1992/slide_captcha_cracker A project that uses image processing techniques to locate a sliding captcha puzzle within a background image. 141
rmokady/clip_prefix_caption An approach to image captioning that leverages the CLIP model and fine-tunes a language model without requiring additional supervision or object annotation. 1,315
microsoft/vision-longformer An implementation of a vision transformer architecture designed for high-resolution image encoding with multiple efficient attention mechanisms 241
chapternewscu/image-captioning-with-semantic-attention A deep learning model for generating image captions with semantic attention 51
hasinhayder/imagecaptionhoveranimation A CSS3-based solution to create hover animations for image captions 354
lumingyin/quickcaption Automated captioning and transcription tool for video and audio files 74