VLP
Image Captioner
A project for pre-training models to support image captioning and question answering tasks.
Vision-Language Pre-training for Image Captioning and Question Answering
412 stars
19 watching
62 forks
Language: Python
last commit: almost 3 years ago Related projects:
Repository | Description | Stars |
---|---|---|
fengyang0317/unsupervised_captioning | An unsupervised image captioning framework that allows generating captions from images without paired data. | 215 |
contextualai/lens | Enhances language models to generate text based on visual descriptions of images | 351 |
ruotianluo/imagecaptioning.pytorch | A Python-based framework for training and testing image captioning models using PyTorch. | 1,451 |
lukemelas/image-paragraph-captioning | Trains image paragraph captioning models to generate diverse and accurate captions | 90 |
libvips/lua-vips | A Lua binding for a fast image processing library with low memory needs. | 127 |
ruotianluo/self-critical.pytorch | An implementation of Self-critical Sequence Training for Image Captioning and related techniques. | 997 |
nickjiang2378/vl-interp | This project provides an official PyTorch implementation of a method to interpret and edit vision-language representations to mitigate hallucinations in image captions. | 31 |
yiwuzhong/sub-gc | A PyTorch implementation of image captioning models via scene graph decomposition. | 96 |
xiadingz/video-caption.pytorch | PyTorch implementation of video captioning, combining deep learning and computer vision techniques. | 401 |
chxj1992/slide_captcha_cracker | A project that uses image processing techniques to locate a sliding captcha puzzle within a background image. | 141 |
rmokady/clip_prefix_caption | An approach to image captioning that leverages the CLIP model and fine-tunes a language model without requiring additional supervision or object annotation. | 1,318 |
microsoft/vision-longformer | An implementation of a vision transformer architecture designed for high-resolution image encoding with multiple efficient attention mechanisms | 241 |
chapternewscu/image-captioning-with-semantic-attention | A deep learning model for generating image captions with semantic attention | 51 |
hasinhayder/imagecaptionhoveranimation | A CSS3-based solution to create hover animations for image captions | 354 |
lumingyin/quickcaption | Automated captioning and transcription tool for video and audio files | 74 |