unsupervised_captioning

Image captioner

An unsupervised image captioning framework that allows generating captions from images without paired data.

Code for Unsupervised Image Captioning

GitHub

215 stars
7 watching
51 forks
Language: Python
last commit: over 1 year ago

Related projects:

Repository Description Stars
apple2373/chainer-caption An image caption generation system using a neural network architecture with pre-trained models. 64
deeprnn/image_captioning This implementation allows users to generate captions from images using a neural network model with visual attention. 786
chapternewscu/image-captioning-with-semantic-attention A deep learning model for generating image captions with semantic attention 51
luoweizhou/vlp A project for pre-training models to support image captioning and question answering tasks. 412
cshizhe/asg2cap An image caption generation model that uses abstract scene graphs to fine-grained control and generate captions 200
yiwuzhong/sub-gc A PyTorch implementation of image captioning models via scene graph decomposition. 96
rmokady/clip_prefix_caption An approach to image captioning that leverages the CLIP model and fine-tunes a language model without requiring additional supervision or object annotation. 1,315
lukemelas/image-paragraph-captioning Trains image paragraph captioning models to generate diverse and accurate captions 90
anonymousanoy/fohe Automates the process of generating multiple rewritten image captions by fine-tuning large vision-language models 7
contextualai/lens Enhances language models to generate text based on visual descriptions of images 351
ibm/max-image-caption-generator An image caption generation system utilizing machine learning models and deep neural networks. 84
ttengwang/caption-anything A tool generating descriptive captions from images with customizable controls and text styles. 1,682
eladhoffer/captiongen A PyTorch-based tool for generating captions from images 128
facebookresearch/cutler An unsupervised object detection and segmentation framework that can learn from image data alone, without requiring human annotations. 943
jamespark3922/adv-inf A method for generating and evaluating video captions using adversarial inference, trained on large datasets of text and multimedia features. 34