FOHE
Caption rewriting
Automates the process of generating multiple rewritten image captions by fine-tuning large vision-language models
7 stars
1 watching
1 forks
Language: Python
last commit: 10 months ago Related projects:
Repository | Description | Stars |
---|---|---|
fengyang0317/unsupervised_captioning | An unsupervised image captioning framework that allows generating captions from images without paired data. | 215 |
nickjiang2378/vl-interp | This project provides an official PyTorch implementation of a method to interpret and edit vision-language representations to mitigate hallucinations in image captions. | 31 |
chapternewscu/image-captioning-with-semantic-attention | A deep learning model for generating image captions with semantic attention | 51 |
cshizhe/asg2cap | An image caption generation model that uses abstract scene graphs to fine-grained control and generate captions | 200 |
contextualai/lens | Enhances language models to generate text based on visual descriptions of images | 351 |
a1ext/auto_re | Automates renaming and tagging of IDA PRO functions based on API calls or jumps | 611 |
apple2373/chainer-caption | An image caption generation system using a neural network architecture with pre-trained models. | 64 |
luoweizhou/vlp | A project for pre-training models to support image captioning and question answering tasks. | 412 |
aboev/arae-tf | Automates generation of discrete sequence text using adversarially regularized autoencoders | 20 |
rmokady/clip_prefix_caption | An approach to image captioning that leverages the CLIP model and fine-tunes a language model without requiring additional supervision or object annotation. | 1,315 |
lukemelas/image-paragraph-captioning | Trains image paragraph captioning models to generate diverse and accurate captions | 90 |
kacky24/stylenet | A PyTorch implementation of a framework for generating captions with styles for images and videos. | 63 |
jakezhaojb/arae | An implementation of Adversarially Regularized Autoencoders for language generation and discrete structure modeling. | 400 |
terrynoya/asimagelib | A decoder library for PNG and BMP images in ActionScript 3. | 10 |
deeprnn/image_captioning | This implementation allows users to generate captions from images using a neural network model with visual attention. | 786 |