Caption-Anything

Captioner

A tool generating descriptive captions from images with customizable controls and text styles.

Caption-Anything is a versatile tool combining image segmentation, visual captioning, and ChatGPT, generating tailored captions with diverse controls for user preferences. https://huggingface.co/spaces/TencentARC/Caption-Anything https://huggingface.co/spaces/VIPLab/Caption-Anything

GitHub

2k stars

16 watching

103 forks

Language: Python

last commit: over 2 years ago

chatgptcontrollable-generationcontrollable-image-captioningimage-captioningsegment-anything

Related projects:

Repository	Description	Stars
fengyang0317/unsupervised_captioning	An unsupervised image captioning framework that allows generating captions from images without paired data.	215
apple2373/chainer-caption	An image caption generation system using a neural network architecture with pre-trained models.	64
tpkahlon/captcha-image	A library to generate images with distorted text and background patterns for security purposes.	8
eladhoffer/captiongen	A PyTorch-based tool for generating captions from images	128
lumingyin/quickcaption	Automated captioning and transcription tool for video and audio files	74
kacky24/stylenet	A PyTorch implementation of a framework for generating captions with styles for images and videos.	63
lukemelas/image-paragraph-captioning	Trains image paragraph captioning models to generate diverse and accurate captions	90
xiadingz/video-caption.pytorch	PyTorch implementation of video captioning, combining deep learning and computer vision techniques.	402
rmokady/clip_prefix_caption	An approach to image captioning that leverages the CLIP model and fine-tunes a language model without requiring additional supervision or object annotation.	1,326
cshizhe/asg2cap	An image caption generation model that uses abstract scene graphs to fine-grained control and generate captions	200
vision-cair/chatcaptioner	Enables automatic generation of descriptive text from images and videos based on user input.	457
yiwuzhong/sub-gc	A PyTorch implementation of image captioning models via scene graph decomposition.	96
contextualai/lens	Enhances language models to generate text based on visual descriptions of images	352
jaywongwang/densevideocaptioning	An implementation of a dense video captioning model with attention-based fusion and context gating	149
nickjiang2378/vl-interp	This project provides an official PyTorch implementation of a method to interpret and edit vision-language representations to mitigate hallucinations in image captions.	46