virtex

Caption learning

A pretraining approach that uses semantically dense captions to learn visual representations and improve image understanding tasks.

[CVPR 2021] VirTex: Learning Visual Representations from Textual Annotations

GitHub

556 stars

14 watching

61 forks

Language: Python

last commit: over 1 year ago

coco-datasetcvpr2021image-captioningmodel-zoopretrained-models

kdexd.xyz/virtex

Related projects:

Repository	Description	Stars
xiadingz/video-caption.pytorch	PyTorch implementation of video captioning, combining deep learning and computer vision techniques.	402
zhegan27/semantic_compositional_nets	A deep learning framework providing a model architecture and training code for image captioning using semantic compositional networks	70
deepcs233/visual-cot	A framework for training multi-modal language models with a focus on visual inputs and providing interpretable thoughts.	162
jaywongwang/densevideocaptioning	An implementation of a dense video captioning model with attention-based fusion and context gating	149
luoweizhou/vlp	A project for pre-training models to support image captioning and question answering tasks.	416
chapternewscu/image-captioning-with-semantic-attention	A deep learning model for generating image captions with semantic attention	51
rmokady/clip_prefix_caption	An approach to image captioning that leverages the CLIP model and fine-tunes a language model without requiring additional supervision or object annotation.	1,326
vict0rsch/deep_learning	A collection of tutorials and resources on implementing deep learning models using Python libraries such as Keras and Lasagne.	426
lukemelas/image-paragraph-captioning	Trains image paragraph captioning models to generate diverse and accurate captions	90
rucaibox/comvint	Creating synthetic visual reasoning instructions to improve the performance of large language models on image-related tasks	18
yxuansu/tacl	Improves pre-trained language models by encouraging an isotropic and discriminative distribution of token representations.	92
yiwuzhong/sub-gc	A PyTorch implementation of image captioning models via scene graph decomposition.	96
ppwwyyxx/moco.tensorflow	Reimplements a popular deep learning model for unsupervised visual representation learning using TensorFlow	161
deeprnn/image_captioning	This implementation allows users to generate captions from images using a neural network model with visual attention.	790
apple2373/chainer-caption	An image caption generation system using a neural network architecture with pre-trained models.	64