DenseVideoCaptioning
Video captioning model
An implementation of a dense video captioning model with attention-based fusion and context gating
Official Tensorflow Implementation of the paper "Bidirectional Attentive Fusion with Context Gating for Dense Video Captioning" in CVPR 2018, with code, model and prediction results.
148 stars
6 watching
50 forks
Language: Python
last commit: over 5 years ago dense-video-captioning
Related projects:
Repository | Description | Stars |
---|---|---|
jamespark3922/adv-inf | A method for generating and evaluating video captions using adversarial inference, trained on large datasets of text and multimedia features. | 34 |
cshizhe/asg2cap | An image caption generation model that uses abstract scene graphs to fine-grained control and generate captions | 200 |
xiadingz/video-caption.pytorch | PyTorch implementation of video captioning, combining deep learning and computer vision techniques. | 401 |
zhegan27/semantic_compositional_nets | A deep learning framework providing a model architecture and training code for image captioning using semantic compositional networks | 70 |
yiwuzhong/sub-gc | A PyTorch implementation of image captioning models via scene graph decomposition. | 96 |
shangwei5/vidue | A deep learning model that jointly performs video frame interpolation and deblurring with unknown exposure time | 66 |
jayleicn/clipbert | An efficient framework for end-to-end learning on image-text and video-text tasks | 704 |
chapternewscu/image-captioning-with-semantic-attention | A deep learning model for generating image captions with semantic attention | 51 |
jcjohnson/densecap | A deep learning framework for generating natural language descriptions of images by detecting objects and their attributes | 1,584 |
pku-yuangroup/video-bench | Evaluates and benchmarks large language models' video understanding capabilities | 117 |
kdexd/virtex | A pretraining approach that uses semantically dense captions to learn visual representations and improve image understanding tasks. | 557 |
deeprnn/image_captioning | This implementation allows users to generate captions from images using a neural network model with visual attention. | 786 |
pku-yuangroup/chronomagic-bench | A benchmark and dataset for evaluating text-to-video generation models' ability to generate coherent and varied metamorphic time-lapse videos. | 186 |
codeslake/pvdnet | An open-source implementation of a deep learning model for video deblurring and motion estimation. | 114 |
zhengpeng7/birefnet | An implementation of a deep learning-based image segmentation model for high-resolution images | 1,319 |