CAVP
Policy Network Framework
A software framework for fine-grained image captioning and sequence-level image captioning, utilizing policy networks to incorporate contextual information into image captions.
Code release for Context-Aware Visual Policy Network for Sequence-Level Image Captioning (MM 2018) and Context-Aware Visual Policy Network for Fine-Grained Image Captioning (TPAMI 2019)
47 stars
4 watching
3 forks
Language: Python
last commit: over 5 years ago image-captioningpolicy-network
Related projects:
Repository | Description | Stars |
---|---|---|
jimmy-ren/vcnn_double-bladed | A GPU-enabled vectorized implementation of CNNs for computer vision tasks | 136 |
zhegan27/semantic_compositional_nets | A deep learning framework providing a model architecture and training code for image captioning using semantic compositional networks | 70 |
cbfinn/gps | An implementation of guided policy search and LQG-based trajectory optimization for reinforcement learning | 598 |
fomorians/highway-cnn | A deep learning framework for training highway networks on image data using convolutional neural networks | 57 |
guanghan/darknet | An implementation of a neural network framework for computer vision tasks, supporting both CPU and GPU computation. | 243 |
hciilab/derpn | A novel region proposal network for object detection and scene text detection that focuses on improving the adaptivity of current detectors | 156 |
deeprnn/image_captioning | This implementation allows users to generate captions from images using a neural network model with visual attention. | 786 |
netflix-skunkworks/policyuniverse | A Python package for parsing and processing AWS IAM policies and statements. | 428 |
hxyou/idealgpt | A deep learning framework for iteratively decomposing vision and language reasoning via large language models. | 32 |
taolei87/rcnn | An implementation of neural network components and optimization methods for text analysis, including rationales for neural predictions. | 355 |
nv-tlabs/gscnn | This code implements a neural network architecture designed to perform semantic segmentation in computer vision tasks. | 920 |
tobypde/frrn | A software framework for training and evaluating full-resolution residual networks for semantic image segmentation tasks | 280 |
cypw/dpns | A deep learning framework implementing a specific network architecture for image localization tasks. | 537 |
pathak22/context-encoder | Unsupervised feature learning by image inpainting using Generative Adversarial Networks (GANs) | 885 |
hellonico/origami | A framework for building computer vision and neural networks applications on the JavaVM | 122 |