cvpr2016
Visual descriptor learner
A system for learning deep representations of fine-grained visual descriptions from images
Learning Deep Representations of Fine-grained Visual Descriptions
334 stars
18 watching
97 forks
Language: Lua
last commit: over 7 years ago
Linked from 1 awesome list
Related projects:
Repository | Description | Stars |
---|---|---|
reedscot/icml2016 | Generates synthetic images from text descriptions using a Generative Adversarial Network (GAN) | 913 |
shi-labs/vcoder | An adapter for improving large language models at object-level perception tasks with auxiliary perception modalities | 261 |
5vision/darqn | An implementation of a deep reinforcement learning model for continuous control tasks | 115 |
gordonhu608/mqt-llava | A vision-language model that uses a query transformer to encode images as visual tokens and allows flexible choice of the number of visual tokens. | 97 |
zhengwang100/rect | A deep learning framework for graph representation learning with partially labeled data | 18 |
soumith/cvpr2015 | An introduction to deep learning for computer vision using the Torch framework | 868 |
clementfarabet/manifold | Manages and transforms high-dimensional data into lower-dimensional representations using various algorithms | 141 |
wanglimin/tdd | A tool for extracting features from videos using deep convolutional descriptors | 104 |
nvlabs/relvit | A deep learning framework designed to improve visual reasoning capabilities by utilizing concepts and semantic relations. | 64 |
satwikkottur/visualword2vec | Learning word embeddings from abstract images to improve language understanding | 19 |
jcjohnson/densecap | A deep learning framework for generating natural language descriptions of images by detecting objects and their attributes | 1,584 |
bobbens/cvpr2016_stylenet | This code implements a computer vision model for extracting features from fashion images using a novel approach to data processing. | 69 |
clementfarabet/vowpal_wabbit | A Lua interface to an online learning algorithm for fast prediction and classification | 2 |
vision-cair/longvu | An artificial intelligence system designed to understand and describe long-form video content | 270 |
batra-mlp-lab/visdial | A system for an AI agent to engage in natural dialog about visual content using a combination of encoder and decoder architectures. | 228 |