cvpr2016

Visual descriptor learner

A system for learning deep representations of fine-grained visual descriptions from images

Learning Deep Representations of Fine-grained Visual Descriptions

GitHub

334 stars
18 watching
97 forks
Language: Lua
last commit: over 7 years ago
Linked from 1 awesome list


Backlinks from these awesome lists:

Related projects:

Repository Description Stars
reedscot/icml2016 Generates synthetic images from text descriptions using a Generative Adversarial Network (GAN) 913
shi-labs/vcoder An adapter for improving large language models at object-level perception tasks with auxiliary perception modalities 261
5vision/darqn An implementation of a deep reinforcement learning model for continuous control tasks 115
gordonhu608/mqt-llava A vision-language model that uses a query transformer to encode images as visual tokens and allows flexible choice of the number of visual tokens. 97
zhengwang100/rect A deep learning framework for graph representation learning with partially labeled data 18
soumith/cvpr2015 An introduction to deep learning for computer vision using the Torch framework 868
clementfarabet/manifold Manages and transforms high-dimensional data into lower-dimensional representations using various algorithms 141
wanglimin/tdd A tool for extracting features from videos using deep convolutional descriptors 104
nvlabs/relvit A deep learning framework designed to improve visual reasoning capabilities by utilizing concepts and semantic relations. 64
satwikkottur/visualword2vec Learning word embeddings from abstract images to improve language understanding 19
jcjohnson/densecap A deep learning framework for generating natural language descriptions of images by detecting objects and their attributes 1,584
bobbens/cvpr2016_stylenet This code implements a computer vision model for extracting features from fashion images using a novel approach to data processing. 69
clementfarabet/vowpal_wabbit A Lua interface to an online learning algorithm for fast prediction and classification 2
vision-cair/longvu An artificial intelligence system designed to understand and describe long-form video content 270
batra-mlp-lab/visdial A system for an AI agent to engage in natural dialog about visual content using a combination of encoder and decoder architectures. 228