awesome-vision-and-language
A curated list of awesome vision and language resources (still under construction... stay tuned!)
488 stars
12 watching
39 forks
last commit: 2 months ago
Linked from 1 awesome list
awesomeawesome-listmultimodal-learningvision-and-language
Awesome Vision-and-Language: / Survey | |||
1506.06833 | |||
1705.09406 | |||
1810.04020 | |||
1907.09358 | |||
Scene-Graph-Survey | |||
1904.09317 | |||
ACCESS 2019 | |||
1911.03977 | |||
1912.11872 | |||
2010.09522 | |||
Awesome Vision-and-Language: / Dataset | |||
1505.00468 | |||
visualqa | |||
1604.03968 | |||
ai-visual-storytelling-seq2seq | |||
VIST | |||
1602.07332 | |||
visual_genome_python_driver | 354 | about 1 year ago | |
visualgenome | |||
1612.06890 | |||
1705.08421 | |||
AVA | |||
1711.11543 | |||
embodiedqa | |||
1711.07280 | |||
bringmeaspoon | |||
1902.09506 | |||
visualreasoning | |||
1811.10830 | |||
r2c | 466 | over 3 years ago | |
VCR | |||
1904.03493 | |||
2010.00763 | |||
Bongard-LOGO | 51 | over 2 years ago | |
2205.13803 | |||
Bongard-HOI | 63 | almost 2 years ago | |
Awesome Vision-and-Language: / Image Captioning | |||
1411.4389 | |||
1412.2306 | |||
1411.4555 | |||
show_and_tell.tensorflow | 290 | almost 8 years ago | |
1502.03044 | |||
show-attend-and-tell | 909 | about 6 years ago | |
1411.4952 | |||
visual-concepts | 150 | over 6 years ago | |
1603.03925 | |||
semantic-attention | 51 | almost 8 years ago | |
1612.01887 | |||
AdaptiveAttention | 334 | almost 7 years ago | |
1612.00563 | |||
1611.06607 | |||
1704.03899 | |||
1611.08002 | |||
Semantic_Compositional_Nets | 69 | over 6 years ago | |
CVPR 2017 | |||
stylenet | 62 | almost 4 years ago | |
ENNLP 2018 | |||
image-paragraph-captioning | 91 | about 5 years ago | |
1803.09845 | |||
NeuralBabyTalk | 523 | over 5 years ago | |
1707.07998 | |||
1807.03871 | |||
1805.08191 | |||
1811.10787 | |||
unsupervised_captioning | 215 | over 1 year ago | |
1906.02365 | |||
CAVP | 47 | about 5 years ago | |
1903.05942 | |||
1903.12020 | |||
1904.01475 | |||
1812.02378 | |||
SGAE | 220 | over 2 years ago | |
1811.10787 | |||
unsupervised_captioning | 215 | over 1 year ago | |
CVPR 2019 | |||
1901.02527 | |||
1908.06954 | |||
2004.03708 | |||
2003.00387 | |||
asg2cap | 200 | almost 2 years ago | |
2007.11731 | |||
Sub-GC | 96 | about 2 months ago | |
2009.12313 | |||
2102.04990 | |||
Awesome Vision-and-Language: / Image Retrieval | |||
1511.07067 | |||
VisualWord2Vec | 19 | over 5 years ago | |
1812.07119 | |||
tirg | 296 | over 3 years ago | |
2105.13868 | |||
IAIS | 30 | over 1 year ago | |
2203.15867 | |||
ImageCoDe | 39 | 7 months ago | |
2407.15239 | |||
Awesome Vision-and-Language: / Scene Text Recognition | |||
1908.09231 | |||
1904.01906 | |||
clovaai | 3,720 | 7 months ago | |
Awesome Vision-and-Language: / Scene Graph | |||
7298990 | |||
1602.07332 | |||
visual_genome_python_driver | 354 | about 1 year ago | |
visualgenome | |||
1701.02426 | |||
scene-graph-TF-release | 422 | over 5 years ago | |
1707.09700 | |||
MSDN | 227 | almost 5 years ago | |
1711.06640 | |||
neural-motifs | 522 | about 5 years ago | |
1802.02598 | |||
1811.06410 | |||
1804.01622 | |||
sg2im | 1,294 | 2 months ago | |
1808.00191 | |||
graph-rcnn.pytorch | 732 | over 4 years ago | |
1904.00560 | |||
1909.05379 | |||
scene_generation | 185 | about 1 year ago | |
1811.10696 | |||
sceneGraph_Mem | 4 | over 5 years ago | |
1903.02728 | |||
ContrastiveLosses4VRD | 200 | over 4 years ago | |
1903.03326 | |||
KERN | 120 | about 2 years ago | |
1812.01880 | |||
VCTree | 121 | about 2 months ago | |
1812.02347 | |||
1904.11622 | |||
limited-label | 54 | about 1 year ago | |
2002.11949 | |||
Scene-Graph-Benchmark | 1,056 | 25 days ago | |
2003.12962 | |||
GPS-Net | 63 | over 4 years ago | |
2006.09623 | |||
2007.08760 | |||
het-eccv20 | 16 | about 4 years ago | |
Awesome Vision-and-Language: / text2image | |||
1605.05396 | |||
icml2016 | 911 | about 6 years ago | |
1612.03242 | |||
StackGAN | 1,858 | over 4 years ago | |
1711.10485 | |||
AttnGAN | 1,334 | 2 months ago | |
1802.09178 | |||
HDGan | 150 | almost 6 years ago | |
1812.02784 | |||
StoryGAN | 235 | about 2 years ago | |
1903.05854 | |||
1904.01310 | |||
1904.01480 | |||
1811.09845 | |||
GeNeVA | 37 | over 1 year ago | |
1909.05379 | |||
scene_generation | 185 | about 1 year ago | |
Awesome Vision-and-Language: / Video Captioning | |||
1411.4389 | |||
1510.07712 | |||
1701.03126 | |||
1611.08002 | |||
CVPR_2017 | |||
1804.00100 | |||
1812.05634 | |||
adv-inf | 34 | about 5 years ago | |
1904.03870 | |||
DenseVideoCaptioning | 148 | about 5 years ago | |
1906.04375 | |||
2011.07735 | |||
iPerceive | |||
Awesome Vision-and-Language: / Video Question Answering | |||
1512.02902 | |||
MovieQA | 80 | almost 8 years ago | |
1809.01696 | |||
TVQA | 170 | almost 2 years ago | |
2007.08751 | |||
ROLL-VideoQA | 19 | about 4 years ago | |
2011.07735 | |||
iPerceive | |||
Awesome Vision-and-Language: / Video Understanding | |||
1811.08383 | |||
temporal-shift-module | 2,055 | 3 months ago | |
1910.11009 | |||
Awesome Vision-and-Language: / Vision and Language Navigation | |||
1711.11543 | |||
embodiedqa | |||
1711.07280 | |||
bringmeaspoon | |||
fda_pdf | |||
fda_code | 13 | 9 months ago | |
mam_paper | |||
Awesome Vision-and-Language: / Vision-and-Language Pretraining | |||
1908.07490 | |||
lxmert | 926 | almost 2 years ago | |
1904.01766 | |||
vilbert | 470 | almost 2 years ago | |
1907.07804 | |||
OmniNet | 512 | almost 4 years ago | |
1908.06066 | |||
Unicoder | 88 | 10 months ago | |
1909.11059 | |||
VLP | 411 | over 2 years ago | |
1911.11237 | |||
Oscar | 1,037 | about 1 year ago | |
2006.09882 | |||
swav | 1,993 | over 1 year ago | |
2004.06165 | |||
Oscar | 1,037 | about 1 year ago | |
2006.16934 | |||
ERNIE | 6,294 | about 1 month ago | |
2101.00529 | |||
VinVL | 349 | about 1 year ago | |
2006.06666 | |||
virtex | 556 | 9 months ago | |
2103.00020 | |||
2103.05247 | |||
universal-computation | 244 | over 2 years ago | |
2102.05918 | |||
2103.01988 | |||
2102.10772 | |||
2102.12092 | |||
2103.06561 | |||
2305.08675 | |||
Awesome Vision-and-Language: / Visual Dialog | |||
1611.08669 | |||
visdial | 227 | almost 6 years ago | |
visualdialog | |||
1803.11186 | |||
2303.05983 | |||
ATVC | 7 | over 1 year ago | |
Awesome Vision-and-Language: / Visual Grounding | |||
1611.09978 | |||
cmn | 67 | about 6 years ago | |
1908.07553 | |||
1812.03299 | |||
1908.06354 | |||
1908.07129 | |||
zsgnet | 69 | over 4 years ago | |
2203.16518 | |||
CoFormer | 42 | over 1 year ago | |
Awesome Vision-and-Language: / Visual Question Answering | |||
1505.00468 | |||
visualqa | |||
1606.00061 | |||
HieCoAttenVQA | 347 | about 6 years ago | |
1606.01847 | |||
vqa-mcb | 221 | about 8 years ago | |
1511.02274 | |||
imageqa-san | 107 | over 7 years ago | |
1511.05234 | |||
AAAA | 25 | almost 4 years ago | |
1603.01417 | |||
dmn-plus | 65 | over 6 years ago | |
1606.01847 | |||
vqa-mcb | 221 | about 8 years ago | |
1606.01455 | |||
nips-mrn-vqa | 38 | almost 8 years ago | |
1609.05600 | |||
1612.00837 | |||
1704.05526 | |||
1803.08896 | |||
PSLQA | 55 | over 5 years ago | |
1707.07998 | |||
1708.02711 | |||
vqa-winner | 165 | over 5 years ago | |
1810.02358 | |||
VQA-Transfer-ExternalData | 20 | over 5 years ago | |
1902.09506 | |||
visualreasoning | |||
1904.08920 | |||
ICCV2019 | |||
1907.12133 | |||
scene-graphs-vqa | |||
2204.11167 | |||
RelViT | 63 | about 2 years ago | |
2208.01813 | |||
TAG | 21 | almost 2 years ago | |
Awesome Vision-and-Language: / Visual Reasoning | |||
1612.06890 | |||
1705.03633 | |||
1902.09506 | |||
visualreasoning | |||
1812.01855 | |||
1811.10830 | |||
r2c | 466 | over 3 years ago | |
VCR | |||
1909.08164 | |||
1909.02701 | |||
VSRN | 289 | over 4 years ago | |
2010.00763 | |||
Bongard-LOGO | 51 | over 2 years ago | |
2205.13803 | |||
Bongard-HOI | 63 | almost 2 years ago | |
2204.11167 | |||
RelViT | 63 | about 2 years ago | |
2307.15199 | |||
PromptStyler | |||
Awesome Vision-and-Language: / Visual Relationship Detection | |||
1608.00187 | |||
Visual-Relationship-Detection | 214 | almost 4 years ago | |
1702.07191 | |||
1702.08319 | |||
drnet | 201 | about 3 years ago | |
1703.03054 | |||
DeepVariationRL | 63 | over 5 years ago | |
1704.03114 | |||
drnet | 201 | about 3 years ago | |
1611.06641 | |||
pl-clc | 39 | about 7 years ago | |
1707.09423 | |||
1803.10362 | |||
ReferringRelationships | 260 | almost 2 years ago | |
1807.04979 | |||
ZoomNet | |||
1808.00171 | |||
vrd | 94 | almost 6 years ago | |
1910.12324 | |||
Awesome Vision-and-Language: / Visual Storytelling | |||
1604.03968 | |||
visual_genome_python_driver | 354 | about 1 year ago | |
VIST | |||
1804.09160 | |||
AREL | 137 | over 3 years ago | |
2002.00774 | |||
AAAI 2020 | |||