awesome-vision-and-language

Vision and Language Resources

A curated list of resources and datasets for research in vision and language tasks.

A curated list of awesome vision and language resources (still under construction... stay tuned!)

GitHub

510 stars
12 watching
40 forks
last commit: 3 months ago
Linked from 1 awesome list

awesomeawesome-listmultimodal-learningvision-and-language

Awesome Vision-and-Language: / Survey

1506.06833
1705.09406
1810.04020
1907.09358
Scene-Graph-Survey
1904.09317
ACCESS 2019
1911.03977
1912.11872
2010.09522

Awesome Vision-and-Language: / Dataset

1505.00468
visualqa
1604.03968
ai-visual-storytelling-seq2seq
VIST
1602.07332
visual_genome_python_driver 357 over 1 year ago
visualgenome
1612.06890
1705.08421
AVA
1711.11543
embodiedqa
1711.07280
bringmeaspoon
1902.09506
visualreasoning
1811.10830
r2c 466 over 3 years ago
VCR
1904.03493
2010.00763
Bongard-LOGO 51 over 2 years ago
2205.13803
Bongard-HOI 64 about 2 years ago

Awesome Vision-and-Language: / Image Captioning

1411.4389
1412.2306
1411.4555
show_and_tell.tensorflow 291 about 8 years ago
1502.03044
show-attend-and-tell 907 over 6 years ago
1411.4952
visual-concepts 150 almost 7 years ago
1603.03925
semantic-attention 51 about 8 years ago
1612.01887
AdaptiveAttention 335 about 7 years ago
1612.00563
1611.06607
1704.03899
1611.08002
Semantic_Compositional_Nets 70 almost 7 years ago
CVPR 2017
stylenet 63 about 4 years ago
ENNLP 2018
image-paragraph-captioning 90 over 5 years ago
1803.09845
NeuralBabyTalk 525 almost 6 years ago
1707.07998
1807.03871
1805.08191
1811.10787
unsupervised_captioning 215 almost 2 years ago
1906.02365
CAVP 46 over 5 years ago
1903.05942
1903.12020
1904.01475
1812.02378
SGAE 221 almost 3 years ago
1811.10787
unsupervised_captioning 215 almost 2 years ago
CVPR 2019
1901.02527
1908.06954
2004.03708
2003.00387
asg2cap 200 about 2 years ago
2007.11731
Sub-GC 96 5 months ago
2009.12313
2102.04990

Awesome Vision-and-Language: / Image Retrieval

1511.07067
VisualWord2Vec 19 almost 6 years ago
1812.07119
tirg 300 almost 4 years ago
2105.13868
IAIS 30 over 1 year ago
2203.15867
ImageCoDe 39 11 months ago
2407.15239
2311.17136
UniIR 114 4 months ago
2407.12346
Q-Pert 1 4 months ago

Awesome Vision-and-Language: / Scene Text Recognition

1908.09231
1904.01906
clovaai 3,769 11 months ago

Awesome Vision-and-Language: / Scene Graph

7298990
1602.07332
visual_genome_python_driver 357 over 1 year ago
visualgenome
1701.02426
scene-graph-TF-release 425 almost 6 years ago
1707.09700
MSDN 227 about 5 years ago
1711.06640
neural-motifs 526 over 5 years ago
1802.02598
1811.06410
1804.01622
sg2im 1,302 6 months ago
1808.00191
graph-rcnn.pytorch 732 almost 5 years ago
1904.00560
1909.05379
scene_generation 188 over 1 year ago
1811.10696
sceneGraph_Mem 4 over 5 years ago
1903.02728
ContrastiveLosses4VRD 199 almost 5 years ago
1903.03326
KERN 121 over 2 years ago
1812.01880
VCTree 121 5 months ago
1812.02347
1904.11622
limited-label 54 over 1 year ago
2002.11949
Scene-Graph-Benchmark 1,085 3 months ago
2003.12962
GPS-Net 64 over 4 years ago
2006.09623
2007.08760
het-eccv20 16 over 4 years ago

Awesome Vision-and-Language: / text2image

1605.05396
icml2016 912 over 6 years ago
1612.03242
StackGAN 1,863 over 4 years ago
1711.10485
AttnGAN 1,343 6 months ago
1802.09178
HDGan 150 about 6 years ago
1812.02784
StoryGAN 233 over 2 years ago
1903.05854
1904.01310
1904.01480
1811.09845
GeNeVA 37 over 1 year ago
1909.05379
scene_generation 188 over 1 year ago

Awesome Vision-and-Language: / Video Captioning

1411.4389
1510.07712
1701.03126
1611.08002
CVPR_2017
1804.00100
1812.05634
adv-inf 34 over 5 years ago
1904.03870
DenseVideoCaptioning 149 over 5 years ago
1906.04375
2011.07735
iPerceive

Awesome Vision-and-Language: / Video Question Answering

1512.02902
MovieQA 80 about 8 years ago
1809.01696
TVQA 172 about 2 years ago
2007.08751
ROLL-VideoQA 19 over 4 years ago
2011.07735
iPerceive

Awesome Vision-and-Language: / Video Understanding

1811.08383
temporal-shift-module 2,078 6 months ago
1910.11009

Awesome Vision-and-Language: / Vision and Language Navigation

1711.11543
embodiedqa
1711.07280
bringmeaspoon
fda_pdf
fda_code 13 about 1 year ago
mam_paper

Awesome Vision-and-Language: / Vision-and-Language Pretraining

1908.07490
lxmert 938 about 2 years ago
1904.01766
vilbert 473 about 2 years ago
1907.07804
OmniNet 515 about 4 years ago
1908.06066
Unicoder 89 about 1 year ago
1909.11059
VLP 416 almost 3 years ago
1911.11237
Oscar 1,039 over 1 year ago
2006.09882
swav 2,014 almost 2 years ago
2004.06165
Oscar 1,039 over 1 year ago
2006.16934
ERNIE 6,331 5 months ago
2101.00529
VinVL 350 over 1 year ago
2006.06666
virtex 556 about 1 year ago
2103.00020
2103.05247
universal-computation 245 about 3 years ago
2102.05918
2103.01988
2102.10772
2102.12092
2103.06561
2305.08675

Awesome Vision-and-Language: / Visual Dialog

1611.08669
visdial 228 about 6 years ago
visualdialog
1803.11186
2303.05983
ATVC 7 over 1 year ago

Awesome Vision-and-Language: / Visual Grounding

1611.09978
cmn 67 over 6 years ago
1908.07553
1812.03299
1908.06354
1908.07129
zsgnet 69 over 4 years ago
2203.16518
CoFormer 45 almost 2 years ago

Awesome Vision-and-Language: / Visual Question Answering

1505.00468
visualqa
1606.00061
HieCoAttenVQA 349 over 6 years ago
1606.01847
vqa-mcb 222 over 8 years ago
1511.02274
imageqa-san 108 about 8 years ago
1511.05234
AAAA 25 about 4 years ago
1603.01417
dmn-plus 64 over 6 years ago
1606.01847
vqa-mcb 222 over 8 years ago
1606.01455
nips-mrn-vqa 39 about 8 years ago
1609.05600
1612.00837
1704.05526
1803.08896
PSLQA 56 almost 6 years ago
1707.07998
1708.02711
vqa-winner 164 almost 6 years ago
1810.02358
VQA-Transfer-ExternalData 20 almost 6 years ago
1902.09506
visualreasoning
1904.08920
ICCV2019
1907.12133
scene-graphs-vqa
2204.11167
RelViT 64 over 2 years ago
2208.01813
TAG 21 about 2 years ago

Awesome Vision-and-Language: / Visual Reasoning

1612.06890
1705.03633
1902.09506
visualreasoning
1812.01855
1811.10830
r2c 466 over 3 years ago
VCR
1909.08164
1909.02701
VSRN 294 about 5 years ago
2010.00763
Bongard-LOGO 51 over 2 years ago
2205.13803
Bongard-HOI 64 about 2 years ago
2204.11167
RelViT 64 over 2 years ago
2307.15199
PromptStyler

Awesome Vision-and-Language: / Visual Relationship Detection

1608.00187
Visual-Relationship-Detection 214 about 4 years ago
1702.07191
1702.08319
drnet 202 over 3 years ago
1703.03054
DeepVariationRL 63 almost 6 years ago
1704.03114
drnet 202 over 3 years ago
1611.06641
pl-clc 39 over 7 years ago
1707.09423
1803.10362
ReferringRelationships 260 about 2 years ago
1807.04979
ZoomNet
1808.00171
vrd 94 over 6 years ago
1910.12324

Awesome Vision-and-Language: / Visual Storytelling

1604.03968
visual_genome_python_driver 357 over 1 year ago
VIST
1804.09160
AREL 136 almost 4 years ago
2002.00774
AAAI 2020

Backlinks from these awesome lists:

More related projects: