awesome-vision-and-language

Vision and Language Resources

A curated list of resources and datasets for research in vision and language tasks.

A curated list of awesome vision and language resources (still under construction... stay tuned!)

GitHub

500 stars
12 watching
40 forks
last commit: 20 days ago
Linked from 1 awesome list

awesomeawesome-listmultimodal-learningvision-and-language

Awesome Vision-and-Language: / Survey

1506.06833
1705.09406
1810.04020
1907.09358
Scene-Graph-Survey
1904.09317
ACCESS 2019
1911.03977
1912.11872
2010.09522

Awesome Vision-and-Language: / Dataset

1505.00468
visualqa
1604.03968
ai-visual-storytelling-seq2seq
VIST
1602.07332
visual_genome_python_driver 357 about 1 year ago
visualgenome
1612.06890
1705.08421
AVA
1711.11543
embodiedqa
1711.07280
bringmeaspoon
1902.09506
visualreasoning
1811.10830
r2c 466 over 3 years ago
VCR
1904.03493
2010.00763
Bongard-LOGO 51 over 2 years ago
2205.13803
Bongard-HOI 64 about 2 years ago

Awesome Vision-and-Language: / Image Captioning

1411.4389
1412.2306
1411.4555
show_and_tell.tensorflow 290 about 8 years ago
1502.03044
show-attend-and-tell 907 over 6 years ago
1411.4952
visual-concepts 151 over 6 years ago
1603.03925
semantic-attention 51 about 8 years ago
1612.01887
AdaptiveAttention 334 almost 7 years ago
1612.00563
1611.06607
1704.03899
1611.08002
Semantic_Compositional_Nets 70 over 6 years ago
CVPR 2017
stylenet 63 almost 4 years ago
ENNLP 2018
image-paragraph-captioning 90 about 5 years ago
1803.09845
NeuralBabyTalk 524 over 5 years ago
1707.07998
1807.03871
1805.08191
1811.10787
unsupervised_captioning 215 over 1 year ago
1906.02365
CAVP 47 over 5 years ago
1903.05942
1903.12020
1904.01475
1812.02378
SGAE 220 over 2 years ago
1811.10787
unsupervised_captioning 215 over 1 year ago
CVPR 2019
1901.02527
1908.06954
2004.03708
2003.00387
asg2cap 200 almost 2 years ago
2007.11731
Sub-GC 96 3 months ago
2009.12313
2102.04990

Awesome Vision-and-Language: / Image Retrieval

1511.07067
VisualWord2Vec 19 over 5 years ago
1812.07119
tirg 298 over 3 years ago
2105.13868
IAIS 30 over 1 year ago
2203.15867
ImageCoDe 39 9 months ago
2407.15239
2311.17136
UniIR 110 about 2 months ago
2407.12346
Q-Pert 1 2 months ago

Awesome Vision-and-Language: / Scene Text Recognition

1908.09231
1904.01906
clovaai 3,755 9 months ago

Awesome Vision-and-Language: / Scene Graph

7298990
1602.07332
visual_genome_python_driver 357 about 1 year ago
visualgenome
1701.02426
scene-graph-TF-release 425 over 5 years ago
1707.09700
MSDN 227 about 5 years ago
1711.06640
neural-motifs 525 over 5 years ago
1802.02598
1811.06410
1804.01622
sg2im 1,300 4 months ago
1808.00191
graph-rcnn.pytorch 733 over 4 years ago
1904.00560
1909.05379
scene_generation 187 about 1 year ago
1811.10696
sceneGraph_Mem 4 over 5 years ago
1903.02728
ContrastiveLosses4VRD 200 over 4 years ago
1903.03326
KERN 120 over 2 years ago
1812.01880
VCTree 121 3 months ago
1812.02347
1904.11622
limited-label 54 about 1 year ago
2002.11949
Scene-Graph-Benchmark 1,075 30 days ago
2003.12962
GPS-Net 63 over 4 years ago
2006.09623
2007.08760
het-eccv20 16 over 4 years ago

Awesome Vision-and-Language: / text2image

1605.05396
icml2016 913 about 6 years ago
1612.03242
StackGAN 1,860 over 4 years ago
1711.10485
AttnGAN 1,339 4 months ago
1802.09178
HDGan 150 about 6 years ago
1812.02784
StoryGAN 233 over 2 years ago
1903.05854
1904.01310
1904.01480
1811.09845
GeNeVA 37 over 1 year ago
1909.05379
scene_generation 187 about 1 year ago

Awesome Vision-and-Language: / Video Captioning

1411.4389
1510.07712
1701.03126
1611.08002
CVPR_2017
1804.00100
1812.05634
adv-inf 34 over 5 years ago
1904.03870
DenseVideoCaptioning 148 over 5 years ago
1906.04375
2011.07735
iPerceive

Awesome Vision-and-Language: / Video Question Answering

1512.02902
MovieQA 80 almost 8 years ago
1809.01696
TVQA 172 about 2 years ago
2007.08751
ROLL-VideoQA 19 about 4 years ago
2011.07735
iPerceive

Awesome Vision-and-Language: / Video Understanding

1811.08383
temporal-shift-module 2,068 5 months ago
1910.11009

Awesome Vision-and-Language: / Vision and Language Navigation

1711.11543
embodiedqa
1711.07280
bringmeaspoon
fda_pdf
fda_code 13 11 months ago
mam_paper

Awesome Vision-and-Language: / Vision-and-Language Pretraining

1908.07490
lxmert 936 about 2 years ago
1904.01766
vilbert 474 about 2 years ago
1907.07804
OmniNet 512 about 4 years ago
1908.06066
Unicoder 88 12 months ago
1909.11059
VLP 412 almost 3 years ago
1911.11237
Oscar 1,038 about 1 year ago
2006.09882
swav 2,005 over 1 year ago
2004.06165
Oscar 1,038 about 1 year ago
2006.16934
ERNIE 6,318 3 months ago
2101.00529
VinVL 350 over 1 year ago
2006.06666
virtex 557 11 months ago
2103.00020
2103.05247
universal-computation 245 almost 3 years ago
2102.05918
2103.01988
2102.10772
2102.12092
2103.06561
2305.08675

Awesome Vision-and-Language: / Visual Dialog

1611.08669
visdial 228 almost 6 years ago
visualdialog
1803.11186
2303.05983
ATVC 7 over 1 year ago

Awesome Vision-and-Language: / Visual Grounding

1611.09978
cmn 67 about 6 years ago
1908.07553
1812.03299
1908.06354
1908.07129
zsgnet 69 over 4 years ago
2203.16518
CoFormer 43 over 1 year ago

Awesome Vision-and-Language: / Visual Question Answering

1505.00468
visualqa
1606.00061
HieCoAttenVQA 349 about 6 years ago
1606.01847
vqa-mcb 222 over 8 years ago
1511.02274
imageqa-san 107 almost 8 years ago
1511.05234
AAAA 25 about 4 years ago
1603.01417
dmn-plus 64 over 6 years ago
1606.01847
vqa-mcb 222 over 8 years ago
1606.01455
nips-mrn-vqa 39 almost 8 years ago
1609.05600
1612.00837
1704.05526
1803.08896
PSLQA 56 almost 6 years ago
1707.07998
1708.02711
vqa-winner 164 almost 6 years ago
1810.02358
VQA-Transfer-ExternalData 20 over 5 years ago
1902.09506
visualreasoning
1904.08920
ICCV2019
1907.12133
scene-graphs-vqa
2204.11167
RelViT 64 about 2 years ago
2208.01813
TAG 21 almost 2 years ago

Awesome Vision-and-Language: / Visual Reasoning

1612.06890
1705.03633
1902.09506
visualreasoning
1812.01855
1811.10830
r2c 466 over 3 years ago
VCR
1909.08164
1909.02701
VSRN 294 almost 5 years ago
2010.00763
Bongard-LOGO 51 over 2 years ago
2205.13803
Bongard-HOI 64 about 2 years ago
2204.11167
RelViT 64 about 2 years ago
2307.15199
PromptStyler

Awesome Vision-and-Language: / Visual Relationship Detection

1608.00187
Visual-Relationship-Detection 214 about 4 years ago
1702.07191
1702.08319
drnet 202 about 3 years ago
1703.03054
DeepVariationRL 63 almost 6 years ago
1704.03114
drnet 202 about 3 years ago
1611.06641
pl-clc 39 over 7 years ago
1707.09423
1803.10362
ReferringRelationships 260 about 2 years ago
1807.04979
ZoomNet
1808.00171
vrd 94 about 6 years ago
1910.12324

Awesome Vision-and-Language: / Visual Storytelling

1604.03968
visual_genome_python_driver 357 about 1 year ago
VIST
1804.09160
AREL 137 almost 4 years ago
2002.00774
AAAI 2020

Backlinks from these awesome lists:

More related projects: