awesome-vision-language-pretraining-papers

Vision-language papers

A curated list of papers on pre-trained vision and language models for multimodal learning tasks

Recent Advances in Vision and Language PreTrained Models (VL-PTMs)

GitHub

1k stars
53 watching
101 forks
last commit: over 2 years ago
Linked from 1 awesome list

bertmultimodal-deep-learningpretrainingvision-and-languagevl-ptms

Other Resources / Two recent surveys on pretrained language models

Pre-trained Models for Natural Language Processing: A Survey , arXiv 2020/03
A Survey on Contextual Embeddings , arXiv 2020/03

Other Resources / Other surveys about multimodal research

Trends in Integration of Vision and Language Research: A Survey of Tasks, Datasets, and Methods , JAIR 2021
Deep Multimodal Representation Learning: A Survey , arXiv 2019
Multimodal Machine Learning: A Survey and Taxonomy , TPAMI 2018
A Comprehensive Survey of Deep Learning for Image Captioning , ACM Computing Surveys 2018

Other Resources / Other repositories of relevant reading list

Pre-trained Languge Model Papers from THU-NLP 3,328 about 2 years ago
BERT-related Papers 2,035 over 1 year ago
Reading List for Topics in Multimodal Machine Learning 6,094 3 months ago
A repository of vision and language papers 500 17 days ago

Backlinks from these awesome lists:

More related projects: