DaVinci

Multimodal learner

An implementation of vision-language models for multimodal learning tasks, enabling generative vision-language models to be fine-tuned for various applications.

Source code for the paper "Prefix Language Models are Unified Modal Learners"

GitHub

43 stars
10 watching
3 forks
Language: Jupyter Notebook
last commit: over 1 year ago

Related projects:

Repository Description Stars
nvlabs/prismer A deep learning framework for training multi-modal models with vision and language capabilities. 1,298
kaiyangzhou/dassl.pytorch A PyTorch toolbox for supporting research and development of domain adaptation, generalization, and semi-supervised learning methods in computer vision. 1,217
hxyou/idealgpt A deep learning framework for iteratively decomposing vision and language reasoning via large language models. 32
yiren-jian/blitext Develops and trains models for vision-language learning with decoupled language pre-training 24
vlf-silkie/vlfeedback An annotated preference dataset and training framework for improving large vision language models. 85
yuxie11/r2d2 A framework for large-scale cross-modal benchmarks and vision-language tasks in Chinese 157
byungkwanlee/collavo Develops a PyTorch implementation of an enhanced vision language model 93
zhourax/vega Develops a multimodal task and dataset to assess vision-language models' ability to handle interleaved image-text inputs. 33
donnyyou/pytorchcv A PyTorch-based framework for building and training deep learning models in computer vision. 47
anuragranj/back2future.pytorch An implementation of unsupervised learning for multi-frame optical flow with occlusions using PyTorch. 111
woozzu/dong_iccv_2017 An implementation of semantic image synthesis via adversarial learning using PyTorch 145
baaivision/eve A PyTorch implementation of an encoder-free vision-language model that can be fine-tuned for various tasks and modalities 230
yassouali/pytorch-segmentation Implementation of semantic segmentation models and datasets using PyTorch 1,686
zijundeng/pytorch-semantic-segmentation Provides PyTorch implementations of various models and pipelines for semantic segmentation in deep learning. 1,724
vishaal27/sus-x This is an open-source project that proposes a novel method to train large-scale vision-language models with minimal resources and no fine-tuning required. 94