BLIText
Vision-Language Learning Model
Develops and trains models for vision-language learning with decoupled language pre-training
[NeurIPS 2023] Bootstrapping Vision-Language Learning with Decoupled Language Pre-training
24 stars
3 watching
1 forks
Language: Python
last commit: 12 months ago multimodal-deep-learningvision-language-pretrainingvision-language-transformer
Related projects:
Repository | Description | Stars |
---|---|---|
baai-wudao/brivl | Pre-trains a multilingual model to bridge vision and language modalities for various downstream applications | 279 |
vlf-silkie/vlfeedback | An annotated preference dataset and training framework for improving large vision language models. | 85 |
yifanxu74/libra | An implementation of a decoupled vision system using large language models | 143 |
byungkwanlee/collavo | Develops a PyTorch implementation of an enhanced vision language model | 93 |
pku-yuangroup/languagebind | Extending pretraining models to handle multiple modalities by aligning language and video representations | 723 |
shizhediao/davinci | An implementation of vision-language models for multimodal learning tasks, enabling generative vision-language models to be fine-tuned for various applications. | 43 |
zhuiyitechnology/pretrained-models | A collection of pre-trained language models for natural language processing tasks | 987 |
csuhan/onellm | A framework for training and fine-tuning multimodal language models on various data types | 588 |
meituan-automl/mobilevlm | An implementation of a vision language model designed for mobile devices, utilizing a lightweight downsample projector and pre-trained language models. | 1,039 |
ymcui/pert | Develops a pre-trained language model to learn semantic knowledge from permuted text without mask labels | 354 |
byungkwanlee/moai | Improves performance of vision language tasks by integrating computer vision capabilities into large language models | 311 |
nvlabs/prismer | A deep learning framework for training multi-modal models with vision and language capabilities. | 1,298 |
baaivision/eve | A PyTorch implementation of an encoder-free vision-language model that can be fine-tuned for various tasks and modalities | 230 |
brightmart/xlnet_zh | Trains a large Chinese language model on massive data and provides a pre-trained model for downstream tasks | 230 |
jshilong/gpt4roi | Training and deploying large language models on computer vision tasks using region-of-interest inputs | 506 |