BriVL

Vision-Language Bridge

Pre-trains a multilingual model to bridge vision and language modalities for various downstream applications

Bridging Vision and Language Model

279 stars

3 watching

31 forks

Language: Python

last commit: over 3 years ago

Related projects:

Repository	Description	Stars
yiren-jian/blitext	Develops and trains models for vision-language learning with decoupled language pre-training	24
baaivision/eve	A PyTorch implementation of an encoder-free vision-language model that can be fine-tuned for various tasks and modalities	246
baai-wudao/model	A repository of pre-trained language models for various tasks and domains.	121
vishaal27/sus-x	This is an open-source project that proposes a novel method to train large-scale vision-language models with minimal resources and no fine-tuning required.	94
byungkwanlee/moai	Improves performance of vision language tasks by integrating computer vision capabilities into large language models	314
nvlabs/prismer	A deep learning framework for training multi-modal models with vision and language capabilities.	1,299
zhuiyitechnology/pretrained-models	A collection of pre-trained language models for natural language processing tasks	989
brightmart/xlnet_zh	Trains a large Chinese language model on massive data and provides a pre-trained model for downstream tasks	230
deepseek-ai/deepseek-vl	A multimodal AI model that enables real-world vision-language understanding applications	2,145
pku-yuangroup/languagebind	Extending pretraining models to handle multiple modalities by aligning language and video representations	751
vlf-silkie/vlfeedback	An annotated preference dataset and training framework for improving large vision language models.	88
openai/finetune-transformer-lm	This project provides code and model for improving language understanding through generative pre-training using a transformer-based architecture.	2,167
yuxie11/r2d2	A framework for large-scale cross-modal benchmarks and vision-language tasks in Chinese	157
meituan-automl/mobilevlm	An implementation of a vision language model designed for mobile devices, utilizing a lightweight downsample projector and pre-trained language models.	1,076
byungkwanlee/collavo	Develops a PyTorch implementation of an enhanced vision language model	93