SuS-X

Vision-Language Model Trainer

This is an open-source project that proposes a novel method to train large-scale vision-language models with minimal resources and no fine-tuning required.

Code for the paper: "SuS-X: Training-Free Name-Only Transfer of Vision-Language Models" [ICCV'23]

GitHub

94 stars
3 watching
5 forks
Language: Python
last commit: about 1 year ago

Related projects:

Repository Description Stars
vlf-silkie/vlfeedback An annotated preference dataset and training framework for improving large vision language models. 85
deepseek-ai/deepseek-vl A multimodal AI model that enables real-world vision-language understanding applications 2,077
baai-wudao/brivl Pre-trains a multilingual model to bridge vision and language modalities for various downstream applications 279
llava-vl/llava-plus-codebase A platform for training and deploying large language and vision models that can use tools to perform tasks 704
yiren-jian/blitext Develops and trains models for vision-language learning with decoupled language pre-training 24
nvlabs/prismer A deep learning framework for training multi-modal models with vision and language capabilities. 1,298
shizhediao/davinci An implementation of vision-language models for multimodal learning tasks, enabling generative vision-language models to be fine-tuned for various applications. 43
baaivision/eve A PyTorch implementation of an encoder-free vision-language model that can be fine-tuned for various tasks and modalities 230
vpgtrans/vpgtrans Transfers visual prompt generators across large language models to reduce training costs and enable customization of multimodal LLMs 269
ucsc-vlaa/sight-beyond-text This repository provides an official implementation of a research paper exploring the use of multi-modal training to enhance language models' truthfulness and ethics in various applications. 19
openai/finetune-transformer-lm This project provides code and model for improving language understanding through generative pre-training using a transformer-based architecture. 2,160
jshilong/gpt4roi Training and deploying large language models on computer vision tasks using region-of-interest inputs 506
csuhan/onellm A framework for training and fine-tuning multimodal language models on various data types 588
maxpumperla/elephas Enables distributed deep learning with Keras and Spark for scalable model training 1,574
byungkwanlee/collavo Develops a PyTorch implementation of an enhanced vision language model 93