VLFeedback

Vision language model trainer

An annotated preference dataset and training framework for improving large vision language models.

GitHub

85 stars
2 watching
2 forks
Language: Python
last commit: 11 months ago

Related projects:

Repository Description Stars
vishaal27/sus-x This is an open-source project that proposes a novel method to train large-scale vision-language models with minimal resources and no fine-tuning required. 94
yiren-jian/blitext Develops and trains models for vision-language learning with decoupled language pre-training 24
shizhediao/davinci An implementation of vision-language models for multimodal learning tasks, enabling generative vision-language models to be fine-tuned for various applications. 43
nvlabs/prismer A deep learning framework for training multi-modal models with vision and language capabilities. 1,298
vlfeat/vlfeat A comprehensive computer vision library providing efficient algorithms for image analysis and feature extraction 1,596
yuxie11/r2d2 A framework for large-scale cross-modal benchmarks and vision-language tasks in Chinese 157
deepseek-ai/deepseek-vl A multimodal AI model that enables real-world vision-language understanding applications 2,077
meituan-automl/mobilevlm An implementation of a vision language model designed for mobile devices, utilizing a lightweight downsample projector and pre-trained language models. 1,039
baai-wudao/brivl Pre-trains a multilingual model to bridge vision and language modalities for various downstream applications 279
volcengine/vescale A PyTorch-based framework for training large language models in parallel on multiple devices 663
liaoning97/revo-lion A comprehensive dataset and evaluation framework for Vision-Language Instruction Tuning models 11
chendelong1999/polite-flamingo Develops training methods to improve the politeness and natural flow of multi-modal Large Language Models 63
csuhan/onellm A framework for training and fine-tuning multimodal language models on various data types 588
llava-vl/llava-plus-codebase A platform for training and deploying large language and vision models that can use tools to perform tasks 704
lyhue1991/torchkeras A PyTorch-based model training framework designed to simplify and streamline training workflows by providing a unified interface for various loss functions, optimizers, and validation metrics. 1,782