VLFeedback
Vision language model trainer
An annotated preference dataset and training framework for improving large vision language models.
85 stars
2 watching
2 forks
Language: Python
last commit: 11 months ago Related projects:
Repository | Description | Stars |
---|---|---|
vishaal27/sus-x | This is an open-source project that proposes a novel method to train large-scale vision-language models with minimal resources and no fine-tuning required. | 94 |
yiren-jian/blitext | Develops and trains models for vision-language learning with decoupled language pre-training | 24 |
shizhediao/davinci | An implementation of vision-language models for multimodal learning tasks, enabling generative vision-language models to be fine-tuned for various applications. | 43 |
nvlabs/prismer | A deep learning framework for training multi-modal models with vision and language capabilities. | 1,298 |
vlfeat/vlfeat | A comprehensive computer vision library providing efficient algorithms for image analysis and feature extraction | 1,596 |
yuxie11/r2d2 | A framework for large-scale cross-modal benchmarks and vision-language tasks in Chinese | 157 |
deepseek-ai/deepseek-vl | A multimodal AI model that enables real-world vision-language understanding applications | 2,077 |
meituan-automl/mobilevlm | An implementation of a vision language model designed for mobile devices, utilizing a lightweight downsample projector and pre-trained language models. | 1,039 |
baai-wudao/brivl | Pre-trains a multilingual model to bridge vision and language modalities for various downstream applications | 279 |
volcengine/vescale | A PyTorch-based framework for training large language models in parallel on multiple devices | 663 |
liaoning97/revo-lion | A comprehensive dataset and evaluation framework for Vision-Language Instruction Tuning models | 11 |
chendelong1999/polite-flamingo | Develops training methods to improve the politeness and natural flow of multi-modal Large Language Models | 63 |
csuhan/onellm | A framework for training and fine-tuning multimodal language models on various data types | 588 |
llava-vl/llava-plus-codebase | A platform for training and deploying large language and vision models that can use tools to perform tasks | 704 |
lyhue1991/torchkeras | A PyTorch-based model training framework designed to simplify and streamline training workflows by providing a unified interface for various loss functions, optimizers, and validation metrics. | 1,782 |