REVO-LION

VLIT model toolkit

A comprehensive dataset and evaluation framework for Vision-Language Instruction Tuning models

REVO-LION: Evaluating and Refining Vision-Language Instruction Tuning Datasets

GitHub

11 stars
1 watching
0 forks
last commit: about 1 year ago

Related projects:

Repository Description Stars
jiutian-vl/jiutian-lion This project integrates visual knowledge into large language models to improve their capabilities and reduce hallucinations. 121
ys-zong/vlguard Improves safety and helpfulness of large language models by fine-tuning them using safety-critical tasks 45
jiasenlu/vilbert_beta A pre-trained model and toolset for performing vision-and-language tasks using a specific neural network architecture. 474
zhuiyitechnology/pretrained-models A collection of pre-trained language models for natural language processing tasks 987
vhellendoorn/code-lms A guide to using pre-trained large language models in source code analysis and generation 1,782
aidc-ai/parrot A method and toolkit for fine-tuning large language models to perform visual instruction tasks in multiple languages. 30
nvlabs/prismer A deep learning framework for training multi-modal models with vision and language capabilities. 1,298
vlf-silkie/vlfeedback An annotated preference dataset and training framework for improving large vision language models. 85
byungkwanlee/collavo Develops a PyTorch implementation of an enhanced vision language model 93
ymcui/macbert Improves pre-trained Chinese language models by incorporating a correction task to alleviate inconsistency issues with downstream tasks 645
yg-smile/rl_vvc_dataset A collection of benchmarks and implementations for testing reinforcement learning-based Volt-VAR control algorithms 20
flagai-open/aquila2 Provides pre-trained language models and tools for fine-tuning and evaluation 437
yiren-jian/blitext Develops and trains models for vision-language learning with decoupled language pre-training 24
jshilong/gpt4roi Training and deploying large language models on computer vision tasks using region-of-interest inputs 506
deepseek-ai/deepseek-vl A multimodal AI model that enables real-world vision-language understanding applications 2,077