LLaVA-RLHF

Reward alignment system

Aligns large multimodal models with factually enhanced reward functions to improve performance and mitigate hacking in reinforcement learning

Aligning LMMs with Factually Augmented RLHF

GitHub

328 stars

9 watching

24 forks

Language: Python

last commit: almost 2 years ago

Screenshot of llava-rlhf/LLaVA-RLHF website

llava-rlhf.github.io/

Related projects:

Repository	Description	Stars
rlhf-v/rlhf-v	Aligns large language models' behavior through fine-grained correctional human feedback to improve trustworthiness and accuracy.	245
ethanyanjiali/minchatgpt	This project demonstrates the effectiveness of reinforcement learning from human feedback (RLHF) in improving small language models like GPT-2.	214
tristandeleu/pytorch-maml-rl	Replication of Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks in PyTorch for reinforcement learning tasks	830
wisconsinaivision/vip-llava	A system designed to enable large multimodal models to understand arbitrary visual prompts	302
sjtu-marl/malib	A framework for parallel population-based reinforcement learning	507
llava-vl/llava-interactive-demo	An all-in-one demo for interactive image processing and generation	353
llava-vl/llava-plus-codebase	A platform for training and deploying large language and vision models that can use tools to perform tasks	717
tatsu-lab/alpaca_farm	A framework for simulating and evaluating reinforcement learning from human feedback methods	786
matthiasplappert/keras-rl	A Python library implementing state-of-the-art deep reinforcement learning algorithms for Keras and OpenAI Gym environments.	8
kaixhin/rainbow	A Python implementation of a deep reinforcement learning algorithm combining multiple techniques for improved performance in Atari games	1,591
salt-nlp/llavar	An open-source project that enhances visual instruction tuning for text-rich image understanding by integrating GPT-4 models with multimodal datasets.	259
aidc-ai/ovis	An MLLM architecture designed to align visual and textual embeddings through structural alignment	575
iffix/machin	An open-source reinforcement learning library for PyTorch, providing a simple and clear implementation of various algorithms.	402
yfzhang114/llava-align	Debiasing techniques to minimize hallucinations in large visual language models	75
lhfowl/robbing_the_fed	This implementation allows an attacker to directly obtain user data from federated learning gradient updates by modifying the shared model architecture.	23