LLaVA-RLHF
Reward alignment system
Aligns large multimodal models with factually enhanced reward functions to improve performance and mitigate hacking in reinforcement learning
Aligning LMMs with Factually Augmented RLHF
319 stars
9 watching
25 forks
Language: Python
last commit: about 1 year ago Related projects:
Repository | Description | Stars |
---|---|---|
rlhf-v/rlhf-v | Aligns large language models' behavior through fine-grained correctional human feedback to improve trustworthiness and accuracy. | 233 |
ethanyanjiali/minchatgpt | This project demonstrates the effectiveness of reinforcement learning from human feedback (RLHF) in improving small language models like GPT-2. | 213 |
tristandeleu/pytorch-maml-rl | Replication of Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks in PyTorch for reinforcement learning tasks | 827 |
wisconsinaivision/vip-llava | A system designed to enable large multimodal models to understand arbitrary visual prompts | 294 |
sjtu-marl/malib | A framework for parallel population-based reinforcement learning | 497 |
llava-vl/llava-interactive-demo | An all-in-one demo for interactive image processing and generation | 351 |
llava-vl/llava-plus-codebase | A platform for training and deploying large language and vision models that can use tools to perform tasks | 704 |
tatsu-lab/alpaca_farm | A framework for simulating and evaluating reinforcement learning from human feedback methods | 782 |
matthiasplappert/keras-rl | A Python library implementing state-of-the-art deep reinforcement learning algorithms for Keras and OpenAI Gym environments. | 7 |
kaixhin/rainbow | A Python implementation of a deep reinforcement learning algorithm combining multiple techniques for improved performance in Atari games | 1,585 |
salt-nlp/llavar | An open-source project that enhances visual instruction tuning for text-rich image understanding by integrating GPT-4 models with multimodal datasets. | 258 |
aidc-ai/ovis | An architecture designed to align visual and textual embeddings in multimodal learning | 517 |
iffix/machin | An open-source reinforcement learning library for PyTorch, providing a simple and clear implementation of various algorithms. | 401 |
yfzhang114/llava-align | Debiasing techniques to minimize hallucinations in large visual language models | 71 |
lhfowl/robbing_the_fed | This implementation allows an attacker to directly obtain user data from federated learning gradient updates by modifying the shared model architecture. | 23 |