PaLM-rlhf-pytorch

RLHF framework

An implementation of RLHF on top of the PaLM architecture to enable human feedback in reinforcement learning for large language models.

Implementation of RLHF (Reinforcement Learning with Human Feedback) on top of the PaLM architecture. Basically ChatGPT but with PaLM

GitHub

8k stars
143 watching
669 forks
Language: Python
last commit: 10 months ago
artificial-intelligenceattention-mechanismsdeep-learninghuman-feedbackreinforcement-learningtransformers

Related projects:

Repository Description Stars
carperai/trlx A framework for distributed reinforcement learning of large language models with human feedback 4,502
p-christ/deep-reinforcement-learning-algorithms-with-pytorch PyTorch implementations of popular deep reinforcement learning algorithms and environments. 5,640
lucidrains/imagen-pytorch Implements Google's Text-to-Image Neural Network in PyTorch using a cascading DDPM architecture with dynamic clipping and noise level conditioning. 8,088
lucidrains/musiclm-pytorch Implementation of Google's MusicLM model for music generation using attention networks and text-conditioning. 3,166
iffix/machin An open-source reinforcement learning library for PyTorch, providing a simple and clear implementation of various algorithms. 401
ethanyanjiali/minchatgpt This project demonstrates the effectiveness of reinforcement learning from human feedback (RLHF) in improving small language models like GPT-2. 213
luchris429/purejaxrl A high-performance implementation of reinforcement learning training pipelines using JAX and PyTorch-like functionality 722
tristandeleu/pytorch-maml-rl Replication of Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks in PyTorch for reinforcement learning tasks 827
thu-ml/tianshou A high-performance reinforcement learning library with modular interfaces and user-friendly APIs for building deep learning agents. 7,968
lucidrains/dalle2-pytorch An implementation of DALL-E 2's text-to-image synthesis neural network in PyTorch 11,148
freedomintelligence/llmzoo A platform providing data, models, and evaluation benchmarks for large language models to promote accessibility and democratization of AI technology 2,934
huggingface/alignment-handbook Provides training recipes and resources to align language models with human preferences 4,677
huggingface/trl A library designed to train transformer language models with reinforcement learning using various optimization techniques and fine-tuning methods. 10,053
tju-drl-lab/ai-optimizer A next-generation deep reinforcement learning toolkit with libraries for multiagent, self-supervised, offline, and transfer/reinforcement learning 4,755
xrsrke/instructgoose A framework for training language models using human feedback and reinforcement learning 169