PaLM-rlhf-pytorch
RLHF framework
An implementation of RLHF on top of the PaLM architecture to enable human feedback in reinforcement learning for large language models.
Implementation of RLHF (Reinforcement Learning with Human Feedback) on top of the PaLM architecture. Basically ChatGPT but with PaLM
8k stars
143 watching
669 forks
Language: Python
last commit: 10 months ago artificial-intelligenceattention-mechanismsdeep-learninghuman-feedbackreinforcement-learningtransformers
Related projects:
Repository | Description | Stars |
---|---|---|
carperai/trlx | A framework for distributed reinforcement learning of large language models with human feedback | 4,502 |
p-christ/deep-reinforcement-learning-algorithms-with-pytorch | PyTorch implementations of popular deep reinforcement learning algorithms and environments. | 5,640 |
lucidrains/imagen-pytorch | Implements Google's Text-to-Image Neural Network in PyTorch using a cascading DDPM architecture with dynamic clipping and noise level conditioning. | 8,088 |
lucidrains/musiclm-pytorch | Implementation of Google's MusicLM model for music generation using attention networks and text-conditioning. | 3,166 |
iffix/machin | An open-source reinforcement learning library for PyTorch, providing a simple and clear implementation of various algorithms. | 401 |
ethanyanjiali/minchatgpt | This project demonstrates the effectiveness of reinforcement learning from human feedback (RLHF) in improving small language models like GPT-2. | 213 |
luchris429/purejaxrl | A high-performance implementation of reinforcement learning training pipelines using JAX and PyTorch-like functionality | 722 |
tristandeleu/pytorch-maml-rl | Replication of Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks in PyTorch for reinforcement learning tasks | 827 |
thu-ml/tianshou | A high-performance reinforcement learning library with modular interfaces and user-friendly APIs for building deep learning agents. | 7,968 |
lucidrains/dalle2-pytorch | An implementation of DALL-E 2's text-to-image synthesis neural network in PyTorch | 11,148 |
freedomintelligence/llmzoo | A platform providing data, models, and evaluation benchmarks for large language models to promote accessibility and democratization of AI technology | 2,934 |
huggingface/alignment-handbook | Provides training recipes and resources to align language models with human preferences | 4,677 |
huggingface/trl | A library designed to train transformer language models with reinforcement learning using various optimization techniques and fine-tuning methods. | 10,053 |
tju-drl-lab/ai-optimizer | A next-generation deep reinforcement learning toolkit with libraries for multiagent, self-supervised, offline, and transfer/reinforcement learning | 4,755 |
xrsrke/instructgoose | A framework for training language models using human feedback and reinforcement learning | 169 |