trlx
RL Framework
A framework for distributed reinforcement learning of large language models with human feedback
A repo for distributed training of language models with Reinforcement Learning via Human Feedback (RLHF)
5k stars
51 watching
472 forks
Language: Python
last commit: about 1 year ago
Linked from 2 awesome lists
machine-learningpytorchreinforcement-learning
Related projects:
Repository | Description | Stars |
---|---|---|
huggingface/trl | A library designed to train transformer language models with reinforcement learning using various optimization techniques and fine-tuning methods. | 10,308 |
lucidrains/palm-rlhf-pytorch | An implementation of RLHF on top of the PaLM architecture to enable human feedback in reinforcement learning for large language models. | 7,729 |
tju-drl-lab/ai-optimizer | A next-generation deep reinforcement learning toolkit with libraries for multiagent, self-supervised, offline, and transfer/reinforcement learning | 4,848 |
google-deepmind/trfl | Provides building blocks for Reinforcement Learning agents using TensorFlow | 3,136 |
paddlepaddle/parl | A high-performance distributed training framework for Reinforcement Learning | 3,296 |
thu-ml/tianshou | A high-performance reinforcement learning library with modular interfaces and user-friendly APIs for building deep learning agents. | 8,069 |
iffix/machin | An open-source reinforcement learning library for PyTorch, providing a simple and clear implementation of various algorithms. | 402 |
p-christ/deep-reinforcement-learning-algorithms-with-pytorch | PyTorch implementations of popular deep reinforcement learning algorithms and environments. | 5,669 |
rle-foundation/rlexplore | Provides a unified toolkit for constructing, computing, and optimizing intrinsic reward modules in reinforcement learning | 373 |
tristandeleu/pytorch-maml-rl | Replication of Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks in PyTorch for reinforcement learning tasks | 830 |
luchris429/purejaxrl | A high-performance implementation of reinforcement learning training pipelines using JAX and PyTorch-like functionality | 755 |
eleutherai/gpt-neox | Provides a framework for training large-scale language models on GPUs with advanced features and optimizations. | 6,997 |
tatsu-lab/alpaca_farm | A framework for simulating and evaluating reinforcement learning from human feedback methods | 786 |
rlcode/reinforcement-learning | A collection of clean and minimal examples for various reinforcement learning algorithms | 3,433 |
yandexdataschool/practical_rl | An educational resource teaching practical reinforcement learning skills in Python using popular deep learning frameworks. | 5,952 |