awesome-rl
RL toolkit
A curated collection of resources and tools for reinforcement learning
Reinforcement learning resources curated
9k stars
440 watching
2k forks
last commit: over 2 years ago
Linked from 3 awesome lists
Awesome Reinforcement Learning / Codes / Codes for examples and exercises in Richard Sutton and Andrew Barto's Reinforcement Learning: An Introduction | |||
| Python Code | 13,685 | over 1 year ago | |
| MATLAB Code (BROKEN LINK) | |||
| C/Lisp Code | |||
| Julia Code | 309 | over 1 year ago | |
| Book | |||
| Exercise Solutions | 2,049 | over 1 year ago | |
Awesome Reinforcement Learning / Codes / Simulation code for Reinforcement Learning Control Problems | |||
| Pole-Cart Problem | |||
| Q-learning Controller | |||
Awesome Reinforcement Learning / Codes | |||
| MATLAB Environment and GUI for Reinforcement Learning | |||
| Reinforcement Learning Repository - University of Massachusetts, Amherst | |||
| Brown-UMBC Reinforcement Learning and Planning Library (Java) | |||
| Reinforcement Learning in R (MDP, Value Iteration) | |||
| Reinforcement Learning Environment in Python and MATLAB | |||
| RL-Glue | (standard interface for RL) and | ||
| PyBrain Library | Python-Based Reinforcement learning, Artificial intelligence, and Neural network | ||
| RLPy Framework | Value-Function-Based Reinforcement Learning Framework for Education and Research | ||
| Maja | Machine learning framework for problems in Reinforcement Learning in python | ||
| TeachingBox | Java based Reinforcement Learning framework | ||
| Policy Gradient Reinforcement Learning Toolbox for MATLAB | |||
| PIQLE | Platform Implementing Q-Learning and other RL algorithms | ||
| BeliefBox | Bayesian reinforcement learning library and toolkit | ||
| Deep Q-Learning with TensorFlow | 1,170 | over 8 years ago | A deep Q learning demonstration using Google Tensorflow |
| Atari | 265 | almost 8 years ago | Deep Q-networks and asynchronous agents in Torch |
| AgentNet | 301 | about 8 years ago | A python library for deep reinforcement learning and custom recurrent networks using Theano+Lasagne |
| Reinforcement Learning Examples by RLCode | 3,433 | over 2 years ago | A Collection of minimal and clean reinforcement learning examples |
| OpenAI Baselines | 15,885 | over 1 year ago | Well tested implementations ( ) of reinforcement learning algorithms from OpenAI |
| PyTorch Deep RL | 3,209 | over 1 year ago | Popular deep RL algorithm implementations with PyTorch |
| ChainerRL | 1,179 | over 4 years ago | Popular deep RL algorithm implementations with Chainer |
| Black-DROPS | 64 | almost 4 years ago | Modular and generic code for the model-based policy search Black-DROPS algorithm (IROS 2017 paper) and easy integration with the simulator |
| Gold | 345 | about 5 years ago | A reinforcement learning library for Golang |
| Jumanji | 657 | 11 months ago | A Suite of Industry-Driven Hardware-Accelerated RL Environments written in JAX |
Awesome Reinforcement Learning / Theory / Lectures | |||
| Reinforcement Learning Lecture Series 2021 | [DeepMind x UCL] | ||
| COMPM050/COMPGI13 Reinforcement Learning | [UCL] by David Silver | ||
| COMPMI22/COMPGI22 - Advanced Deep Learning and Reinforcement Learning | 820 | over 6 years ago | [UCL] |
Awesome Reinforcement Learning / Theory / Lectures / [UC Berkeley] CS188 Artificial Intelligence by Pieter Abbeel | |||
| Lecture 8: Markov Decision Processes 1 | |||
| Lecture 9: Markov Decision Processes 2 | |||
| Lecture 10: Reinforcement Learning 1 | |||
| Lecture 11: Reinforcement Learning 2 | |||
Awesome Reinforcement Learning / Theory / Lectures | |||
| CS7642 Reinforcement Learning | [Udacity (Georgia Tech.)] | ||
| CS229 Machine Learning - Lecture 16: Reinforcement Learning | [Stanford] by Andrew Ng | ||
| Deep RL Bootcamp | [UC Berkeley] | ||
| CS294 Deep Reinforcement Learning | [UC Berkeley] by John Schulman and Pieter Abbeel | ||
| 10703: Deep Reinforcement Learning and Control, Spring 2017 | [CMU] | ||
| 6.S094: Deep Learning for Self-Driving Cars | [MIT] | ||
Awesome Reinforcement Learning / Theory / Lectures / 6.S094: Deep Learning for Self-Driving Cars | |||
| Lecture 2: Deep Reinforcement Learning for Motion Planning | |||
Awesome Reinforcement Learning / Theory / Lectures / [Siraj Raval]: Introduction to AI for Video Games (Reinforcement Learning Video Series) | |||
| Introduction to AI for video games | |||
| Monte Carlo Prediction | |||
| Q learning explained | |||
| Solving the basic game of Pong | |||
| Actor Critic Algorithms | |||
| War Robots | |||
Awesome Reinforcement Learning / Theory / Lectures | |||
| Reinforcement Learning Fundamentals | [Mutual Information] | ||
Awesome Reinforcement Learning / Theory / Lectures / Reinforcement Learning Fundamentals | |||
| Reinforcement Learning: A Six Part Series | |||
| The Bellman Equations, Dynamic Programming, and Generalized Policy Iteration | |||
| Monte Carlo And Off-Policy Methods | |||
| TD Learning, Sarsa, and Q-Learning | |||
Awesome Reinforcement Learning / Theory / Books | |||
| [Book] | Richard Sutton and Andrew Barto, Reinforcement Learning: An Introduction (1st Edition, 1998) | ||
| [Book] | Richard Sutton and Andrew Barto, Reinforcement Learning: An Introduction (2nd Edition, in progress, 2018) | ||
| [Book] | Csaba Szepesvari, Algorithms for Reinforcement Learning | ||
| [Book Chapter] | David Poole and Alan Mackworth, Artificial Intelligence: Foundations of Computational Agents | ||
| [Book (Amazon)] | Dimitri P. Bertsekas and John N. Tsitsiklis, Neuro-Dynamic Programming | ||
| [Book (Amazon)] | Mykel J. Kochenderfer, Decision Making Under Uncertainty: Theory and Application | ||
| [Book(Manning)] | Deep Reinforcement Learning in Action | ||
| BOOK, VIDEOLECTURES, AND COURSE MATERIAL, 2019 | REINFORCEMENT LEARNING AND OPTIMAL CONTROL Dimitri P. Bertsekas | ||
Awesome Reinforcement Learning / Theory / Surveys | |||
| [Paper] | Leslie Pack Kaelbling, Michael L. Littman, Andrew W. Moore, Reinforcement Learning: A Survey (JAIR 1996) | ||
| [Paper] | S. S. Keerthi and B. Ravindran, A Tutorial Survey of Reinforcement Learning (Sadhana 1994) | ||
| [Paper] | Matthew E. Taylor, Peter Stone, Transfer Learning for Reinforcement Learning Domains: A Survey (JMLR 2009) | ||
| [Paper] | Jens Kober, J. Andrew Bagnell, Jan Peters, Reinforcement Learning in Robotics, A Survey (IJRR 2013) | ||
| [Paper] | Michael L. Littman, Reinforcement learning improves behaviour from evaluative feedback (Nature 2015) | ||
| [Book] | Marc P. Deisenroth, Gerhard Neumann, Jan Peter, A Survey on Policy Search for Robotics, Foundations and Trends in Robotics (2014) | ||
| [DOI] | Kai Arulkumaran, Marc Peter Deisenroth, Miles Brundage, Anil Anthony Bharath, A Brief Survey of Deep Reinforcement Learning (IEEE Signal Processing Magazine 2017) | ||
| [DOI] | Benjamin Recht, A Tour of Reinforcement Learning: The View from Continuous Control (Annu. Rev. Control Robot. Auton. Syst. 2019) | ||
Awesome Reinforcement Learning / Theory / Papers / Thesis | |||
| [DOI] | Marvin Minsky, Steps toward Artificial Intelligence, Proceedings of the IRE, 1961. (discusses issues in RL such as the "credit assignment problem") | ||
| [DOI] | Ian H. Witten, An Adaptive Optimal Controller for Discrete-Time Markov Environments, Information and Control, 1977. (earliest publication on temporal-difference (TD) learning rule) | ||
Awesome Reinforcement Learning / Theory / Papers / Thesis / Dynamic Programming (DP): | |||
| [Thesis] | Christopher J. C. H. Watkins, Learning from Delayed Rewards, Ph.D. Thesis, Cambridge University, 1989 | ||
Awesome Reinforcement Learning / Theory / Papers / Thesis / Monte Carlo: | |||
| [Paper] | Andrew Barto, Michael Duff, Monte Carlo Inversion and Reinforcement Learning, NIPS, 1994 | ||
| [Paper] | Satinder P. Singh, Richard S. Sutton, Reinforcement Learning with Replacing Eligibility Traces, Machine Learning, 1996 | ||
Awesome Reinforcement Learning / Theory / Papers / Thesis / Temporal-Difference: | |||
| [Paper] | Richard S. Sutton, Learning to predict by the methods of temporal differences. Machine Learning 3: 9-44, 1988 | ||
Awesome Reinforcement Learning / Theory / Papers / Thesis / Q-Learning (Off-policy TD algorithm): | |||
| [Thesis] | Chris Watkins, Learning from Delayed Rewards, Cambridge, 1989 | ||
Awesome Reinforcement Learning / Theory / Papers / Thesis / Sarsa (On-policy TD algorithm): | |||
| [Report] | G.A. Rummery, M. Niranjan, On-line Q-learning using connectionist systems, Technical Report, Cambridge Univ., 1994 | ||
| [Paper] | Richard S. Sutton, Generalization in Reinforcement Learning: Successful examples using sparse coding, NIPS, 1996 | ||
Awesome Reinforcement Learning / Theory / Papers / Thesis / R-Learning (learning of relative values) | |||
| [Paper-Google Scholar] | Andrew Schwartz, A Reinforcement Learning Method for Maximizing Undiscounted Rewards, ICML, 1993 | ||
Awesome Reinforcement Learning / Theory / Papers / Thesis / Function Approximation methods (Least-Square Temporal Difference, Least-Square Policy Iteration) | |||
| [Paper] | Steven J. Bradtke, Andrew G. Barto, Linear Least-Squares Algorithms for Temporal Difference Learning, Machine Learning, 1996 | ||
| [Paper] | Michail G. Lagoudakis, Ronald Parr, Model-Free Least Squares Policy Iteration, NIPS, 2001 | ||
Awesome Reinforcement Learning / Theory / Papers / Thesis / Policy Search / Policy Gradient | |||
| [Paper] | Richard Sutton, David McAllester, Satinder Singh, Yishay Mansour, Policy Gradient Methods for Reinforcement Learning with Function Approximation, NIPS, 1999 | ||
| [Paper] | Jan Peters, Sethu Vijayakumar, Stefan Schaal, Natural Actor-Critic, ECML, 2005 | ||
| [Paper] | Jens Kober, Jan Peters, Policy Search for Motor Primitives in Robotics, NIPS, 2009 | ||
| [Paper] | Jan Peters, Katharina Mulling, Yasemin Altun, Relative Entropy Policy Search, AAAI, 2010 | ||
| [Paper] | Freek Stulp, Olivier Sigaud, Path Integral Policy Improvement with Covariance Matrix Adaptation, ICML, 2012 | ||
| [Paper] | Nate Kohl, Peter Stone, Policy Gradient Reinforcement Learning for Fast Quadrupedal Locomotion, ICRA, 2004 | ||
| [Paper] | Marc Deisenroth, Carl Rasmussen, PILCO: A Model-Based and Data-Efficient Approach to Policy Search, ICML, 2011 | ||
| [Paper] | Scott Kuindersma, Roderic Grupen, Andrew Barto, Learning Dynamic Arm Motions for Postural Recovery, Humanoids, 2011 | ||
| Paper | Konstantinos Chatzilygeroudis, Roberto Rama, Rituraj Kaushik, Dorian Goepp, Vassilis Vassiliades, Jean-Baptiste Mouret, Black-Box Data-efficient Policy Search for Robotics, IROS, 2017. [ ] | ||
Awesome Reinforcement Learning / Theory / Papers / Thesis / Hierarchical RL | |||
| [Paper] | Richard Sutton, Doina Precup, Satinder Singh, Between MDPs and Semi-MDPs: A Framework for Temporal Abstraction in Reinforcement Learning, Artificial Intelligence, 1999 | ||
| [Paper] | George Konidaris, Andrew Barto, Building Portable Options: Skill Transfer in Reinforcement Learning, IJCAI, 2007 | ||
Awesome Reinforcement Learning / Theory / Papers / Thesis / Deep Learning + Reinforcement Learning (A sample of recent works on DL+RL) | |||
| [Paper] | V. Mnih, et. al., Human-level Control through Deep Reinforcement Learning, Nature, 2015 | ||
| [Paper] | Xiaoxiao Guo, Satinder Singh, Honglak Lee, Richard Lewis, Xiaoshi Wang, Deep Learning for Real-Time Atari Game Play Using Offline Monte-Carlo Tree Search Planning, NIPS, 2014 | ||
| [ArXiv] | Sergey Levine, Chelsea Finn, Trevor Darrel, Pieter Abbeel, End-to-End Training of Deep Visuomotor Policies. ArXiv, 16 Oct 2015 | ||
| [ArXiv] | Tom Schaul, John Quan, Ioannis Antonoglou, David Silver, Prioritized Experience Replay, ArXiv, 18 Nov 2015 | ||
| [ArXiv] | Hado van Hasselt, Arthur Guez, David Silver, Deep Reinforcement Learning with Double Q-Learning, ArXiv, 22 Sep 2015 | ||
| [ArXiv] | Volodymyr Mnih, Adrià Puigdomènech Badia, Mehdi Mirza, Alex Graves, Timothy P. Lillicrap, Tim Harley, David Silver, Koray Kavukcuoglu, Asynchronous Methods for Deep Reinforcement Learning, ArXiv, 4 Feb 2016 | ||
Awesome Reinforcement Learning / Applications / Game Playing | |||
| [Paper] | Backgammon - Gerald Tesauro, "TD-Gammon" game play using TD(λ) (ACM 1995) | ||
| [arXiv] | Chess - Jonathan Baxter, Andrew Tridgell and Lex Weaver, "KnightCap" program using TD(λ) (1999) | ||
| [arXiv] | Chess - Matthew Lai, Giraffe: Using deep reinforcement learning to play chess (2015) | ||
| [DOI] | Atari 2600 Games - Volodymyr Mnih, Koray Kavukcuoglu, David Silver et al., Human-level Control through Deep Reinforcement Learning (Nature 2015) | ||
| Flappy Bird Reinforcement Learning | 918 | almost 8 years ago | Flappy Bird - Sarvagya Vaish, |
| [Paper] | Mario - Kenneth O. Stanley and Risto Miikkulainen, MarI/O - learning to play Mario with evolutionary reinforcement learning using artificial neural networks (Evolutionary Computation 2002) | ||
| [DOI] | StarCraft II - Oriol Vinyals, Igor Babuschkin, Wojciech M. Czarnecki et al., Grandmaster level in StarCraft II using multi-agent reinforcement learning (Nature 2019) | ||
Awesome Reinforcement Learning / Applications / Robotics | |||
| [Paper] | Nate Kohl and Peter Stone, Policy Gradient Reinforcement Learning for Fast Quadrupedal Locomotion (ICRA 2004) | ||
| [Paper] | Petar Kormushev, Sylvain Calinon and Darwin G. Caldwel, Robot Motor SKill Coordination with EM-based Reinforcement Learning (IROS 2010) | ||
| [Paper] | Todd Hester, Michael Quinlan, and Peter Stone, Generalized Model Learning for Reinforcement Learning on a Humanoid Robot (ICRA 2010) | ||
| [Paper] | George Konidaris, Scott Kuindersma, Roderic Grupen and Andrew Barto, Autonomous Skill Acquisition on a Mobile Manipulator (AAAI 2011) | ||
| [Paper] | Marc Peter Deisenroth and Carl Edward Rasmussen,PILCO: A Model-Based and Data-Efficient Approach to Policy Search (ICML 2011) | ||
| [Paper] | Scott Niekum, Sachin Chitta, Bhaskara Marthi, et al., Incremental Semantically Grounded Learning from Demonstration (RSS 2013) | ||
| [Paper] | Mark Cutler and Jonathan P. How, Efficient Reinforcement Learning for Robots using Informative Simulated Priors (ICRA 2015) | ||
| ArXiv | Antoine Cully, Jeff Clune, Danesh Tarapore and Jean-Baptiste Mouret, Robots that can adapt like animals (Nature 2015) [ ] [ ] [ ] | ||
| ArXiv | Konstantinos Chatzilygeroudis, Roberto Rama, Rituraj Kaushik et al, Black-Box Data-efficient Policy Search for Robotics (IROS 2017) [ ] [ ] [ ] | ||
| [DOI] | P. Travis Jardine, Michael Kogan, Sidney N. Givigi and Shahram Yousefi, Adaptive predictive control of a differential drive robot tuned with reinforcement learning (Int J Adapt Control Signal Process 2019) | ||
Awesome Reinforcement Learning / Applications / Control | |||
| [Paper] | Pieter Abbeel, Adam Coates, et al., An Application of Reinforcement Learning to Aerobatic Helicopter Flight (NIPS 2006) | ||
| [Paper] | J. Andrew Bagnell and Jeff G. Schneider, Autonomous helicopter control using Reinforcement Learning Policy Search Methods (ICRA 2001) | ||
Awesome Reinforcement Learning / Applications / Operations Research | |||
| [Paper] | Scott Proper and Prasad Tadepalli, Scaling Average-reward Reinforcement Learning for Product Delivery (AAAI 2004) | ||
| [Paper] | Naoki Abe, Naval Verma et al., Cross Channel Optimized Marketing by Reinforcement Learning (KDD 2004) | ||
| [DOI] | Bernd Waschneck, Andre Reichstaller, Lenz Belzner et al., Deep reinforcement learning for semiconductor production scheduling (ASMC 2018) | ||
Awesome Reinforcement Learning / Applications / Human Computer Interaction | |||
| [Paper] | Satinder Singh, Diane Litman et al., Optimizing Dialogue Management with Reinforcement Learning: Experiments with the NJFun System (JAIR 2002) | ||
Awesome Reinforcement Learning / Codes / Book | |||
| Python Code | 13,685 | over 1 year ago | (2nd Edition) |
| MATLAB Code | (1st Edition) | ||
Awesome Reinforcement Learning / Codes / Simulation code for Reinforcement Learning Control Problems | |||
| Pole-Cart Problem | |||
| Q-learning Controller | |||
Awesome Reinforcement Learning / Codes | |||
| MATLAB Environment and GUI for Reinforcement Learning | |||
| Reinforcement Learning Repository - University of Massachusetts, Amherst | |||
| Brown-UMBC Reinforcement Learning and Planning Library (Java) | |||
| Reinforcement Learning in R (MDP, Value Iteration) | |||
| Reinforcement Learning Environment in Python and MATLAB | |||
| RL-Glue | (standard interface for RL) and | ||
| PyBrain Library | Python-Based Reinforcement learning, Artificial intelligence, and Neural network | ||
| RLPy Framework | Value-Function-Based Reinforcement Learning Framework for Education and Research | ||
| Maja | Machine learning framework for problems in Reinforcement Learning in python | ||
| TeachingBox | Java based Reinforcement Learning framework | ||
| Policy Gradient Reinforcement Learning Toolbox for MATLAB | |||
| PIQLE | Platform Implementing Q-Learning and other RL algorithms | ||
| BeliefBox | Bayesian reinforcement learning library and toolkit | ||
| Deep Q-Learning with TensorFlow | 1,170 | over 8 years ago | A deep Q learning demonstration using Google Tensorflow |
| Atari | 265 | almost 8 years ago | Deep Q-networks and asynchronous agents in Torch |
| AgentNet | 301 | about 8 years ago | A python library for deep reinforcement learning and custom recurrent networks using Theano+Lasagne |
| Reinforcement Learning Examples by RLCode | 3,433 | over 2 years ago | A Collection of minimal and clean reinforcement learning examples |
| OpenAI Baselines | 15,885 | over 1 year ago | Well tested implementations ( ) of reinforcement learning algorithms from OpenAI |
| PyTorch Deep RL | 3,209 | over 1 year ago | Popular deep RL algorithm implementations with PyTorch |
| ChainerRL | 1,179 | over 4 years ago | Popular deep RL algorithm implementations with Chainer |
| Black-DROPS | 64 | almost 4 years ago | Modular and generic code for the model-based policy search Black-DROPS algorithm (IROS 2017 paper) and easy integration with the simulator |
| Jumanji | 657 | 11 months ago | A Suite of Industry-Driven Hardware-Accelerated RL Environments written in JAX |
Awesome Reinforcement Learning / Tutorials / Websites | |||
| Reinforcement Learning: A Tutorial | Mance Harmon and Stephanie Harmon, | ||
| [Paper] | C. Igel, M.A. Riedmiller, et al., Reinforcement Learning in a Nutshell, ESANN, 2007 | ||
| Reinforcement Learning | UNSW - | ||
Awesome Reinforcement Learning / Tutorials / Websites / Reinforcement Learning | |||
| Introduction | |||
| TD-Learning | |||
| Q-Learning and SARSA | |||
| Applet for "Cat and Mouse" Game | |||
Awesome Reinforcement Learning / Tutorials / Websites | |||
| ROS Reinforcement Learning Tutorial | |||
| POMDP for Dummies | |||
Awesome Reinforcement Learning / Tutorials / Websites / Scholarpedia articles on: | |||
| Reinforcement Learning | |||
| Temporal Difference Learning | |||
Awesome Reinforcement Learning / Tutorials / Websites | |||
| MATLAB Software, presentations, and demo videos | Repository with useful | ||
| Bibliography on Reinforcement Learning | |||
| [Class Website] | UC Berkeley - CS 294: Deep Reinforcement Learning, Fall 2015 (John Schulman, Pieter Abbeel) | ||
| Blog posts on Reinforcement Learning, Parts 1-4 | by Travis DeWolf | ||
| The Arcade Learning Environment | Atari 2600 games environment for developing AI agents | ||
| Deep Reinforcement Learning: Pong from Pixels | by Andrej Karpathy | ||
| Demystifying Deep Reinforcement Learning | |||
| Let’s make a DQN | |||
| Simple Reinforcement Learning with Tensorflow, Parts 0-8 | by Arthur Juliani | ||
| Practical_RL | 5,952 | about 1 year ago | github-based course in reinforcement learning in the wild (lectures, coding labs, projects) |
| RLenv.directory: Explore and find new reinforcement learning environments. | |||
| RL: Past, Present and Future Perspectives | Katja Hofmann's talk at NeurIPS '19 - | ||
| How to Structure, Organize, Track and Manage Reinforcement Learning (RL) Projects | |||
| Reinforcement Learning Cheat Sheet | A summary of some important concepts and algorithms in RL | ||
Awesome Reinforcement Learning / Online Demos | |||
| Real-world demonstrations of Reinforcement Learning | |||
| Deep Q-Learning Demo | A deep Q learning demonstration using ConvNetJS | ||
| Deep Q-Learning with Tensor Flow | 1,170 | over 8 years ago | A deep Q learning demonstration using Google Tensorflow |
| Reinforcement Learning Demo | A reinforcement learning demo using reinforcejs by Andrej Karpathy | ||
Awesome Reinforcement Learning / Open Source Reinforcement Learning Platforms | |||
| OpenAI gym | 34,966 | about 1 year ago | A toolkit for developing and comparing reinforcement learning algorithms |
| OpenAI universe | 7,478 | over 7 years ago | A software platform for measuring and training an AI's general intelligence across the world's supply of games, websites and other applications |
| DeepMind Lab | 7,146 | almost 3 years ago | A customisable 3D platform for agent-based AI research |
| Project Malmo | 4,111 | almost 2 years ago | A platform for Artificial Intelligence experimentation and research built on top of Minecraft by Microsoft |
| ViZDoom | 1,756 | about 1 year ago | Doom-based AI research platform for reinforcement learning from raw visual information |
| Retro Learning Environment | 185 | over 7 years ago | An AI platform for reinforcement learning based on video game emulators. Currently supports SNES and Sega Genesis. Compatible with OpenAI gym |
| torch-twrl | 251 | over 8 years ago | A package that enables reinforcement learning in Torch by Twitter |
| UETorch | 369 | almost 8 years ago | A Torch plugin for Unreal Engine 4 by Facebook |
| TorchCraft | 1,387 | about 4 years ago | Connecting Torch to StarCraft |
| garage | 1,893 | over 2 years ago | A framework for reproducible reinformcement learning research, fully compatible with OpenAI Gym and DeepMind Control Suite (successor to rllab) |
| TensorForce | 3,299 | over 1 year ago | Practical deep reinforcement learning on TensorFlow with Gitter support and OpenAI Gym/Universe/DeepMind Lab integration |
| tf-TRFL | 3,136 | almost 3 years ago | A library built on top of TensorFlow that exposes several useful building blocks for implementing Reinforcement Learning agents |
| OpenAI lab | 326 | almost 8 years ago | An experimentation system for Reinforcement Learning using OpenAI Gym, Tensorflow, and Keras |
| keras-rl | 8 | over 3 years ago | State-of-the art deep reinforcement learning algorithms in Keras designed for compatibility with OpenAI |
| BURLAP | Brown-UMBC Reinforcement Learning and Planning, a library written in Java | ||
| MAgent | 1,700 | about 3 years ago | A Platform for Many-agent Reinforcement Learning |
| Ray RLlib | Ray RLlib is a reinforcement learning library that aims to provide both performance and composability | ||
| SLM Lab | 1,256 | about 3 years ago | A research framework for Deep Reinforcement Learning using Unity, OpenAI Gym, PyTorch, Tensorflow |
| Unity ML Agents | 17,334 | 11 months ago | Create reinforcement learning environments using the Unity Editor |
| Intel Coach | 2,334 | almost 3 years ago | Coach is a python reinforcement learning research framework containing implementation of many state-of-the-art algorithms |
| Microsoft AirSim | Open source simulator based on Unreal Engine for autonomous vehicles from Microsoft AI & Research | ||
| DI-engine | 3,143 | 11 months ago | DI-engine is a generalized Decision Intelligence engine. It supports most basic deep reinforcement learning (DRL) algorithms, such as DQN, PPO, SAC, and domain-specific algorithms like QMIX in multi-agent RL, GAIL in inverse RL, and RND in exploration problems |
| Jumanji | 657 | 11 months ago | A Suite of Industry-Driven Hardware-Accelerated RL Environments written in JAX |