Awesome Deep Reinforcement Learning / General guidances |
Awesome Offline RL | 935 | 6 months ago | |
Reinforcement Learning Today | | | |
Multiagent Reinforcement Learning by Marc Lanctot RLSS @ Lille | | | 11 July 2019 |
RLDM 2019 Notes by David Abel | | | 11 July 2019 |
A Survey of Reinforcement Learning Informed by Natural Language | | | 10 Jun 2019 |
Challenges of Real-World Reinforcement Learning | | | 29 Apr 2019 |
Ray Interference: a Source of Plateaus in Deep Reinforcement Learning | | | 25 Apr 2019 |
Principles of Deep RL by David Silver | | | |
University AI's General introduction to deep rl (in Chinese) | | | |
OpenAI's spinningup | | | |
The Promise of Hierarchical Reinforcement Learning | | | 9 Mar 2019 |
Deep Reinforcement Learning that Matters | | | 30 Jan 2019 |
Awesome Deep Reinforcement Learning / 2024 |
Foundation Policies with Hilbert Representations | | | 23 Feb 2024 |
Awesome Deep Reinforcement Learning / 2022 |
arxiv | | | Reinforcement Learning with Action-Free Pre-Training from Videos |
Awesome Deep Reinforcement Learning / Generalist policies |
Foundation Policies with Hilbert Representations | | | 23 Feb 2024 |
Awesome Deep Reinforcement Learning / Foundations and theory |
General non-linear Bellman equations | | | 9 July 2019 |
Monte Carlo Gradient Estimation in Machine Learning | | | 25 Jun 2019 |
Awesome Deep Reinforcement Learning / General benchmark frameworks |
Brax | 2,370 | 6 days ago | |
Android-Env | 1,023 | about 21 hours ago | |
MuJoCo | | | | |
Unsupervised RL Benchmark | 332 | about 2 years ago | |
Dataset for Offline RL | 1,359 | 15 days ago | |
Spriteworld: a flexible, configurable python-based reinforcement learning environment | 368 | over 4 years ago | |
Chainerrl Visualizer | 54 | almost 2 years ago | |
Behaviour Suite for Reinforcement Learning | | | 13 Aug 2019 | |
Quantifying Generalization in Reinforcement Learning | | | 20 Dec 2018 |
S-RL Toolbox: Environments, Datasets and Evaluation Metrics for State Representation Learning | | | 25 Sept 2018 |
dopamine | 10,581 | 29 days ago | |
StarCraft II | 8,038 | 4 months ago | |
tfrl | 3,134 | almost 2 years ago | |
chainerrl | 1,176 | over 3 years ago | |
PARL | 3,280 | 4 months ago | |
DI-engine: a generalized decision intelligence engine. It supports various Deep RL algorithms | 3,120 | 6 days ago | |
PPO x Family: Course in Chinese for Deep RL | 1,987 | 7 months ago | |
Awesome Deep Reinforcement Learning / Unsupervised |
URLB: Unsupervised Reinforcement Learning Benchmark | | | 28 Oct 2021 |
APS: Active Pretraining with Successor Feature | | | 31 Aug 2021 |
Behavior From the Void: Unsupervised Active Pre-Training | | | 8 Mar 2021 |
Reinforcement Learning with Prototypical Representations | | | 22 Feb 2021 |
Efficient Exploration via State Marginal Matching | | | 12 Jun 2019 |
Self-Supervised Exploration via Disagreement | | | 10 Jun 2019 |
Exploration by Random Network Distillation | | | 30 Oct 2018 |
Diversity is All You Need: Learning Skills without a Reward Function | | | 16 Feb 2018 |
Curiosity-driven Exploration by Self-supervised Prediction | | | 15 May 2017 |
Awesome Deep Reinforcement Learning / Offline |
PerSim: Data-efficient Offline Reinforcement Learning with Heterogeneous Agents via Personalized Simulators | | | 10 Nov 2021 |
A General Offline Reinforcement Learning Framework for Interactive Recommendation | | | AAAI 2021 |
Awesome Deep Reinforcement Learning / Value based |
Harnessing Structures for Value-Based Planning and Reinforcement Learning | | | 5 Feb 2020 | |
Recurrent Value Functions | | | 23 May 2019 |
Stochastic Lipschitz Q-Learning | | | 24 Apr 2019 |
TreeQN and ATreeC: Differentiable Tree-Structured Models for Deep Reinforcement Learning | | | 8 Mar 2018 |
DISTRIBUTED PRIORITIZED EXPERIENCE REPLAY | | | 2 Mar 2018 |
Rainbow: Combining Improvements in Deep Reinforcement Learning | | | 6 Oct 2017 |
Learning from Demonstrations for Real World Reinforcement Learning | | | 12 Apr 2017 |
Dueling Network Architecture | | | |
Double DQN | | | |
Prioritized Experience | | | |
Deep Q-Networks | | | |
Awesome Deep Reinforcement Learning / Policy gradient |
Phasic Policy Gradient | | | 9 Sep 2020 |
An operator view of policy gradient methods | | | 22 Jun 2020 |
Direct Policy Gradients: Direct Optimization of Policies in Discrete Action Spaces | | | 14 Jun 2019 |
Policy Gradient Search: Online Planning and Expert Iteration without Search Trees | | | 7 Apr 2019 |
SUPERVISED POLICY UPDATE FOR DEEP REINFORCEMENT LEARNING | | | 24 Dec 2018 |
PPO-CMA: Proximal Policy Optimization with Covariance Matrix Adaptation | | | 5 Oct 2018 |
Clipped Action Policy Gradient | | | 22 June 2018 |
Expected Policy Gradients for Reinforcement Learning | | | 10 Jan 2018 |
Proximal Policy Optimization Algorithms | | | 20 July 2017 |
Emergence of Locomotion Behaviours in Rich Environments | | | 7 July 2017 |
Interpolated Policy Gradient: Merging On-Policy and Off-Policy Gradient Estimation for Deep Reinforcement Learning | | | 1 Jun 2017 |
Equivalence Between Policy Gradients and Soft Q-Learning | | | |
Trust Region Policy Optimization | | | |
Reinforcement Learning with Deep Energy-Based Policies | | | |
Q-PROP: SAMPLE-EFFICIENT POLICY GRADIENT WITH AN OFF-POLICY CRITIC | | | |
Awesome Deep Reinforcement Learning / Explorations |
Entropic Desired Dynamics for Intrinsic Control | | | 2021 |
Self-Supervised Exploration via Disagreement | | | 10 Jun 2019 |
Approximate Exploration through State Abstraction | | | 24 Jan 2019 |
The Uncertainty Bellman Equation and Exploration | | | 15 Sep 2017 |
Noisy Networks for Exploration | | | 30 Jun 2017 |
Count-Based Exploration in Feature Space for Reinforcement Learning | | | 25 Jun 2017 |
Count-Based Exploration with Neural Density Models | | | 14 Jun 2017 |
UCB and InfoGain Exploration via Q-Ensembles | | | 11 Jun 2017 |
Minimax Regret Bounds for Reinforcement Learning | | | 16 Mar 2017 |
Incentivizing Exploration In Reinforcement Learning With Deep Predictive Models | | | |
EX2: Exploration with Exemplar Models for Deep Reinforcement Learning | | | |
Awesome Deep Reinforcement Learning / Actor-Critic |
Generalized Off-Policy Actor-Critic | | | 27 Mar 2019 |
Soft Actor-Critic Algorithms and Applications | | | 29 Jan 2019 |
The Reactor: A Sample-Efficient Actor-Critic Architecture | | | 15 Apr 2017 |
SAMPLE EFFICIENT ACTOR-CRITIC WITH EXPERIENCE REPLAY | | | |
REINFORCEMENT LEARNING WITH UNSUPERVISED AUXILIARY TASKS | | | |
Continuous control with deep reinforcement learning | | | |
Awesome Deep Reinforcement Learning / Model-based |
Self-Consistent Models and Values | | | 25 Oct 2021 |
When to use parametric models in reinforcement learning? | | | 12 Jun 2019 |
Model Based Reinforcement Learning for Atari | | | 5 Mar 2019 |
Model-Based Stabilisation of Deep Reinforcement Learning | | | 6 Sep 2018 |
Learning model-based planning from scratch | | | 19 July 2017 |
Awesome Deep Reinforcement Learning / Model-free + Model-based |
Imagination-Augmented Agents for Deep Reinforcement Learning | | | 19 July 2017 |
Awesome Deep Reinforcement Learning / Hierarchical |
WHY DOES HIERARCHY (SOMETIMES) WORK SO WELL IN REINFORCEMENT LEARNING? | | | 23 Sep 2019 |
Language as an Abstraction for Hierarchical Deep Reinforcement Learning | | | 18 Jun 2019 |
Awesome Deep Reinforcement Learning / Option |
Variational Option Discovery Algorithms | | | 26 July 2018 |
A Laplacian Framework for Option Discovery in Reinforcement Learning | | | 16 Jun 2017 |
Awesome Deep Reinforcement Learning / Connection with other methods |
Robust Imitation of Diverse Behaviors | | | |
Learning human behaviors from motion capture by adversarial imitation | | | |
Connecting Generative Adversarial Networks and Actor-Critic Methods | | | |
Awesome Deep Reinforcement Learning / Connecting value and policy methods |
Bridging the Gap Between Value and Policy Based Reinforcement Learning | | | |
Policy gradient and Q-learning | | | |
Awesome Deep Reinforcement Learning / Reward design |
End-to-End Robotic Reinforcement Learning without Reward Engineering | | | 16 Apr 2019 |
Reinforcement Learning with Corrupted Reward Channel | | | 23 May 2017 |
Awesome Deep Reinforcement Learning / Unifying |
Multi-step Reinforcement Learning: A Unifying Algorithm | | | |
Awesome Deep Reinforcement Learning / Faster DRL |
Neural Episodic Control | | | |
Awesome Deep Reinforcement Learning / Multi-agent |
No Press Diplomacy: Modeling Multi-Agent Gameplay | | | 4 Sep 2019 |
Options as responses: Grounding behavioural hierarchies in multi-agent RL | | | 6 Jun 2019 |
Evolutionary Reinforcement Learning for Sample-Efficient Multiagent Coordination | | | 18 Jun 2019 |
A Regularized Opponent Model with Maximum Entropy Objective | | | 17 May 2019 |
Deep Q-Learning for Nash Equilibria: Nash-DQN | | | 23 Apr 2019 |
Malthusian Reinforcement Learning | | | 3 Mar 2019 |
Bayesian Action Decoder for Deep Multi-Agent Reinforcement Learning | | | 4 Nov 2018 |
INTRINSIC SOCIAL MOTIVATION VIA CAUSAL INFLUENCE IN MULTI-AGENT RL | | | 19 Oct 2018 |
QMIX: Monotonic Value Function Factorisation for Deep Multi-Agent Reinforcement Learning | | | 30 Mar 2018 |
Modeling Others using Oneself in Multi-Agent Reinforcement Learning | | | 26 Feb 2018 |
The Mechanics of n-Player Differentiable Games | | | 15 Feb 2018 |
Continuous Adaptation via Meta-Learning in Nonstationary and Competitive Environments | | | 10 Oct 2017 |
Learning with Opponent-Learning Awareness | | | 13 Sep 2017 |
Counterfactual Multi-Agent Policy Gradients | | | |
Multi-Agent Actor-Critic for Mixed Cooperative-Competitive Environments | | | 7 Jun 2017 |
Multiagent Bidirectionally-Coordinated Nets for Learning to Play StarCraft Combat Games | | | 29 Mar 2017 |
Awesome Deep Reinforcement Learning / New design |
IMPALA: Scalable Distributed Deep-RL with Importance Weighted Actor-Learner Architectures | | | 9 Feb 2018 |
Reverse Curriculum Generation for Reinforcement Learning | | | |
Trial without Error: Towards Safe Reinforcement Learning via Human Intervention | | | |
Learning to Design Games: Strategic Environments in Deep Reinforcement Learning | | | 5 July 2017 |
Awesome Deep Reinforcement Learning / Multitask |
Kickstarting Deep Reinforcement Learning | | | 10 Mar 2018 |
Zero-Shot Task Generalization with Multi-Task Deep Reinforcement Learning | | | 7 Nov 2017 |
Distral: Robust Multitask Reinforcement Learning | | | 13 July 2017 |
Awesome Deep Reinforcement Learning / Observational Learning |
Observational Learning by Reinforcement Learning | | | 20 Jun 2017 |
|
Discovery of Useful Questions as Auxiliary Tasks | | | 10 Sep 2019 |
Meta-learning of Sequential Strategies | | | 8 May 2019 |
Efficient Off-Policy Meta-Reinforcement Learning via Probabilistic Context Variables | | | 19 Mar 2019 |
Some Considerations on Learning to Explore via Meta-Reinforcement Learning | | | 11 Jan 2019 |
Meta-Gradient Reinforcement Learning | | | 24 May 2018 |
ProMP: Proximal Meta-Policy Search | | | 16 Oct 2018 |
Unsupervised Meta-Learning for Reinforcement Learning | | | 12 Jun 2018 |
Awesome Deep Reinforcement Learning / Distributional |
GAN Q-learning | | | 20 July 2018 |
Implicit Quantile Networks for Distributional Reinforcement Learning | | | 14 Jun 2018 |
Nonlinear Distributional Gradient Temporal-Difference Learning | | | 20 May 2018 |
DISTRIBUTED DISTRIBUTIONAL DETERMINISTIC POLICY GRADIENTS | | | 23 Apr 2018 |
An Analysis of Categorical Distributional Reinforcement Learning | | | 22 Feb 2018 |
Distributional Reinforcement Learning with Quantile Regression | | | 27 Oct 2017 |
A Distributional Perspective on Reinforcement Learning | | | 21 July 2017 |
Awesome Deep Reinforcement Learning / Planning |
Search on the Replay Buffer: Bridging Planning and Reinforcement Learning | | | 12 June 2019 |
Awesome Deep Reinforcement Learning / Safety |
Robust Reinforcement Learning for Continuous Control with Model Misspecification | | | 18 Jun 2019 |
Verifiable Reinforcement Learning via Policy Extraction | | | 22 May 2018 |
Awesome Deep Reinforcement Learning / Inverse RL |
ADDRESSING SAMPLE INEFFICIENCY AND REWARD BIAS IN INVERSE REINFORCEMENT LEARNING | | | 9 Sep 2018 |
Awesome Deep Reinforcement Learning / No reward RL |
Fast Task Inference with Variational Intrinsic Successor Features | | | 2 Jun 2019 |
Curiosity-driven Exploration by Self-supervised Prediction | | | 15 May 2017 |
Awesome Deep Reinforcement Learning / Time |
Interval timing in deep reinforcement learning agents | | | 31 May 2019 |
Time Limits in Reinforcement Learning | | | |
Awesome Deep Reinforcement Learning / Adversarial learning |
Sample-efficient Adversarial Imitation Learning from Observation | | | 18 Jun 2019 |
Awesome Deep Reinforcement Learning / Use Natural Language |
Using Natural Language for Reward Shaping in Reinforcement Learning | | | 31 May 2019 |
Awesome Deep Reinforcement Learning / Generative and contrastive representation learning |
Unsupervised State Representation Learning in Atari | | | 19 Jun 2019 |
Awesome Deep Reinforcement Learning / Belief |
Shaping Belief States with Generative Environment Models for RL | | | 24 Jun 2019 |
Awesome Deep Reinforcement Learning / PAC |
Provably Convergent Off-Policy Actor-Critic with Function Approximation | | | 11 Nov 2019 |
Awesome Deep Reinforcement Learning / Applications |
Benchmarks for Deep Off-Policy Evaluation | | | 30 Mar 2021 |
Learning Reciprocity in Complex Sequential Social Dilemmas | | | 19 Mar 2019 |
DeepMimic: Example-Guided Deep Reinforcement Learning of Physics-Based Character Skills | | | 9 Apr 2018 |
TUNING RECURRENT NEURAL NETWORKS WITH REINFORCEMENT LEARNING | | | |