Balancing Performance and Safety in RL

Safe Reinforcement Learning (RL) is a subset of RL that focuses on learning policies that not only maximize the long-term reward but also ensure reasonable system performance and/or respect safety constraints during the learning and/or deployment processes [1].

Safety is the opposite of risk, which refers to the stochastic nature of the environment. An optimal policy for long-term reward maximization may still perform poorly in some catastrophic situations due to inherent uncertainty.

Three fundamental categories of safe RL algorithms

  1. Constrained MDP and Risk-sensitive MDP

These algorithms transform the optimization criterion to include some risk measures. This category includes Constrained Markov Decision Process (CMDP) and Risk-Sensitive Markov Decision Process (RS-MDP).

  1. Safe Exploration

This category modifies the exploration process through the incorporation of prior/external knowledge and/or the use of a risk metric, while the optimization criterion remains. Some examples of algorithms in this category are Upper Confidence Bound for Risk (UCBR) and Variance-Based Risk-Sensitive Exploration (VB-RSE).

  1. Adversary Policy

In an RL setting, perturbed observations can fool an agent with a neural network policy into taking actions that lead to poor performance. Adversarial policies can help in detecting and mitigating such attacks.

🌟 Here is a curated list of safe RL papers from 2017 to 2022.

Our Journey of Reimplementing Safe RL Algorithms

Reimplementing state-of-the-art RL algorithms allows us to gain a deeper understanding of the algorithms’ inner workings, and subsequently, explore novel and innovative approaches. In my course “Advanced Topics in Reinforcement Learning”, we took on the challenge of reimplementing ideas from several recent safe RL papers. Our findings and discussions are available as scientific blogs, with code re-implementations available on our GitHub repository (https://github.com/Safe-RL-Team). Join us on an exciting journey of advancing the field of Safe RL!

  1. Safe Reinforcement Learning via Curriculum Induction, Matteo Turchetta, Andrey Kolobov, Shital Shah, Andreas Krause, and Alekh Agarwal

    📚 Blog Marvin Sextro, Jonas Loos

  2. Safe Reinforcement Learning with Natural Language Constraints, Tsung-Yen Yang, Michael Hu, Yinlam Chow, Peter J. Ramadge, and Karthik Narasimhan, NeurIPS 2021

    📚 Blog Hongyou Zhou

  3. Adversarial Policies: Attacking Deep Reinforcement Learning, Adam Gleave, Michael Dennis, Cody Wild, Neel Kant, Sergey Levine, and Stuart Russell, ICLR 2020

    📚 Blog Lorenz Hufe, Jarek Liesen

  4. Reward constrained policy optimization, Chen Tessler, Daniel J. Mankowitz, and Shie Mannor, ICLR 2019

    📚 Blog Boris Meinardus, Tuan Anh Le

  5. Constrained Policy Optimization via Bayesian World Models, Yarden As, Ilnura Usmanova, Sebastian Curi and Andreas Krause, ICLR 2022

    📚 Blog Vincent Meilinger

  6. Constrained Policy Optimization, Joshua Achiam, David Held, Aviv Tamar, and Pieter Abbeel, ICML 2017

    📚 Blog Thanh Cuong Le, Paul Hasenbusch

  7. Responsive Safety in Reinforcement Learning by PID Lagrangian Methods Adam, Adam Stooke, Joshua Achiam, and Pieter Abbeel, ICML 2020

    📚 Blog Wenxi Huang

  8. There Is No Turning Back: A Self-Supervised Approach for Reversibility-Aware Reinforcement Learning, Nathan Grinsztajn, Johan Ferret, Olivier Pietquin, Philippe Preux, and Matthieu Geist, NeurIPS 2021

    📚 Blog Malik-Manel Hashim

  9. Uncertainty-Based Offline Reinforcement Learning with Diversified Q-Ensemble, Gaon An, Seungyong Moon, Jang-Hyun Kim, and Hyun Oh Song, NeurIPS 2021

    📚 Blog Jonas Loos, Julian Dralle

  10. Learning Barrier Certificates: Towards Safe Reinforcement Learning with Zero Training-time Violations, Yuping Luo, and Tengyu Ma, NeurIPS 2021

    📚 Blog Lars Chen, Jeremiah Flannery

  11. Teachable Reinforcement Learning via Advice Distillation, Olivia Watkins, Trevor Darrell, Pieter Abbeel, Jacob Andreas, and Abhishek Gupta, NeurIPS 2021

    📚 Blog Mihai Dumitrescu, Claire Sturgill

  12. Cautious Adaptation For Reinforcement Learning in Safety-Critical Settings, Jesse Zhang, Brian Cheung, Chelsea Finn, Sergey Levine, and Dinesh Jayaraman, ICML 2020

    📚 Blog Maren Eberle

  13. Verifiable Reinforcement Learning via Policy Extraction, Osbert Bastani, Yewen Pu, and Armando Solar-Lezama, NeurIPS 2018

    📚 Blog Christoph Pröschel

  14. Time Discretization-Invariant Safe Action Repetition for Policy Gradient Methods, Seohong Park, Jaekyeom Kim, and Gunhee Kim, NeurIPS 2021

    📚 Blog Hristo Boyadzhiev

Pushing Boundaries and Prioritizing Safety in RL

Safe Reinforcement Learning is a cutting-edge field that holds immense potential for real-world applications. By implementing and exploring ideas from state-of-the-art papers, we can push the boundaries of what is possible and pave the way for even more effective and robust safe RL algorithms.

So, let’s dive in and make the world a safer place, one policy at a time!

Citation

Cited as:

Guo, Rong. (April 2023). Safe Reinforcement Learning. Rong’Log. https://rongrg.github.io/posts/2023-04-12-saferl/.

Or

@article{guo2023safeRL,
  title   = "Safe Reinforcement Learning",
  author  = "Guo, Rong",
  journal = "rongrg.github.io",
  year    = "2023",
  month   = "April",
  url     = "https://rongrg.github.io/posts/2023-04-12-saferl//"
}

Reference

  1. Javier García and Fernando Fernández, A Comprehensive Survey on Safe Reinforcement Learning, Journal of Machine Learning Research 2015
  2. Alex Ray, Joshua Achiam and Dario Amodei, Benchmarking Safe Exploration in Deep Reinforcement Learning, Open AI 2019
  3. Aviral Kumar and Sergey Levine, Offline Reinforcement Learning: From Algorithms to Practical Challenges, NeurIPS Tutorial 2020
  4. Yongshuai Liu, Avishai Halev,and Xin Liu, Policy Learning with Constraints in Model-free Reinforcement Learning: A Survey, IJCAI 2021
  5. Sergey Levine, Aviral Kumar, George Tucker, and Justin Fu, Offline Reinforcement Learning: Tutorial, Review, and Perspectives on Open Problems, arxiv 2020
  6. Thomas et al., Preventing undesirable behavior of intelligent machines, Science 2019