Safety-critical dynamical systems are fundamental to various industries, including aerospace, autonomous systems, and healthcare robotics, where the occurrence of safety issues or functional failure could result in severe consequences. One of the major challenges in these systems is the potential degradation of components and actuators, which can jeopardize the safety and stability of the entire system. Thus, it is imperative to incorporate the system’s health status into the control design framework to ensure resilience to functional degradation.
These types of systems often involve uncertainties and incomplete knowledge, particularly as components deteriorate, leading to nonlinear changes in system dynamics. This necessitates the development of learning strategies that enable the assimilation of available data within the control learning paradigm. Ensuring safety during both the exploration and exploitation phases of such dynamical systems is crucial. Traditional model-based control methods require precise system models, making them less effective in scenarios involving uncertainties and degradation.
Reinforcement Learning (RL) has emerged as a powerful alternative capable of learning optimal control strategies for partially or fully unknown dynamic systems based on input-output data without exact knowledge of system models. However, implementing RL-based approaches poses its own set of challenges, as the exploration phase essential for learning might steer the system into unsafe territories, potentially accelerating degradation. Additionally, ensuring provable safety guarantees during the operational phase is critical for sustained safety in system operation.
In this context, the Safe Reinforcement Learning (Safe RL) paradigm aims to develop RL-based approaches that prioritize safety guarantees alongside stability and optimality. This thesis tackles these challenges by introducing innovative control learning strategies that can adapt to uncertainties and functional degradation, thus advancing the fields of safe RL and introducing the concept of Degradation-Tolerant Control (DTC).
Key contributions of this work include ensuring the optimality, safety, and stability of control policies during both exploration and exploitation phases through the integration of Control Barrier Functions (CBFs) and Control Lyapunov Functions (CLFs) within the RL framework. CBFs identify safe operating regions, while CLFs maintain system stability by guiding the learning process to respect safety and stability constraints.
Furthermore, the thesis addresses the deceleration of degradation by incorporating degradation rates into the control design, utilizing optimal control methods for linear systems in discrete time to minimize component degradation and extend their lifespan. For nonlinear systems, RL techniques are employed to provide adaptable solutions to complex dynamics in both discrete and continuous time settings.
Lastly, a novel cyclic RL algorithm is proposed to ensure system stability amidst actuator degradation. This algorithm iteratively updates the control policy to adapt to component degradation, allowing for continuous optimization and performance despite ongoing degradation. These innovative approaches have been validated through simulations, showcasing their efficacy in academic applications.
