Affinity Workshop: WiML Workshop 1

Maintenance planning framework using online and offline deep reinforcement learning

Zaharah Bukhsh · Nils Jansen


Cost-effective asset management is an area of interest across several industries, for example, manufacturing, transportation, and infrastructure. This paper develops a deep reinforcement learning (DRL) framework to automatically learn an optimal rehabilitation policy for continuously deteriorating water pipes. We approach the problem of rehabilitation planning in online and offline DRL settings. In online DRL, the agent interacts with a simulated environment of multiple pipes with distinct length, material, and failure rate characteristics. We train the agent using deep Q learning (DQN) to learn an optimal policy with minimal average costs and maximum reliability level for assets under consideration. In offline learning, the agent uses the entire DQN replay dataset to learn an optimal policy via the conservative Q-learning algorithm without further interactions with the environment. We demonstrate that DRL-based policies show improvements over standard preventive and corrective planning approaches. Additionally, learning from the fixed DQN replay dataset surpasses vanilla DQN, which learns from online interactions with the environment. The results warrant that the existing deterioration profiles of water pipes consisting of large and diverse states and actions trajectories can be used to learn rehabilitation policies in offline DRL settings.

Chat is not available.