Timezone: »

Reinforcement Learning Under Algorithmic Triage
Eleni Straitouri · Adish Singla · Vahid Balazadeh Meresht · Manuel Rodriguez
Event URL: https://openreview.net/forum?id=aATg8ctMfd »

Methods to learn under algorithmic triage have predominantly focused on supervised learning settings where each decision, or prediction, is independent of each other. Under algorithmic triage, a supervised learning model predicts a fraction of the instances and humans predict the remaining ones. In this work, we take a first step towards developing reinforcement learning models that are optimized to operate under algorithmic triage. To this end, we look at the problem through the framework of options and develop a two-stage actor-critic method to learn reinforcement learning models under triage. The first stage performs offline, off-policy training using human data gathered in an environment where the human has operated on their own. The second stage performs on-policy training to account for the impact that switching may have on the human policy, which may be difficult to anticipate from the above human data. Extensive simulation experiments in a synthetic car driving task show that the machine models and the triage policies trained using our two-stage method effectively complement human policies and outperform those provided by several competitive baselines.

Author Information

Eleni Straitouri (MPI-SWS)
Adish Singla (MPI-SWS)
Vahid Balazadeh Meresht (Sharif University of Technology)
Manuel Rodriguez (Max Planck Institute for Software Systems)

More from the Same Authors