Timezone: »
The promise of Offline Reinforcement Learning (RL) lies in learning policies from fixed datasets, without interacting with the environment. Being unable to interact makes the dataset one of the most essential ingredient of the algorithm and has a large influence on the performance of the learned policy. Studies on how the dataset composition influences various Offline RL algorithms are missing currently. Towards that end, we conducted a comprehensive empirical analysis on the effect of dataset composition towards the performance of Offline RL algorithms for discrete action environments. The performance is studied through two metrics of the datasets, Trajectory Quality (TQ) and State-Action Coverage (SACo). Our analysis suggests that variants of the off-policy Deep-Q-Network family rely on the dataset to exhibit high SACo. Contrary to that, algorithms that constrain the learned policy towards the data generating policy perform well across datasets, if they exhibit high TQ or SACo or both. For datasets with high TQ, Behavior Cloning outperforms or performs similarly to the best Offline RL algorithms.
Author Information
Kajetan Schweighofer (Johannes Kepler University Linz)
Markus Hofmarcher (ELLIS Unit / University Linz)
Marius-Constantin Dinu (LIT AI Lab, Institute for Machine Learning, Johannes Kepler University Linz, Dynatrace Research)
Philipp Renz (LIT AI Lab - JKU Linz)
Angela Bitto (JKU)
Vihang Patil (LIT AI Lab, Institute for Machine Learning, Johannes Kepler University Linz, Austria)
Sepp Hochreiter (LIT AI Lab / University Linz)
Head of the LIT AI Lab and Professor of bioinformatics at the University of Linz. First to identify and analyze the vanishing gradient problem, the fundamental deep learning problem, in 1991. First author of the main paper on the now widely used LSTM RNNs. He implemented 'learning how to learn' (meta-learning) networks via LSTM RNNs and applied Deep Learning and RNNs to self-driving cars, sentiment analysis, reinforcement learning, bioinformatics, and medicine.
More from the Same Authors
-
2021 : Assigning Credit to Human Decisions using Modern Hopfield Networks »
Michael Widrich · Markus Hofmarcher · Vihang Patil · Angela Bitto · Sepp Hochreiter -
2021 : Modern Hopfield Networks for Return Decomposition for Delayed Rewards »
Michael Widrich · Markus Hofmarcher · Vihang Patil · Angela Bitto · Sepp Hochreiter -
2021 : Understanding the Effects of Dataset Characteristics on Offline Reinforcement Learning »
Kajetan Schweighofer · Markus Hofmarcher · Marius-Constantin Dinu · Philipp Renz · Angela Bitto · Vihang Patil · Sepp Hochreiter -
2021 : Modern Hopfield Networks for Return Decomposition for Delayed Rewards »
Michael Widrich · Markus Hofmarcher · Vihang Patil · Angela Bitto · Sepp Hochreiter -
2022 : Boosting Multi-modal Contrastive Learning with Modern Hopfield Networks and InfoLOOB »
Andreas Fürst · Elisabeth Rumetshofer · Johannes Lehner · Viet T. Tran · Fei Tang · Hubert Ramsauer · David Kreil · Michael Kopp · Günter Klambauer · Angela Bitto · Sepp Hochreiter -
2022 : Modern Hopfield Networks for Iterative Learning on Tabular Data »
Bernhard Schäfl · Lukas Gruber · Angela Bitto · Sepp Hochreiter -
2022 : Toward Semantic History Compression for Reinforcement Learning »
Fabian Paischer · Thomas Adler · Andreas Radler · Markus Hofmarcher · Sepp Hochreiter -
2022 : Foundation Models for History Compression in Reinforcement Learning »
Fabian Paischer · Thomas Adler · Andreas Radler · Markus Hofmarcher · Sepp Hochreiter -
2022 : Toward Semantic History Compression for Reinforcement Learning »
Fabian Paischer · Thomas Adler · Andreas Radler · Markus Hofmarcher · Sepp Hochreiter -
2022 : Informative rewards and generalization in curriculum learning »
Rahul Siripurapu · Vihang Patil · Kajetan Schweighofer · Marius-Constantin Dinu · Markus Holzleitner · Hamid Eghbalzadeh · Luis Ferro · Thomas Schmied · Michael Kopp · Sepp Hochreiter -
2022 : Foundation Models for History Compression in Reinforcement Learning »
Fabian Paischer · Thomas Adler · Andreas Radler · Markus Hofmarcher · Sepp Hochreiter -
2022 Poster: CLOOB: Modern Hopfield Networks with InfoLOOB Outperform CLIP »
Andreas Fürst · Elisabeth Rumetshofer · Johannes Lehner · Viet T. Tran · Fei Tang · Hubert Ramsauer · David Kreil · Michael Kopp · Günter Klambauer · Angela Bitto · Sepp Hochreiter -
2021 : Understanding the Effects of Dataset Composition on Offline Reinforcement Learning »
Kajetan Schweighofer · Markus Hofmarcher · Marius-Constantin Dinu · Angela Bitto · Philipp Renz · Vihang Patil · Sepp Hochreiter -
2021 : Understanding the Effects of Dataset Composition on Offline Reinforcement Learning »
Kajetan Schweighofer · Markus Hofmarcher · Marius-Constantin Dinu · Angela Bitto · Philipp Renz · Vihang Patil · Sepp Hochreiter -
2021 Poster: The balancing principle for parameter choice in distance-regularized domain adaptation »
Werner Zellinger · Natalia Shepeleva · Marius-Constantin Dinu · Hamid Eghbal-zadeh · Hoan Duc Nguyen · Bernhard Nessler · Sergei Pereverzyev · Bernhard A. Moser -
2019 : Poster and Coffee Break 2 »
Karol Hausman · Kefan Dong · Ken Goldberg · Lihong Li · Lin Yang · Lingxiao Wang · Lior Shani · Liwei Wang · Loren Amdahl-Culleton · Lucas Cassano · Marc Dymetman · Marc Bellemare · Marcin Tomczak · Margarita Castro · Marius Kloft · Marius-Constantin Dinu · Markus Holzleitner · Martha White · Mengdi Wang · Michael Jordan · Mihailo Jovanovic · Ming Yu · Minshuo Chen · Moonkyung Ryu · Muhammad Zaheer · Naman Agarwal · Nan Jiang · Niao He · Nikolaus Yasui · Nikos Karampatziakis · Nino Vieillard · Ofir Nachum · Olivier Pietquin · Ozan Sener · Pan Xu · Parameswaran Kamalaruban · Paul Mineiro · Paul Rolland · Philip Amortila · Pierre-Luc Bacon · Prakash Panangaden · Qi Cai · Qiang Liu · Quanquan Gu · Raihan Seraj · Richard Sutton · Rick Valenzano · Robert Dadashi · Rodrigo Toro Icarte · Roshan Shariff · Roy Fox · Ruosong Wang · Saeed Ghadimi · Samuel Sokota · Sean Sinclair · Sepp Hochreiter · Sergey Levine · Sergio Valcarcel Macua · Sham Kakade · Shangtong Zhang · Sheila McIlraith · Shie Mannor · Shimon Whiteson · Shuai Li · Shuang Qiu · Wai Lok Li · Siddhartha Banerjee · Sitao Luan · Tamer Basar · Thinh Doan · Tianhe Yu · Tianyi Liu · Tom Zahavy · Toryn Klassen · Tuo Zhao · Vicenç Gómez · Vincent Liu · Volkan Cevher · Wesley Suttle · Xiao-Wen Chang · Xiaohan Wei · Xiaotong Liu · Xingguo Li · Xinyi Chen · Xingyou Song · Yao Liu · YiDing Jiang · Yihao Feng · Yilun Du · Yinlam Chow · Yinyu Ye · Yishay Mansour · · Yonathan Efroni · Yongxin Chen · Yuanhao Wang · Bo Dai · Chen-Yu Wei · Harsh Shrivastava · Hongyang Zhang · Qinqing Zheng · SIDDHARTHA SATPATHI · Xueqing Liu · Andreu Vall -
2019 : Poster Session »
Ahana Ghosh · Javad Shafiee · Akhilan Boopathy · Alex Tamkin · Theodoros Vasiloudis · Vedant Nanda · Ali Baheri · Paul Fieguth · Andrew Bennett · Guanya Shi · Hao Liu · Arushi Jain · Jacob Tyo · Benjie Wang · Boxiao Chen · Carroll Wainwright · Chandramouli Shama Sastry · Chao Tang · Daniel S. Brown · David Inouye · David Venuto · Dhruv Ramani · Dimitrios Diochnos · Divyam Madaan · Dmitrii Krashenikov · Joel Oren · Doyup Lee · Eleanor Quint · elmira amirloo · Matteo Pirotta · Gavin Hartnett · Geoffroy Dubourg-Felonneau · Gokul Swamy · Pin-Yu Chen · Ilija Bogunovic · Jason Carter · Javier Garcia-Barcos · Jeet Mohapatra · Jesse Zhang · Jian Qian · John Martin · Oliver Richter · Federico Zaiter · Tsui-Wei Weng · Karthik Abinav Sankararaman · Kyriakos Polymenakos · Lan Hoang · mahdieh abbasi · Marco Gallieri · Mathieu Seurin · Matteo Papini · Matteo Turchetta · Matthew Sotoudeh · Mehrdad Hosseinzadeh · Nathan Fulton · Masatoshi Uehara · Niranjani Prasad · Oana-Maria Camburu · Patrik Kolaric · Philipp Renz · Prateek Jaiswal · Reazul Hasan Russel · Riashat Islam · Rishabh Agarwal · Alexander Aldrick · Sachin Vernekar · Sahin Lale · Sai Kiran Narayanaswami · Samuel Daulton · Sanjam Garg · Sebastian East · Shun Zhang · Soheil Dsidbari · Justin Goodwin · Victoria Krakovna · Wenhao Luo · Wesley Chung · Yuanyuan Shi · Yuh-Shyang Wang · Hongwei Jin · Ziping Xu -
2017 : Invited Talk 3 »
Sepp Hochreiter -
2017 : Panel: Machine learning and audio signal processing: State of the art and future perspectives »
Sepp Hochreiter · Bo Li · Karen Livescu · Arindam Mandal · Oriol Nieto · Malcolm Slaney · Hendrik Purwins -
2017 Spotlight: Self-Normalizing Neural Networks »
Günter Klambauer · Thomas Unterthiner · Andreas Mayr · Sepp Hochreiter -
2017 Poster: Self-Normalizing Neural Networks »
Günter Klambauer · Thomas Unterthiner · Andreas Mayr · Sepp Hochreiter -
2017 Poster: GANs Trained by a Two Time-Scale Update Rule Converge to a Local Nash Equilibrium »
Martin Heusel · Hubert Ramsauer · Thomas Unterthiner · Bernhard Nessler · Sepp Hochreiter -
2016 Symposium: Recurrent Neural Networks and Other Machines that Learn Algorithms »
Jürgen Schmidhuber · Sepp Hochreiter · Alex Graves · Rupesh K Srivastava -
2015 Poster: Rectified Factor Networks »
Djork-Arné Clevert · Andreas Mayr · Thomas Unterthiner · Sepp Hochreiter