Skip to yearly menu bar Skip to main content


Search All 2023 Events
 

100 Results

<<   <   Page 1 of 9   >   >>
Workshop
Exploiting Contextual Structure to Generate Useful Auxiliary Tasks
Benedict Quartey · Ankit Shah · George Konidaris
Poster
Thu 8:45 State-Action Similarity-Based Representations for Off-Policy Evaluation
Brahma Pavse · Josiah Hanna
Poster
Tue 15:15 Uncertainty-Aware Instance Reweighting for Off-Policy Learning
Xiaoying Zhang · Junpu Chen · Hongning Wang · Hong Xie · Yang Liu · John C.S. Lui · Hang Li
Poster
Wed 8:45 Off-Policy Evaluation for Human Feedback
Qitong Gao · Ge Gao · Juncheng Dong · Vahid Tarokh · Min Chi · Miroslav Pajic
Poster
Tue 8:45 Global Optimality and Finite Sample Analysis of Softmax Off-Policy Actor Critic under State Distribution Mismatch
Shangtong Zhang · Remi Tachet des Combes · Romain Laroche
Poster
Tue 8:45 Reliable Off-Policy Learning for Dosage Combinations
Jonas Schweisthal · Dennis Frauen · Valentyn Melnychuk · Stefan Feuerriegel
Poster
Wed 8:45 Future-Dependent Value-Based Off-Policy Evaluation in POMDPs
Masatoshi Uehara · Haruka Kiyohara · Andrew Bennett · Victor Chernozhukov · Nan Jiang · Nathan Kallus · Chengchun Shi · Wen Sun
Workshop
Learning Models and Evaluating Policies with Offline Off-Policy Data under Partial Observability
Shreyas Chaudhari · Philip Thomas · Bruno C. da Silva
Poster
Thu 8:45 Marginal Density Ratio for Off-Policy Evaluation in Contextual Bandits
Muhammad Faaiz Taufiq · Arnaud Doucet · Rob Cornish · Jean-Francois Ton
Poster
Thu 8:45 Unified Off-Policy Learning to Rank: a Reinforcement Learning Perspective
Zeyu Zhang · Yi Su · Hui Yuan · Yiran Wu · Rishab Balasubramanian · Qingyun Wu · Huazheng Wang · Mengdi Wang
Workshop
Chain-of-Thought Reasoning is a Policy Improvement Operator
Hugh Zhang · David Parkes
Poster
Wed 8:45 f-Policy Gradients: A General Framework for Goal-Conditioned RL using f-Divergences
Siddhant Agarwal · Ishan Durugkar · Peter Stone · Amy Zhang