Skip to yearly menu bar Skip to main content


Search All 2022 Events
 

50 Results

<<   <   Page 2 of 5   >   >>
Poster
Wed 9:00 On the Safety of Interpretable Machine Learning: A Maximum Deviation Approach
Dennis Wei · Rahul Nair · Amit Dhurandhar · Kush Varshney · Elizabeth Daly · Moninder Singh
Poster
Wed 14:00 Shield Decentralization for Safe Multi-Agent Reinforcement Learning
Daniel Melcer · Christopher Amato · Stavros Tripakis
Poster
Tue 9:00 Neural Abstractions
Alessandro Abate · Alec Edwards · Mirco Giacobbe
Poster
Thu 9:00 Parametrically Retargetable Decision-Makers Tend To Seek Power
Alex Turner · Prasad Tadepalli
Poster
Wed 9:00 Capturing Failures of Large Language Models via Human Cognitive Biases
Erik Jones · Jacob Steinhardt
Poster
MExMI: Pool-based Active Model Extraction Crossover Membership Inference
Yaxin Xiao · Qingqing Ye · Haibo Hu · Huadi Zheng · Chengfang Fang · Jie Shi
Poster
Thu 9:00 Defining and Characterizing Reward Gaming
Joar Skalse · Nikolaus Howe · Dmitrii Krasheninnikov · David Krueger
Poster
Tue 14:00 Second Thoughts are Best: Learning to Re-Align With Human Values from Text Edits
Ruibo Liu · Chenyan Jia · Ge Zhang · Ziyu Zhuang · Tony Liu · Soroush Vosoughi
Poster
Tue 9:00 Safety Guarantees for Neural Network Dynamic Systems via Stochastic Barrier Functions
Rayan Mazouz · Karan Muvvala · Akash Ratheesh Babu · Luca Laurenti · Morteza Lahijanian
Poster
Thu 9:00 Towards Safe Reinforcement Learning with a Safety Editor Policy
Haonan Yu · Wei Xu · Haichao Zhang
Poster
Thu 14:00 Active Learning with Safety Constraints
Romain Camilleri · Andrew Wagenmaker · Jamie Morgenstern · Lalit Jain · Kevin Jamieson
Poster
Wed 9:00 Enhancing Safe Exploration Using Safety State Augmentation
Aivar Sootla · Alexander Cowen-Rivers · Jun Wang · Haitham Bou Ammar