Skip to yearly menu bar Skip to main content


Search All 2024 Events
 

68 Results

<<   <   Page 2 of 6   >   >>
Workshop
Sat 17:27 Understanding The Effect Of Temperature On Alignment With Human Opinions
Maja Pavlovic · Massimo Poesio
Workshop
Understanding The Effect Of Temperature On Alignment With Human Opinions
Maja Pavlovic · Massimo Poesio
Poster
Wed 16:30 Off-Policy Selection for Initiating Human-Centric Experimental Design
Ge Gao · Xi Yang · Qitong Gao · Song Ju · Miroslav Pajic · Min Chi
Tutorial
Tue 13:30 Cross-disciplinary insights into alignment in humans and machines
Gillian Hadfield · Dylan Hadfield-Menell · Joel Leibo · Rakshit Trivedi
Poster
Wed 11:00 Improved Generation of Adversarial Examples Against Safety-aligned LLMs
Qizhang Li · Yiwen Guo · Wangmeng Zuo · Hao Chen
Workshop
The Art of Knowing When to Stop: Analysis of Optimal Stopping in People and Machines
Fukun Zhang · Bonan Zhao
Poster
Wed 16:30 Expectation Alignment: Handling Reward Misspecification in the Presence of Expectation Mismatch
Malek Mechergui · Sarath Sreedharan
Affinity Event
Text-Image Concept Human Alignment Dataset and Metric
Maria Alejandra Bravo
Affinity Event
Detecting Machine-Generated vs. Human-Generated Content Across Varying Text Lengths: Small, Medium, and Large
Anjali Rawal · Hui Wang · Mengyuan Liu · Yu-Hsuan Lin · Youjia Zheng · Shanu Sushmita
Poster
Fri 11:00 ALI-Agent: Assessing LLMs' Alignment with Human Values via Agent-based Evaluation
jingnan zheng · Han Wang · An Zhang · Nguyen Duy Tai · Jun Sun · Tat-Seng Chua
Poster
Thu 16:30 Learning Human-like Representations to Enable Learning Human Values
Andrea Wynn · Ilia Sucholutsky · Tom Griffiths
Poster
Wed 11:00 When Your AIs Deceive You: Challenges of Partial Observability in Reinforcement Learning from Human Feedback
Leon Lang · Davis Foote · Stuart J Russell · Anca Dragan · Erik Jenner · Scott Emmons