firstbacksecondback
496 Results
Poster
|
Optimistic Critic Reconstruction and Constrained Fine-Tuning for General Offline-to-Online RL Qin-Wen Luo · Ming-Kun Xie · Yewen Wang · Sheng-Jun Huang |
||
Workshop
|
iART - Imitation guided Automated Red Teaming Sajad Mousavi · Desik Rengarajan · Ashwin Ramesh Babu · Vineet Gundecha · Avisek Naug · Sahand Ghorbanpour · Ricardo Luna Gutierrez · Antonio Guillen-Perez · Paolo Faraboschi · Soumyendu Sarkar |
||
Workshop
|
Sun 10:45 |
Contributed Talk 1: iART - Imitation guided Automated Red Teaming Sajad Mousavi · Desik Rengarajan · Ashwin Ramesh Babu · Vineet Gundecha · Avisek Naug · Sahand Ghorbanpour · Ricardo Luna Gutierrez · Antonio Guillen-Perez · Paolo Faraboschi · Soumyendu Sarkar |
|
Workshop
|
VinePPO: Accurate Credit Assignment in RL for LLM Mathematical Reasoning Amirhossein Kazemnejad · Milad Aghajohari · Eva Portelance · Alessandro Sordoni · Siva Reddy · Aaron Courville · Nicolas Le Roux |
||
Poster
|
Wed 11:00 |
SustainDC: Benchmarking for Sustainable Data Center Control Avisek Naug · Antonio Guillen-Perez · Ricardo Luna Gutierrez · Vineet Gundecha · Cullen Bash · Sahand Ghorbanpour · Sajad Mousavi · Ashwin Ramesh Babu · Dejan Markovikj · Lekhapriya Dheeraj Kashyap · Desik Rengarajan · Soumyendu Sarkar |
|
Workshop
|
Crystal Design Amidst Noisy DFT Signals: A Reinforcement Learning Approach Prashant Govindarajan · Mathieu Reymond · Santiago Miret · Mariano Phielipp · Sarath Chandar |
||
Poster
|
Thu 11:00 |
Reinforcement Learning with LTL and -Regular Objectives via Optimality-Preserving Translation to Average Rewards Xuan Bach Le · Dominik Wagner · Leon Witzman · Alexander Rabinovich · Luke Ong |
|
Poster
|
Wed 11:00 |
Offline Multitask Representation Learning for Reinforcement Learning Haque Ishfaq · Thanh Nguyen-Tang · Songtao Feng · Raman Arora · Mengdi Wang · Ming Yin · Doina Precup |
|
Workshop
|
ENHANCING DATA EFFICIENCY IN REINFORCEMENT LEARNING: A NOVEL IMAGINATION MECHANISM BASED ON MESH INFORMATION PROPAGATION Zihang Wang · Maowei Jiang · Pengyu Zeng · ruiqi li · Quangao Liu · Peter Búš |
||
Poster
|
Thu 16:30 |
OmniJARVIS: Unified Vision-Language-Action Tokenization Enables Open-World Instruction Following Agents Zihao Wang · Shaofei Cai · Zhancun Mu · Haowei Lin · Ceyao Zhang · Xuejie Liu · Qing Li · Anji Liu · Xiaojian (Shawn) Ma · Yitao Liang |
|
Workshop
|
Honesty to Subterfuge: In-Context Reinforcement Learning Can Make Honest Models Reward Hack Leo McKee-Reid · Joe Needham · Maria Martinez · Christoph Sträter · Mikita Balesni |
||
Workshop
|
Learning to Bridge the Gap: Efficient Novelty Recovery with Planning and Reinforcement Learning Alicia Li · Nishanth Kumar · Tomás Lozano-Pérez · Leslie Kaelbling |