Timezone: »
Generative Pre-trained Transformer (GPT) models have exhibited exciting progress in capabilities, capturing the interest of practitioners and the public alike. Yet, while the literature on the trustworthiness of GPT models remains limited, practitioners have proposed employing capable GPT models for sensitive applications to healthcare and finance – where mistakes can be costly. To this end, this work proposes a comprehensive trustworthiness evaluation for large language models with a focus on GPT-4 and GPT-3.5, considering diverse perspectives – including toxicity, stereotype bias, adversarial robustness, out-of-distribution robustness, robustness on adversarial demonstrations, privacy, machine ethics, and fairness. Based on our evaluations, we discover previously unpublished vulnerabilities to trustworthiness threats. For instance, we find that GPT models can be easily misled to generate toxic and biased outputs and leak private information in both training data and conversation history. We also find that although GPT-4 is usually more trustworthy than GPT-3.5 on standard benchmarks, GPT-4 is more vulnerable given jailbreaking system or user prompts, potentially due to the reason that GPT-4 follows the (misleading) instructions more precisely. Our work illustrates a comprehensive trustworthiness evaluation of GPT models and sheds light on the trustworthiness gaps. Our benchmark is publicly available at https://decodingtrust.github.io/.
Author Information
Boxin Wang (Department of Computer Science, University of Illinois, Urbana Champaign)
Weixin Chen (Tsinghua University)
Hengzhi Pei (University of Illinois, Urbana Champaign)
Chulin Xie (University of Illinois at Urbana-Champaign)
Mintong Kang (University of Illinois at Urbana-Champaign)
Chenhui Zhang (Massachusetts Institute of Technology)
Chejian Xu (University of Illinois at Urbana-Champaign)
Zidi Xiong (UIUC)
Ritik Dutta (IIT Gandhinagar, Dhirubhai Ambani Institute Of Information and Communication Technology)
Rylan Schaeffer (Stanford University)
Sang Truong (Stanford University)
Simran Arora (Stanford)
Mantas Mazeika (University of Illinois Urbana-Champaign)
Dan Hendrycks (Center for AI Safety)
Zinan Lin (Microsoft Research)
Yu Cheng (Microsoft Research)
Sanmi Koyejo (Stanford University / Google)
Dawn Song (UC Berkeley)
Bo Li (UChicago/UIUC)
Related Events (a corresponding poster, oral, or spotlight)
-
2023 Poster: DecodingTrust: A Comprehensive Assessment of Trustworthiness in GPT Models »
Tue. Dec 12th 04:45 -- 06:45 PM Room Great Hall & Hall B1+B2 #1618
More from the Same Authors
-
2021 : VALUE: A Multi-Task Benchmark for Video-and-Language Understanding Evaluation »
Linjie Li · Jie Lei · Zhe Gan · Licheng Yu · Yen-Chun Chen · Rohit Pillai · Yu Cheng · Luowei Zhou · Xin Wang · William Yang Wang · Tamara L Berg · Mohit Bansal · Jingjing Liu · Lijuan Wang · Zicheng Liu -
2021 : CUAD: An Expert-Annotated NLP Dataset for Legal Contract Review »
Dan Hendrycks · Collin Burns · Anya Chen · Spencer Ball -
2021 : Adversarial GLUE: A Multi-Task Benchmark for Robustness Evaluation of Language Models »
Boxin Wang · Chejian Xu · Shuohang Wang · Zhe Gan · Yu Cheng · Jianfeng Gao · Ahmed Awadallah · Bo Li -
2021 : Measuring Coding Challenge Competence With APPS »
Dan Hendrycks · Steven Basart · Saurav Kadavath · Mantas Mazeika · Akul Arora · Ethan Guo · Collin Burns · Samir Puranik · Horace He · Dawn Song · Jacob Steinhardt -
2021 : An Algorithmic Theory of Metacognition in Minds and Machines »
Rylan Schaeffer -
2021 : Certified Robustness for Free in Differentially Private Federated Learning »
Chulin Xie · Yunhui Long · Pin-Yu Chen · Krishnaram Kenthapadi · Bo Li -
2021 : RVFR: Robust Vertical Federated Learning via Feature Subspace Recovery »
Jing Liu · Chulin Xie · Krishnaram Kenthapadi · Sanmi Koyejo · Bo Li -
2021 : PixMix: Dreamlike Pictures Comprehensively Improve Safety Measures »
Dan Hendrycks · Andy Zou · Mantas Mazeika · Leonard Tang · Dawn Song · Jacob Steinhardt -
2021 : What Would Jiminy Cricket Do? Towards Agents That Behave Morally »
Dan Hendrycks · Mantas Mazeika · Andy Zou · Sahil Patel · Christine Zhu · Jesus Navarro · Dawn Song · Bo Li · Jacob Steinhardt -
2021 : Measuring Mathematical Problem Solving With the MATH Dataset »
Dan Hendrycks · Collin Burns · Saurav Kadavath · Akul Arora · Steven Basart · Eric Tang · Dawn Song · Jacob Steinhardt -
2022 Poster: VF-PS: How to Select Important Participants in Vertical Federated Learning, Efficiently and Securely? »
Jiawei Jiang · Lukas Burkhalter · Fangcheng Fu · Bolin Ding · Bo Du · Anwar Hithnawi · Bo Li · Ce Zhang -
2022 Poster: Effective Backdoor Defense by Exploiting Sensitivity of Poisoned Samples »
Weixin Chen · Baoyuan Wu · Haoqian Wang -
2022 : GAUCHE: A Library for Gaussian Processes in Chemistry »
Ryan-Rhys Griffiths · Leo Klarner · Henry Moss · Aditya Ravuri · Sang Truong · Bojana Rankovic · Yuanqi Du · Arian Jamasb · Julius Schwartz · Austin Tripp · Gregory Kell · Anthony Bourached · Alex Chan · Jacob Moss · Chengzhi Guo · Alpha Lee · Philippe Schwaller · Jian Tang -
2022 : Improving Vertical Federated Learning by Efficient Communication with ADMM »
Chulin Xie · Pin-Yu Chen · Ce Zhang · Bo Li -
2022 : Benchmarking Robustness under Distribution Shift of Multimodal Image-Text Models »
Jielin Qiu · Yi Zhu · Xingjian Shi · Zhiqiang Tang · DING ZHAO · Bo Li · Mu Li -
2022 : DensePure: Understanding Diffusion Models towards Adversarial Robustness »
Zhongzhu Chen · Kun Jin · Jiongxiao Wang · Weili Nie · Mingyan Liu · Anima Anandkumar · Bo Li · Dawn Song -
2022 : Fifteen-minute Competition Overview Video »
Nathan Drenkow · Raman Arora · Gino Perrotta · Todd Neller · Ryan Gardner · Mykel J Kochenderfer · Jared Markowitz · Corey Lowman · Casey Richardson · Bo Li · Bart Paulhamus · Ashley J Llorens · Andrew Newman -
2022 : Assembling Existing Labels from Public Datasets to\\Diagnose Novel Diseases: COVID-19 in Late 2019 »
Zengle Zhu · Mintong Kang · Alan Yuille · Zongwei Zhou -
2022 : On the Robustness of Safe Reinforcement Learning under Observational Perturbations »
ZUXIN LIU · Zijian Guo · Zhepeng Cen · Huan Zhang · Jie Tan · Bo Li · DING ZHAO -
2023 : Segment Any Stream: Scalable Water Extent Detection with the Segment Anything Model »
Haozhen Zheng · Chenhui Zhang · Kaiyu Guan · Yawen Deng · Sherrie Wang · Bruce Rhoads · Andrew J Margenot · Shengnan Zhou · Sheng Wang -
2023 : Testing Assumptions Underlying a Unified Theory for the Origin of Grid Cells »
Rylan Schaeffer · Rylan Schaeffer · Mikail Khona · Mikail Khona · Adrian Bertagnoli · Sanmi Koyejo · Sanmi Koyejo · Ila Fiete -
2023 : Rethinking Bayesian Optimization with Gaussian Processes: Insights from Hyperspectral Trait Search »
Ruhana Azam · Sanmi Koyejo · Samuel Fernandes · Mohammed Kebir · Andrew Leakey · Alexander Lipka -
2023 : Beyond Expectations: Model-Driven Amplification of Dataset Biases in Data Feedback Loops »
Rylan Schaeffer · Oluwasanmi Koyejo -
2023 : Associative Memory Under the Probabilistic Lens: Improved Transformers & Dynamic Memory Creation »
Rylan Schaeffer · Mikail Khona · Nika Zahedi · Ila Fiete · Andrey Gromov · Sanmi Koyejo -
2023 : Differentially Private Synthetic Data via Foundation Model APIs 1: Images »
Zinan Lin · Sivakanth Gopi · Janardhan Kulkarni · Harsha Nori · Sergey Yekhanin -
2023 : Training Private and Efficient Language Models with Synthetic Data from LLMs »
Da Yu · Arturs Backurs · Sivakanth Gopi · Huseyin A. Inan · Janardhan Kulkarni · Zinan Lin · Chulin Xie · Huishuai Zhang · Wanrong Zhang -
2023 : FOCUS: Fairness via Agent-Awareness for Federated Learning on Heterogeneous Data »
Wenda Chu · Chulin Xie · Boxin Wang · Linyi Li · Lang Yin · Arash Nourian · Han Zhao · Bo Li -
2023 : Identifying and Mitigating Vulnerabilities in LLM-Integrated Applications »
Fengqing Jiang · Zhangchen Xu · Luyao Niu · Boxin Wang · Jinyuan Jia · Bo Li · Radha Poovendran -
2023 : Reward Model Aggregation »
Zihao Wang · Chirag Nagpal · Alexander D'Amour · Victor Veitch · Sanmi Koyejo -
2023 : Testing Assumptions Underlying a Unified Theory for the Origin of Grid Cells »
Rylan Schaeffer · Mikail Khona · Adrian Bertagnoli · Sanmi Koyejo · Ila Fiete -
2023 : An Information-Theoretic Understanding of Maximum Manifold Capacity Representations »
Berivan Isik · Victor Lecomte · Rylan Schaeffer · Yann LeCun · Mikail Khona · Ravid Shwartz-Ziv · Sanmi Koyejo · Andrey Gromov -
2023 : Understanding subgroup performance differences of fair predictors using causal models »
Stephen Pfohl · Natalie Harris · Chirag Nagpal · David Madras · Vishwali Mhasawade · Olawale Salaudeen · Katherine Heller · Sanmi Koyejo · Alexander D'Amour -
2023 : An Information-Theoretic Understanding of Maximum Manifold Capacity Representations »
Berivan Isik · Rylan Schaeffer · Victor Lecomte · Mikail Khona · Yann LeCun · Sanmi Koyejo · Andrey Gromov · Ravid Shwartz-Ziv -
2023 : Skeleton-of-Thought: Large Language Models Can Do Parallel Decoding »
Xuefei Ning · Zinan Lin · Zixuan Zhou · Zifu Wang · Huazhong Yang · Yu Wang -
2023 : Measuring and Improving Recall in Convolutional Language Models »
Evan Sabri Eyuboglu · Simran Arora · Aman Timalsina · Isys Johnson · Michael Poli · James Zou · Atri Rudra · Christopher Ré -
2023 : Divergence at the Interpolation Threshold: Identifying, Interpreting \& Ablating the Sources of a Deep Learning Puzzle »
Rylan Schaeffer · Zachary Robertson · Akhilan Boopathy · Mikail Khona · Ila Fiete · Andrey Gromov · Sanmi Koyejo -
2023 : An Information-Theoretic Understanding of Maximum Manifold Capacity Representations »
Berivan Isik · Victor Lecomte · Rylan Schaeffer · Yann LeCun · Mikail Khona · Ravid Shwartz-Ziv · Sanmi Koyejo · Andrey Gromov -
2023 : Testing Assumptions Underlying a Unified Theory for the Origin of Grid Cells »
Rylan Schaeffer · Mikail Khona · Adrian Bertagnoli · Sanmi Koyejo · Ila Fiete -
2023 : Divergence at the Interpolation Threshold: Identifying, Interpreting \& Ablating the Sources of a Deep Learning Puzzle »
Rylan Schaeffer · Zachary Robertson · Akhilan Boopathy · Mikail Khona · Ila Fiete · Andrey Gromov · Sanmi Koyejo -
2023 Competition: TDC 2023 (LLM Edition): The Trojan Detection Challenge »
Mantas Mazeika · Andy Zou · Norman Mu · Long Phan · Zifan Wang · Chunru Yu · Adam Khoja · Fengqing Jiang · Aidan O'Gara · Zhen Xiang · Arezoo Rajabi · Dan Hendrycks · Radha Poovendran · Bo Li · David Forsyth -
2023 : Differentially Private Synthetic Data via Foundation Model APIs 1: Images »
Zinan Lin -
2023 Workshop: Third Workshop on Efficient Natural Language and Speech Processing (ENLSP-III): Towards the Future of Large Language Models and their Emerging Descendants »
Mehdi Rezagholizadeh · Peyman Passban · Yue Dong · Yu Cheng · Soheila Samiee · Lili Mou · Qun Liu · Boxing Chen -
2023 : Decoding Backdoors in LLMs and Their Implications »
Bo Li -
2023 : BadChain: Backdoor Chain-of-Thought Prompting for Large Language Models »
Zhen Xiang · Fengqing Jiang · Zidi Xiong · Bhaskar Ramasubramanian · Radha Poovendran · Bo Li -
2023 : An Information-Theoretic Understanding of Maximum Manifold Capacity Representations »
Rylan Schaeffer · Berivan Isik · Victor Lecomte · Mikail Khona · Yann LeCun · Andrey Gromov · Ravid Shwartz-Ziv · Sanmi Koyejo -
2023 : An Information-Theoretic Understanding of Maximum Manifold Capacity Representations »
Rylan Schaeffer · Berivan Isik · Victor Lecomte · Mikail Khona · Yann LeCun · Andrey Gromov · Ravid Shwartz-Ziv · Sanmi Koyejo -
2023 Poster: GAUCHE: A Library for Gaussian Processes in Chemistry »
Ryan-Rhys Griffiths · Leo Klarner · Henry Moss · Aditya Ravuri · Sang Truong · Yuanqi Du · Samuel Stanton · Gary Tom · Bojana Rankovic · Arian Jamasb · Aryan Deshwal · Julius Schwartz · Austin Tripp · Gregory Kell · Simon Frieder · Anthony Bourached · Alex Chan · Jacob Moss · Chengzhi Guo · Johannes Peter Dürholt · Saudamini Chaurasia · Ji Won Park · Felix Strieth-Kalthoff · Alpha Lee · Bingqing Cheng · Alan Aspuru-Guzik · Philippe Schwaller · Jian Tang -
2023 Poster: Are Emergent Abilities of Large Language Models a Mirage? »
Rylan Schaeffer · Brando Miranda · Sanmi Koyejo -
2023 Oral: Are Emergent Abilities of Large Language Models a Mirage? »
Rylan Schaeffer · Brando Miranda · Sanmi Koyejo -
2023 Poster: FedGame: A Game-Theoretic Defense against Backdoor Attacks in Federated Learning »
Jinyuan Jia · Zhuowen Yuan · Dinuka Sahabandu · Luyao Niu · Arezoo Rajabi · Bhaskar Ramasubramanian · Bo Li · Radha Poovendran -
2023 Poster: BIRD: Generalizable Backdoor Detection and Removal for Deep Reinforcement Learning »
Xuan Chen · Wenbo Guo · Guanhong Tao · Xiangyu Zhang · Dawn Song -
2023 Social: ML Safety Social »
Mantas Mazeika -
2023 Poster: IMPRESS: Evaluating the Resilience of Imperceptible Perturbations Against Unauthorized Data Usage in Diffusion-Based Generative AI »
Bochuan Cao · Changjiang Li · Ting Wang · Jinyuan Jia · Bo Li · Jinghui Chen -
2023 Poster: DiffAttack: Evasion Attacks Against Diffusion-Based Adversarial Purification »
Mintong Kang · Dawn Song · Bo Li -
2023 Poster: Domain Watermark: Effective and Harmless Dataset Copyright Protection is Closed at Hand »
Junfeng Guo · Yiming Li · Lixu Wang · Shu-Tao Xia · Heng Huang · Cong Liu · Bo Li -
2023 Poster: CBD: A Certified Backdoor Detector Based on Local Dominant Probability »
Zhen Xiang · Zidi Xiong · Bo Li -
2023 Poster: Monarch Mixer: A Simple Sub-Quadratic GEMM-Based Architecture »
Dan Fu · Simran Arora · Jessica Grogan · Isys Johnson · Evan Sabri Eyuboglu · Armin Thomas · Benjamin Spector · Michael Poli · Atri Rudra · Christopher Ré -
2023 Poster: Incentives in Federated Learning: Equilibria, Dynamics, and Mechanisms for Welfare Maximization »
Aniket Murhekar · Zhuowen Yuan · Bhaskar Ray Chaudhury · Bo Li · Ruta Mehta -
2023 Poster: Self-Supervised Learning of Representations for Space Generates Multi-Modular Grid Cells »
Rylan Schaeffer · Mikail Khona · Tzuhsuan Ma · Cristobal Eyzaguirre · Sanmi Koyejo · Ila Fiete -
2023 Oral: Monarch Mixer: A Simple Sub-Quadratic GEMM-Based Architecture »
Dan Fu · Simran Arora · Jessica Grogan · Isys Johnson · Evan Sabri Eyuboglu · Armin Thomas · Benjamin Spector · Michael Poli · Atri Rudra · Christopher Ré -
2023 Poster: WordScape: a Pipeline to extract multilingual, visually rich Documents with Layout Annotations from Web Crawl Data »
Maurice Weber · Carlo Siebenschuh · Rory Butler · Anton Alexandrov · Valdemar Thanner · Georgios Tsolakis · Haris Jabbar · Ian Foster · Bo Li · Rick Stevens · Ce Zhang -
2022 : Contributed Talk: DensePure: Understanding Diffusion Models towards Adversarial Robustness »
Zhongzhu Chen · Kun Jin · Jiongxiao Wang · Weili Nie · Mingyan Liu · Anima Anandkumar · Bo Li · Dawn Song -
2022 Workshop: Workshop on Machine Learning Safety »
Dan Hendrycks · Victoria Krakovna · Dawn Song · Jacob Steinhardt · Nicholas Carlini -
2022 Workshop: Trustworthy and Socially Responsible Machine Learning »
Huan Zhang · Linyi Li · Chaowei Xiao · J. Zico Kolter · Anima Anandkumar · Bo Li -
2022 Spotlight: Fairness in Federated Learning via Core-Stability »
Bhaskar Ray Chaudhury · Linyi Li · Mintong Kang · Bo Li · Ruta Mehta -
2022 Competition: The Trojan Detection Challenge »
Mantas Mazeika · Dan Hendrycks · Huichen Li · Xiaojun Xu · Andy Zou · Sidney Hough · Arezoo Rajabi · Dawn Song · Radha Poovendran · Bo Li · David Forsyth -
2022 Spotlight: Lightning Talks 5B-4 »
Yuezhi Yang · Zeyu Yang · Yong Lin · Yishi Xu · Linan Yue · Tao Yang · Weixin Chen · Qi Liu · Jiaqi Chen · Dongsheng Wang · Baoyuan Wu · Yuwang Wang · Hao Pan · Shengyu Zhu · Zhenwei Miao · Yan Lu · Lu Tan · Bo Chen · Yichao Du · Haoqian Wang · Wei Li · Yanqing An · Ruiying Lu · Peng Cui · Nanning Zheng · Li Wang · Zhibin Duan · Xiatian Zhu · Mingyuan Zhou · Enhong Chen · Li Zhang -
2022 Spotlight: LOT: Layer-wise Orthogonal Training on Improving l2 Certified Robustness »
Xiaojun Xu · Linyi Li · Bo Li -
2022 Spotlight: Effective Backdoor Defense by Exploiting Sensitivity of Poisoned Samples »
Weixin Chen · Baoyuan Wu · Haoqian Wang -
2022 Spotlight: Lightning Talks 5B-1 »
Devansh Arpit · Xiaojun Xu · Zifan Shi · Ivan Skorokhodov · Shayan Shekarforoush · Zhan Tong · Yiqun Wang · Shichong Peng · Linyi Li · Ivan Skorokhodov · Huan Wang · Yibing Song · David Lindell · Yinghao Xu · Seyed Alireza Moazenipourasil · Sergey Tulyakov · Peter Wonka · Yiqun Wang · Ke Li · David Fleet · Yujun Shen · Yingbo Zhou · Bo Li · Jue Wang · Peter Wonka · Marcus Brubaker · Caiming Xiong · Limin Wang · Deli Zhao · Qifeng Chen · Dit-Yan Yeung -
2022 Panel: Panel 4C-1: How Would The… & SCAMPS: Synthetics for… »
Mantas Mazeika · Daniel McDuff -
2022 Competition: Reconnaissance Blind Chess: An Unsolved Challenge for Multi-Agent Decision Making Under Uncertainty »
Ryan Gardner · Gino Perrotta · Corey Lowman · Casey Richardson · Andrew Newman · Jared Markowitz · Nathan Drenkow · Bart Paulhamus · Ashley J Llorens · Todd Neller · Raman Arora · Bo Li · Mykel J Kochenderfer -
2022 Spotlight: Certifying Some Distributional Fairness with Subpopulation Decomposition »
Mintong Kang · Linyi Li · Maurice Weber · Yang Liu · Ce Zhang · Bo Li -
2022 Spotlight: Lightning Talks 1A-4 »
Siwei Wang · Jing Liu · Nianqiao Ju · Shiqian Li · Eloïse Berthier · Muhammad Faaiz Taufiq · Arsene Fansi Tchango · Chen Liang · Chulin Xie · Jordan Awan · Jean-Francois Ton · Ziad Kobeissi · Wenguan Wang · Xinwang Liu · Kewen Wu · Rishab Goel · Jiaxu Miao · Suyuan Liu · Julien Martel · Ruobin Gong · Francis Bach · Chi Zhang · Rob Cornish · Sanmi Koyejo · Zhi Wen · Yee Whye Teh · Yi Yang · Jiaqi Jin · Bo Li · Yixin Zhu · Vinayak Rao · Wenxuan Tu · Gaetan Marceau Caron · Arnaud Doucet · Xinzhong Zhu · Joumana Ghosn · En Zhu -
2022 Spotlight: Lightning Talks 1A-3 »
Kimia Noorbakhsh · Ronan Perry · Qi Lyu · Jiawei Jiang · Christian Toth · Olivier Jeunen · Xin Liu · Yuan Cheng · Lei Li · Manuel Rodriguez · Julius von Kügelgen · Lars Lorch · Nicolas Donati · Lukas Burkhalter · Xiao Fu · Zhongdao Wang · Songtao Feng · Ciarán Gilligan-Lee · Rishabh Mehrotra · Fangcheng Fu · Jing Yang · Bernhard Schölkopf · Ya-Li Li · Christian Knoll · Maks Ovsjanikov · Andreas Krause · Shengjin Wang · Hong Zhang · Mounia Lalmas · Bolin Ding · Bo Du · Yingbin Liang · Franz Pernkopf · Robert Peharz · Anwar Hithnawi · Julius von Kügelgen · Bo Li · Ce Zhang -
2022 Spotlight: VF-PS: How to Select Important Participants in Vertical Federated Learning, Efficiently and Securely? »
Jiawei Jiang · Lukas Burkhalter · Fangcheng Fu · Bolin Ding · Bo Du · Anwar Hithnawi · Bo Li · Ce Zhang -
2022 Spotlight: CoPur: Certifiably Robust Collaborative Inference via Feature Purification »
Jing Liu · Chulin Xie · Sanmi Koyejo · Bo Li -
2022 : Panel »
Pin-Yu Chen · Alex Gittens · Bo Li · Celia Cintas · Hilde Kuehne · Payel Das -
2022 : Trustworthy Machine Learning in Autonomous Driving »
Bo Li -
2022 Workshop: Decentralization and Trustworthy Machine Learning in Web3: Methodologies, Platforms, and Applications »
Jian Lou · Zhiguang Wang · Chejian Xu · Bo Li · Dawn Song -
2022 : Invited Talk #5, Privacy-Preserving Data Synthesis for General Purposes, Bo Li »
Bo Li -
2022 : Fairness Panel »
Freedom Gumedze · Rachel Cummings · Bo Li · Robert Tillman · Edward Choi -
2022 : Distributional Privacy for Data Sharing »
Zinan Lin · Shuaiqi Wang · Vyas Sekar · Giulia Fanti -
2022 : Trustworthy Federated Learning »
Bo Li -
2022 Poster: Improving Certified Robustness via Statistical Learning with Logical Reasoning »
Zhuolin Yang · Zhikuan Zhao · Boxin Wang · Jiawei Zhang · Linyi Li · Hengzhi Pei · Bojan Karlaš · Ji Liu · Heng Guo · Ce Zhang · Bo Li -
2022 Poster: Untargeted Backdoor Watermark: Towards Harmless and Stealthy Dataset Copyright Protection »
Yiming Li · Yang Bai · Yong Jiang · Yong Yang · Shu-Tao Xia · Bo Li -
2022 Poster: Fairness in Federated Learning via Core-Stability »
Bhaskar Ray Chaudhury · Linyi Li · Mintong Kang · Bo Li · Ruta Mehta -
2022 Poster: Generalizing Goal-Conditioned Reinforcement Learning with Variational Causal Reasoning »
Wenhao Ding · Haohong Lin · Bo Li · DING ZHAO -
2022 Poster: Certifying Some Distributional Fairness with Subpopulation Decomposition »
Mintong Kang · Linyi Li · Maurice Weber · Yang Liu · Ce Zhang · Bo Li -
2022 Poster: How Would The Viewer Feel? Estimating Wellbeing From Video Scenarios »
Mantas Mazeika · Eric Tang · Andy Zou · Steven Basart · Jun Shern Chan · Dawn Song · David Forsyth · Jacob Steinhardt · Dan Hendrycks -
2022 Poster: LOT: Layer-wise Orthogonal Training on Improving l2 Certified Robustness »
Xiaojun Xu · Linyi Li · Bo Li -
2022 Poster: CoPur: Certifiably Robust Collaborative Inference via Feature Purification »
Jing Liu · Chulin Xie · Sanmi Koyejo · Bo Li -
2022 Poster: Forecasting Future World Events With Neural Networks »
Andy Zou · Tristan Xiao · Ryan Jia · Joe Kwon · Mantas Mazeika · Richard Li · Dawn Song · Jacob Steinhardt · Owain Evans · Dan Hendrycks -
2022 Poster: Exploring the Limits of Domain-Adaptive Training for Detoxifying Large-Scale Language Models »
Boxin Wang · Wei Ping · Chaowei Xiao · Peng Xu · Mostofa Patwary · Mohammad Shoeybi · Bo Li · Anima Anandkumar · Bryan Catanzaro -
2022 Poster: SafeBench: A Benchmarking Platform for Safety Evaluation of Autonomous Vehicles »
Chejian Xu · Wenhao Ding · Weijie Lyu · ZUXIN LIU · Shuai Wang · Yihan He · Hanjiang Hu · DING ZHAO · Bo Li -
2022 Poster: General Cutting Planes for Bound-Propagation-Based Neural Network Verification »
Huan Zhang · Shiqi Wang · Kaidi Xu · Linyi Li · Bo Li · Suman Jana · Cho-Jui Hsieh · J. Zico Kolter -
2022 Poster: No Free Lunch from Deep Learning in Neuroscience: A Case Study through Models of the Entorhinal-Hippocampal Circuit »
Rylan Schaeffer · Mikail Khona · Ila Fiete -
2022 Poster: OpenOOD: Benchmarking Generalized Out-of-Distribution Detection »
Jingkang Yang · Pengyun Wang · Dejian Zou · Zitang Zhou · Kunyuan Ding · WENXUAN PENG · Haoqi Wang · Guangyao Chen · Bo Li · Yiyou Sun · Xuefeng Du · Kaiyang Zhou · Wayne Zhang · Dan Hendrycks · Yixuan Li · Ziwei Liu -
2021 : Live panel: Perspectives on ImageNet. »
Dawn Song · Ross Wightman · Dan Hendrycks -
2021 : Using ImageNet to Measure Robustness and Uncertainty »
Dawn Song · Dan Hendrycks -
2021 : Poster/Paper Spotlights »
Ezgi Korkmaz · Marianna Ganapini · Ruiqi He · Rylan Schaeffer · Kevin O'Neill -
2021 : Career and Life: Panel Discussion - Bo Li, Adriana Romero-Soriano, Devi Parikh, and Emily Denton »
Remi Denton · Devi Parikh · Bo Li · Adriana Romero -
2021 : Live Q&A with Bo Li »
Bo Li -
2021 : Invited talk – Trustworthy Machine Learning via Logic Inference, Bo Li »
Bo Li -
2021 : Adversarial GLUE: A Multi-Task Benchmark for Robustness Evaluation of Language Models »
Boxin Wang · Chejian Xu · Shuohang Wang · Zhe Gan · Yu Cheng · Jianfeng Gao · Ahmed Awadallah · Bo Li -
2021 Poster: G-PATE: Scalable Differentially Private Data Generator via Private Aggregation of Teacher Discriminators »
Yunhui Long · Boxin Wang · Zhuolin Yang · Bhavya Kailkhura · Aston Zhang · Carl Gunter · Bo Li -
2021 Poster: Anti-Backdoor Learning: Training Clean Models on Poisoned Data »
Yige Li · Xixiang Lyu · Nodens Koren · Lingjuan Lyu · Bo Li · Xingjun Ma -
2021 Poster: Adversarial Attack Generation Empowered by Min-Max Optimization »
Jingkang Wang · Tianyun Zhang · Sijia Liu · Pin-Yu Chen · Jiacen Xu · Makan Fardad · Bo Li -
2021 Poster: Chasing Sparsity in Vision Transformers: An End-to-End Exploration »
Tianlong Chen · Yu Cheng · Zhe Gan · Lu Yuan · Lei Zhang · Zhangyang Wang -
2021 Poster: Why Spectral Normalization Stabilizes GANs: Analysis and Improvements »
Zinan Lin · Vyas Sekar · Giulia Fanti -
2021 Poster: Data-Efficient GAN Training Beyond (Just) Augmentations: A Lottery Ticket Perspective »
Tianlong Chen · Yu Cheng · Zhe Gan · Jingjing Liu · Zhangyang Wang -
2021 : Reconnaissance Blind Chess + Q&A »
Ryan Gardner · Gino Perrotta · Corey Lowman · Casey Richardson · Andrew Newman · Jared Markowitz · Nathan Drenkow · Bart Paulhamus · Ashley J Llorens · Todd Neller · Raman Arora · Bo Li · Mykel J Kochenderfer -
2021 Poster: The Elastic Lottery Ticket Hypothesis »
Xiaohan Chen · Yu Cheng · Shuohang Wang · Zhe Gan · Jingjing Liu · Zhangyang Wang -
2021 : VisDA21: Visual Domain Adaptation + Q&A »
Kate Saenko · Kuniaki Saito · Donghyun Kim · Samarth Mishra · Ben Usman · Piotr Teterwak · Dina Bashkirova · Dan Hendrycks -
2021 Poster: TRS: Transferability Reduced Ensemble via Promoting Gradient Diversity and Model Smoothness »
Zhuolin Yang · Linyi Li · Xiaojun Xu · Shiliang Zuo · Qian Chen · Pan Zhou · Benjamin Rubinstein · Ce Zhang · Bo Li -
2020 Workshop: Workshop on Dataset Curation and Security »
Nathalie Baracaldo · Yonatan Bisk · Avrim Blum · Michael Curry · John Dickerson · Micah Goldblum · Tom Goldstein · Bo Li · Avi Schwarzschild -
2020 Poster: Reverse-engineering recurrent neural network solutions to a hierarchical inference task for mice »
Rylan Schaeffer · Mikail Khona · Leenoy Meshulam · Brain Laboratory International · Ila Fiete -
2020 Poster: Robust Deep Reinforcement Learning against Adversarial Perturbations on State Observations »
Huan Zhang · Hongge Chen · Chaowei Xiao · Bo Li · Mingyan Liu · Duane Boning · Cho-Jui Hsieh -
2020 Spotlight: Robust Deep Reinforcement Learning against Adversarial Perturbations on State Observations »
Huan Zhang · Hongge Chen · Chaowei Xiao · Bo Li · Mingyan Liu · Duane Boning · Cho-Jui Hsieh -
2020 Poster: On Convergence of Nearest Neighbor Classifiers over Feature Transformations »
Luka Rimanic · Cedric Renggli · Bo Li · Ce Zhang -
2019 Poster: Using Self-Supervised Learning Can Improve Model Robustness and Uncertainty »
Dan Hendrycks · Mantas Mazeika · Saurav Kadavath · Dawn Song -
2018 Poster: Using Trusted Data to Train Deep Networks on Labels Corrupted by Severe Noise »
Dan Hendrycks · Mantas Mazeika · Duncan Wilson · Kevin Gimpel -
2018 Poster: Dialog-based Interactive Image Retrieval »
Xiaoxiao Guo · Hui Wu · Yu Cheng · Steven Rennie · Gerald Tesauro · Rogerio Feris -
2018 Poster: Robustness of conditional GANs to noisy labels »
Kiran Thekumparampil · Ashish Khetan · Zinan Lin · Sewoong Oh -
2018 Spotlight: Robustness of conditional GANs to noisy labels »
Kiran Thekumparampil · Ashish Khetan · Zinan Lin · Sewoong Oh -
2018 Poster: PacGAN: The power of two samples in generative adversarial networks »
Zinan Lin · Ashish Khetan · Giulia Fanti · Sewoong Oh -
2016 Poster: Latent Attention For If-Then Program Synthesis »
Chang Liu · Xinyun Chen · Richard Shin · Mingcheng Chen · Dawn Song