Timezone: »
Poster
Adversarial Attacks on Stochastic Bandits
Kwang-Sung Jun · Lihong Li · Yuzhe Ma · Jerry Zhu
We study adversarial attacks that manipulate the reward signals to control the actions chosen by a stochastic multi-armed bandit algorithm. We propose the first attack against two popular bandit algorithms: $\epsilon$-greedy and UCB, \emph{without} knowledge of the mean rewards. The attacker is able to spend only logarithmic effort, multiplied by a problem-specific parameter that becomes smaller as the bandit problem gets easier to attack. The result means the attacker can easily hijack the behavior of the bandit algorithm to promote or obstruct certain actions, say, a particular medical treatment. As bandits are seeing increasingly wide use in practice, our study exposes a significant security threat.
Author Information
Kwang-Sung Jun (UW-Madison)
Lihong Li (Google Brain)
Yuzhe Ma (University of Wisconsin-Madison)
Jerry Zhu (University of Wisconsin-Madison)
More from the Same Authors
-
2021 : Game Redesign in No-regret Game Playing »
Yuzhe Ma · Young Wu · Jerry Zhu -
2021 : Reward Poisoning in Reinforcement Learning: Attacks Against Unknown Learners in Unknown Environments »
Amin Rakhsha · Xuezhou Zhang · Jerry Zhu · Adish Singla -
2021 : Game Redesign in No-regret Game Playing »
Yuzhe Ma · Young Wu · Jerry Zhu -
2021 : Reward Poisoning in Reinforcement Learning: Attacks Against Unknown Learners in Unknown Environments »
Amin Rakhsha · Xuezhou Zhang · Jerry Zhu · Adish Singla -
2023 Poster: Mechanism Design for Collaborative Normal Mean Estimation »
Yiding Chen · Jerry Zhu · Kirthevasan Kandasamy -
2023 Poster: Dream the Impossible: Outlier Imagination with Diffusion Models »
Xuefeng Du · Yiyou Sun · Jerry Zhu · Yixuan Li -
2022 Workshop: Reinforcement Learning for Real Life (RL4RealLife) Workshop »
Yuxi Li · Emma Brunskill · MINMIN CHEN · Omer Gottesman · Lihong Li · Yao Liu · Zhiwei Tony Qin · Matthew Taylor -
2022 Poster: Provable Defense against Backdoor Policies in Reinforcement Learning »
Shubham Bharti · Xuezhou Zhang · Adish Singla · Jerry Zhu -
2020 : Panel »
Emma Brunskill · Nan Jiang · Nando de Freitas · Finale Doshi-Velez · Sergey Levine · John Langford · Lihong Li · George Tucker · Rishabh Agarwal · Aviral Kumar -
2020 Poster: Escaping the Gravitational Pull of Softmax »
Jincheng Mei · Chenjun Xiao · Bo Dai · Lihong Li · Csaba Szepesvari · Dale Schuurmans -
2020 Oral: Escaping the Gravitational Pull of Softmax »
Jincheng Mei · Chenjun Xiao · Bo Dai · Lihong Li · Csaba Szepesvari · Dale Schuurmans -
2020 Poster: Task-agnostic Exploration in Reinforcement Learning »
Xuezhou Zhang · Yuzhe Ma · Adish Singla -
2020 Poster: CoinDICE: Off-Policy Confidence Interval Estimation »
Bo Dai · Ofir Nachum · Yinlam Chow · Lihong Li · Csaba Szepesvari · Dale Schuurmans -
2020 Poster: Off-Policy Evaluation via the Regularized Lagrangian »
Mengjiao (Sherry) Yang · Ofir Nachum · Bo Dai · Lihong Li · Dale Schuurmans -
2020 Spotlight: CoinDICE: Off-Policy Confidence Interval Estimation »
Bo Dai · Ofir Nachum · Yinlam Chow · Lihong Li · Csaba Szepesvari · Dale Schuurmans -
2019 : Closing Remarks »
Bo Dai · Niao He · Nicolas Le Roux · Lihong Li · Dale Schuurmans · Martha White -
2019 : Poster and Coffee Break 2 »
Karol Hausman · Kefan Dong · Ken Goldberg · Lihong Li · Lin Yang · Lingxiao Wang · Lior Shani · Liwei Wang · Loren Amdahl-Culleton · Lucas Cassano · Marc Dymetman · Marc Bellemare · Marcin Tomczak · Margarita Castro · Marius Kloft · Marius-Constantin Dinu · Markus Holzleitner · Martha White · Mengdi Wang · Michael Jordan · Mihailo Jovanovic · Ming Yu · Minshuo Chen · Moonkyung Ryu · Muhammad Zaheer · Naman Agarwal · Nan Jiang · Niao He · Nikolaus Yasui · Nikos Karampatziakis · Nino Vieillard · Ofir Nachum · Olivier Pietquin · Ozan Sener · Pan Xu · Parameswaran Kamalaruban · Paul Mineiro · Paul Rolland · Philip Amortila · Pierre-Luc Bacon · Prakash Panangaden · Qi Cai · Qiang Liu · Quanquan Gu · Raihan Seraj · Richard Sutton · Rick Valenzano · Robert Dadashi · Rodrigo Toro Icarte · Roshan Shariff · Roy Fox · Ruosong Wang · Saeed Ghadimi · Samuel Sokota · Sean Sinclair · Sepp Hochreiter · Sergey Levine · Sergio Valcarcel Macua · Sham Kakade · Shangtong Zhang · Sheila McIlraith · Shie Mannor · Shimon Whiteson · Shuai Li · Shuang Qiu · Wai Lok Li · Siddhartha Banerjee · Sitao Luan · Tamer Basar · Thinh Doan · Tianhe Yu · Tianyi Liu · Tom Zahavy · Toryn Klassen · Tuo Zhao · Vicenç Gómez · Vincent Liu · Volkan Cevher · Wesley Suttle · Xiao-Wen Chang · Xiaohan Wei · Xiaotong Liu · Xingguo Li · Xinyi Chen · Xingyou Song · Yao Liu · YiDing Jiang · Yihao Feng · Yilun Du · Yinlam Chow · Yinyu Ye · Yishay Mansour · · Yonathan Efroni · Yongxin Chen · Yuanhao Wang · Bo Dai · Chen-Yu Wei · Harsh Shrivastava · Hongyang Zhang · Qinqing Zheng · SIDDHARTHA SATPATHI · Xueqing Liu · Andreu Vall -
2019 : Poster Spotlight 2 »
Aaron Sidford · Mengdi Wang · Lin Yang · Yinyu Ye · Zuyue Fu · Zhuoran Yang · Yongxin Chen · Zhaoran Wang · Ofir Nachum · Bo Dai · Ilya Kostrikov · Dale Schuurmans · Ziyang Tang · Yihao Feng · Lihong Li · Denny Zhou · Qiang Liu · Rodrigo Toro Icarte · Ethan Waldie · Toryn Klassen · Rick Valenzano · Margarita Castro · Simon Du · Sham Kakade · Ruosong Wang · Minshuo Chen · Tianyi Liu · Xingguo Li · Zhaoran Wang · Tuo Zhao · Philip Amortila · Doina Precup · Prakash Panangaden · Marc Bellemare -
2019 Workshop: The Optimization Foundations of Reinforcement Learning »
Bo Dai · Niao He · Nicolas Le Roux · Lihong Li · Dale Schuurmans · Martha White -
2019 : Opening Remarks »
Bo Dai · Niao He · Nicolas Le Roux · Lihong Li · Dale Schuurmans · Martha White -
2019 Poster: Policy Poisoning in Batch Reinforcement Learning and Control »
Yuzhe Ma · Xuezhou Zhang · Wen Sun · Jerry Zhu -
2019 Poster: A Kernel Loss for Solving the Bellman Equation »
Yihao Feng · Lihong Li · Qiang Liu -
2019 Poster: Preference-Based Batch and Sequential Teaching: Towards a Unified View of Models »
Farnam Mansouri · Yuxin Chen · Ara Vartanian · Jerry Zhu · Adish Singla -
2019 Poster: A Unified Framework for Data Poisoning Attack to Graph-based Semi-supervised Learning »
Xuanqing Liu · Si Si · Jerry Zhu · Yang Li · Cho-Jui Hsieh -
2019 Poster: DualDICE: Behavior-Agnostic Estimation of Discounted Stationary Distribution Corrections »
Ofir Nachum · Yinlam Chow · Bo Dai · Lihong Li -
2019 Spotlight: DualDICE: Behavior-Agnostic Estimation of Discounted Stationary Distribution Corrections »
Ofir Nachum · Yinlam Chow · Bo Dai · Lihong Li -
2018 : Hierarchical reinforcement learning for composite-task dialogues »
Lihong Li -
2018 Poster: Breaking the Curse of Horizon: Infinite-Horizon Off-Policy Estimation »
Qiang Liu · Lihong Li · Ziyang Tang · Denny Zhou -
2018 Spotlight: Breaking the Curse of Horizon: Infinite-Horizon Off-Policy Estimation »
Qiang Liu · Lihong Li · Ziyang Tang · Denny Zhou -
2017 Workshop: Teaching Machines, Robots, and Humans »
Maya Cakmak · Anna Rafferty · Adish Singla · Jerry Zhu · Sandra Zilles -
2017 Workshop: From 'What If?' To 'What Next?' : Causal Inference and Machine Learning for Intelligent Decision Making »
Ricardo Silva · Panagiotis Toulis · John Shawe-Taylor · Alexander Volfovsky · Thorsten Joachims · Lihong Li · Nathan Kallus · Adith Swaminathan -
2017 Poster: Q-LDA: Uncovering Latent Patterns in Text-based Sequential Decision Processes »
Jianshu Chen · Chong Wang · Lin Xiao · Ji He · Lihong Li · Li Deng -
2016 : Optimal Teaching for Online Perceptrons »
Xuezhou Zhang · Jerry Zhu -
2016 Workshop: The Future of Interactive Machine Learning »
Kory Mathewson @korymath · Kaushik Subramanian · Mark Ho · Robert Loftin · Joseph L Austerweil · Anna Harutyunyan · Doina Precup · Layla El Asri · Matthew Gombolay · Jerry Zhu · Sonia Chernova · Charles Isbell · Patrick M Pilarski · Weng-Keen Wong · Manuela Veloso · Julie A Shah · Matthew Taylor · Brenna Argall · Michael Littman -
2016 Poster: Active Learning with Oracle Epiphany »
Tzu-Kuo Huang · Lihong Li · Ara Vartanian · Saleema Amershi · Jerry Zhu -
2015 Poster: Human Memory Search as Initial-Visit Emitting Random Walk »
Kwang-Sung Jun · Jerry Zhu · Timothy T Rogers · Zhuoran Yang · Ming Yuan -
2014 Poster: Optimal Teaching for Limited-Capacity Human Learners »
Kaustubh R Patil · Jerry Zhu · Łukasz Kopeć · Bradley C Love -
2014 Spotlight: Optimal Teaching for Limited-Capacity Human Learners »
Kaustubh R Patil · Jerry Zhu · Łukasz Kopeć · Bradley C Love -
2013 Poster: Machine Teaching for Bayesian Learners in the Exponential Family »
Jerry Zhu -
2011 Poster: How Do Humans Teach: On Curriculum Learning and Teaching Dimension »
Faisal Khan · Jerry Zhu · Bilge Mutlu -
2011 Poster: An Empirical Evaluation of Thompson Sampling »
Olivier Chapelle · Lihong Li -
2011 Poster: Learning Higher-Order Graph Structure with Features by Structure Penalty »
Shilin Ding · Grace Wahba · Jerry Zhu -
2010 Oral: Humans Learn Using Manifolds, Reluctantly »
Bryan R Gibson · Jerry Zhu · Timothy T Rogers · Chuck Kalish · Joseph Harrison -
2010 Spotlight: Learning from Logged Implicit Exploration Data »
Alex Strehl · Lihong Li · John Langford · Sham M Kakade -
2010 Poster: Learning from Logged Implicit Exploration Data »
Alexander L Strehl · John Langford · Lihong Li · Sham M Kakade -
2010 Poster: Humans Learn Using Manifolds, Reluctantly »
Bryan R Gibson · Jerry Zhu · Timothy T Rogers · Chuck Kalish · Joseph Harrison -
2010 Poster: Transduction with Matrix Completion: Three Birds with One Stone »
Andrew B Goldberg · Jerry Zhu · Benjamin Recht · Junming Sui · Rob Nowak -
2010 Session: Spotlights Session 1 »
Jerry Zhu -
2010 Poster: Parallelized Stochastic Gradient Descent »
Martin A Zinkevich · Markus Weimer · Alexander Smola · Lihong Li -
2009 Poster: Human Rademacher Complexity »
Jerry Zhu · Timothy T Rogers · Bryan R Gibson -
2008 Workshop: Machine learning meets human learning »
Nathaniel D Daw · Tom Griffiths · Josh Tenenbaum · Jerry Zhu -
2008 Poster: Human Active Learning »
Jerry Zhu · Rui M Castro · Timothy T Rogers · Rob Nowak · Ruichen Qian · Chuck Kalish -
2008 Poster: Sparse Online Learning via Truncated Gradient »
John Langford · Lihong Li · Tong Zhang -
2008 Poster: Unlabeled data: Now it helps, now it doesn't »
Aarti Singh · Rob Nowak · Jerry Zhu -
2008 Spotlight: Sparse Online Learning via Truncated Gradient »
John Langford · Lihong Li · Tong Zhang -
2008 Oral: Unlabeled data: Now it helps, now it doesn't »
Aarti Singh · Rob Nowak · Jerry Zhu