Timezone: »

Statistical Inference with M-Estimators on Adaptively Collected Data
Kelly Zhang · Lucas Janson · Susan Murphy

Tue Dec 07 08:30 AM -- 10:00 AM (PST) @

Bandit algorithms are increasingly used in real-world sequential decision-making problems. Associated with this is an increased desire to be able to use the resulting datasets to answer scientific questions like: Did one type of ad lead to more purchases? In which contexts is a mobile health intervention effective? However, classical statistical approaches fail to provide valid confidence intervals when used with data collected with bandit algorithms. Alternative methods have recently been developed for simple models (e.g., comparison of means). Yet there is a lack of general methods for conducting statistical inference using more complex models on data collected with (contextual) bandit algorithms; for example, current methods cannot be used for valid inference on parameters in a logistic regression model for a binary reward. In this work, we develop theory justifying the use of M-estimators---which includes estimators based on empirical risk minimization as well as maximum likelihood---on data collected with adaptive algorithms, including (contextual) bandit algorithms. Specifically, we show that M-estimators, modified with particular adaptive weights, can be used to construct asymptotically valid confidence regions for a variety of inferential targets.

Author Information

Kelly Zhang (Harvard University)
Lucas Janson (Harvard University)
Susan Murphy (Harvard University)
Susan Murphy

Susan A. Murphy is Professor of Statistics and Computer Science at Harvard University. Her research focuses on improving sequential decision making in health, in particular the development of online, real-time reinforcement learning algorithms for use in personalized digital health. She is a member of the US National Academy of Sciences and of the US National Academy of Medicine. In 2013 she was awarded a MacArthur Fellowship for her work on experimental designs to inform sequential decision making. She is a Fellow of the College on Problems in Drug Dependence, Past-President of Institute of Mathematical Statistics, and a former editor of the Annals of Statistics.

More from the Same Authors