Timezone: »
Poster
First is Better Than Last for Language Data Influence
Chih-Kuan Yeh · Ankur Taly · Mukund Sundararajan · Frederick Liu · Pradeep Ravikumar
The ability to identify influential training examples enables us to debug training data and explain model behavior. Existing techniques to do so are based on the flow of training data influence through the model parameters. For large models in NLP applications, it is often computationally infeasible to study this flow through all model parameters, therefore techniques usually pick the last layer of weights. However, we observe that since the activation connected to the last layer of weights contains "shared logic", the data influenced calculated via the last layer weights prone to a "cancellation effect", where the data influence of different examples have large magnitude that contradicts each other. The cancellation effect lowers the discriminative power of the influence score, and deleting influential examples according to this measure often does not change the model's behavior by much. To mitigate this, we propose a technique called TracIn-WE that modifies a method called TracIn to operate on the word embedding layer instead of the last layer, where the cancellation effect is less severe. One potential concern is that influence based on the word embedding layer may not encode sufficient high level information. However, we find that gradients (unlike embeddings) do not suffer from this, possibly because they chain through higher layers. We show that TracIn-WE significantly outperforms other data influence methods applied on the last layer significantly on the case deletion evaluation on three language classification tasks for different models. In addition, TracIn-WE can produce scores not just at the level of the overall training input, but also at the level of words within the training input, a further aid in debugging.
Author Information
Chih-Kuan Yeh (Google Brain)
Ankur Taly (Google Brain)
Mukund Sundararajan (Google LLC)
Frederick Liu (Google)
Pradeep Ravikumar (Carnegie Mellon University)
More from the Same Authors
-
2022 : Domain-Adjusted Regression or: ERM May Already Learn Features Sufficient for Out-of-Distribution Generalization »
Elan Rosenfeld · Pradeep Ravikumar · Andrej Risteski -
2023 Poster: Learning Linear Causal Representations from Interventions under General Nonlinear Mixing »
Simon Buchholz · Goutham Rajendran · Elan Rosenfeld · Bryon Aragam · Bernhard Schölkopf · Pradeep Ravikumar -
2023 Poster: Learning with Explanation Constraints »
Rattana Pukdee · Dylan Sam · J. Zico Kolter · Maria-Florina Balcan · Pradeep Ravikumar -
2023 Poster: Sample based Explanations via Generalized Representers »
Che-Ping Tsai · Chih-Kuan Yeh · Pradeep Ravikumar -
2023 Poster: Dense-Exponential Random Features: Sharp Positive Estimators of the Gaussian Kernel »
Valerii Likhosherstov · Krzysztof M Choromanski · Kumar Avinava Dubey · Frederick Liu · Tamas Sarlos · Adrian Weller -
2023 Poster: Responsible AI (RAI) Games and Ensembles »
Yash Gupta · Runtian Zhai · Arun Suggala · Pradeep Ravikumar -
2023 Poster: Identifying Causal Mechanism Shifts among Nonlinear Additive Noise Models »
Tianyu Chen · Kevin Bello · Bryon Aragam · Pradeep Ravikumar -
2023 Poster: Order Matters in the Presence of Dataset Imbalance for Multilingual Learning »
Dami Choi · Derrick Xin · Justin Gilmer · Hamid Dadkhahi · Ankush Garg · Orhan Firat · Chih-Kuan Yeh · Andrew Dai · Behrooz Ghorbani -
2023 Poster: Global Optimality in Bivariate Gradient-based DAG Learning »
Chang Deng · Kevin Bello · Pradeep Ravikumar · Bryon Aragam -
2023 Oral: Learning Linear Causal Representations from Interventions under General Nonlinear Mixing »
Simon Buchholz · Goutham Rajendran · Elan Rosenfeld · Bryon Aragam · Bernhard Schölkopf · Pradeep Ravikumar -
2023 Poster: Fundamental Limits and Tradeoffs in Invariant Representation Learning »
Han Zhao · Chen Dan · Bryon Aragam · Tommi Jaakkola · Geoffrey Gordon · Pradeep Ravikumar -
2022 Spotlight: Identifiability of deep generative models without auxiliary information »
Bohdan Kivva · Goutham Rajendran · Pradeep Ravikumar · Bryon Aragam -
2022 : Domain-Adjusted Regression or: ERM May Already Learn Features Sufficient for Out-of-Distribution Generalization »
Elan Rosenfeld · Pradeep Ravikumar · Andrej Risteski -
2022 : Panel Discussion »
Behnam Neyshabur · David Sontag · Pradeep Ravikumar · Erin Hartman -
2022 Workshop: Human in the Loop Learning (HiLL) Workshop at NeurIPS 2022 »
Shanghang Zhang · Hao Dong · Wei Pan · Pradeep Ravikumar · Vittorio Ferrari · Fisher Yu · Xin Wang · Zihan Ding -
2022 Poster: DAGMA: Learning DAGs via M-matrices and a Log-Determinant Acyclicity Characterization »
Kevin Bello · Bryon Aragam · Pradeep Ravikumar -
2022 Poster: Chefs' Random Tables: Non-Trigonometric Random Features »
Valerii Likhosherstov · Krzysztof M Choromanski · Kumar Avinava Dubey · Frederick Liu · Tamas Sarlos · Adrian Weller -
2022 Poster: Identifiability of deep generative models without auxiliary information »
Bohdan Kivva · Goutham Rajendran · Pradeep Ravikumar · Bryon Aragam -
2022 Poster: Masked Prediction: A Parameter Identifiability View »
Bingbin Liu · Daniel Hsu · Pradeep Ravikumar · Andrej Risteski -
2021 Poster: Learning latent causal graphs via mixture oracles »
Bohdan Kivva · Goutham Rajendran · Pradeep Ravikumar · Bryon Aragam -
2021 Poster: Boosted CVaR Classification »
Runtian Zhai · Chen Dan · Arun Suggala · J. Zico Kolter · Pradeep Ravikumar -
2021 Poster: Detecting Errors and Estimating Accuracy on Unlabeled Data with Self-training Ensembles »
Jiefeng Chen · Frederick Liu · Besim Avci · Xi Wu · Yingyu Liang · Somesh Jha -
2021 Poster: When Is Generalizable Reinforcement Learning Tractable? »
Dhruv Malik · Yuanzhi Li · Pradeep Ravikumar -
2020 Poster: Estimating Training Data Influence by Tracing Gradient Descent »
Garima Pruthi · Frederick Liu · Satyen Kale · Mukund Sundararajan -
2020 Spotlight: Estimating Training Data Influence by Tracing Gradient Descent »
Garima Pruthi · Frederick Liu · Satyen Kale · Mukund Sundararajan -
2020 Poster: On Learning Ising Models under Huber's Contamination Model »
Adarsh Prasad · Vishwak Srinivasan · Sivaraman Balakrishnan · Pradeep Ravikumar -
2020 Poster: On Completeness-aware Concept-Based Explanations in Deep Neural Networks »
Chih-Kuan Yeh · Been Kim · Sercan Arik · Chun-Liang Li · Tomas Pfister · Pradeep Ravikumar -
2020 Poster: Generalized Boosting »
Arun Suggala · Bingbin Liu · Pradeep Ravikumar -
2019 Poster: On the (In)fidelity and Sensitivity of Explanations »
Chih-Kuan Yeh · Cheng-Yu Hsieh · Arun Suggala · David Inouye · Pradeep Ravikumar -
2019 Poster: On Human-Aligned Risk Minimization »
Liu Leqi · Adarsh Prasad · Pradeep Ravikumar -
2019 Poster: Optimal Analysis of Subset-Selection Based L_p Low-Rank Approximation »
Chen Dan · Hong Wang · Hongyang Zhang · Yuchen Zhou · Pradeep Ravikumar -
2019 Poster: Game Design for Eliciting Distinguishable Behavior »
Fan Yang · Liu Leqi · Yifan Wu · Zachary Lipton · Pradeep Ravikumar · Tom M Mitchell · William Cohen -
2018 Poster: The Sample Complexity of Semi-Supervised Learning with Nonparametric Mixture Models »
Chen Dan · Liu Leqi · Bryon Aragam · Pradeep Ravikumar · Eric Xing -
2018 Poster: Connecting Optimization and Regularization Paths »
Arun Suggala · Adarsh Prasad · Pradeep Ravikumar -
2018 Poster: DAGs with NO TEARS: Continuous Optimization for Structure Learning »
Xun Zheng · Bryon Aragam · Pradeep Ravikumar · Eric Xing -
2018 Spotlight: DAGs with NO TEARS: Continuous Optimization for Structure Learning »
Xun Zheng · Bryon Aragam · Pradeep Ravikumar · Eric Xing -
2018 Poster: MixLasso: Generalized Mixed Regression via Convex Atomic-Norm Regularization »
Ian En-Hsu Yen · Wei-Cheng Lee · Kai Zhong · Sung-En Chang · Pradeep Ravikumar · Shou-De Lin -
2018 Poster: Representer Point Selection for Explaining Deep Neural Networks »
Chih-Kuan Yeh · Joon Kim · Ian En-Hsu Yen · Pradeep Ravikumar -
2017 : Pradeep Ravikumar (CMU) on A Parallel Primal-Dual Sparse Method for Extreme Classification »
Pradeep Ravikumar -
2017 Poster: The Expxorcist: Nonparametric Graphical Models Via Conditional Exponential Densities »
Arun Suggala · Mladen Kolar · Pradeep Ravikumar -
2017 Poster: On Separability of Loss Functions, and Revisiting Discriminative Vs Generative Models »
Adarsh Prasad · Alexandru Niculescu-Mizil · Pradeep Ravikumar -
2017 Spotlight: On Separability of Loss Functions, and Revisiting Discriminative Vs Generative Models »
Adarsh Prasad · Alexandru Niculescu-Mizil · Pradeep Ravikumar -
2016 Poster: Dual Decomposed Learning with Factorwise Oracle for Structural SVM of Large Output Domain »
Ian En-Hsu Yen · Xiangru Huang · Kai Zhong · Ruohan Zhang · Pradeep Ravikumar · Inderjit Dhillon -
2015 Poster: Fast Classification Rates for High-dimensional Gaussian Generative Models »
Tianyang Li · Adarsh Prasad · Pradeep Ravikumar -
2015 Poster: Collaborative Filtering with Graph Information: Consistency and Scalable Methods »
Nikhil Rao · Hsiang-Fu Yu · Pradeep Ravikumar · Inderjit Dhillon -
2015 Spotlight: Collaborative Filtering with Graph Information: Consistency and Scalable Methods »
Nikhil Rao · Hsiang-Fu Yu · Pradeep Ravikumar · Inderjit Dhillon -
2015 Poster: Beyond Sub-Gaussian Measurements: High-Dimensional Structured Estimation with Sub-Exponential Designs »
Vidyashankar Sivakumar · Arindam Banerjee · Pradeep Ravikumar -
2015 Poster: Sparse Linear Programming via Primal and Dual Augmented Coordinate Descent »
Ian En-Hsu Yen · Kai Zhong · Cho-Jui Hsieh · Pradeep Ravikumar · Inderjit Dhillon -
2015 Poster: Fixed-Length Poisson MRF: Adding Dependencies to the Multinomial »
David I Inouye · Pradeep Ravikumar · Inderjit Dhillon -
2015 Poster: Consistent Multilabel Classification »
Oluwasanmi Koyejo · Nagarajan Natarajan · Pradeep Ravikumar · Inderjit Dhillon -
2015 Poster: Closed-form Estimators for High-dimensional Generalized Linear Models »
Eunho Yang · Aurelie Lozano · Pradeep Ravikumar -
2015 Spotlight: Closed-form Estimators for High-dimensional Generalized Linear Models »
Eunho Yang · Aurelie Lozano · Pradeep Ravikumar -
2014 Poster: QUIC & DIRTY: A Quadratic Approximation Approach for Dirty Statistical Models »
Cho-Jui Hsieh · Inderjit Dhillon · Pradeep Ravikumar · Stephen Becker · Peder A Olsen -
2014 Poster: Consistent Binary Classification with Generalized Performance Metrics »
Sanmi Koyejo · Nagarajan Natarajan · Pradeep Ravikumar · Inderjit Dhillon -
2014 Poster: On the Information Theoretic Limits of Learning Ising Models »
Rashish Tandon · Karthikeyan Shanmugam · Pradeep Ravikumar · Alex Dimakis -
2014 Poster: Sparse Random Feature Algorithm as Coordinate Descent in Hilbert Space »
Ian En-Hsu Yen · Ting-Wei Lin · Shou-De Lin · Pradeep Ravikumar · Inderjit Dhillon -
2014 Spotlight: Consistent Binary Classification with Generalized Performance Metrics »
Sanmi Koyejo · Nagarajan Natarajan · Pradeep Ravikumar · Inderjit Dhillon -
2014 Poster: Proximal Quasi-Newton for Computationally Intensive L1-regularized M-estimators »
Kai Zhong · Ian En-Hsu Yen · Inderjit Dhillon · Pradeep Ravikumar -
2014 Poster: A Representation Theory for Ranking Functions »
Harsh H Pareek · Pradeep Ravikumar -
2014 Poster: Capturing Semantically Meaningful Word Dependencies with an Admixture of Poisson MRFs »
David I Inouye · Pradeep Ravikumar · Inderjit Dhillon -
2014 Poster: Constant Nullspace Strong Convexity and Fast Convergence of Proximal Methods under High-Dimensional Settings »
Ian En-Hsu Yen · Cho-Jui Hsieh · Pradeep Ravikumar · Inderjit Dhillon -
2014 Poster: Elementary Estimators for Graphical Models »
Eunho Yang · Aurelie Lozano · Pradeep Ravikumar -
2013 Workshop: Discrete Optimization in Machine Learning: Connecting Theory and Practice »
Stefanie Jegelka · Andreas Krause · Pradeep Ravikumar · Kazuo Murota · Jeffrey A Bilmes · Yisong Yue · Michael Jordan -
2013 Poster: Conditional Random Fields via Univariate Exponential Families »
Eunho Yang · Pradeep Ravikumar · Genevera I Allen · Zhandong Liu -
2013 Poster: On Poisson Graphical Models »
Eunho Yang · Pradeep Ravikumar · Genevera I Allen · Zhandong Liu -
2013 Poster: BIG & QUIC: Sparse Inverse Covariance Estimation for a Million Variables »
Cho-Jui Hsieh · Matyas A Sustik · Inderjit Dhillon · Pradeep Ravikumar · Russell Poldrack -
2013 Oral: BIG & QUIC: Sparse Inverse Covariance Estimation for a Million Variables »
Cho-Jui Hsieh · Matyas A Sustik · Inderjit Dhillon · Pradeep Ravikumar · Russell Poldrack -
2013 Poster: Dirty Statistical Models »
Eunho Yang · Pradeep Ravikumar -
2013 Poster: Large Scale Distributed Sparse Precision Estimation »
Huahua Wang · Arindam Banerjee · Cho-Jui Hsieh · Pradeep Ravikumar · Inderjit Dhillon -
2013 Poster: Learning with Noisy Labels »
Nagarajan Natarajan · Inderjit Dhillon · Pradeep Ravikumar · Ambuj Tewari -
2012 Workshop: Discrete Optimization in Machine Learning (DISCML): Structure and Scalability »
Stefanie Jegelka · Andreas Krause · Jeffrey A Bilmes · Pradeep Ravikumar -
2012 Poster: Graphical Models via Generalized Linear Models »
Eunho Yang · Pradeep Ravikumar · Genevera I Allen · zhandong Liu -
2012 Oral: Graphical Models via Generalized Linear Models »
Eunho Yang · Pradeep Ravikumar · Genevera I Allen · zhandong Liu -
2012 Poster: A Divide-and-Conquer Method for Sparse Inverse Covariance Estimation »
Cho-Jui Hsieh · Inderjit Dhillon · Pradeep Ravikumar · Arindam Banerjee -
2011 Workshop: Discrete Optimization in Machine Learning (DISCML): Uncertainty, Generalization and Feedback »
Andreas Krause · Pradeep Ravikumar · Stefanie S Jegelka · Jeffrey A Bilmes -
2011 Poster: On Learning Discrete Graphical Models using Greedy Methods »
Ali Jalali · Christopher C Johnson · Pradeep Ravikumar -
2011 Spotlight: On Learning Discrete Graphical Models using Greedy Methods »
Ali Jalali · Christopher C Johnson · Pradeep Ravikumar -
2011 Poster: Greedy Algorithms for Structurally Constrained High Dimensional Problems »
Ambuj Tewari · Pradeep Ravikumar · Inderjit Dhillon -
2011 Poster: Sparse Inverse Covariance Matrix Estimation Using Quadratic Approximation »
Cho-Jui Hsieh · Matyas A Sustik · Inderjit Dhillon · Pradeep Ravikumar -
2011 Session: Oral Session 5 »
Pradeep Ravikumar -
2011 Poster: Nearest Neighbor based Greedy Coordinate Descent »
Inderjit Dhillon · Pradeep Ravikumar · Ambuj Tewari -
2010 Workshop: Discrete Optimization in Machine Learning: Structures, Algorithms and Applications »
Andreas Krause · Pradeep Ravikumar · Jeffrey A Bilmes · Stefanie Jegelka -
2010 Workshop: Robust Statistical Learning »
Pradeep Ravikumar · Constantine Caramanis · Sujay Sanghavi -
2010 Session: Oral Session 14 »
Pradeep Ravikumar -
2010 Oral: A Dirty Model for Multi-task Learning »
Ali Jalali · Pradeep Ravikumar · Sujay Sanghavi · Chao Ruan -
2010 Poster: A Dirty Model for Multi-task Learning »
Ali Jalali · Pradeep Ravikumar · Sujay Sanghavi · Chao Ruan -
2009 Workshop: Discrete Optimization in Machine Learning: Submodularity, Polyhedra and Sparsity »
Andreas Krause · Pradeep Ravikumar · Jeffrey A Bilmes -
2009 Poster: Information-theoretic lower bounds on the oracle complexity of convex optimization »
Alekh Agarwal · Peter Bartlett · Pradeep Ravikumar · Martin J Wainwright -
2009 Spotlight: Information-theoretic lower bounds on the oracle complexity of convex optimization »
Alekh Agarwal · Peter Bartlett · Pradeep Ravikumar · Martin J Wainwright -
2009 Poster: A unified framework for high-dimensional analysis of $M$-estimators with decomposable regularizers »
Sahand N Negahban · Pradeep Ravikumar · Martin J Wainwright · Bin Yu -
2009 Oral: A unified framework for high-dimensional analysis of $M$-estimators with decomposable regularizers »
Sahand N Negahban · Pradeep Ravikumar · Martin J Wainwright · Bin Yu -
2008 Poster: Nonparametric sparse hierarchical models describe V1 fMRI responses to natural images »
Pradeep Ravikumar · Vincent Vu · Bin Yu · Thomas Naselaris · Kendrick Kay · Jack Gallant -
2008 Spotlight: Nonparametric sparse hierarchical models describe V1 fMRI responses to natural images »
Pradeep Ravikumar · Vincent Vu · Bin Yu · Thomas Naselaris · Kendrick Kay · Jack Gallant -
2008 Poster: Model Selection in Gaussian Graphical Models: High-Dimensional Consistency of \ell_1-regularizedMLE »
Pradeep Ravikumar · Garvesh Raskutti · Martin J Wainwright · Bin Yu -
2007 Poster: SpAM: Sparse Additive Models »
Pradeep Ravikumar · Han Liu · John Lafferty · Larry Wasserman -
2007 Spotlight: SpAM: Sparse Additive Models »
Pradeep Ravikumar · Han Liu · John Lafferty · Larry Wasserman -
2006 Poster: Inferring Graphical Model Structure using $\ell_1$-Regularized Pseudo-Likelihood »
Martin J Wainwright · Pradeep Ravikumar · John Lafferty -
2006 Spotlight: Inferring Graphical Model Structure using $\ell_1$-Regularized Pseudo-Likelihood »
Martin J Wainwright · Pradeep Ravikumar · John Lafferty