NeurIPS 2019 Papers

Layout:

mini compact topic detail

Chirality Nets for Human Pose Regression

Uncertainty on Asynchronous Time Event Prediction

Learning Nearest Neighbor Graphs from Noisy Distance Samples

Efficient Symmetric Norm Regression via Linear Sketching

Weighted Linear Bandits for Non-Stationary Environments

Individual Regret in Cooperative Nonstochastic Multi-Armed Bandits

A Generalized Algorithm for Multi-Objective Reinforcement Learning and Policy Adaptation

On the (In)fidelity and Sensitivity of Explanations

The Label Complexity of Active Learning from Observational Data

Stochastic Proximal Langevin Algorithm: Potential Splitting and Nonasymptotic Rates

Covariate-Powered Empirical Bayes Estimation

On Distributed Averaging for Stochastic k-PCA

Online Markov Decoding: Lower Bounds and Near-Optimal Approximation Algorithms

Exact inference in structured prediction

Learning to Propagate for Graph Meta-Learning

Learning to Perform Local Rewriting for Combinatorial Optimization

Generalization Bounds for Neural Networks via Approximate Description Length

Interpreting and improving natural-language processing (in machines) with natural language-processing (in the brain)

Towards Automatic Concept-based Explanations

Connections Between Mirror Descent, Thompson Sampling and the Information Ratio

Solving graph compression via optimal transport

Quality Aware Generative Adversarial Networks

Model-Based Reinforcement Learning with Adversarial Training for Online Recommendation

Precision-Recall Balanced Topic Modelling

Brain-Like Object Recognition with High-Performing Shallow Recurrent ANNs

Shallow RNN: Accurate Time-series Classification on Resource Constrained Devices

When to use parametric models in reinforcement learning?

Globally Optimal Learning for Structured Elliptical Losses

Learning Sparse Distributions using Iterative Hard Thresholding

Variational Bayesian Decision-making for Continuous Utilities

Face Reconstruction from Voice using Generative Adversarial Networks

Variational Bayes under Model Misspecification

Inherent Tradeoffs in Learning Fair Representations

Policy Learning for Fairness in Ranking

Blocking Bandits

Non-Asymptotic Gap-Dependent Regret Bounds for Tabular MDPs

Prediction of Spatial Point Processes: Regularized Method with Out-of-Sample Guarantees

Deep imitation learning for molecular inverse problems

Verified Uncertainty Calibration

Fast-rate PAC-Bayes Generalization Bounds via Shifted Rademacher Processes

Model Selection for Contextual Bandits

On the Power and Limitations of Random Features for Understanding Neural Networks

Surfing: Iterative Optimization Over Incrementally Trained Deep Networks

DFNets: Spectral CNNs for Graphs with Feedback-Looped Filters

Elliptical Perturbations for Differential Privacy

Neural Jump Stochastic Differential Equations

Spherical Text Embedding

Graph Normalizing Flows

Flattening a Hierarchical Clustering through Active Learning

Efficiently escaping saddle points on manifolds

Fast Sparse Group Lasso

Counting the Optimal Solutions in Graphical Models

Optimal Sampling and Clustering in the Stochastic Block Model

Probabilistic Logic Neural Networks for Reasoning

Tensor Monte Carlo: Particle Methods for the GPU era

Shadowing Properties of Optimization Algorithms

Sequence Modeling with Unconstrained Generation Order

Learning Local Search Heuristics for Boolean Satisfiability

A Linearly Convergent Method for Non-Smooth Non-Convex Optimization on the Grassmannian with Applications to Robust Subspace and Dictionary Learning

The Randomized Midpoint Method for Log-Concave Sampling

MonoForest framework for tree ensemble analysis

Safe Exploration for Interactive Machine Learning

Exact sampling of determinantal point processes with sublinear time preprocessing

Painless Stochastic Gradient: Interpolation, Line-Search, and Convergence Rates

On Lazy Training in Differentiable Programming

D-VAE: A Variational Autoencoder for Directed Acyclic Graphs

Asymptotics for Sketching in Least Squares Regression

Mapping State Space using Landmarks for Universal Goal Reaching

STREETS: A Novel Camera Network Dataset for Traffic Flow

On the Transfer of Inductive Bias from Simulation to the Real World: a New Disentanglement Dataset

A Meta-MDP Approach to Exploration for Lifelong Reinforcement Learning

Fast and Accurate Stochastic Gradient Estimation

Adaptive Influence Maximization with Myopic Feedback

Using Self-Supervised Learning Can Improve Model Robustness and Uncertainty

Fast Efficient Hyperparameter Tuning for Policy Gradient Methods

Batched Multi-armed Bandits Problem

Pseudo-Extended Markov chain Monte Carlo

Adaptive Gradient-Based Meta-Learning Methods

Quaternion Knowledge Graph Embeddings

AGEM: Solving Linear Inverse Problems via Deep Priors and Sampling

Channel Gating Neural Networks

Aligning Visual Regions and Textual Concepts for Semantic-Grounded Image Representations

Globally Convergent Newton Methods for Ill-conditioned Generalized Self-concordant Losses

Causal Regularization

Nonstochastic Multiarmed Bandits with Unrestricted Delays

Ultrametric Fitting by Gradient Descent

Deliberative Explanations: visualizing network insecurities

Regret Minimization for Reinforcement Learning with Vectorial Feedback and Complex Objectives

Empirically Measuring Concentration: Fundamental Limits on Intrinsic Robustness

Adaptive Auxiliary Task Weighting for Reinforcement Learning

Visualizing and Measuring the Geometry of BERT

Learning from Bad Data via Generation

Biases for Emergent Communication in Multi-agent Reinforcement Learning

Outlier-robust estimation of a sparse linear model using $\ell_1$-penalized Huber's $M$-estimator

Meta-Inverse Reinforcement Learning with Probabilistic Context Variables

Robustness Verification of Tree-based Models

Levenshtein Transformer

SPoC: Search-based Pseudocode to Code

muSSP: Efficient Min-cost Flow Algorithm for Multi-object Tracking

Communication trade-offs for Local-SGD with large step size

Deep RGB-D Canonical Correlation Analysis For Sparse Depth Completion

Distribution-Independent PAC Learning of Halfspaces with Massart Noise

A Meta-Analysis of Overfitting in Machine Learning

Efficient online learning with kernels for adversarial large scale problems

Generalization Error Analysis of Quantized Compressive Learning

PIDForest: Anomaly Detection via Partial Identification

Learning Reward Machines for Partially Observable Reinforcement Learning

Learning Compositional Neural Programs with Recursive Tree Search and Planning

Efficiently Learning Fourier Sparse Set Functions

Machine Learning Estimation of Heterogeneous Treatment Effects with Instruments

Universality and individuality in neural dynamics across large populations of recurrent networks

Small ReLU networks are powerful memorizers: a tight analysis of memorization capacity

Comparing distributions: $\ell_1$ geometry improves kernel two-sample testing

Reflection Separation using a Pair of Unpolarized and Polarized Images

Recovering Bandits

Combining Generative and Discriminative Models for Hybrid Inference

Legendre Memory Units: Continuous-Time Representation in Recurrent Neural Networks

Variational Bayesian Optimal Experimental Design

Cormorant: Covariant Molecular Neural Networks

SIC-MMAB: Synchronisation Involves Communication in Multiplayer Multi-Armed Bandits

Cold Case: The Lost MNIST Digits

Robust exploration in linear quadratic reinforcement learning

Bayesian Optimization under Heavy-tailed Payoffs

Convergence of Adversarial Training in Overparametrized Neural Networks

Implicit Posterior Variational Inference for Deep Gaussian Processes

Multi-Criteria Dimensionality Reduction with Applications to Fairness

Assessing Social and Intersectional Biases in Contextualized Word Representations

Learning Positive Functions with Pseudo Mirror Descent

Fast and Provable ADMM for Learning with Generative Priors

Multiagent Evaluation under Incomplete Information

Modeling Conceptual Understanding in Image Reference Games

Calibration tests in multi-class classification: A unifying framework

Theoretical Analysis of Adversarial Learning: A Minimax Approach

Tight Regret Bounds for Model-Based Reinforcement Learning with Greedy Policies

Adversarial Music: Real world Audio Adversary against Wake-word Detection System

Hindsight Credit Assignment

Efficient and Accurate Estimation of Lipschitz Constants for Deep Neural Networks

Cross-sectional Learning of Extremal Dependence among Financial Assets

Towards Explaining the Regularization Effect of Initial Large Learning Rate in Training Neural Networks

Multilabel reductions: what is my loss optimising?

Smoothing Structured Decomposable Circuits

Private Stochastic Convex Optimization with Optimal Rates

Provably Robust Deep Learning via Adversarially Trained Smoothed Classifiers

Nearly Tight Bounds for Robust Proper Learning of Halfspaces with a Margin

McDiarmid-Type Inequalities for Graph-Dependent Variables and Stability Bounds

The Broad Optimality of Profile Maximum Likelihood

Adaptive Density Estimation for Generative Models

Neural Networks with Cheap Differential Operators

On the Downstream Performance of Compressed Word Embeddings

Policy Continuation with Hindsight Inverse Dynamics

Generalization Bounds of Stochastic Gradient Descent for Wide and Deep Neural Networks

Unsupervised Curricula for Visual Meta-Reinforcement Learning

Regularization Matters: Generalization and Optimization of Neural Nets v.s. their Induced Kernel

Residual Flows for Invertible Generative Modeling

Data-dependent Sample Complexity of Deep Neural Networks via Lipschitz Augmentation

Evaluating Protein Transfer Learning with TAPE

Sequential Neural Processes

Guided Meta-Policy Search

Quantum Entropy Scoring for Fast Robust Mean Estimation and Improved Outlier Detection

Paradoxes in Fair Machine Learning

A unified theory for the origin of grid cells through the lens of pattern formation

Poincaré Recurrence, Cycles and Spurious Equilibria in Gradient-Descent-Ascent for Non-Convex Non-Concave Zero-Sum Games

Limitations of Lazy Training of Two-layers Neural Network

This Looks Like That: Deep Learning for Interpretable Image Recognition

Online Learning via the Differential Privacy Lens

Who is Afraid of Big Bad Minima? Analysis of gradient-flow in spiked matrix-tensor models

Self-Critical Reasoning for Robust Visual Question Answering

Sinkhorn Barycenters with Free Support via Frank-Wolfe Algorithm

Reconciling meta-learning and continual learning with online mixtures of tasks

Private Learning Implies Online Learning: An Efficient Reduction

Fast Convergence of Belief Propagation to Global Optima: Beyond Correlation Decay

Large Memory Layers with Product Keys

On Exact Computation with an Infinitely Wide Neural Net

Better Transfer Learning with Inferred Successor Maps

A Structured Prediction Approach for Generalization in Cooperative Multi-Agent Reinforcement Learning

List-decodable Linear Regression

An adaptive nearest neighbor rule for classification

Optimal Sparse Decision Trees

Implicit Regularization in Deep Matrix Factorization

Fast and Flexible Multi-Task Classification using Conditional Neural Adaptive Processes

Are sample means in multi-armed bandits positively or negatively biased?

Stochastic Runge-Kutta Accelerates Langevin Monte Carlo and Beyond

VIREL: A Variational Inference Framework for Reinforcement Learning

Emergence of Object Segmentation in Perturbed Generative Models

On the Hardness of Robust Classification

Sparse Logistic Regression Learns All Discrete Pairwise Graphical Models

Invertible Convolutional Flow

Wasserstein Weisfeiler-Lehman Graph Kernels

Differentiable Ranking and Sorting using Optimal Transport

Adversarial Training and Robustness for Multiple Perturbations

UniXGrad: A Universal, Adaptive Algorithm with Optimal Guarantees for Constrained Optimization

Scalable Global Optimization via Local Bayesian Optimization

Infra-slow brain dynamics as a marker for cognitive function and decline

Learning by Abstraction: The Neural State Machine

Optimal Stochastic and Online Learning with Individual Iterates

Cross-lingual Language Model Pretraining

Learning Object Bounding Boxes for 3D Instance Segmentation on Point Clouds

DM2C: Deep Mixed-Modal Clustering

Poisson-Minibatching for Gibbs Sampling with Convergence Rate Guarantees

Likelihood-Free Overcomplete ICA and Applications In Causal Discovery

A Step Toward Quantifying Independently Reproducible Machine Learning Research

KerGM: Kernelized Graph Matching

Learning dynamic polynomial proofs

On Testing for Biases in Peer Review

Weight Agnostic Neural Networks

When does label smoothing help?

Statistical bounds for entropic optimal transport: sample complexity and the central limit theorem

Learning Hierarchical Priors in VAEs

A Nonconvex Approach for Exact and Efficient Multichannel Sparse Blind Deconvolution

Differentially Private Markov Chain Monte Carlo

Practical Differentially Private Top-k Selection with Pay-what-you-get Composition

Compression with Flows via Local Bits-Back Coding

Learning in Generalized Linear Contextual Bandits with Stochastic Delays

Efficient Regret Minimization Algorithm for Extensive-Form Correlated Equilibrium

Perceiving the arrow of time in autoregressive motion

Principal Component Projection and Regression in Nearly Linear Time through Asymmetric SVRG

Implicit Generation and Modeling with Energy Based Models

SuperGLUE: A Stickier Benchmark for General-Purpose Language Understanding Systems

Dual Variational Generation for Low Shot Heterogeneous Face Recognition

Probabilistic Watershed: Sampling all spanning forests for seeded segmentation and semi-supervised learning

Heterogeneous Graph Learning for Visual Commonsense Reasoning

SGD on Neural Networks Learns Functions of Increasing Complexity

Beyond Online Balanced Descent: An Optimal Algorithm for Smoothed Online Optimization

DualDICE: Behavior-Agnostic Estimation of Discounted Stationary Distribution Corrections

Learning to Control Self-Assembling Morphologies: A Study of Generalization via Modularity

Complexity of Highly Parallel Non-Smooth Convex Optimization

Asymmetric Valleys: Beyond Sharp and Flat Local Minima

Conditional Independence Testing using Generative Adversarial Networks

Training Image Estimators without Image Ground Truth

Numerically Accurate Hyperbolic Embeddings Using Tiling-Based Models

Positional Normalization

Better Exploration with Optimistic Actor Critic

Quadratic Video Interpolation

Efficient Meta Learning via Minibatch Proximal Update

Cascade RPN: Delving into High-Quality Region Proposal Network with Adaptive Convolution

Twin Auxilary Classifiers GAN

Identification of Conditional Causal Effects under Markov Equivalence

Finding Friend and Foe in Multi-Agent Games

Learning Perceptual Inference by Contrasting

Point-Voxel CNN for Efficient 3D Deep Learning

Splitting Steepest Descent for Growing Neural Architectures

SySCD: A System-Aware Parallel Coordinate Descent Algorithm

Divide and Couple: Using Monte Carlo Variational Objectives for Posterior Approximation

Deep Equilibrium Models

CPM-Nets: Cross Partial Multi-View Networks

Asymptotic Guarantees for Learning Generative Models with the Sliced-Wasserstein Distance

Imitation Learning from Observations by Minimizing Inverse Dynamics Disagreement

Ask not what AI can do, but what AI should do: Towards a framework of task delegability

Multimodal Model-Agnostic Meta-Learning via Task-Aware Modulation

Enabling hyperparameter optimization in sequential autoencoders for spiking neural data

Re-randomized Densification for One Permutation Hashing and Bin-wise Consistent Weighted Sampling

Adversarial Examples Are Not Bugs, They Are Features

Multi-task Learning for Aggregated Data using Gaussian Processes

Hierarchical Decision Making by Generating and Following Natural Language Instructions

Surround Modulation: A Bio-inspired Connectivity Structure for Convolutional Neural Networks

Self-attention with Functional Time Representation Learning

Optimistic Distributionally Robust Optimization for Nonparametric Likelihood Approximation

Generalization in multitask deep neural classifiers: a statistical physics approach

On Relating Explanations and Adversarial Examples

On the equivalence between graph isomorphism testing and function approximation with GNNs

Ease-of-Teaching and Language Structure from Emergent Communication

Approximate Feature Collisions in Neural Nets

Abstraction based Output Range Analysis for Neural Networks

Generative Models for Graph-Based Protein Design

The Geometry of Deep Networks: Power Diagram Subdivision

Space and Time Efficient Kernel Density Estimation in High Dimensions

Learning Data Manipulation for Augmentation and Weighting

Gradient-based Adaptive Markov Chain Monte Carlo

Exploring Algorithmic Fairness in Robust Graph Covering Problems

Variational Mixture-of-Experts Autoencoders for Multi-Modal Deep Generative Models

Imitation-Projected Programmatic Reinforcement Learning

Reverse engineering recurrent networks for sentiment classification reveals line attractor dynamics

TAB-VCR: Tags and Attributes based Visual Commonsense Reasoning Baselines

BehaveNet: nonlinear embedding and Bayesian neural decoding of behavioral videos

Are deep ResNets provably better than linear predictors?

A Family of Robust Stochastic Operators for Reinforcement Learning

End-to-End Learning on 3D Protein Structure for Interface Prediction

DRUM: End-To-End Differentiable Rule Mining On Knowledge Graphs

Structured and Deep Similarity Matching via Structured and Deep Hebbian Networks

Amortized Bethe Free Energy Minimization for Learning MRFs

A Condition Number for Joint Optimization of Cycle-Consistent Networks

Wasserstein Dependency Measure for Representation Learning

Differential Privacy Has Disparate Impact on Model Accuracy

Low-Rank Bandit Methods for High-Dimensional Dynamic Pricing

Learning Representations by Maximizing Mutual Information Across Views

Exact Combinatorial Optimization with Graph Convolutional Neural Networks

Kernel Truncated Randomized Ridge Regression: Optimal Rates and Low Noise Acceleration

A Kernel Loss for Solving the Bellman Equation

Stacked Capsule Autoencoders

Neural Taskonomy: Inferring the Similarity of Task-Derived Representations from Brain Activity

Goal-conditioned Imitation Learning

Multiple Futures Prediction

Riemannian batch normalization for SPD neural networks

Understanding the Representation Power of Graph Neural Networks in Learning Graph Topology

Hamiltonian Neural Networks

Preventing Gradient Attenuation in Lipschitz Constrained Convolutional Networks

Explicitly disentangling image content from translation and rotation with spatial-VAE

Mo' States Mo' Problems: Emergency Stop Mechanisms from Observation

Input-Output Equivalence of Unitary and Contractive RNNs

Group Retention when Using Machine Learning in Sequential Decision Making: the Interplay between User Dynamics and Fairness

Search on the Replay Buffer: Bridging Planning and Reinforcement Learning

Certifying Geometric Robustness of Neural Networks

DeepWave: A Recurrent Neural-Network for Real-Time Acoustic Imaging

Can Unconditional Language Models Recover Arbitrary Sentences?

Momentum-Based Variance Reduction in Non-Convex SGD

Reward Constrained Interactive Recommendation with Natural Language Feedback

Flexible Modeling of Diversity with Strongly Log-Concave Distributions

Efficient Rematerialization for Deep Networks

Invariance and identifiability issues for word embeddings

Projected Stein Variational Newton: A Fast and Scalable Bayesian Inference Method in High Dimensions

Power analysis of knockoff filters for correlated designs

Robust Bi-Tempered Logistic Loss Based on Bregman Divergences

An Algorithm to Learn Polytree Networks with Hidden Nodes

Semi-Parametric Efficient Policy Learning with Continuous Actions

Function-Space Distributions over Kernels

Beyond the Single Neuron Convex Barrier for Neural Network Certification

Minimal Variance Sampling in Stochastic Gradient Boosting

Compositional Plan Vectors

Computational Separations between Sampling and Optimization

On Human-Aligned Risk Minimization

Locally Private Learning without Interaction Requires Separation

Learning to Optimize in Swarms

The Step Decay Schedule: A Near Optimal, Geometrically Decaying Learning Rate Procedure For Least Squares

Solving a Class of Non-Convex Min-Max Games Using Iterative First Order Methods

Missing Not at Random in Matrix Completion: The Effectiveness of Estimating Missingness Probabilities Under a Low Nuclear Norm Assumption

Accurate Layerwise Interpretable Competence Estimation

Semantic-Guided Multi-Attention Localization for Zero-Shot Learning

Near Neighbor: Who is the Fairest of Them All?

Offline Contextual Bandits with High Probability Fairness Guarantees

Online Optimal Control with Linear Dynamics and Predictions: Algorithms and Regret Analysis

Regret Bounds for Learning State Representations in Reinforcement Learning

MelGAN: Generative Adversarial Networks for Conditional Waveform Synthesis

PAC-Bayes Un-Expected Bernstein Inequality

Generalized Matrix Means for Semi-Supervised Learning with Multilayer Graphs

Planning with Goal-Conditioned Policies

Generating Diverse High-Fidelity Images with VQ-VAE-2

Don't take it lightly: Phasing optical random projections with unknown operators

Explicit Explore-Exploit Algorithms in Continuous State Spaces

Algorithmic Guarantees for Inverse Imaging with Untrained Network Priors

Value Function in Frequency Domain and the Characteristic Value Iteration Algorithm

Unsupervised Co-Learning on $G$-Manifolds Across Irreducible Representations

A Self Validation Network for Object-Level Human Attention Estimation

Thompson Sampling and Approximate Inference

Algorithm-Dependent Generalization Bounds for Overparameterized Deep Residual Networks

Invariance-inducing regularization using worst-case transformations suffices to boost accuracy and spatial robustness

Towards Practical Alternating Least-Squares for CCA

Icebreaker: Element-wise Efficient Information Acquisition with a Bayesian Deep Latent Gaussian Model

Single-Model Uncertainties for Deep Learning

Compiler Auto-Vectorization with Imitation Learning

Abstract Reasoning with Distracting Features

Sliced Gromov-Wasserstein

Pure Exploration with Multiple Correct Answers

Revisiting the Bethe-Hessian: Improved Community Detection in Sparse Heterogeneous Graphs

Discrete Flows: Invertible Generative Models of Discrete Data

Likelihood Ratios for Out-of-Distribution Detection

Universal Boosting Variational Inference

Qsparse-local-SGD: Distributed SGD with Quantization, Sparsification and Local Computations

Explaining Landscape Connectivity of Low-cost Solutions for Multilayer Nets

Thompson Sampling for Multinomial Logit Contextual Bandits

Fast Structured Decoding for Sequence Models

Computational Mirrors: Blind Inverse Light Transport by Deep Matrix Factorization

Nonparametric Contextual Bandits in Metric Spaces with Unknown Metric

Backprop with Approximate Activations for Memory-efficient Network Training

Bayesian Layers: A Module for Neural Network Uncertainty

Hamiltonian descent for composite objectives

DAC: The Double Actor-Critic Architecture for Learning Options

Exact Gaussian Processes on a Million Data Points

On the Fairness of Disentangled Representations

Reverse KL-Divergence Training of Prior Networks: Improved Uncertainty and Adversarial Robustness

Unsupervised Discovery of Temporal Structure in Noisy Data with Dynamical Components Analysis

Low-Complexity Nonparametric Bayesian Online Prediction with Universal Guarantees

Region-specific Diffeomorphic Metric Mapping

Policy Poisoning in Batch Reinforcement Learning and Control

Non-Asymptotic Pure Exploration by Solving Games

Flexible information routing in neural populations through stochastic comodulation

Are Disentangled Representations Helpful for Abstract Visual Reasoning?

Censored Semi-Bandits: A Framework for Resource Allocation with Censored Feedback

Categorized Bandits

Generalization Bounds in the Predict-then-Optimize Framework

Implicit Regularization of Accelerated Methods in Hilbert Spaces

Characterization and Learning of Causal Graphs with Latent Variables from Soft Interventions

General E(2)-Equivariant Steerable CNNs

Robust Attribution Regularization

Structure Learning with Side Information: Sample Complexity

Deep Multi-State Dynamic Recurrent Neural Networks Operating on Wavelet Based Neural Features for Robust Brain Machine Interfaces

Efficient characterization of electrically evoked responses for neural interfaces

Differentially Private Distributed Data Summarization under Covariate Shift

Untangling in Invariant Speech Recognition

Outlier Detection and Robust PCA Using a Convex Measure of Innovation

PowerSGD: Practical Low-Rank Gradient Compression for Distributed Optimization

Constraint-based Causal Structure Learning with Consistent Separating Sets

Stochastic Frank-Wolfe for Composite Convex Minimization

Efficient Neural Architecture Transformation Search in Channel-Level for Object Detection

Integrating Markov processes with structural causal modeling enables counterfactual inference in complex systems

A Similarity-preserving Network Trained on Transformed Images Recapitulates Salient Features of the Fly Motion Detection Circuit

Fast, Provably convergent IRLS Algorithm for p-norm Linear Regression

Sample Efficient Active Learning of Causal Trees

Differentially Private Covariance Estimation

Computing Linear Restrictions of Neural Networks

Correlation Priors for Reinforcement Learning

Inducing brain-relevant bias in natural language processing models

User-Specified Local Differential Privacy in Unconstrained Adaptive Online Learning

Stochastic Bandits with Context Distributions

Multi-resolution Multi-task Gaussian Processes

A New Perspective on Pool-Based Active Classification and False-Discovery Control

Provable Certificates for Adversarial Examples: Fitting a Ball in the Union of Polytopes

Are Sixteen Heads Really Better than One?

Universal Approximation of Input-Output Maps by Temporal Convolutional Nets

Reinforcement Learning with Convex Constraints

Graph-based Discriminators: Sample Complexity and Expressiveness

Defending Neural Backdoors via Generative Distribution Modeling

Calculating Optimistic Likelihoods Using (Geodesically) Convex Optimization

The Implicit Metropolis-Hastings Algorithm

Can you trust your model's uncertainty? Evaluating predictive uncertainty under dataset shift

An Inexact Augmented Lagrangian Framework for Nonconvex Optimization with Nonlinear Constraints

Inverting Deep Generative models, One layer at a time

Generalization in Reinforcement Learning with Selective Noise Injection and Information Bottleneck

Primal-Dual Block Generalized Frank-Wolfe

GOT: An Optimal Transport framework for Graph comparison

Learning Fairness in Multi-Agent Systems

On Mixup Training: Improved Calibration and Predictive Uncertainty for Deep Neural Networks

Sampled Softmax with Random Fourier Features

A Solvable High-Dimensional Model of GAN

Semi-flat minima and saddle points by embedding neural networks to overparameterization

Using Embeddings to Correct for Unobserved Confounding in Networks

On Robustness to Adversarial Examples and Polynomial Optimization

Adversarial Robustness through Local Linearization

A Graph Theoretic Additive Approximation of Optimal Transport

DiskANN: Fast Accurate Billion-point Nearest Neighbor Search on a Single Node

Towards Hardware-Aware Tractable Learning of Probabilistic Models

No-Regret Learning in Unknown Games with Correlated Payoffs

Learning about an exponential amount of conditional distributions

An Algorithmic Framework For Differentially Private Data Analysis on Trusted Processors

Approximating Interactive Human Evaluation with Self-Play for Open-Domain Dialog Systems

Towards modular and programmable architecture search

Compacting, Picking and Growing for Unforgetting Continual Learning

AutoPrune: Automatic Network Pruning by Regularizing Auxiliary Parameters

Paraphrase Generation with Latent Bag of Words

A New Distribution on the Simplex with Auto-Encoding Applications

Non-normal Recurrent Neural Network (nnRNN): learning long time dependencies while improving expressivity with transient dynamics

Alleviating Label Switching with Optimal Transport

RUDDER: Return Decomposition for Delayed Rewards

Search-Guided, Lightly-Supervised Training of Structured Prediction Energy Networks

Explanations can be manipulated and geometry is to blame

Hierarchical Optimal Transport for Multimodal Distribution Alignment

Object landmark discovery through unsupervised adaptation

On Differentially Private Graph Sparsification and Applications

Accelerating Rescaled Gradient Descent: Fast Optimization of Smooth Functions

Metalearned Neural Memory

Recurrent Kernel Networks

Gossip-based Actor-Learner Architectures for Deep Reinforcement Learning

Variance Reduction in Bipartite Experiments through Correlation Clustering

Specific and Shared Causal Relation Modeling and Mechanism-Based Clustering

Near-Optimal Reinforcement Learning in Dynamic Treatment Regimes

Shaping Belief States with Generative Environment Models for RL

Exploration via Hindsight Goal Generation

Manifold denoising by Nonlinear Robust Principal Component Analysis

Global Convergence of Gradient Descent for Deep Linear Residual Networks

Diffusion Improves Graph Learning

The continuous Bernoulli: fixing a pervasive error in variational autoencoders

Extreme Classification in Log Memory using Count-Min Sketch: A Case Study of Amazon Search with 50M Products

A Fourier Perspective on Model Robustness in Computer Vision

Communication-efficient Distributed SGD with Sketching

Privacy Amplification by Mixing and Diffusion Mechanisms

Episodic Memory in Lifelong Language Learning

Kalman Filter, Sensor Fusion, and Constrained Regression: Equivalences and Insights

Deep Random Splines for Point Process Intensity Estimation of Neural Population Data

Learning nonlinear level sets for dimensionality reduction in function approximation

Self-supervised GAN: Analysis and Improvement with Multi-class Minimax Game

Stochastic Continuous Greedy ++: When Upper and Lower Bounds Match

A Simple Baseline for Bayesian Uncertainty in Deep Learning

Online-Within-Online Meta-Learning

Online Convex Matrix Factorization with Representative Regions

Provably robust boosted decision stumps and trees against adversarial attacks

Kernel quadrature with DPPs

Generative Well-intentioned Networks

Fast and Furious Learning in Zero-Sum Games: Vanishing Regret with Non-Vanishing Step Sizes

Sim2real transfer learning for 3D human pose estimation: motion to the rescue

REM: From Structural Entropy to Community Structure Deception

Learning to Correlate in Multi-Player General-Sum Sequential Games

Unified Language Model Pre-training for Natural Language Understanding and Generation

Minimum Stein Discrepancy Estimators

On the Inductive Bias of Neural Tangent Kernels

Self-Supervised Deep Learning on Point Clouds by Reconstructing Space

Piecewise Strong Convexity of Neural Networks

Cross-Domain Transferability of Adversarial Perturbations

Sparse High-Dimensional Isotonic Regression

The Option Keyboard: Combining Skills in Reinforcement Learning

Random Projections and Sampling Algorithms for Clustering of High-Dimensional Polygonal Curves

Making AI Forget You: Data Deletion in Machine Learning

Triad Constraints for Learning Causal Structure of Latent Variables

k-Means Clustering of Lines for Big Data

Recurrent Space-time Graph Neural Networks

Accurate, reliable and fast robustness evaluation

Band-Limited Gaussian Processes: The Sinc Kernel

Streaming Bayesian Inference for Crowdsourced Classification

Neuropathic Pain Diagnosis Simulator for Causal Discovery Algorithm Evaluation

Leveraging Labeled and Unlabeled Data for Consistent Fair Binary Classification

Unsupervised Object Segmentation by Redrawing

Efficient Algorithms for Smooth Minimax Optimization

Learning search spaces for Bayesian optimization: Another view of hyperparameter transfer learning

MetaInit: Initializing learning by learning to initialize

A Generic Acceleration Framework for Stochastic Composite Optimization

Scalable Deep Generative Relational Model with High-Order Node Dependence

Continuous-time Models for Stochastic Optimization Algorithms

Implicit Semantic Data Augmentation for Deep Networks

Learning Hawkes Processes from a handful of events

Random Path Selection for Continual Learning

Selecting causal brain features with a single conditional independence test per feature

Control What You Can: Intrinsically Motivated Task-Planning Agent

Continuous Hierarchical Representations with Poincaré Variational Auto-Encoders

Beating SGD Saturation with Tail-Averaging and Minibatching

When to Trust Your Model: Model-Based Policy Optimization

Correlation Clustering with Adaptive Similarity Queries

Random Quadratic Forms with Dependence: Applications to Restricted Isometry and Beyond

Curriculum-guided Hindsight Experience Replay

Kernelized Bayesian Softmax for Text Generation

Efficient Identification in Linear Structural Causal Models with Instrumental Cutsets

A General Framework for Symmetric Property Estimation

Generalization of Reinforcement Learners with Working and Episodic Memory

Classification Accuracy Score for Conditional Generative Models

Screening Sinkhorn Algorithm for Regularized Optimal Transport

Distribution Learning of a Random Spatial Field with a Location-Unaware Mobile Sensor

Structured Prediction with Projection Oracles

Selective Sampling-based Scalable Sparse Subspace Clustering

Tree-Sliced Variants of Wasserstein Distances

Universality in Learning from Linear Measurements

Structured Variational Inference in Continuous Cox Process Models

Towards Interpretable Reinforcement Learning Using Attention Augmented Agents

Theoretical Limits of Pipeline Parallel Optimization and Application to Distributed Deep Learning

Root Mean Square Layer Normalization

Integer Discrete Flows and Lossless Compression

Think out of the "Box": Generically-Constrained Asynchronous Composite Optimization and Hedging

A Primal Dual Formulation For Deep Learning With Constraints

Beyond temperature scaling: Obtaining well-calibrated multi-class probabilities with Dirichlet calibration

Submodular Function Minimization with Noisy Evaluation Oracle

Multi-objective Bayesian optimisation with preferences over objectives

Novel positional encodings to enable tree-based transformers

Planning in entropy-regularized Markov decision processes and games

Neural Attribution for Semantic Bug-Localization in Student Programs

Debiased Bayesian inference for average treatment effects

Are Labels Required for Improving Adversarial Robustness?

The Impact of Regularization on High-dimensional Logistic Regression

Modelling the Dynamics of Multiagent Q-Learning in Repeated Symmetric Games: a Mean Field Theoretic Approach

Bootstrapping Upper Confidence Bound

Attribution-Based Confidence Metric For Deep Neural Networks

Margin-Based Generalization Lower Bounds for Boosted Classifiers

Adaptive Temporal-Difference Learning for Policy Evaluation with Per-State Uncertainty Estimates

Learning to Confuse: Generating Training Time Adversarial Data with Auto-Encoder

Gradient based sample selection for online continual learning

Graph Transformer Networks

Multivariate Distributionally Robust Convex Regression under Absolute Error Loss

Dimension-Free Bounds for Low-Precision Training

Improved Regret Bounds for Bandit Combinatorial Optimization

Theoretical evidence for adversarial robustness through randomization

Online Continual Learning with Maximal Interfered Retrieval

Deep Multimodal Multilinear Fusion with High-order Polynomial Pooling

Bayesian Optimization with Unknown Search Space

Pareto Multi-Task Learning

Optimizing Generalized PageRank Methods for Seed-Expansion Community Detection

The Case for Evaluating Causal Models Using Interventional Measures and Empirical Data

A Domain Agnostic Measure for Monitoring and Evaluating GANs

Learning Auctions with Robust Incentive Guarantees

Concentration of risk measures: A Wasserstein distance approach

A Zero-Positive Learning Approach for Diagnosing Software Performance Regressions

Thresholding Bandit with Optimal Aggregate Regret

DTWNet: a Dynamic Time Warping Network

Stabilizing Off-Policy Q-Learning via Bootstrapping Error Reduction

Policy Optimization Provably Converges to Nash Equilibria in Zero-Sum Linear Quadratic Games

Faster Boosting with Smaller Memory

Subquadratic High-Dimensional Hierarchical Clustering

Landmark Ordinal Embedding

Distributed estimation of the inverse Hessian by determinantal averaging

Personalizing Many Decisions with High-Dimensional Covariates

Communication-Efficient Distributed Blockwise Momentum SGD with Error-Feedback

Global Guarantees for Blind Demodulation with Generative Priors

The Thermodynamic Variational Objective

Structured Graph Learning Via Laplacian Spectral Constraints

A Necessary and Sufficient Stability Notion for Adaptive Generalization

Sparse Variational Inference: Bayesian Coresets from Scratch

Demystifying Black-box Models with Symbolic Metamodels

Provable Non-linear Inductive Matrix Completion

Rethinking Kernel Methods for Node Representation Learning on Graphs

Privacy-Preserving Q-Learning with Functional Noise in Continuous Spaces

Distribution oblivious, risk-aware algorithms for multi-armed bandits with unbounded rewards

Learning Neural Networks with Adaptive Regularization

Layer-Dependent Importance Sampling for Training Deep and Large Graph Convolutional Networks

A Direct tilde{O}(1/epsilon) Iteration Parallel Algorithm for Optimal Transport

Two Generator Game: Learning to Sample via Linear Goodness-of-Fit Test

Gaussian-Based Pooling for Convolutional Neural Networks

NAOMI: Non-Autoregressive Multiresolution Sequence Imputation

Online EXP3 Learning in Adversarial Bandits with Delayed Feedback

Neural Temporal-Difference Learning Converges to Global Optima

Unlabeled Data Improves Adversarial Robustness

Meta Architecture Search

Greedy Sampling for Approximate Clustering in the Presence of Outliers

Attentive State-Space Modeling of Disease Progression

Region Mutual Information Loss for Semantic Segmentation

Data Parameters: A New Family of Parameters for Learning a Differentiable Curriculum

Learning Stable Deep Dynamics Models

Unified Sample-Optimal Property Estimation in Near-Linear Time

Machine Teaching of Active Sequential Learners

On Tractable Computation of Expected Predictions

Local SGD with Periodic Averaging: Tighter Analysis and Adaptive Synchronization

Bias Correction of Learned Generative Models using Likelihood-Free Importance Weighting

Break the Ceiling: Stronger Multi-scale Deep Graph Convolutional Networks

Image Captioning: Transforming Objects into Words

Controllable Unsupervised Text Attribute Transfer via Editing Entangled Latent Representation

Random Projections with Asymmetric Quantization

Information-Theoretic Generalization Bounds for SGLD via Data-Dependent Estimates

MintNet: Building Invertible Neural Networks with Masked Convolutions

Zero-shot Knowledge Transfer via Adversarial Belief Matching

Statistical Model Aggregation via Parameter Matching

How to Initialize your Network? Robust Initialization for WeightNorm & ResNets

An Embedding Framework for Consistent Polyhedral Surrogates

Private Testing of Distributions via Sample Permutations

Exponential Family Estimation via Adversarial Dynamics Embedding

Adversarial Fisher Vectors for Unsupervised Representation Learning

Superposition of many models into one

N-Gram Graph: Simple Unsupervised Representation for Graphs, with Applications to Molecules

Improving Black-box Adversarial Attacks with a Transfer-based Prior

MaxGap Bandit: Adaptive Algorithms for Approximate Ranking

Statistical Analysis of Nearest Neighbor Methods for Anomaly Detection

Input-Cell Attention Reduces Vanishing Saliency of Recurrent Neural Networks

Time/Accuracy Tradeoffs for Learning a ReLU with respect to Gaussian Marginals

Online Forecasting of Total-Variation-bounded Sequences

Reducing the variance in online optimization by transporting past gradients

Rates of Convergence for Large-scale Nearest Neighbor Classification

Cross-Modal Learning with Adversarial Samples

High-Dimensional Optimization in Adaptive Random Subspaces

Outlier-Robust High-Dimensional Sparse Estimation via Iterative Filtering

Variational Graph Recurrent Neural Networks

Fast structure learning with modular regularization

Consistency-based Semi-supervised Learning for Object detection

Deep Leakage from Gradients

Worst-Case Regret Bounds for Exploration via Randomized Value Functions

Program Synthesis and Semantic Parsing with Learned Code Idioms

Transfer Learning via Minimizing the Performance Gap Between Domains

Semi-Implicit Graph Variational Auto-Encoders

Unsupervised Learning of Object Keypoints for Perception and Control

Dimensionality reduction: theoretical perspective on practical measures

ODE2VAE: Deep generative second order ODEs with Bayesian neural networks

Tight Sample Complexity of Learning One-hidden-layer Convolutional Neural Networks

Learning Multiple Markov Chains via Adaptive Allocation

Learning step sizes for unfolded sparse coding

A Composable Specification Language for Reinforcement Learning Tasks

Time Matters in Regularizing Deep Networks: Weight Decay and Data Augmentation Affect Early Learning Dynamics, Matter Little Near Convergence

Neural Relational Inference with Fast Modular Meta-learning

Two Time-scale Off-Policy TD Learning: Non-asymptotic Analysis over Markovian Samples

Deep Gamblers: Learning to Abstain with Portfolio Theory

Variational Temporal Abstraction

Neural Proximal/Trust Region Policy Optimization Attains Globally Optimal Policy

Efficient Convex Relaxations for Streaming PCA

Sample Complexity of Learning Mixture of Sparse Linear Regressions

Sequential Experimental Design for Transductive Linear Bandits

Discrete Object Generation with Reversible Inductive Construction

Learning Robust Global Representations by Penalizing Local Predictive Power

G2SAT: Learning to Generate SAT Formulas

Oracle-Efficient Algorithms for Online Linear Optimization with Bandit Feedback

Large Scale Adversarial Representation Learning

Same-Cluster Querying for Overlapping Clusters

A unified variance-reduced accelerated gradient method for convex optimization

Limits of Private Learning with Access to Public Data

ObjectNet: A large-scale bias-controlled dataset for pushing the limits of object recognition models

Statistical-Computational Tradeoff in Single Index Models

Bandits with Feedback Graphs and Switching Costs

On the Expressive Power of Deep Polynomial Neural Networks

Efficient Near-Optimal Testing of Community Changes in Balanced Stochastic Block Models

Superset Technique for Approximate Recovery in One-Bit Compressed Sensing

Certainty Equivalence is Efficient for Linear Quadratic Control

KNG: The K-Norm Gradient Mechanism

Keeping Your Distance: Solving Sparse Reward Tasks Using Self-Balancing Shaped Rewards

MarginGAN: Adversarial Training in Semi-Supervised Learning

Temporal FiLM: Capturing Long-Range Sequence Dependencies with Feature-Wise Modulations.

Can SGD Learn Recurrent Neural Networks with Provable Generalization?

CXPlain: Causal Explanations for Model Interpretation under Uncertainty

Learning to Self-Train for Semi-Supervised Few-Shot Classification

Functional Adversarial Attacks

A Game Theoretic Approach to Class-wise Selective Rationalization

Regularized Anderson Acceleration for Off-Policy Deep Reinforcement Learning

Efficiently avoiding saddle points with zero order methods: No gradients required

SHE: A Fast and Accurate Deep Neural Network for Encrypted Data

From Complexity to Simplicity: Adaptive ES-Active Subspaces for Blackbox Optimization

On Fenchel Mini-Max Learning

Devign: Effective Vulnerability Identification by Learning Comprehensive Program Semantics via Graph Neural Networks

Learning metrics for persistence-based summaries and applications for graph classification

Stochastic Variance Reduced Primal Dual Algorithms for Empirical Composition Optimization

Unsupervised Meta-Learning for Few-Shot Image Classification

Learning Mixtures of Plackett-Luce Models from Structured Partial Orders

Metamers of neural networks reveal divergence from human perceptual systems

Efficient Forward Architecture Search

DETOX: A Redundancy-based Framework for Faster and More Robust Gradient Aggregation

Convergence-Rate-Matching Discretization of Accelerated Optimization Flows Through Opportunistic State-Triggered Control

Tensor Programs I: Wide Feedforward or Recurrent Neural Networks of Any Architecture are Gaussian Processes

A neurally plausible model for online recognition and postdiction in a dynamical environment

Dying Experts: Efficient Algorithms with Optimal Regret Bounds

Spatial-Aware Feature Aggregation for Image based Cross-View Geo-Localization

Multi-Agent Common Knowledge Reinforcement Learning

A Benchmark for Interpretability Methods in Deep Neural Networks

A Unified Framework for Data Poisoning Attack to Graph-based Semi-supervised Learning

Average Case Column Subset Selection for Entrywise $\ell_1$-Norm Loss

Learning to Learn By Self-Critique

Model Similarity Mitigates Test Set Overuse

Decentralized sketching of low rank matrices

Locality-Sensitive Hashing for f-Divergences: Mutual Information Loss and Beyond

Transductive Zero-Shot Learning with Visual Structure Constraint

Contextual Bandits with Cross-Learning

On the Value of Target Data in Transfer Learning

Meta Learning with Relational Information for Short Sequences

Bayesian Joint Estimation of Multiple Graphical Models

Compositional generalization through meta sequence-to-sequence learning

Lookahead Optimizer: k steps forward, 1 step back

On Sample Complexity Upper and Lower Bounds for Exact Ranking from Noisy Comparisons

Understanding the Role of Momentum in Stochastic Gradient Methods

Leader Stochastic Gradient Descent for Distributed Training of Deep Learning Models

Memory Efficient Adaptive Optimization

Practical Two-Step Lookahead Bayesian Optimization

Dynamic Incentive-Aware Learning: Robust Pricing in Contextual Auctions

Using Statistics to Automate Stochastic Optimization

Certified Adversarial Robustness with Additive Noise

Towards Optimal Off-Policy Evaluation for Reinforcement Learning with Marginalized Importance Sampling

Differentiable Convex Optimization Layers

A Bayesian Theory of Conformity in Collective Decision Making

Learning to Predict 3D Objects with an Interpolation-based Differentiable Renderer

DINGO: Distributed Newton-Type Method for Gradient-Norm Optimization

Learning from brains how to regularize machines

Tight Dimensionality Reduction for Sketching Low Degree Polynomial Kernels

A Convex Relaxation Barrier to Tight Robustness Verification of Neural Networks

Random Tessellation Forests

Sobolev Independence Criterion

Maximum Entropy Monte-Carlo Planning

Non-Cooperative Inverse Reinforcement Learning

Slice-based Learning: A Programming Model for Residual Learning in Critical Data Slices

Algorithmic Analysis and Statistical Estimation of SLOPE via Approximate Message Passing

Language as an Abstraction for Hierarchical Deep Reinforcement Learning

ADDIS: an adaptive discarding algorithm for online FDR control with conservative nulls

Finding the Needle in the Haystack with Convolutions: on the benefits of architectural bias

Multiclass Performance Metric Elicitation

Don't Blame the ELBO! A Linear VAE Perspective on Posterior Collapse

Deep Generative Video Compression

Discovery of Useful Questions as Auxiliary Tasks

Preference-Based Batch and Sequential Teaching: Towards a Unified View of Models

Correlation clustering with local objectives

Correlation in Extensive-Form Games: Saddle-Point Formulation and Benchmarks

Linear Stochastic Bandits Under Safety Constraints

A coupled autoencoder approach for multi-modal analysis of cell types

A Stochastic Composite Gradient Method with Incremental Variance Reduction

Budgeted Reinforcement Learning in Continuous State Space

Online Continuous Submodular Maximization: From Full-Information to Bandit Feedback

Distributionally Robust Optimization and Generalization in Kernel Methods

Sampling Networks and Aggregate Simulation for Online POMDP Planning

Defending Against Neural Fake News

GNNExplainer: Generating Explanations for Graph Neural Networks

A General Theory of Equivariant CNNs on Homogeneous Spaces

(Nearly) Efficient Algorithms for the Graph Matching Problem on Correlated Random Graphs

Write, Execute, Assess: Program Synthesis with a REPL

Sample Adaptive MCMC

Learning Bayesian Networks with Low Rank Conditional Probability Tables

STAR-Caps: Capsule Networks with Straight-Through Attentive Routing

Procrastinating with Confidence: Near-Optimal, Anytime, Adaptive Algorithm Configuration

What Can ResNet Learn Efficiently, Going Beyond Kernels?

A Communication Efficient Stochastic Multi-Block Alternating Direction Method of Multipliers

Trivializations for Gradient-Based Optimization on Manifolds

Error Correcting Output Codes Improve Probability Estimation and Adversarial Robustness of Deep Neural Networks

PerspectiveNet: 3D Object Detection from a Single RGB Image via Perspective Points

Adaptively Aligned Image Captioning via Adaptive Attention Time

PRNet: Self-Supervised Learning for Partial-to-Partial Registration

Surrogate Objectives for Batch Policy Optimization in One-step Decision Making

Unlocking Fairness: a Trade-off Revisited

Accurate Uncertainty Estimation and Decomposition in Ensemble Learning

Approximating the Permanent by Sampling from Adaptive Partitions

Seeing the Wind: Visual Wind Speed Prediction with a Coupled Convolutional and Recurrent Neural Network

Graph Structured Prediction Energy Networks

Regret Bounds for Thompson Sampling in Episodic Restless Bandit Problems

Fisher Efficient Inference of Intractable Models

Unsupervised State Representation Learning in Atari

Learning Macroscopic Brain Connectomes via Group-Sparse Factorization

Modeling Expectation Violation in Intuitive Physics with Coarse Probabilistic Object Representations

Learning to Screen

Modelling heterogeneous distributions with an Uncountable Mixture of Asymmetric Laplacians

Latent distance estimation for random geometric graphs

Graph Agreement Models for Semi-Supervised Learning

Retrosynthesis Prediction with Conditional Graph Logic Network

Recurrent Registration Neural Networks for Deformable Image Registration

Finite-Sample Analysis for SARSA with Linear Function Approximation

Equal Opportunity in Online Classification with Partial Feedback

A Little Is Enough: Circumventing Defenses For Distributed Learning

Learning Deterministic Weighted Automata with Queries and Counterexamples

Neural Multisensory Scene Inference

A Robust Non-Clairvoyant Dynamic Mechanism for Contextual Auctions

Finite-time Analysis of Approximate Policy Iteration for the Linear Quadratic Regulator

Characterizing the Exact Behaviors of Temporal Difference Learning Algorithms Using Markov Jump Linear System Theory

The Functional Neural Process

Facility Location Problem in Differential Privacy Model Revisited

A Universally Optimal Multistage Accelerated Stochastic Gradient Method

Learning from Trajectories via Subgoal Discovery

Multiclass Learning from Contradictions

Distributed Low-rank Matrix Factorization With Exact Consensus

Energy-Inspired Models: Learning with Sampler-Induced Distributions

An adaptive Mirror-Prox method for variational inequalities with singular operators

The Synthesis of XNOR Recurrent Neural Networks with Stochastic Logic

Wide Neural Networks of Any Depth Evolve as Linear Models Under Gradient Descent

From deep learning to mechanistic understanding in neuroscience: the structure of retinal prediction

Robust and Communication-Efficient Collaborative Learning

Provably Global Convergence of Actor-Critic: A Case for Linear Quadratic Regulator with Ergodic Cost

Online Normalization for Training Neural Networks

Manipulating a Learning Defender and Ways to Counteract

Fixing the train-test resolution discrepancy

Certifiable Robustness to Graph Perturbations

Fast Decomposable Submodular Function Minimization using Constrained Total Variation

Hyperbolic Graph Neural Networks

The spiked matrix model with generative priors

Learning-In-The-Loop Optimization: End-To-End Control And Co-Design Of Soft Robots Through Learned Deep Latent Representations

Gradient Dynamics of Shallow Univariate ReLU Networks

Möbius Transformation for Fast Inner Product Search on Graph

Modeling Dynamic Functional Connectivity with Latent Factor Gaussian Processes

Learning to Infer Implicit Surfaces without 3D Supervision

Learning Distributions Generated by One-Layer ReLU Networks

Fast Convergence of Natural Gradient Descent for Over-Parameterized Neural Networks

Efficient Approximation of Deep ReLU Networks for Functions on Low Dimensional Manifolds

Rapid Convergence of the Unadjusted Langevin Algorithm: Isoperimetry Suffices

Provably Efficient Q-learning with Function Approximation via Distribution Shift Error Checking Oracle

Large-scale optimal transport map estimation using projection pursuit

Which Algorithmic Choices Matter at Which Batch Sizes? Insights From a Noisy Quadratic Model

Dynamic Local Regret for Non-convex Online Forecasting

Loaded DiCE: Trading off Bias and Variance in Any-Order Score Function Gradient Estimators for Reinforcement Learning

Provably Efficient Q-Learning with Low Switching Cost

Crowdsourcing via Pairwise Co-occurrences: Identifiability and Algorithms

Differentially Private Anonymized Histograms

A Debiased MDI Feature Importance Measure for Random Forests

Bipartite expander Hopfield networks as self-decoding high-capacity error correcting codes

A Unifying Framework for Spectrum-Preserving Graph Sparsification and Coarsening

Post training 4-bit quantization of convolutional networks for rapid-deployment

Max-value Entropy Search for Multi-Objective Bayesian Optimization

Spike-Train Level Backpropagation for Training Deep Recurrent Spiking Neural Networks

Detecting Overfitting via Adversarial Examples

A Unified Bellman Optimality Principle Combining Reward Maximization and Empowerment

Towards Understanding the Importance of Shortcut Connections in Residual Networks

Stein Variational Gradient Descent With Matrix-Valued Kernels

A Model to Search for Synthesizable Molecules

SMILe: Scalable Meta Inverse Reinforcement Learning through Context-Conditional Policies

Stability of Graph Scattering Transforms

A Polynomial Time Algorithm for Log-Concave Maximum Likelihood via Locally Exponential Families

Interaction Hard Thresholding: Consistent Sparse Quadratic Regression in Sub-quadratic Time and Space

Re-examination of the Role of Latent Variables in Sequence Modeling

PyTorch: An Imperative Style, High-Performance Deep Learning Library

Modular Universal Reparameterization: Deep Multi-task Learning Across Diverse Domains

How degenerate is the parametrization of neural networks with the ReLU activation function?

The Implicit Bias of AdaGrad on Separable Data

Maximum Expected Hitting Cost of a Markov Decision Process and Informativeness of Rewards

Coresets for Clustering with Fairness Constraints

LiteEval: A Coarse-to-Fine Framework for Resource Efficient Video Recognition

Continual Unsupervised Representation Learning

MAVEN: Multi-Agent Variational Exploration

On two ways to use determinantal point processes for Monte Carlo integration

Solving Interpretable Kernel Dimensionality Reduction

Constrained Reinforcement Learning Has Zero Duality Gap

Foundations of Comparison-Based Hierarchical Clustering

Lower Bounds on Adversarial Robustness from Optimal Transport

Phase Transitions and Cyclic Phenomena in Bandits with Switching Constraints

Latent Weights Do Not Exist: Rethinking Binarized Neural Network Optimization

Competitive Gradient Descent

The Parameterized Complexity of Cascading Portfolio Scheduling

Self-Routing Capsule Networks

What the Vec? Towards Probabilistically Grounded Embeddings

Symmetry-adapted generation of 3d point sets for the targeted discovery of molecules

Nonlinear scaling of resource allocation in sensory bottlenecks

Minimizers of the Empirical Risk and Risk Monotonicity

PerspectiveNet: A Scene-consistent Image Generator for New View Synthesis in Real Indoor Environments

Explicit Planning for Efficient Exploration in Reinforcement Learning

Normalization Helps Training of Quantized LSTM

GRU-ODE-Bayes: Continuous Modeling of Sporadically-Observed Time Series

Neural Spline Flows

Coresets for Archetypal Analysis

Nonzero-sum Adversarial Hypothesis Testing Games

Learning elementary structures for 3D shape generation and matching

Estimating Convergence of Markov chains with L-Lag Couplings

Deep Scale-spaces: Equivariance Over Scale

Escaping from saddle points on Riemannian manifolds

Universal Invariant and Equivariant Graph Neural Networks

Modeling Tabular data using Conditional GAN

Localized Structured Prediction

Learning-Based Low-Rank Approximations

Depth-First Proof-Number Search with Heuristic Edge Cost and Application to Chemical Synthesis Planning

ZO-AdaMM: Zeroth-Order Adaptive Momentum Method for Black-Box Optimization

Toward a Characterization of Loss Functions for Distribution Learning

Manifold-regression to predict from MEG/EEG brain signals without source modeling

Unsupervised Emergence of Egocentric Spatial Structure from Sensorimotor Prediction

Non-Stationary Markov Decision Processes, a Worst-Case Approach using Model-Based Reinforcement Learning

Multi-source Domain Adaptation for Semantic Segmentation

On the Correctness and Sample Complexity of Inverse Reinforcement Learning

PointDAN: A Multi-Scale 3D Domain Adaption Network for Point Cloud Representation

Learning from Label Proportions with Generative Adversarial Networks

Robust Principal Component Analysis with Adaptive Neighbors

On the convergence of single-call stochastic extra-gradient methods

A Mean Field Theory of Quantized Deep Networks: The Quantization-Depth Trade-Off

First Order Motion Model for Image Animation

Beyond Confidence Regions: Tight Bayesian Ambiguity Sets for Robust MDPs

Approximate Bayesian Inference for a Mechanistic Model of Vesicle Release at a Ribbon Synapse

BatchBALD: Efficient and Diverse Batch Acquisition for Deep Bayesian Active Learning

Interior-Point Methods Strike Back: Solving the Wasserstein Barycenter Problem

Subspace Detours: Building Transport Plans that are Optimal on Subspace Projections

High-Quality Self-Supervised Deep Image Denoising

Discriminator optimal transport

Online Prediction of Switching Graph Labelings with Cluster Specialists

Beyond Vector Spaces: Compact Data Representation as Differentiable Weighted Graphs

GIFT: Learning Transformation-Invariant Dense Visual Descriptors via Group CNNs

Blow: a single-scale hyperconditioned flow for non-parallel raw-audio voice conversion

Efficient Smooth Non-Convex Stochastic Compositional Optimization via Stochastic Recursive Gradient Descent

Graph-Based Semi-Supervised Learning with Non-ignorable Non-response

Quantum Wasserstein Generative Adversarial Networks

Are Anchor Points Really Indispensable in Label-Noise Learning?

Learning Nonsymmetric Determinantal Point Processes

Fast AutoAugment

Interval timing in deep reinforcement learning agents

Dichotomize and Generalize: PAC-Bayesian Binary Activated Deep Neural Networks

Weakly Supervised Instance Segmentation using the Bounding Box Tightness Prior

Efficient Pure Exploration in Adaptive Round Model

Neural Shuffle-Exchange Networks - Sequence Processing in O(n log n) Time

High-dimensional multivariate forecasting with low-rank Gaussian Copula Processes

Multi-objects Generation with Amortized Structural Regularization

Discriminative Topic Modeling with Logistic LDA

Semi-supervisedly Co-embedding Attributed Networks

Hyperparameter Learning via Distributional Transfer

DetNAS: Backbone Search for Object Detection

Oblivious Sampling Algorithms for Private Data Analysis

Is Deeper Better only when Shallow is Good?

First-order methods almost always avoid saddle points: The case of vanishing step-sizes

Large Scale Structure of Neural Network Loss Landscapes

From voxels to pixels and back: Self-supervision in natural-image reconstruction from fMRI

Maximum Mean Discrepancy Gradient Flow

Copulas as High-Dimensional Generative Models: Vine Copula Autoencoders

Code Generation as a Dual Task of Code Summarization

BIVA: A Very Deep Hierarchy of Latent Variables for Generative Modeling

Hypothesis Set Stability and Generalization

Bayesian Batch Active Learning as Sparse Subset Approximation

Diffeomorphic Temporal Alignment Nets

Global Sparse Momentum SGD for Pruning Very Deep Neural Networks

Copula Multi-label Learning

Domain Generalization via Model-Agnostic Learning of Semantic Features

Bayesian Learning of Sum-Product Networks

The Normalization Method for Alleviating Pathological Sharpness in Wide Neural Networks

On the Convergence Rate of Training Recurrent Neural Networks

Grid Saliency for Context Explanations of Semantic Segmentation

Anti-efficient encoding in emergent communication

Convergence Guarantees for Adaptive Bayesian Quadrature Methods

Optimal Sparsity-Sensitive Bounds for Distributed Mean Estimation

Singleshot : a scalable Tucker tensor decomposition

Mining GOLD Samples for Conditional GANs

Reliable training and estimation of variance networks

L_DMI: A Novel Information-theoretic Loss Function for Training Deep Nets Robust to Label Noise

Meta-Surrogate Benchmarking for Hyperparameter Optimization

Direct Optimization through $\arg \max$ for Discrete Variational Auto-Encoder

Progressive Augmentation of GANs

Fully Parameterized Quantile Function for Distributional Reinforcement Learning

Neural Machine Translation with Soft Prototype

Learning and Generalization in Overparameterized Neural Networks, Going Beyond Two Layers

Constrained deep neural network architecture search for IoT devices accounting for hardware calibration

Towards a Zero-One Law for Column Subset Selection

Deep Model Transferability from Attribution Maps

Iterative Least Trimmed Squares for Mixed Linear Regression

Intrinsic dimension of data representations in deep neural networks

Distributional Reward Decomposition for Reinforcement Learning

Dual Adversarial Semantics-Consistent Network for Generalized Zero-Shot Learning

Compositional De-Attention Networks

Dynamic Ensemble Modeling Approach to Nonstationary Neural Decoding in Brain-Computer Interfaces

Divergence-Augmented Policy Optimization

Learning Dynamics of Attention: Human Prior for Interpretable Machine Reasoning

AutoAssist: A Framework to Accelerate Training of Deep Neural Networks

A Regularized Approach to Sparse Optimal Policy in Reinforcement Learning

Comparing Unsupervised Word Translation Methods Step by Step

Equipping Experts/Bandits with Long-term Memory

Scalable inference of topic evolution via models for latent geometric structures

Generalized Block-Diagonal Structure Pursuit: Learning Soft Latent Task Assignment against Negative Transfer

Addressing Sample Complexity in Visual Tasks Using HER and Hallucinatory GANs

Doubly-Robust Lasso Bandit

Almost Horizon-Free Structure-Aware Best Policy Identification with a Generative Model

Optimal Best Markovian Arm Identification with Fixed Confidence

Deep Active Learning with a Neural Architecture Search

Co-Generation with GANs using AIS based HMC

Push-pull Feedback Implements Hierarchical Information Retrieval Efficiently

In-Place Zero-Space Memory Protection for CNN

Learning GANs and Ensembles Using Discrepancy

Connective Cognition Network for Directional Visual Commonsense Reasoning

MaCow: Masked Convolutional Generative Flow

Effective End-to-end Unsupervised Outlier Detection via Inlier Priority of Discriminative Network

Mixtape: Breaking the Softmax Bottleneck Efficiently

AttentionXML: Label Tree-based Attention-Aware Deep Model for High-Performance Extreme Multi-Label Text Classification

Comparison Against Task Driven Artificial Neural Networks Reveals Functional Organization of Mouse Visual Cortex

Topology-Preserving Deep Image Segmentation

Variance Reduced Policy Evaluation with Smooth Function Approximation

Learning Disentangled Representations for Recommendation

Graph Neural Tangent Kernel: Fusing Graph Neural Networks with Graph Kernels

A Latent Variational Framework for Stochastic Optimization

Acceleration via Symplectic Discretization of High-Resolution Differential Equations

Limiting Extrapolation in Linear Approximate Value Iteration

Markov Random Fields for Collaborative Filtering

Ouroboros: On Accelerating Training of Transformer-Based Language Models

Quantum Embedding of Knowledge for Reasoning

Focused Quantization for Sparse CNNs

Regularized Gradient Boosting

Robustness to Adversarial Perturbations in Learning from Incomplete Data

An Adaptive Empirical Bayesian Method for Sparse Deep Learning

A Refined Margin Distribution Analysis for Forest Representation Learning

Time-series Generative Adversarial Networks

Einconv: Exploring Unexplored Tensor Network Decompositions for Convolutional Neural Networks

Off-Policy Evaluation via Off-Policy Classification

Latent Ordinary Differential Equations for Irregularly-Sampled Time Series

Input Similarity from the Neural Network Perspective

Learning to Predict Without Looking Ahead: World Models Without Forward Prediction

Characterizing Bias in Classifiers using Generative Models

Balancing Efficiency and Fairness in On-Demand Ridesourcing

Adaptive Sequence Submodularity

Efficient Probabilistic Inference in the Quest for Physics Beyond the Standard Model

Incremental Few-Shot Learning with Attention Attractor Networks

Learning Disentangled Representation for Robust Person Re-identification

Enhancing the Locality and Breaking the Memory Bottleneck of Transformer on Time Series Forecasting

Estimating Entropy of Distributions in Constant Space

On the Accuracy of Influence Functions for Measuring Group Effects

On the Utility of Learning about Humans for Human-AI Coordination

Optimistic Regret Minimization for Extensive-Form Games via Dilated Distance-Generating Functions

Fast Parallel Algorithms for Statistical Subset Selection Problems

Learning Non-Convergent Non-Persistent Short-Run MCMC Toward Energy-Based Model

E2-Train: Training State-of-the-art CNNs with Over 80% Less Energy

On the number of variables to use in principal component regression

Hybrid 8-bit Floating Point (HFP8) Training and Inference for Deep Neural Networks

Tight Certificates of Adversarial Robustness for Randomly Smoothed Classifiers

Scalable Spike Source Localization in Extracellular Recordings using Amortized Variational Inference

Visual Concept-Metaconcept Learning

Data-driven Estimation of Sinusoid Frequencies

Exploration Bonus for Regret Minimization in Discrete and Continuous Average Reward MDPs

PHYRE: A New Benchmark for Physical Reasoning

Neural Similarity Learning

Prior-Free Dynamic Auctions with Low Regret Buyers

One ticket to win them all: generalizing lottery ticket initializations across datasets and optimizers

Multivariate Triangular Quantile Maps for Novelty Detection

ANODEV2: A Coupled Neural ODE Framework

Learning Mean-Field Games

Factor Group-Sparse Regularization for Efficient Low-Rank Matrix Recovery

Global Convergence of Least Squares EM for Demixing Two Log-Concave Densities

Spectral Modification of Graphs for Improved Spectral Clustering

The Convergence Rate of Neural Networks for Learned Functions of Different Frequencies

Fair Algorithms for Clustering

Few-shot Video-to-Video Synthesis

Policy Evaluation with Latent Confounders via Optimal Balance

MixMatch: A Holistic Approach to Semi-Supervised Learning

Mutually Regressive Point Processes

Ordered Memory

Unsupervised Scalable Representation Learning for Multivariate Time Series

Think Globally, Act Locally: A Deep Neural Network Approach to High-Dimensional Time Series Forecasting

Hyperbolic Graph Convolutional Neural Networks

SpArSe: Sparse Architecture Search for CNNs on Resource-Constrained Microcontrollers

A state-space model for inferring effective connectivity of latent neural dynamics from simultaneous EEG/fMRI

Making the Cut: A Bandit-based Approach to Tiered Interviewing

Scalable Bayesian dynamic covariance modeling with variational Wishart and inverse Wishart processes

Efficient Deep Approximation of GMMs

Finite-Time Performance Bounds and Adaptive Learning Rate Selection for Two Time-Scale Reinforcement Learning

Symmetry-Based Disentangled Representation Learning requires Interaction with Environments

Adaptive Cross-Modal Few-shot Learning

On Single Source Robustness in Deep Fusion Models

Exploiting Local and Global Structure for Point Cloud Semantic Segmentation with Contextual Point Representations

Rethinking Deep Neural Network Ownership Verification: Embedding Passports to Defeat Ambiguity Attacks

Game Design for Eliciting Distinguishable Behavior

Cost Effective Active Search

Offline Contextual Bayesian Optimization

Breaking the Glass Ceiling for Embedding-Based Classifiers for Large Output Spaces

End to end learning and optimization on graphs

Partially Encrypted Deep Learning using Functional Encryption

Optimal Sketching for Kronecker Product Regression and Low Rank Approximation

State Aggregation Learning from Markov Transition Data

Multi-relational Poincaré Graph Embeddings

Disentangling Influence: Using disentangled representations to audit model predictions

Double Quantization for Communication-Efficient Distributed Optimization

Decentralized Cooperative Stochastic Bandits

Globally optimal score-based learning of directed acyclic graphs in high-dimensions

U-Time: A Fully Convolutional Network for Time Series Segmentation Applied to Sleep Staging

Learning low-dimensional state embeddings and metastable clusters from time series data

Massively scalable Sinkhorn distances via the Nyström method

No-Press Diplomacy: Modeling Multi-Agent Gameplay

LIIR: Learning Individual Intrinsic Reward in Multi-Agent Reinforcement Learning

Uncertainty-based Continual Learning with Adaptive Regularization

Differentially Private Bagging: Improved utility and cheaper privacy than subsample-and-aggregate

Successor Uncertainties: Exploration and Uncertainty in Temporal Difference Learning

Understanding and Improving Layer Normalization

Learning New Tricks From Old Dogs: Multi-Source Transfer Learning From Pre-Trained Networks

Propagating Uncertainty in Reinforcement Learning via Wasserstein Barycenters

Beyond Alternating Updates for Matrix Factorization with Inertial Bregman Proximal Gradient Algorithms

A Geometric Perspective on Optimal Representations for Reinforcement Learning

Learning Deep Bilinear Transformation for Fine-grained Image Representation

Modeling Uncertainty by Learning a Hierarchy of Deep Neural Connections

On Adversarial Mixup Resynthesis

Flow-based Image-to-Image Translation with Feature Disentanglement

Training Language GANs from Scratch

Limitations of the empirical Fisher approximation for natural gradient descent

Learner-aware Teaching: Inverse Reinforcement Learning with Preferences and Constraints

Data Cleansing for Models Trained with SGD

Practical Deep Learning with Bayesian Principles

Embedding Symbolic Knowledge into Deep Networks

A Prior of a Googol Gaussians: a Tensor Ring Induced Prior for Generative Models

Curvilinear Distance Metric Learning

Approximation Ratios of Graph Neural Networks for Combinatorial Problems

Shape and Time Distortion Loss for Training Deep Time Series Forecasting Models

Thinning for Accelerating the Learning of Point Processes

A First-Order Algorithmic Framework for Wasserstein Distributionally Robust Logistic Regression

Full-Gradient Representation for Neural Network Visualization

Regularized Weighted Low Rank Approximation

Understanding Attention and Generalization in Graph Neural Networks

Efficient Graph Generation with Graph Recurrent Attention Networks

A Normative Theory for Causal Inference and Bayes Factor Computation in Neural Circuits

Bat-G net: Bat-inspired High-Resolution 3D Image Reconstruction using Ultrasonic Echoes

q-means: A quantum algorithm for unsupervised machine learning

Improved Precision and Recall Metric for Assessing Generative Models

Cross Attention Network for Few-shot Classification

iSplit LBI: Individualized Partial Ranking with Ties via Split LBI

MetaQuant: Learning to Quantize by Learning to Penetrate Non-differentiable Quantization

Teaching Multiple Concepts to a Forgetful Learner

SCAN: A Scalable Neural Networks Framework Towards Compact and Efficient Models

Scalable Structure Learning of Continuous-Time Bayesian Networks from Incomplete Data

Efficiently Estimating Erdos-Renyi Graphs with Node Differential Privacy

Stochastic Gradient Hamiltonian Monte Carlo Methods with Recursive Variance Reduction

Learning Generalizable Device Placement Algorithms for Distributed Machine Learning

Unsupervised Keypoint Learning for Guiding Class-Conditional Video Prediction

Handling correlated and repeated measurements with the smoothed multivariate square-root Lasso

PasteGAN: A Semi-Parametric Method to Generate Image from Scene Graph

Uncoupled Regression from Pairwise Comparison Data

Tight Dimension Independent Lower Bound on the Expected Convergence Rate for Diminishing Step Sizes in SGD

Learning Sample-Specific Models with Low-Rank Personalized Regression

Learning Representations for Time Series Clustering

Deconstructing Lottery Tickets: Zeros, Signs, and the Supermask

Privacy-Preserving Classification of Personal Text Messages with Secure Multi-Party Computation

MCP: Learning Composable Hierarchical Control with Multiplicative Compositional Policies

Exact Rate-Distortion in Autoencoders via Echo Noise

Conformalized Quantile Regression

Practical and Consistent Estimation of f-Divergences

Ultra Fast Medoid Identification via Correlated Sequential Halving

Domes to Drones: Self-Supervised Active Triangulation for 3D Human Pose Reconstruction

Subspace Attack: Exploiting Promising Subspaces for Query-Efficient Black-box Attacks

Predicting the Politics of an Image Using Webly Supervised Data

The Landscape of Non-convex Empirical Risk with Degenerate Population Risk

Adaptive GNN for Image Analysis and Editing

Joint Optimization of Tree-based Index and Deep Model for Recommender Systems

Learning Latent Process from High-Dimensional Event Sequences via Efficient Sampling

Dancing to Music

LCA: Loss Change Allocation for Neural Network Training

Thompson Sampling with Information Relaxation Penalties

Coda: An End-to-End Neural Program Decompiler

Assessing Disparate Impact of Personalized Interventions: Identifiability and Bounds

Deep Generalized Method of Moments for Instrumental Variable Analysis

Arbicon-Net: Arbitrary Continuous Geometric Transformation Networks for Image Registration

Capacity Bounded Differential Privacy

The Fairness of Risk Scores Beyond Classification: Bipartite Ranking and the XAUC Metric

PC-Fairness: A Unified Framework for Measuring Causality-based Fairness

First order expansion of convex regularized estimators

Implicitly learning to reason in first-order logic

Kernel-Based Approaches for Sequence Modeling: Connections to Neural Methods

Meta-Curvature

Adversarial training for free!

Intrinsically Efficient, Stable, and Bounded Off-Policy Evaluation for Reinforcement Learning

Communication-Efficient Distributed Learning via Lazily Aggregated Quantized Gradients

Optimal Decision Tree with Noisy Outcomes

Transfusion: Understanding Transfer Learning for Medical Imaging

A Flexible Generative Framework for Graph-based Semi-supervised Learning

Efficient Communication in Multi-Agent Reinforcement Learning via Variance Based Control

Deep Set Prediction Networks

DppNet: Approximating Determinantal Point Processes with Deep Networks

Distinguishing Distributions When Samples Are Strategically Transformed

Implicit Regularization of Discrete Gradient Dynamics in Linear Neural Networks

Fully Dynamic Consistent Facility Location

Neural Lyapunov Control

Augmented Neural ODEs

Deep Signature Transforms

Meta-Reinforced Synthetic Data for One-Shot Fine-Grained Visual Recognition

Convergent Policy Optimization for Safe Reinforcement Learning

Approximate Inference Turns Deep Networks into Gaussian Processes

Scalable Gromov-Wasserstein Learning for Graph Partitioning and Matching

Inherent Weight Normalization in Stochastic Neural Networks

Learning Temporal Pose Estimation from Sparsely-Labeled Videos

Implicit Regularization for Optimal Sparse Recovery

Real-Time Reinforcement Learning

Regret Minimization for Reinforcement Learning by Evaluating the Optimal Bias Function

Multi-mapping Image-to-Image Translation via Learning Disentanglement

Addressing Failure Prediction by Learning Model Confidence

Fooling Neural Network Interpretations via Adversarial Model Manipulation

Copula-like Variational Inference

Backpropagation-Friendly Eigendecomposition

FastSpeech: Fast, Robust and Controllable Text to Speech

Robust Multi-agent Counterfactual Prediction

Locally Private Gaussian Estimation

Spatially Aggregated Gaussian Processes with Multivariate Areal Outputs

Epsilon-Best-Arm Identification in Pay-Per-Reward Multi-Armed Bandits

Turbo Autoencoder: Deep learning based channel codes for point-to-point communication channels

PAC-Bayes under potentially heavy tails

On the Optimality of Perturbations in Stochastic and Adversarial Multi-armed Bandit Problems

Identifying Causal Effects via Context-specific Independence Relations

A Linearly Convergent Proximal Gradient Algorithm for Decentralized Optimization

On the Global Convergence of (Fast) Incremental Expectation Maximization Methods

Regularizing Trajectory Optimization with Denoising Autoencoders

Bridging Machine Learning and Logical Reasoning by Abductive Learning

Classification-by-Components: Probabilistic Modeling of Reasoning over a Set of Components

One-Shot Object Detection with Co-Attention and Co-Excitation

Knowledge Extraction with No Observable Data

Combinatorial Bayesian Optimization using the Graph Cartesian Product

Glyce: Glyph-vectors for Chinese Character Representations

Non-asymptotic Analysis of Stochastic Methods for Non-Smooth Non-Convex Regularized Problems

Discovering Neural Wirings

On the Calibration of Multiclass Classification with Rejection

Conformal Prediction Under Covariate Shift

Drill-down: Interactive Retrieval of Complex Scenes using Natural Language Queries

On Learning Over-parameterized Neural Networks: A Functional Approximation Perspective

Interlaced Greedy Algorithm for Maximization of Submodular Functions in Nearly Linear Time

Information-Theoretic Confidence Bounds for Reinforcement Learning

Optimal Analysis of Subset-Selection Based L_p Low-Rank Approximation

Total Least Squares Regression in Input Sparsity Time

Learning Robust Options by Conditional Value at Risk Optimization

Large Scale Markov Decision Processes with Changing Rewards

Hyper-Graph-Network Decoders for Block Codes

Transfer Anomaly Detection by Inferring Latent Domain Representations

Positive-Unlabeled Compression on the Cloud

SpiderBoost and Momentum: Faster Variance Reduction Algorithms

Third-Person Visual Imitation Learning via Decoupled Hierarchical Controller

Stagewise Training Accelerates Convergence of Testing Error Over SGD

Multiview Aggregation for Learning Category-Specific Shape Reconstruction

Learning Transferable Graph Exploration

Gradient Information for Representation and Modeling

Adapting Neural Networks for the Estimation of Treatment Effects

Direct Estimation of Differential Functional Graphical Models

Semi-Parametric Dynamic Contextual Pricing

Initialization of ReLUs for Dynamical Isometry

Kernel Stein Tests for Multiple Model Comparison

Rethinking the CSC Model for Natural Images

Disentangled behavioural representations

More Is Less: Learning Efficient Video Representations by Big-Little Network and Depthwise Temporal Aggregation

Deep Structured Prediction for Facial Landmark Detection

Integrating Bayesian and Discriminative Sparse Kernel Machines for Multi-class Active Learning

Park: An Open Platform for Learning-Augmented Computer Systems

Partitioning Structure Learning for Segmented Linear Regression Trees

Online Stochastic Shortest Path with Bandit Feedback and Unknown Transition Function

Sample-Efficient Deep Reinforcement Learning via Episodic Backward Update

Information Competing Process for Learning Diversified Representations

Minimax Optimal Estimation of Approximate Differential Privacy on Neighboring Databases

Order Optimal One-Shot Distributed Learning

Discrimination in Online Markets: Effects of Social Bias on Learning from Reviews and Policy Design

Controllable Text-to-Image Generation

Rethinking Generative Mode Coverage: A Pointwise Guaranteed Approach

Improving Textual Network Learning with Variational Homophilic Embeddings

Controlling Neural Level Sets

CNN^{2}: Viewpoint Generalization via a Binocular Vision

GENO -- GENeric Optimization for Classical Machine Learning

Fully Neural Network based Model for General Temporal Point Processes

Provably Powerful Graph Networks

A Tensorized Transformer for Language Modeling

Defense Against Adversarial Attacks Using Feature Scattering-based Adversarial Training

Catastrophic Forgetting Meets Negative Transfer: Batch Spectral Shrinkage for Safe Transfer Learning

Blended Matching Pursuit

Neural networks grown and self-organized by noise

Nonconvex Low-Rank Symmetric Tensor Completion from Noisy Data

XNAS: Neural Architecture Search with Expert Advice

Meta-Weight-Net: Learning an Explicit Mapping For Sample Weighting

Variational Structured Semantic Inference for Diverse Image Captioning

Transferable Normalization: Towards Improving Transferability of Deep Neural Networks

Multi-marginal Wasserstein GAN

On the Curved Geometry of Accelerated Optimization

Self-Supervised Generalisation with Meta Auxiliary Learning

ResNets Ensemble via the Feynman-Kac Formalism to Improve Natural and Robust Accuracies

On the Ineffectiveness of Variance Reduced Optimization for Deep Learning

The Cells Out of Sample (COOS) dataset and benchmarks for measuring out-of-sample generalization of image classifiers

An Improved Analysis of Training Over-parameterized Deep Neural Networks

Gate Decorator: Global Filter Pruning Method for Accelerating Deep Convolutional Neural Networks

Cascaded Dilated Dense Network with Two-step Data Consistency for MRI Reconstruction

Importance Resampling for Off-policy Prediction

Meta-Learning Representations for Continual Learning

A New Defense Against Adversarial Images: Turning a Weakness into a Strength

Generalized Off-Policy Actor-Critic

Multivariate Sparse Coding of Nonstationary Covariances with Gaussian Processes

Variational Denoising Network: Toward Blind Noise Modeling and Removal

Learnable Tree Filter for Structure-preserving Feature Transform

Unconstrained Monotonic Neural Networks

Random deep neural networks are biased towards simple functions

Incremental Scene Synthesis

Coordinated hippocampal-entorhinal replay as structural inference

Visualizing the PHATE of Neural Networks

SSRGD: Simple Stochastic Recursive Gradient Descent for Escaping Saddle Points

Learning Imbalanced Datasets with Label-Distribution-Aware Margin Loss

Data-Dependence of Plateau Phenomenon in Learning with Neural Network --- Statistical Mechanical Analysis

Fixing Implicit Derivatives: Trust-Region Based Learning of Continuous Energy Functions

Expressive power of tensor-network factorizations for probabilistic modeling

Hierarchical Optimal Transport for Document Representation

Hyperspherical Prototype Networks

Neural Diffusion Distance for Image Segmentation

Fine-grained Optimization of Deep Neural Networks

Extending Stein's unbiased risk estimator to train deep denoisers with correlated pairs of noisy images

Computing Full Conformal Prediction Set with Approximate Homotopy

Multi-View Reinforcement Learning

Sampling Sketches for Concave Sublinear Functions of Frequencies

Distributional Policy Optimization: An Alternative Approach for Continuous Control

Hierarchical Reinforcement Learning with Advantage-Based Auxiliary Rewards

On The Classification-Distortion-Perception Tradeoff

HyperGCN: A New Method For Training Graph Convolutional Networks on Hypergraphs

Failing Loudly: An Empirical Study of Methods for Detecting Dataset Shift

CondConv: Conditionally Parameterized Convolutions for Efficient Inference

Conditional Structure Generation through Graph Variational Generative Adversarial Nets

Online sampling from log-concave distributions

Model Compression with Adversarial Robustness: A Unified Optimization Framework

Gated CRF Loss for Weakly Supervised Semantic Image Segmentation

Cross-channel Communication Networks

Image Synthesis with a Single (Robust) Classifier

Optimal Statistical Rates for Decentralised Non-Parametric Regression with Linear Speed-Up

Envy-Free Classification

Regression Planning Networks

Control Batch Size and Learning Rate to Generalize Well: Theoretical and Empirical Evidence

Convolution with even-sized kernels and symmetric padding

Combinatorial Bandits with Relative Feedback

Reconciling λ-Returns with Experience Replay

A Graph Theoretic Framework of Recomputation Algorithms for Memory-Efficient Backpropagation

Explicit Disentanglement of Appearance and Perspective in Generative Models

General Proximal Incremental Aggregated Gradient Algorithms: Better and Novel Results under General Scheme

Selecting the independent coordinates of manifolds with large aspect ratios

Polynomial Cost of Adaptation for X-Armed Bandits

Nonparametric Regressive Point Processes Based on Conditional Gaussian Processes

Combinatorial Inference against Label Noise

Powerset Convolutional Neural Networks

Deep Supervised Summarization: Algorithm and Application to Learning Instructions

Learn, Imagine and Create: Text-to-Image Generation from Prior Knowledge

Fast Low-rank Metric Learning for Large-scale and High-dimensional Data

Memory-oriented Decoder for Light Field Salient Object Detection

Correlated Uncertainty for Learning Dense Correspondences from Noisy Labels

Deep Learning without Weight Transport

DATA: Differentiable ArchiTecture Approximation

Network Pruning via Transformable Architecture Search

Selecting Optimal Decisions via Distributionally Robust Nearest-Neighbor Regression

An Accelerated Decentralized Stochastic Proximal Algorithm for Finite Sums

Saccader: Improving Accuracy of Hard Attention Models for Vision

Why Can't I Dance in the Mall? Learning to Mitigate Scene Bias in Action Recognition

Volumetric Correspondence Networks for Optical Flow

Optimal Pricing in Repeated Posted-Price Auctions with Different Patience of the Seller and the Buyer

Value Propagation for Decentralized Networked Deep Multi-agent Reinforcement Learning

Importance Weighted Hierarchical Variational Inference

Staying up to Date with Online Content Changes Using Reinforcement Learning for Scheduling

Differentially Private Bayesian Linear Regression

DISN: Deep Implicit Surface Network for High-quality Single-view 3D Reconstruction

NeurVPS: Neural Vanishing Point Scanning via Conic Convolution

Secretary Ranking with Minimal Inversions

Multi-label Co-regularization for Semi-supervised Facial Action Unit Recognition

Trust Region-Guided Proximal Policy Optimization

Differentiable Cloth Simulation for Inverse Problems

RSN: Randomized Subspace Newton

NAT: Neural Architecture Transformer for Accurate and Compact Architectures

No Pressure! Addressing the Problem of Local Minima in Manifold Learning Algorithms

Multiway clustering via tensor block models

Adversarial Self-Defense for Cycle-Consistent GANs

RUBi: Reducing Unimodal Biases for Visual Question Answering

Zero-Shot Semantic Segmentation

Learning to Predict Layout-to-image Conditional Convolutions for Semantic Image Synthesis

Uniform Error Bounds for Gaussian Process Regression with Application to Safe Control

Category Anchor-Guided Unsupervised Domain Adaptation for Semantic Segmentation

Towards closing the gap between the theory and practice of SVRG

ETNet: Error Transition Network for Arbitrary Style Transfer

Poisson-Randomized Gamma Dynamical Systems

vGraph: A Generative Model for Joint Community Detection and Node Representation Learning

Equitable Stable Matchings in Quadratic Time

Learning Erdos-Renyi Random Graphs via Edge Detecting Queries

Semantic Conditioned Dynamic Modulation for Temporal Sentence Grounding in Videos

Invert to Learn to Invert

Metric Learning for Adversarial Robustness

Chasing Ghosts: Instruction Following as Bayesian State Tracking

Learning Conditional Deformable Templates with Convolutional Networks

Block Coordinate Regularization by Denoising

Reducing Noise in GAN Training with Variance Reduced Extragradient

A Primal-Dual link between GANs and Autoencoders

Provable Gradient Variance Guarantees for Black-Box Variational Inference

Deep ReLU Networks Have Surprisingly Few Activation Patterns

Differentially Private Algorithms for Learning Mixtures of Separated Gaussians

Noise-tolerant fair classification

First Exit Time Analysis of Stochastic Gradient Descent Under Heavy-Tailed Gradient Noise

Experience Replay for Continual Learning

Joint-task Self-supervised Learning for Temporal Correspondence

Generalization in Generative Adversarial Networks: A Novel Perspective from Privacy Protection

Social-BiGAT: Multimodal Trajectory Forecasting using Bicycle-GAN and Graph Attention Networks

You Only Propagate Once: Accelerating Adversarial Training via Maximal Principle

Generalized Sliced Wasserstein Distances

Average-Case Averages: Private Algorithms for Smooth Sensitivity and Mean Estimation

Meta-Learning with Implicit Gradients

Zero-shot Learning via Simultaneous Generating and Learning

DeepUSPS: Deep Robust Unsupervised Saliency Prediction via Self-supervision

The Point Where Reality Meets Fantasy: Mixed Adversarial Generators for Image Splice Detection

Multi-Resolution Weak Supervision for Sequential Data

Stand-Alone Self-Attention in Vision Models

Private Hypothesis Selection

FreeAnchor: Learning to Match Anchors for Visual Object Detection

GPipe: Efficient Training of Giant Neural Networks using Pipeline Parallelism

Unsupervised learning of object structure and dynamics from videos

Stochastic Shared Embeddings: Data-driven Regularization of Embedding Layers

Faster width-dependent algorithm for mixed packing and covering LPs

Exponentially convergent stochastic k-PCA without variance reduction

Causal Confusion in Imitation Learning

Necessary and Sufficient Geometries for Gradient Methods

Geometry-Aware Neural Rendering

ViLBERT: Pretraining Task-Agnostic Visiolinguistic Representations for Vision-and-Language Tasks

Generative Modeling by Estimating Gradients of the Data Distribution

R2D2: Reliable and Repeatable Detector and Descriptor

Understanding Sparse JL for Feature Hashing

Trajectory of Alternating Direction Method of Multipliers and Adaptive Acceleration

Uniform convergence may be unable to explain generalization in deep learning

Using a Logarithmic Mapping to Enable Lower Discount Factors in Reinforcement Learning

Average Individual Fairness: Algorithms, Generalization and Experiments

High Fidelity Video Prediction with Large Stochastic Recurrent Neural Networks

A neurally plausible model learns successor representations in partially observable environments

Unsupervised Scale-consistent Depth and Ego-motion Learning from Monocular Video

Optimizing Generalized Rate Metrics with Three Players

Variance Reduction for Matrix Games

On Making Stochastic Classifiers Deterministic

Scalable Bayesian inference of dendritic voltage via spatiotemporal recurrent state space models

On Robustness of Principal Component Regression

Nonparametric Density Estimation & Convergence Rates for GANs under Besov IPM Losses

Updates of Equilibrium Prop Match Gradients of Backprop Through Time in an RNN with Static Input

Logarithmic Regret for Online Control

Fast and Accurate Least-Mean-Squares Solvers

XLNet: Generalized Autoregressive Pretraining for Language Understanding

Efficient and Thrifty Voting by Any Means Necessary

Dynamics of stochastic gradient descent for two-layer neural networks in the teacher-student setup

Kernel Instrumental Variable Regression

Putting An End to End-to-End: Gradient-Isolated Learning of Representations

Blind Super-Resolution Kernel Estimation using an Internal-GAN

Parameter elimination in particle Gibbs sampling

Guided Similarity Separation for Image Retrieval

HYPE: A Benchmark for Human eYe Perceptual Evaluation of Generative Models

Scene Representation Networks: Continuous 3D-Structure-Aware Neural Scene Representations

Strategizing against No-regret Learners