NeurIPS 2021 Papers

Layout:

mini compact topic detail

Learning Gaussian Mixtures with Generalized Linear Models: Precise Asymptotics in High-dimensions

Mitigating Covariate Shift in Imitation Learning via Offline Data With Partial Coverage

Flexible Option Learning

Landscape analysis of an improved power method for tensor decomposition

Explicit loss asymptotics in the gradient descent training of neural networks

COCO-LM: Correcting and Contrasting Text Sequences for Language Model Pretraining

Compositional Modeling of Nonlinear Dynamical Systems with ODE-based Random Features

$\alpha$-IoU: A Family of Power Intersection over Union Losses for Bounding Box Regression

A Minimalist Approach to Offline Reinforcement Learning

Time Discretization-Invariant Safe Action Repetition for Policy Gradient Methods

Conservative Data Sharing for Multi-Task Offline Reinforcement Learning

Decrypting Cryptic Crosswords: Semantically Complex Wordplay Puzzles as a Target for NLP

Spherical Motion Dynamics: Learning Dynamics of Normalized Neural Network using SGD and Weight Decay

A Gaussian Process-Bayesian Bernoulli Mixture Model for Multi-Label Active Learning

Exploring Social Posterior Collapse in Variational Autoencoder for Interaction Modeling

Sparse Training via Boosting Pruning Plasticity with Neuroregeneration

Large-Scale Unsupervised Object Discovery

Across-animal odor decoding by probabilistic manifold alignment

Score-based Generative Neural Networks for Large-Scale Optimal Transport

On Plasticity, Invariance, and Mutually Frozen Weights in Sequential Task Learning

Statistical Regeneration Guarantees of the Wasserstein Autoencoder with Latent Space Consistency

EIGNN: Efficient Infinite-Depth Graph Neural Networks

Unsupervised Noise Adaptive Speech Enhancement by Discriminator-Constrained Optimal Transport

Credal Self-Supervised Learning

Distributed Deep Learning In Open Collaborations

Skipping the Frame-Level: Event-Based Piano Transcription With Neural Semi-CRFs

Profiling Pareto Front With Multi-Objective Stein Variational Gradient Descent

Sequential Algorithms for Testing Closeness of Distributions

INDIGO: GNN-Based Inductive Knowledge Graph Completion Using Pair-Wise Encoding

Detecting Moments and Highlights in Videos via Natural Language Queries

Joint Inference for Neural Network Depth and Dropout Regularization

Lifelong Domain Adaptation via Consolidated Internal Distribution

Learning latent causal graphs via mixture oracles

Container: Context Aggregation Networks

Semialgebraic Representation of Monotone Deep Equilibrium Models and Applications to Certification

The Limitations of Large Width in Neural Networks: A Deep Gaussian Process Perspective

Temporally Abstract Partial Models

Near-Optimal Offline Reinforcement Learning via Double Variance Reduction

FL-WBC: Enhancing Robustness against Model Poisoning Attacks in Federated Learning from a Client Perspective

One Explanation is Not Enough: Structured Attention Graphs for Image Classification

Overinterpretation reveals image classification model pathologies

Good Classification Measures and How to Find Them

BNS: Building Network Structures Dynamically for Continual Learning

Cockpit: A Practical Debugging Tool for the Training of Deep Neural Networks

TRS: Transferability Reduced Ensemble via Promoting Gradient Diversity and Model Smoothness

Automorphic Equivalence-aware Graph Neural Network

Direct Multi-view Multi-person 3D Pose Estimation

Learnability of Linear Thresholds from Label Proportions

Regret Bounds for Gaussian-Process Optimization in Large Domains

On Episodes, Prototypical Networks, and Few-Shot Learning

Trustworthy Multimodal Regression with Mixture of Normal-inverse Gamma Distributions

CATs: Cost Aggregation Transformers for Visual Correspondence

Efficient Training of Retrieval Models using Negative Cache

Differentiable Multiple Shooting Layers

Deep Explicit Duration Switching Models for Time Series

Offline RL Without Off-Policy Evaluation

A Provably Efficient Model-Free Posterior Sampling Method for Episodic Reinforcement Learning

Towards Open-World Feature Extrapolation: An Inductive Graph Learning Approach

Validation Free and Replication Robust Volume-based Data Valuation

Graph Neural Networks with Adaptive Residual

Efficient Combination of Rematerialization and Offloading for Training DNNs

Conservative Offline Distributional Reinforcement Learning

On Model Calibration for Long-Tailed Object Detection and Instance Segmentation

Generative Occupancy Fields for 3D Surface-Aware Image Synthesis

TNASP: A Transformer-based NAS Predictor with a Self-evolution Framework

Learning Generative Vision Transformer with Energy-Based Latent Space for Saliency Prediction

Learning Compact Representations of Neural Networks using DiscriminAtive Masking (DAM)

Influence Patterns for Explaining Information Flow in BERT

Towards mental time travel: a hierarchical memory for reinforcement learning agents

Explaining heterogeneity in medial entorhinal cortex with task-driven neural networks

Provable Guarantees for Self-Supervised Deep Learning with Spectral Contrastive Loss

Online Knapsack with Frequency Predictions

Neural Regression, Representational Similarity, Model Zoology & Neural Taskonomy at Scale in Rodent Visual Cortex

The future is log-Gaussian: ResNets and their infinite-depth-and-width limit at initialization

A Mathematical Framework for Quantifying Transferability in Multi-source Transfer Learning

Panoptic 3D Scene Reconstruction From a Single RGB Image

PTR: A Benchmark for Part-based Conceptual, Relational, and Physical Reasoning

Improving Coherence and Consistency in Neural Sequence Models with Dual-System, Neuro-Symbolic Reasoning

3DP3: 3D Scene Perception via Probabilistic Programming

Learning with Holographic Reduced Representations

Convex Polytope Trees

You Are the Best Reviewer of Your Own Papers: An Owner-Assisted Scoring Mechanism

Online Control of Unknown Time-Varying Dynamical Systems

Dynamic Visual Reasoning by Learning Differentiable Physics Models from Video and Language

Counterbalancing Learning and Strategic Incentives in Allocation Markets

Low-dimensional Structure in the Space of Language Representations is Reflected in Brain Responses

Agnostic Reinforcement Learning with Low-Rank MDPs and Rich Observations

Improving Computational Efficiency in Visual Reinforcement Learning via Stored Embeddings

Reinforcement Learning with Latent Flow

Behavior From the Void: Unsupervised Active Pre-Training

Uniform Convergence of Interpolators: Gaussian Width, Norm Bounds and Benign Overfitting

Probing Inter-modality: Visual Parsing with Self-Attention for Vision-and-Language Pre-training

Towards Gradient-based Bilevel Optimization with Non-convex Followers and Beyond

Information is Power: Intrinsic Control via Information Capture

CHIP: CHannel Independence-based Pruning for Compact Neural Networks

Improving Generalization in Meta-RL with Imaginary Tasks from Latent Dynamics Mixture

Instance-optimal Mean Estimation Under Differential Privacy

Weak-shot Fine-grained Classification via Similarity Transfer

A Continuous Mapping For Augmentation Design

Towards robust vision by multi-task learning on monkey visual cortex

SE(3)-equivariant prediction of molecular wavefunctions and electronic densities

Analysis of one-hidden-layer neural networks via the resolvent method

A Probabilistic State Space Model for Joint Inference from Differential Equations and Data

Differentiable Simulation of Soft Multi-body Systems

Hierarchical Skills for Efficient Exploration

Sample-Efficient Learning of Stackelberg Equilibria in General-Sum Games

Improved Transformer for High-Resolution GANs

Stabilizing Deep Q-Learning with ConvNets and Vision Transformers under Data Augmentation

Tractable Regularization of Probabilistic Circuits

Batch Multi-Fidelity Bayesian Optimization with Deep Auto-Regressive Networks

Graph Posterior Network: Bayesian Predictive Uncertainty for Node Classification

Heterogeneous Multi-player Multi-armed Bandits: Closing the Gap and Generalization

Sim and Real: Better Together

Matrix factorisation and the interpretation of geodesic distance

Marginalised Gaussian Processes with Nested Sampling

Grounding Spatio-Temporal Language with Transformers

K-Net: Towards Unified Image Segmentation

Neural Algorithmic Reasoners are Implicit Planners

Active 3D Shape Reconstruction from Vision and Touch

Overcoming the curse of dimensionality with Laplacian regularization in semi-supervised learning

Adversarial Attacks on Black Box Video Classifiers: Leveraging the Power of Geometric Transformations

A self consistent theory of Gaussian Processes captures feature learning effects in finite CNNs

Fast Approximate Dynamic Programming for Infinite-Horizon Markov Decision Processes

Speedy Performance Estimation for Neural Architecture Search

Scalable Thompson Sampling using Sparse Gaussian Process Models

Reliable Decisions with Threshold Calibration

MetaAvatar: Learning Animatable Clothed Human Models from Few Depth Images

Learning Large Neighborhood Search Policy for Integer Programming

Corruption Robust Active Learning

A Critical Look at the Consistency of Causal Estimation with Deep Latent Variable Models

Non-local Latent Relation Distillation for Self-Adaptive 3D Human Pose Estimation

Piper: Multidimensional Planner for DNN Parallelization

Post-Contextual-Bandit Inference

CrypTen: Secure Multi-Party Computation Meets Machine Learning

Continuous Mean-Covariance Bandits

Controlling Neural Networks with Rule Representations

Compressing Neural Networks: Towards Determining the Optimal Layer-wise Decomposition

Hierarchical Clustering: $O(1)$-Approximation for Well-Clustered Graphs

Efficient constrained sampling via the mirror-Langevin algorithm

Towards Robust Bisimulation Metric Learning

Amortized Variational Inference for Simple Hierarchical Models

Repulsive Deep Ensembles are Bayesian

Algorithmic stability and generalization of an unsupervised feature selection algorithm

LSH-SMILE: Locality Sensitive Hashing Accelerated Simulation and Learning

Towards Better Understanding of Training Certifiably Robust Models against Adversarial Examples

Learning MDPs from Features: Predict-Then-Optimize for Sequential Decision Making by Reinforcement Learning

RLlib Flow: Distributed Reinforcement Learning is a Dataflow Problem

AC-GC: Lossy Activation Compression with Guaranteed Convergence

Near-Optimal No-Regret Learning in General Games

It Has Potential: Gradient-Driven Denoisers for Convergent Solutions to Inverse Problems

Shift Invariance Can Reduce Adversarial Robustness

DNN-based Topology Optimisation: Spatial Invariance and Neural Tangent Kernel

Neural Production Systems

Neural Active Learning with Performance Guarantees

Equivariant Manifold Flows

Reinforcement Learning in Newcomblike Environments

Disrupting Deep Uncertainty Estimation Without Harming Accuracy

Fairness in Ranking under Uncertainty

Identifiable Generative models for Missing Not at Random Data Imputation

Multi-view Contrastive Graph Clustering

Unifying lower bounds on prediction dimension of convex surrogates

Learning Stochastic Majority Votes by Minimizing a PAC-Bayes Generalization Bound

Misspecified Gaussian Process Bandit Optimization

Reliable Estimation of KL Divergence using a Discriminator in Reproducing Kernel Hilbert Space

DeepGEM: Generalized Expectation-Maximization for Blind Inversion

Parameter-free HE-friendly Logistic Regression

Imitation with Neural Density Models

SubTab: Subsetting Features of Tabular Data for Self-Supervised Representation Learning

Fair Exploration via Axiomatic Bargaining

No Regrets for Learning the Prior in Bandits

Privately Publishable Per-instance Privacy

Learning the optimal Tikhonov regularizer for inverse problems

Slow Learning and Fast Inference: Efficient Graph Similarity Computation via Knowledge Distillation

Fast Algorithms for $L_\infty$-constrained S-rectangular Robust MDPs

A Trainable Spectral-Spatial Sparse Coding Model for Hyperspectral Image Restoration

Weighted model estimation for offline model-based reinforcement learning

Bellman-consistent Pessimism for Offline Reinforcement Learning

Settling the Variance of Multi-Agent Policy Gradients

Cortico-cerebellar networks as decoupling neural interfaces

Recursive Causal Structure Learning in the Presence of Latent Variables and Selection Bias

A flow-based latent state generative model of neural population responses to natural images

Monte Carlo Tree Search With Iteratively Refining State Abstractions

Editing a classifier by rewriting its prediction rules

Training Neural Networks with Fixed Sparse Masks

Generalization Bounds For Meta-Learning: An Information-Theoretic Analysis

ByPE-VAE: Bayesian Pseudocoresets Exemplar VAE

Particle Cloud Generation with Message Passing Generative Adversarial Networks

Learning to Execute: Efficient Learning of Universal Plan-Conditioned Policies in Robotics

PSD Representations for Effective Probability Models

Breaking the Moments Condition Barrier: No-Regret Algorithm for Bandits with Super Heavy-Tailed Payoffs

Memory-efficient Patch-based Inference for Tiny Deep Learning

Boost Neural Networks by Checkpoints

Explainable Semantic Space by Grounding Language to Vision with Cross-Modal Contrastive Learning

SOAT: A Scene- and Object-Aware Transformer for Vision-and-Language Navigation

Learning 3D Dense Correspondence via Canonical Point Autoencoder

On the interplay between data structure and loss function in classification problems

Robust Compressed Sensing MRI with Deep Generative Priors

Scalable Intervention Target Estimation in Linear Models

Hierarchical Reinforcement Learning with Timed Subgoals

Selective Sampling for Online Best-arm Identification

Excess Capacity and Backdoor Poisoning

Multimodal and Multilingual Embeddings for Large-Scale Speech Mining

Learning Frequency Domain Approximation for Binary Neural Networks

Clustering Effect of Adversarial Robust Models

Flow Network based Generative Models for Non-Iterative Diverse Candidate Generation

Beyond Pinball Loss: Quantile Methods for Calibrated Uncertainty Quantification

Functional Neural Networks for Parametric Image Restoration Problems

Delayed Gradient Averaging: Tolerate the Communication Latency for Federated Learning

Understanding the Generalization Benefit of Model Invariance from a Data Perspective

Predicting Deep Neural Network Generalization with Perturbation Response Curves

Escape saddle points by a simple gradient-descent based algorithm

Your head is there to move you around: Goal-driven models of the primate dorsal pathway

SUPER-ADAM: Faster and Universal Framework of Adaptive Gradients

SGD: The Role of Implicit Regularization, Batch-size and Multiple-epochs

Decentralized Q-learning in Zero-sum Markov Games

Hit and Lead Discovery with Explorative RL and Fragment-based Molecule Generation

Channel Permutations for N:M Sparsity

Dynamic influence maximization

Closing the loop in medical decision support by understanding clinical decision-making: A case study on organ transplantation

Human-Adversarial Visual Question Answering

Non-approximate Inference for Collective Graphical Models on Path Graphs via Discrete Difference of Convex Algorithm

What Matters for Adversarial Imitation Learning?

Accumulative Poisoning Attacks on Real-time Data

IQ-Learn: Inverse soft-Q Learning for Imitation

ParK: Sound and Efficient Kernel Ridge Regression by Feature Space Partitions

MarioNette: Self-Supervised Sprite Learning

Regulating algorithmic filtering on social media

Visualizing the Emergence of Intermediate Visual Patterns in DNNs

CBP: backpropagation with constraint on weight precision using a pseudo-Lagrange multiplier method

Bridging the Gap Between Practice and PAC-Bayes Theory in Few-Shot Meta-Learning

On The Structure of Parametric Tournaments with Application to Ranking from Pairwise Comparisons

A Geometric Perspective towards Neural Calibration via Sensitivity Decomposition

Improving Transferability of Representations via Augmentation-Aware Self-Supervision

Policy Learning Using Weak Supervision

Learning on Random Balls is Sufficient for Estimating (Some) Graph Parameters

Passive attention in artificial neural networks predicts human visual selectivity

Do Vision Transformers See Like Convolutional Neural Networks?

Information-constrained optimization: can adaptive processing of gradients help?

Fast Tucker Rank Reduction for Non-Negative Tensors Using Mean-Field Approximation

Learning Hard Optimization Problems: A Data Generation Perspective

Similarity and Matching of Neural Network Representations

NEO: Non Equilibrium Sampling on the Orbits of a Deterministic Transform

Neural-PIL: Neural Pre-Integrated Lighting for Reflectance Decomposition

Scaling up Continuous-Time Markov Chains Helps Resolve Underspecification

Noisy Recurrent Neural Networks

Multi-modal Dependency Tree for Video Captioning

Deep Reinforcement Learning at the Edge of the Statistical Precipice

Shaping embodied agent behavior with activity-context priors from egocentric video

Representation Learning Beyond Linear Prediction Functions

How Modular should Neural Module Networks Be for Systematic Generalization?

PolarStream: Streaming Object Detection and Segmentation with Polar Pillars

SSMF: Shifting Seasonal Matrix Factorization

Average-Reward Learning and Planning with Options

Nonsmooth Implicit Differentiation for Machine-Learning and Optimization

Numerical influence of ReLU’(0) on backpropagation

Optimal Gradient-based Algorithms for Non-concave Bandit Optimization

Towards Sample-Optimal Compressive Phase Retrieval with Sparse and Generative Priors

PettingZoo: Gym for Multi-Agent Reinforcement Learning

HNPE: Leveraging Global Parameters for Neural Posterior Estimation

Partition-Based Formulations for Mixed-Integer Optimization of Trained ReLU Neural Networks

Gradient-based Hyperparameter Optimization Over Long Horizons

On Pathologies in KL-Regularized Reinforcement Learning from Expert Demonstrations

Multimodal Virtual Point 3D Detection

VidLanKD: Improving Language Understanding via Video-Distilled Knowledge Transfer

Convergence and Alignment of Gradient Descent with Random Backpropagation Weights

Implicit Task-Driven Probability Discrepancy Measure for Unsupervised Domain Adaptation

Referring Transformer: A One-step Approach to Multi-task Visual Grounding

Learning Diverse Policies in MOBA Games via Macro-Goals

Deformable Butterfly: A Highly Structured and Sparse Linear Transform

MAP Propagation Algorithm: Faster Learning with a Team of Reinforcement Learning Agents

An Analysis of Constant Step Size SGD in the Non-convex Regime: Asymptotic Normality and Bias

Understanding the Effect of Stochasticity in Policy Optimization

Deep Learning on a Data Diet: Finding Important Examples Early in Training

Reducing Information Bottleneck for Weakly Supervised Semantic Segmentation

Bayesian Optimization of Function Networks

Prior-independent Dynamic Auctions for a Value-maximizing Buyer

Constrained Robust Submodular Partitioning

Iterative Connecting Probability Estimation for Networks

Forster Decomposition and Learning Halfspaces with Noise

On Joint Learning for Solving Placement and Routing in Chip Design

End-to-End Weak Supervision

A Theory-Driven Self-Labeling Refinement Method for Contrastive Representation Learning

Characterizing the risk of fairwashing

Searching the Search Space of Vision Transformer

On Learning Domain-Invariant Representations for Transfer Learning with Multiple Sources

Perceptual Score: What Data Modalities Does Your Model Perceive?

On UMAP's True Loss Function

Stable, Fast and Accurate: Kernelized Attention with Relative Positional Encoding

Adder Attention for Vision Transformer

Provably Efficient Black-Box Action Poisoning Attacks Against Reinforcement Learning

COMBO: Conservative Offline Model-Based Policy Optimization

Learning to Schedule Heuristics in Branch and Bound

TAAC: Temporally Abstract Actor-Critic for Continuous Control

Spectrum-to-Kernel Translation for Accurate Blind Image Super-Resolution

Unintended Selection: Persistent Qualification Rate Disparities and Interventions

Stable Neural ODE with Lyapunov-Stable Equilibrium Points for Defending Against Adversarial Attacks

Unifying Gradient Estimators for Meta-Reinforcement Learning via Off-Policy Evaluation

Open-set Label Noise Can Improve Robustness Against Inherent Label Noise

Self-Supervised Learning of Event-Based Optical Flow with Spiking Neural Networks

Deeply Shared Filter Bases for Parameter-Efficient Convolutional Neural Networks

Doubly Robust Thompson Sampling with Linear Payoffs

Only Train Once: A One-Shot Neural Network Training And Pruning Framework

Fairness via Representation Neutralization

A No-go Theorem for Robust Acceleration in the Hyperbolic Plane

Probabilistic Transformer For Time Series Analysis

Learning Robust Hierarchical Patterns of Human Brain across Many fMRI Studies

Self-Adaptable Point Processes with Nonparametric Time Decays

RED : Looking for Redundancies for Data-FreeStructured Compression of Deep Neural Networks

Adversarially Robust Change Point Detection

The Limits of Optimal Pricing in the Dark

Making the most of your day: online learning for optimal allocation of time

Visual Search Asymmetry: Deep Nets and Humans Share Similar Inherent Biases

Invertible DenseNets with Concatenated LipSwish

Pareto-Optimal Learning-Augmented Algorithms for Online Conversion Problems

Non-asymptotic Error Bounds for Bidirectional GANs

Iterative Teacher-Aware Learning

Stochastic $L^\natural$-convex Function Minimization

BatchQuant: Quantized-for-all Architecture Search with Robust Quantizer

Post-Training Sparsity-Aware Quantization

Accurately Solving Rod Dynamics with Graph Learning

Online and Offline Reinforcement Learning by Planning with a Learned Model

Symplectic Adjoint Method for Exact Gradient of Neural ODE with Minimal Memory

Sample-Efficient Reinforcement Learning Is Feasible for Linearly Realizable MDPs with Limited Revisiting

Greedy and Random Quasi-Newton Methods with Faster Explicit Superlinear Convergence

Heavy Ball Momentum for Conditional Gradient

On the Out-of-distribution Generalization of Probabilistic Image Modelling

Neural Architecture Dilation for Adversarial Robustness

Even your Teacher Needs Guidance: Ground-Truth Targets Dampen Regularization Imposed by Self-Distillation

Revisiting Smoothed Online Learning

Learning interaction rules from multi-animal trajectories via augmented behavioral models

A Constant Approximation Algorithm for Sequential Random-Order No-Substitution k-Median Clustering

Proxy Convexity: A Unified Framework for the Analysis of Neural Networks Trained by Gradient Descent

Dynamic Sasvi: Strong Safe Screening for Norm-Regularized Least Squares

Scalable Inference of Sparsely-changing Gaussian Markov Random Fields

Capturing implicit hierarchical structure in 3D biomedical images with self-supervised hyperbolic representations

Adaptive Risk Minimization: Learning to Adapt to Domain Shift

Analyzing the Confidentiality of Undistillable Teachers in Knowledge Distillation

Scalable Rule-Based Representation Learning for Interpretable Classification

Parallel and Efficient Hierarchical k-Median Clustering

Training Feedback Spiking Neural Networks by Implicit Differentiation on the Equilibrium State

Conic Blackwell Algorithm: Parameter-Free Convex-Concave Saddle-Point Solving

Robust Allocations with Diversity Constraints

Optimality and Stability in Federated Learning: A Game-theoretic Approach

Dynamic Causal Bayesian Optimization

An Efficient Pessimistic-Optimistic Algorithm for Stochastic Linear Bandits with General Constraints

Robustifying Algorithms of Learning Latent Trees with Vector Variables

Generalization Guarantee of SGD for Pairwise Learning

Universal Off-Policy Evaluation

Calibration and Consistency of Adversarial Surrogate Losses

On the Convergence of Step Decay Step-Size for Stochastic Optimization

Unsupervised Speech Recognition

TransMIL: Transformer based Correlated Multiple Instance Learning for Whole Slide Image Classification

DOBF: A Deobfuscation Pre-Training Objective for Programming Languages

On the Expected Complexity of Maxout Networks

Tactical Optimism and Pessimism for Deep Reinforcement Learning

An online passive-aggressive algorithm for difference-of-squares classification

Learning State Representations from Random Deep Action-conditional Predictions

Exact Privacy Guarantees for Markov Chain Implementations of the Exponential Mechanism with Artificial Atoms

Relative stability toward diffeomorphisms indicates performance in deep nets

Sparse Uncertainty Representation in Deep Learning with Inducing Weights

Online Active Learning with Surrogate Loss Functions

Reverse engineering learned optimizers reveals known and novel mechanisms

A Near-Optimal Algorithm for Debiasing Trained Machine Learning Models

Efficient and Accurate Gradients for Neural SDEs

Dynamics of Stochastic Momentum Methods on Large-scale, Quadratic Models

Sparse Spiking Gradient Descent

Convergence Rates of Stochastic Gradient Descent under Infinite Noise Variance

Few-Shot Data-Driven Algorithms for Low Rank Approximation

Privately Learning Mixtures of Axis-Aligned Gaussians

Hash Layers For Large Sparse Models

List-Decodable Mean Estimation in Nearly-PCA Time

Automatic Unsupervised Outlier Model Selection

Why Lottery Ticket Wins? A Theoretical Perspective of Sample Complexity on Sparse Neural Networks

Ising Model Selection Using $\ell_{1}$-Regularized Linear Regression: A Statistical Mechanics Analysis

Dynamic Bottleneck for Robust Self-Supervised Exploration

USCO-Solver: Solving Undetermined Stochastic Combinatorial Optimization Problems

Dynaboard: An Evaluation-As-A-Service Platform for Holistic Next-Generation Benchmarking

Understanding Interlocking Dynamics of Cooperative Rationalization

Differentiable Equilibrium Computation with Decision Diagrams for Stackelberg Models of Combinatorial Congestion Games

Understanding Adaptive, Multiscale Temporal Integration In Deep Speech Recognition Systems

Beyond Value-Function Gaps: Improved Instance-Dependent Regret Bounds for Episodic Reinforcement Learning

Online Market Equilibrium with Application to Fair Division

Process for Adapting Language Models to Society (PALMS) with Values-Targeted Datasets

Bias and variance of the Bayesian-mean decoder

MLP-Mixer: An all-MLP Architecture for Vision

Learning Knowledge Graph-based World Models of Textual Environments

Bridging Non Co-occurrence with Unlabeled In-the-wild Data for Incremental Object Detection

Refined Learning Bounds for Kernel and Approximate $k$-Means

Unleashing the Power of Contrastive Self-Supervised Visual Models via Contrast-Regularized Fine-Tuning

Faster Directional Convergence of Linear Neural Networks under Spherically Symmetric Data

An Exponential Lower Bound for Linearly Realizable MDP with Constant Suboptimality Gap

Coresets for Classification – Simplified and Strengthened

Collaborative Learning in the Jungle (Decentralized, Byzantine, Heterogeneous, Asynchronous and Nonconvex Learning)

Submodular + Concave

Understanding Partial Multi-Label Learning via Mutual Information

Self-Supervised Learning with Data Augmentations Provably Isolates Content from Style

Improved Coresets and Sublinear Algorithms for Power Means in Euclidean Spaces

Multi-Armed Bandits with Bounded Arm-Memory: Near-Optimal Guarantees for Best-Arm Identification and Regret Minimization

Revisiting Contrastive Methods for Unsupervised Learning of Visual Representations

Vector-valued Distance and Gyrocalculus on the Space of Symmetric Positive Definite Matrices

Self-Instantiated Recurrent Units with Dynamic Soft Recursion

Near-Optimal Lower Bounds For Convex Optimization For All Orders of Smoothness

Improved Learning Rates of a Functional Lasso-type SVM with Sparse Multi-Kernel Representation

Exploring Architectural Ingredients of Adversarially Robust Deep Neural Networks

A Hierarchical Reinforcement Learning Based Optimization Framework for Large-scale Dynamic Pickup and Delivery Problems

Morié Attack (MA): A New Potential Risk of Screen Photos

Towards Understanding Why Lookahead Generalizes Better Than SGD and Beyond

FjORD: Fair and Accurate Federated Learning under heterogeneous targets with Ordered Dropout

Replacing Rewards with Examples: Example-Based Policy Search via Recursive Classification

An Empirical Study of Adder Neural Networks for Object Detection

Medical Dead-ends and Learning to Identify High-Risk States and Treatments

Low-Rank Extragradient Method for Nonsmooth and Low-Rank Matrix Optimization Problems

Do Neural Optimal Transport Solvers Work? A Continuous Wasserstein-2 Benchmark

Local plasticity rules can learn deep representations using self-supervised contrastive predictions

Approximating the Permanent with Deep Rejection Sampling

Exploiting Data Sparsity in Secure Cross-Platform Social Recommendation

How Data Augmentation affects Optimization for Linear Regression

Spot the Difference: Detection of Topological Changes via Geometric Alignment

Deep Extended Hazard Models for Survival Analysis

Scaling Gaussian Processes with Derivative Information Using Variational Inference

On the Expressivity of Markov Reward

Credit Assignment in Neural Networks through Deep Feedback Control

KS-GNN: Keywords Search over Incomplete Graphs via Graphs Neural Network

A novel notion of barycenter for probability distributions based on optimal weak mass transport

Confident Anchor-Induced Multi-Source Free Domain Adaptation

Learning from Inside: Self-driven Siamese Sampling and Reasoning for Video Question Answering

Iterative Amortized Policy Optimization

The Sensory Neuron as a Transformer: Permutation-Invariant Neural Networks for Reinforcement Learning

Encoding Robustness to Image Style via Adversarial Feature Perturbations

Structure-Aware Random Fourier Kernel for Graphs

Risk Monotonicity in Statistical Learning

Recognizing Vector Graphics without Rasterization

Large-Scale Learning with Fourier Features and Tensor Decompositions

Implicit Semantic Response Alignment for Partial Domain Adaptation

Exponential Separation between Two Learning Models and Adversarial Robustness

SBO-RNN: Reformulating Recurrent Neural Networks via Stochastic Bilevel Optimization

Variational Continual Bayesian Meta-Learning

Learning One Representation to Optimize All Rewards

Data-Efficient GAN Training Beyond (Just) Augmentations: A Lottery Ticket Perspective

Linear Convergence of Gradient Methods for Estimating Structured Transition Matrices in High-dimensional Vector Autoregressive Models

Raw Nav-merge Seismic Data to Subsurface Properties with MLP based Multi-Modal Information Unscrambler

On the Convergence and Sample Efficiency of Variance-Reduced Policy Gradient Method

Robust Optimization for Multilingual Translation with Imbalanced Data

SPANN: Highly-efficient Billion-scale Approximate Nearest Neighborhood Search

Class-Incremental Learning via Dual Augmentation

Mitigating Forgetting in Online Continual Learning with Neuron Calibration

Robust Auction Design in the Auto-bidding World

CROCS: Clustering and Retrieval of Cardiac Signals Based on Patient Disease Class, Sex, and Age

Multi-Objective SPIBB: Seldonian Offline Policy Improvement with Safety Constraints in Finite MDPs

The Pareto Frontier of model selection for general Contextual Bandits

Asymptotically Exact Error Characterization of Offline Policy Evaluation with Misspecified Linear Models

Causal Bandits with Unknown Graph Structure

Emergent Discrete Communication in Semantic Spaces

On the Stochastic Stability of Deep Markov Models

Statistical Inference with M-Estimators on Adaptively Collected Data

Reward-Free Model-Based Reinforcement Learning with Linear Function Approximation

Locality Sensitive Teaching

RL for Latent MDPs: Regret Guarantees and a Lower Bound

Distribution-free inference for regression: discrete, continuous, and in between

Lattice partition recovery with dyadic CART

The Flip Side of the Reweighted Coin: Duality of Adaptive Dropout and Regularization

Beyond the Signs: Nonparametric Tensor Completion via Sign Series

Provable Model-based Nonlinear Bandit and Reinforcement Learning: Shelve Optimism, Embrace Virtual Curvature

Tensor decompositions of higher-order correlations by nonlinear Hebbian plasticity

Curriculum Learning for Vision-and-Language Navigation

Information Directed Sampling for Sparse Linear Bandits

A generative nonparametric Bayesian model for whole genomes

Derivative-Free Policy Optimization for Linear Risk-Sensitive and Robust Control Design: Implicit Regularization and Sample Complexity

A Unified View of cGANs with and without Classifiers

Sub-Linear Memory: How to Make Performers SLiM

Challenges and Opportunities in High Dimensional Variational Inference

On the Existence of The Adversarial Bayes Classifier

Cross-modal Domain Adaptation for Cost-Efficient Visual Reinforcement Learning

To Beam Or Not To Beam: That is a Question of Cooperation for Language GANs

True Few-Shot Learning with Language Models

Label Disentanglement in Partition-based Extreme Multilabel Classification

Towards understanding retrosynthesis by energy-based models

Rectangular Flows for Manifold Learning

Two Sides of Meta-Learning Evaluation: In vs. Out of Distribution

Stability & Generalisation of Gradient Descent for Shallow Neural Networks without the Neural Tangent Kernel

DiBS: Differentiable Bayesian Structure Learning

BARTScore: Evaluating Generated Text as Text Generation

Fast and accurate randomized algorithms for low-rank tensor decompositions

Nearly Horizon-Free Offline Reinforcement Learning

CogView: Mastering Text-to-Image Generation via Transformers

Private and Non-private Uniformity Testing for Ranking Data

Universal Graph Convolutional Networks

Causal Inference for Event Pairs in Multivariate Point Processes

Labeling Trick: A Theory of Using Graph Neural Networks for Multi-Node Representation Learning

On the Power of Differentiable Learning versus PAC and SQ Learning

SOLQ: Segmenting Objects by Learning Queries

Bandits with Knapsacks beyond the Worst Case

Counterfactual Invariance to Spurious Correlations in Text Classification

Sample-Efficient Reinforcement Learning for Linearly-Parameterized MDPs with a Generative Model

Identity testing for Mallows model

Generalized and Discriminative Few-Shot Object Detection via SVD-Dictionary Enhancement

Localization with Sampling-Argmax

Probabilistic Entity Representation Model for Reasoning over Knowledge Graphs

Automated Discovery of Adaptive Attacks on Adversarial Defenses

Learning with Labeling Induced Abstentions

Fair Sparse Regression with Clustering: An Invex Relaxation for a Combinatorial Problem

Revisiting Hilbert-Schmidt Information Bottleneck for Adversarial Robustness

Fast Policy Extragradient Methods for Competitive Games with Entropy Regularization

A Regression Approach to Learning-Augmented Online Algorithms

Revenue maximization via machine learning with noisy data

Planning from Pixels in Environments with Combinatorially Hard Search Spaces

Privately Learning Subspaces

Which Mutual-Information Representation Learning Objectives are Sufficient for Control?

Symbolic Regression via Deep Reinforcement Learning Enhanced Genetic Programming Seeding

Risk Bounds for Over-parameterized Maximum Margin Classification on Sub-Gaussian Mixtures

Spatial Ensemble: a Novel Model Smoothing Mechanism for Student-Teacher Framework

Online Multi-Armed Bandits with Adaptive Inference

Equilibrium and non-Equilibrium regimes in the learning of Restricted Boltzmann Machines

Contextual Recommendations and Low-Regret Cutting-Plane Algorithms

UniDoc: Unified Pretraining Framework for Document Understanding

The effectiveness of feature attribution methods and its correlation with automatic evaluation scores

Subquadratic Overparameterization for Shallow Neural Networks

Learning Semantic Representations to Verify Hardware Designs

Constrained Optimization to Train Neural Networks on Critical and Under-Represented Classes

Surrogate Regret Bounds for Polyhedral Losses

A Variational Perspective on Diffusion-Based Generative Models and Score Matching

Unifying Width-Reduced Methods for Quasi-Self-Concordant Optimization

Shapeshifter: a Parameter-efficient Transformer using Factorized Reshaped Matrices

A Bayesian-Symbolic Approach to Reasoning and Learning in Intuitive Physics

Cardinality constrained submodular maximization for random streams

On Calibration and Out-of-Domain Generalization

Nonuniform Negative Sampling and Log Odds Correction with Rare Events Data

For high-dimensional hierarchical models, consider exchangeability of effects across covariates instead of across datasets

Intriguing Properties of Contrastive Losses

Answering Complex Causal Queries With the Maximum Causal Set Effect

Generalizable Multi-linear Attention Network

Co-Adaptation of Algorithmic and Implementational Innovations in Inference-based Deep Reinforcement Learning

Test-time Collective Prediction

Statistical Undecidability in Linear, Non-Gaussian Causal Models in the Presence of Latent Confounders

Are Transformers more robust than CNNs?

Approximate optimization of convex functions with outlier noise

Reliable and Trustworthy Machine Learning for Health Using Dataset Shift Detection

SimiGrad: Fine-Grained Adaptive Batching for Large Scale Training using Gradient Similarity Measurement

Bootstrap Your Object Detector via Mixed Training

Can fMRI reveal the representation of syntactic structure in the brain?

On the Algorithmic Stability of Adversarial Training

Damped Anderson Mixing for Deep Reinforcement Learning: Acceleration, Convergence, and Stabilization

Can multi-label classification networks know what they don’t know?

AFEC: Active Forgetting of Negative Transfer in Continual Learning

Near-Optimal Multi-Perturbation Experimental Design for Causal Structure Learning

Exploring Forensic Dental Identification with Deep Learning

Dissecting the Diffusion Process in Linear Graph Convolutional Networks

Solving Graph-based Public Goods Games with Tree Search and Imitation Learning

NeurWIN: Neural Whittle Index Network For Restless Bandits Via Deep RL

On Riemannian Optimization over Positive Definite Matrices with the Bures-Wasserstein Geometry

Safe Pontryagin Differentiable Programming

Generic Neural Architecture Search via Regression

Graph Differentiable Architecture Search with Structure Learning

Reinforcement Learning in Reward-Mixing MDPs

A Highly-Efficient Group Elastic Net Algorithm with an Application to Function-On-Scalar Regression

Not All Low-Pass Filters are Robust in Graph Convolutional Networks

Implicit Regularization in Matrix Sensing via Mirror Descent

Generalized DataWeighting via Class-Level Gradient Manipulation

Online Robust Reinforcement Learning with Model Uncertainty

Single Layer Predictive Normalized Maximum Likelihood for Out-of-Distribution Detection

Non-Asymptotic Analysis for Two Time-scale TDC with General Smooth Function Approximation

Continuized Accelerations of Deterministic and Stochastic Gradient Descents, and of Gossip Algorithms

Oracle Complexity in Nonsmooth Nonconvex Optimization

Unlabeled Principal Component Analysis

Residual2Vec: Debiasing graph embedding with random graphs

Towards Context-Agnostic Learning Using Synthetic Data

Modality-Agnostic Topology Aware Localization

A Closer Look at the Worst-case Behavior of Multi-armed Bandit Algorithms

Asynchronous Stochastic Optimization Robust to Arbitrary Delays

Graph Neural Networks with Local Graph Parameters

Towards Sharper Generalization Bounds for Structured Prediction

L2ight: Enabling On-Chip Learning for Optical Neural Networks via Efficient in-situ Subspace Optimization

Coresets for Time Series Clustering

MCMC Variational Inference via Uncorrected Hamiltonian Annealing

Adversarial Examples for k-Nearest Neighbor Classifiers Based on Higher-Order Voronoi Diagrams

Implicit Sparse Regularization: The Impact of Depth and Early Stopping

Fixes That Fail: Self-Defeating Improvements in Machine-Learning Systems

Risk-Aware Transfer in Reinforcement Learning using Successor Features

A Biased Graph Neural Network Sampler with Near-Optimal Regret

Coresets for Decision Trees of Signals

Quantifying and Improving Transferability in Domain Generalization

Online Selective Classification with Limited Feedback

Concentration inequalities under sub-Gaussian and sub-exponential conditions

Subgaussian and Differentiable Importance Sampling for Off-Policy Evaluation and Learning

Combining Latent Space and Structured Kernels for Bayesian Optimization over Combinatorial Spaces

MAU: A Motion-Aware Unit for Video Prediction and Beyond

Width-based Lookaheads with Learnt Base Policies and Heuristics Over the Atari-2600 Benchmark

FastCorrect: Fast Error Correction with Edit Alignment for Automatic Speech Recognition

STEP: Out-of-Distribution Detection in the Presence of Limited In-Distribution Labeled Data

The Complexity of Bayesian Network Learning: Revisiting the Superstructure

Tighter Expected Generalization Error Bounds via Wasserstein Distance

Differentiable Learning Under Triage

Analytical Study of Momentum-Based Acceleration Methods in Paradigmatic High-Dimensional Non-Convex Problems

GraphFormers: GNN-nested Transformers for Representation Learning on Textual Graph

Hyperbolic Busemann Learning with Ideal Prototypes

Meta-Learning for Relative Density-Ratio Estimation

TacticZero: Learning to Prove Theorems from Scratch with Deep Reinforcement Learning

ROI Maximization in Stochastic Online Decision-Making

Asymptotics of representation learning in finite Bayesian neural networks

Reverse engineering recurrent neural networks with Jacobian switching linear dynamical systems

Breaking the Sample Complexity Barrier to Regret-Optimal Model-Free Reinforcement Learning

Revisiting ResNets: Improved Training and Scaling Strategies

Communication-efficient SGD: From Local SGD to One-Shot Averaging

DSelect-k: Differentiable Selection in the Mixture of Experts with Applications to Multi-Task Learning

Well-tuned Simple Nets Excel on Tabular Datasets

A Central Limit Theorem for Differentially Private Query Answering

A Little Robustness Goes a Long Way: Leveraging Robust Features for Targeted Transfer Attacks

Learning Dynamic Graph Representation of Brain Connectome with Spatio-Temporal Attention

Diffusion Models Beat GANs on Image Synthesis

Metadata-based Multi-Task Bandits with Bayesian Hierarchical Models

A Contrastive Learning Approach for Training Variational Autoencoder Priors

Are My Deep Learning Systems Fair? An Empirical Study of Fixed-Seed Training

Coresets for Clustering with Missing Values

Representation Learning on Spatial Networks

Chasing Sparsity in Vision Transformers: An End-to-End Exploration

Cycle Self-Training for Domain Adaptation

Self-Supervised Multi-Object Tracking with Cross-input Consistency

Generalizable Imitation Learning from Observation via Inferring Goal Proximity

Information-theoretic generalization bounds for black-box learning algorithms

Disentangled Contrastive Learning on Graphs

Deep Proxy Causal Learning and its Application to Confounded Bandit Policy Evaluation

CANITA: Faster Rates for Distributed Convex Optimization with Communication Compression

Adversarial Regression with Doubly Non-negative Weighting Matrices

ConE: Cone Embeddings for Multi-Hop Reasoning over Knowledge Graphs

Combinatorial Pure Exploration with Bottleneck Reward Function

Few-Shot Object Detection via Association and DIscrimination

Coarse-to-fine Animal Pose and Shape Estimation

Memory-Efficient Approximation Algorithms for Max-k-Cut and Correlation Clustering

An Online Method for A Class of Distributionally Robust Optimization with Non-convex Objectives

Regime Switching Bandits

Conformal Bayesian Computation

Two steps to risk sensitivity

Rebooting ACGAN: Auxiliary Classifier GANs with Stable Training

Causal Influence Detection for Improving Efficiency in Reinforcement Learning

NeuroLKH: Combining Deep Learning Model with Lin-Kernighan-Helsgaun Heuristic for Solving the Traveling Salesman Problem

Learning to Time-Decode in Spiking Neural Networks Through the Information Bottleneck

Uncertainty-Driven Loss for Single Image Super-Resolution

Continual World: A Robotic Benchmark For Continual Reinforcement Learning

Spectral embedding for dynamic networks with stability guarantees

Decentralized Learning in Online Queuing Systems

Framing RNN as a kernel method: A neural ODE approach

Algorithmic Instabilities of Accelerated Gradient Descent

Dual Progressive Prototype Network for Generalized Zero-Shot Learning

Distributed Principal Component Analysis with Limited Communication

Efficient Active Learning for Gaussian Process Classification by Error Reduction

E(n) Equivariant Normalizing Flows

Scalable Bayesian GPFA with automatic relevance determination and discrete noise models

Sharp Impossibility Results for Hyper-graph Testing

Private learning implies quantum stability

IRM—when it works and when it doesn't: A test case of natural language inference

GemNet: Universal Directional Graph Neural Networks for Molecules

Tight High Probability Bounds for Linear Stochastic Approximation with Fixed Stepsize

Learning to Combine Per-Example Solutions for Neural Program Synthesis

Causal Navigation by Continuous-time Neural Networks

Provable Representation Learning for Imitation with Contrastive Fourier Features

Detecting Errors and Estimating Accuracy on Unlabeled Data with Self-training Ensembles

Parametrized Quantum Policies for Reinforcement Learning

Parameter Prediction for Unseen Deep Architectures

A Consciousness-Inspired Planning Agent for Model-Based Reinforcement Learning

LLC: Accurate, Multi-purpose Learnt Low-dimensional Binary Codes

How can classical multidimensional scaling go wrong?

Deep Extrapolation for Attribute-Enhanced Generation

Separation Results between Fixed-Kernel and Feature-Learning Probability Metrics

Removing Inter-Experimental Variability from Functional Data in Systems Neuroscience

Neural optimal feedback control with local learning rules

HyperSPNs: Compact and Expressive Probabilistic Circuits

Robust Generalization despite Distribution Shift via Minimum Discriminating Information

Tracking Without Re-recognition in Humans and Machines

Luna: Linear Unified Nested Attention

Modified Frank Wolfe in Probability Space

EDGE: Explaining Deep Reinforcement Learning Policies

An Empirical Investigation of Domain Generalization with Empirical Risk Minimizers

Differentially Private n-gram Extraction

Heuristic-Guided Reinforcement Learning

A Note on Sparse Generalized Eigenvalue Problem

On Empirical Risk Minimization with Dependent and Heavy-Tailed Data

Celebrating Diversity in Shared Multi-Agent Reinforcement Learning

Mirror Langevin Monte Carlo: the Case Under Isoperimetry

HSVA: Hierarchical Semantic-Visual Adaptation for Zero-Shot Learning

Exploration-Exploitation in Multi-Agent Competition: Convergence with Bounded Rationality

Finding Regions of Heterogeneity in Decision-Making via Expected Conditional Covariance

Neural Additive Models: Interpretable Machine Learning with Neural Nets

You Never Cluster Alone

Rethinking the Pruning Criteria for Convolutional Neural Network

Bridging Explicit and Implicit Deep Generative Models via Neural Stein Estimators

Distilling Object Detectors with Feature Richness

Support Recovery of Sparse Signals from a Mixture of Linear Measurements

Residual Relaxation for Multi-view Representation Learning

Latent Execution for Neural Program Synthesis Beyond Domain-Specific Languages

Landmark-Guided Subgoal Generation in Hierarchical Reinforcement Learning

Optimal Rates for Random Order Online Optimization

Autonomous Reinforcement Learning via Subgoal Curricula

Tuning Large Neural Networks via Zero-Shot Hyperparameter Transfer

$\texttt{LeadCache}$: Regret-Optimal Caching in Networks

Noise2Score: Tweedie’s Approach to Self-Supervised Image Denoising without Clean Images

Adversarial Robustness with Non-uniform Perturbations

Object-Aware Regularization for Addressing Causal Confusion in Imitation Learning

Optimal Uniform OPE and Model-based Offline Reinforcement Learning in Time-Homogeneous, Reward-Free and Task-Agnostic Settings

Online Matching in Sparse Random Graphs: Non-Asymptotic Performances of Greedy Algorithm

All Tokens Matter: Token Labeling for Training Better Vision Transformers

The decomposition of the higher-order homology embedding constructed from the $k$-Laplacian

Catch-A-Waveform: Learning to Generate Audio from a Single Short Example

Curriculum Disentangled Recommendation with Noisy Multi-feedback

Unsupervised Motion Representation Learning with Capsule Autoencoders

On Margin-Based Cluster Recovery with Oracle Queries

Locally Most Powerful Bayesian Test for Out-of-Distribution Detection using Deep Generative Models

Mixture weights optimisation for Alpha-Divergence Variational Inference

Fast and Memory Efficient Differentially Private-SGD via JL Projections

Conformal Time-series Forecasting

A Max-Min Entropy Framework for Reinforcement Learning

Instance-Dependent Partial Label Learning

Leveraging Distribution Alignment via Stein Path for Cross-Domain Cold-Start Recommendation

Modeling Heterogeneous Hierarchies with Relation-specific Hyperbolic Cones

Adaptive Diffusion in Graph Neural Networks

Explaining Latent Representations with a Corpus of Examples

Sparse Quadratic Optimisation over the Stiefel Manifold with Application to Permutation Synchronisation

Knowledge-inspired 3D Scene Graph Prediction in Point Cloud

Regularization in ResNet with Stochastic Depth

Photonic Differential Privacy with Direct Feedback Alignment

Few-Round Learning for Federated Learning

Multiclass Boosting and the Cost of Weak Learning

On Optimal Robustness to Adversarial Corruption in Online Decision Problems

Self-Diagnosing GAN: Diagnosing Underrepresented Samples in Generative Adversarial Networks

Addressing Algorithmic Disparity and Performance Inconsistency in Federated Learning

ABC: Auxiliary Balanced Classifier for Class-imbalanced Semi-supervised Learning

Rethinking Calibration of Deep Neural Networks: Do Not Be Afraid of Overconfidence

There Is No Turning Back: A Self-Supervised Approach for Reversibility-Aware Reinforcement Learning

Learning Interpretable Decision Rule Sets: A Submodular Optimization Approach

Fault-Tolerant Federated Reinforcement Learning with Theoretical Guarantee

Boosted CVaR Classification

MICo: Improved representations via sampling-based state similarity for Markov decision processes

Federated Hyperparameter Tuning: Challenges, Baselines, and Connections to Weight-Sharing

Fast Minimum-norm Adversarial Attacks through Adaptive Norm Constraints

Dealing With Misspecification In Fixed-Confidence Linear Top-m Identification

BAST: Bayesian Additive Regression Spanning Trees for Complex Constrained Domain

On Memorization in Probabilistic Deep Generative Models

Assessing Fairness in the Presence of Missing Data

Entropy-based adaptive Hamiltonian Monte Carlo

DeepReduce: A Sparse-tensor Communication Framework for Federated Deep Learning

An Improved Analysis and Rates for Variance Reduction under Without-replacement Sampling Orders

Taxonomizing local versus global structure in neural network loss landscapes

Making a (Counterfactual) Difference One Rationale at a Time

RIM: Reliable Influence-based Active Learning on Graphs

SOFT: Softmax-free Transformer with Linear Complexity

Node Dependent Local Smoothing for Scalable Graph Learning

Simple Stochastic and Online Gradient Descent Algorithms for Pairwise Learning

A Geometric Analysis of Neural Collapse with Unconstrained Features

Noisy Adaptation Generates Lévy Flights in Attractor Neural Networks

Reverse-Complement Equivariant Networks for DNA Sequences

Test-Time Classifier Adjustment Module for Model-Agnostic Domain Generalization

Statistical Query Lower Bounds for List-Decodable Linear Regression

The Unbalanced Gromov Wasserstein Distance: Conic Formulation and Relaxation

Can You Learn an Algorithm? Generalizing from Easy to Hard Problems with Recurrent Networks

Adversarial Teacher-Student Representation Learning for Domain Generalization

Neural Bootstrapper

Learning to Draw: Emergent Communication through Sketching

Counterfactual Maximum Likelihood Estimation for Training Deep Networks

Fitting summary statistics of neural data with a differentiable spiking network simulator

Littlestone Classes are Privately Online Learnable

Can contrastive learning avoid shortcut solutions?

Improved Variance-Aware Confidence Sets for Linear Bandits and Linear Mixture MDP

Contrastive Learning for Neural Topic Model

Scallop: From Probabilistic Deductive Databases to Scalable Differentiable Reasoning

Uniform-PAC Bounds for Reinforcement Learning with Linear Function Approximation

Last iterate convergence of SGD for Least-Squares in the Interpolation regime.

Provable Benefits of Actor-Critic Methods for Offline Reinforcement Learning

Necessary and sufficient graphical conditions for optimal adjustment sets in causal graphical models with hidden variables

Differentiable Unsupervised Feature Selection based on a Gated Laplacian

Uniform Concentration Bounds toward a Unified Framework for Robust Clustering

Risk-Averse Bayes-Adaptive Reinforcement Learning

Approximate Decomposable Submodular Function Minimization for Cardinality-Based Components

Lower and Upper Bounds on the Pseudo-Dimension of Tensor Network Models

Permutation-Invariant Variational Autoencoder for Graph-Level Representation Learning

Federated Reconstruction: Partially Local Federated Learning

Provably Efficient Reinforcement Learning with Linear Function Approximation under Adaptivity Constraints

K-level Reasoning for Zero-Shot Coordination in Hanabi

A Theory of the Distortion-Perception Tradeoff in Wasserstein Space

Learning a Single Neuron with Bias Using Gradient Descent

Offline Meta Reinforcement Learning -- Identifiability Challenges and Effective Data Collection Strategies

The Many Faces of Adversarial Risk

Re-ranking for image retrieval and transductive few-shot classification

Impression learning: Online representation learning with synaptic plasticity

Adaptive Conformal Inference Under Distribution Shift

Practical, Provably-Correct Interactive Learning in the Realizable Setting: The Power of True Believers

Neural Distance Embeddings for Biological Sequences

REMIPS: Physically Consistent 3D Reconstruction of Multiple Interacting People under Weak Supervision

Adaptive wavelet distillation from neural networks through interpretations

Credit Assignment Through Broadcasting a Global Error Vector

Robust Online Correlation Clustering

DOCTOR: A Simple Method for Detecting Misclassification Errors

Out-of-Distribution Generalization in Kernel Regression

Functionally Regionalized Knowledge Transfer for Low-resource Drug Discovery

Efficiently Learning One Hidden Layer ReLU Networks From Queries

Truncated Marginal Neural Ratio Estimation

Learning Barrier Certificates: Towards Safe Reinforcement Learning with Zero Training-time Violations

Hyperparameter Optimization Is Deceiving Us, and How to Stop It

Scalable and Stable Surrogates for Flexible Classifiers with Fairness Constraints

Pointwise Bounds for Distribution Estimation under Communication Constraints

Backward-Compatible Prediction Updates: A Probabilistic Approach

Universal Rate-Distortion-Perception Representations for Lossy Compression

Autobahn: Automorphism-based Graph Neural Nets

Fractal Structure and Generalization Properties of Stochastic Optimization Algorithms

Differentiable Annealed Importance Sampling and the Perils of Gradient Noise

Learning to Ground Multi-Agent Communication with Autoencoders

BCD Nets: Scalable Variational Approaches for Bayesian Causal Discovery

Turing Completeness of Bounded-Precision Recurrent Neural Networks

Interpretable agent communication from scratch (with a generic visual processor emerging on the side)

Mean-based Best Arm Identification in Stochastic Bandits under Reward Contamination

A Provably Efficient Sample Collection Strategy for Reinforcement Learning

Searching for Efficient Transformers for Language Modeling

Combining Human Predictions with Model Probabilities via Confusion Matrices and Calibration

On Large-Cohort Training for Federated Learning

A/B Testing for Recommender Systems in a Two-sided Marketplace

Detecting and Adapting to Irregular Distribution Shifts in Bayesian Online Learning

Can we globally optimize cross-validation loss? Quasiconvexity in ridge regression

A Prototype-Oriented Framework for Unsupervised Domain Adaptation

Probabilistic Attention for Interactive Segmentation

Safe Policy Optimization with Local Generalized Linear Function Approximations

Locally Valid and Discriminative Prediction Intervals for Deep Learning Models

Extracting Deformation-Aware Local Features by Learning to Deform

NxMTransformer: Semi-Structured Sparsification for Natural Language Understanding via ADMM

Lip to Speech Synthesis with Visual Context Attentional GAN

Sparse Deep Learning: A New Framework Immune to Local Traps and Miscalibration

RMIX: Learning Risk-Sensitive Policies for Cooperative Reinforcement Learning Agents

Finite-Sample Analysis of Off-Policy TD-Learning via Generalized Bellman Operators

Fine-Grained Zero-Shot Learning with DNA as Side Information

Debiased Visual Question Answering from Feature and Sample Perspectives

Towards a Theoretical Framework of Out-of-Distribution Generalization

Handling Long-tailed Feature Distribution in AdderNets

Gradient-Free Adversarial Training Against Image Corruption for Learning-based Steering

Increasing Liquid State Machine Performance with Edge-of-Chaos Dynamics Organized by Astrocyte-modulated Plasticity

Capacity and Bias of Learned Geometric Embeddings for Directed Graphs

Word2Fun: Modelling Words as Functions for Diachronic Word Representation

Provably Faster Algorithms for Bilevel Optimization

MixSeq: Connecting Macroscopic Time Series Forecasting with Microscopic Time Series Data

Practical Near Neighbor Search via Group Testing

Understanding End-to-End Model-Based Reinforcement Learning Methods as Implicit Parameterization

Fast Abductive Learning by Similarity-based Consistency Optimization

Posterior Collapse and Latent Variable Non-identifiability

See More for Scene: Pairwise Consistency Learning for Scene Classification

Adversarial Attack Generation Empowered by Min-Max Optimization

PARP: Prune, Adjust and Re-Prune for Self-Supervised Speech Recognition

Iterative Methods for Private Synthetic Data: Unifying Framework and New Methods

When Is Unsupervised Disentanglement Possible?

Robust Pose Estimation in Crowded Scenes with Direct Pose-Level Inference

Beyond BatchNorm: Towards a Unified Understanding of Normalization in Deep Learning

You are caught stealing my winning lottery ticket! Making a lottery ticket claim its ownership

Can Less be More? When Increasing-to-Balancing Label Noise Rates Considered Beneficial

Discerning Decision-Making Process of Deep Neural Networks with Hierarchical Voting Transformation

Rethinking and Reweighting the Univariate Losses for Multi-Label Ranking: Consistency and Generalization

Learning to Adapt via Latent Domains for Adaptive Semantic Segmentation

Near Optimal Policy Optimization via REPS

Per-Pixel Classification is Not All You Need for Semantic Segmentation

Variational Automatic Curriculum Learning for Sparse-Reward Cooperative Multi-Agent Problems

Optimal Algorithms for Stochastic Contextual Preference Bandits

Batch Normalization Orthogonalizes Representations in Deep Random Networks

Exploiting Chain Rule and Bayes' Theorem to Compare Probability Distributions

Multi-Objective Meta Learning

Efficiently Identifying Task Groupings for Multi-Task Learning

Continuous Doubly Constrained Batch Reinforcement Learning

ELLA: Exploration through Learned Language Abstraction

PortaSpeech: Portable and High-Quality Generative Text-to-Speech

A mechanistic multi-area recurrent network model of decision-making

Localization, Convexity, and Star Aggregation

Learning to delegate for large-scale vehicle routing

Maximum Likelihood Training of Score-Based Diffusion Models

Graphical Models in Heavy-Tailed Markets

Reliable Post hoc Explanations: Modeling Uncertainty in Explainability

Relaxing Local Robustness

Improving Calibration through the Relationship with Adversarial Robustness

Consistent Non-Parametric Methods for Maximizing Robustness

Representation Learning for Event-based Visuomotor Policies

Towards Understanding Cooperative Multi-Agent Q-Learning with Value Factorization

Off-Policy Risk Assessment in Contextual Bandits

A Bi-Level Framework for Learning to Solve Combinatorial Optimization on Graphs

Joint Semantic Mining for Weakly Supervised RGB-D Salient Object Detection

The Inductive Bias of Quantum Kernels

Hindsight Task Relabelling: Experience Replay for Sparse Reward Meta-RL

Coordinated Proximal Policy Optimization

Estimating High Order Gradients of the Data Distribution by Denoising

Stabilizing Dynamical Systems via Policy Gradient Methods

What Makes Multi-Modal Learning Better than Single (Provably)

Cardinality-Regularized Hawkes-Granger Model

Deep Contextual Video Compression

Designing Counterfactual Generators using Deep Model Inversion

Model Adaptation: Historical Contrastive Learning for Unsupervised Domain Adaptation without Source Data

Offline Reinforcement Learning as One Big Sequence Modeling Problem

G-PATE: Scalable Differentially Private Data Generator via Private Aggregation of Teacher Discriminators

Emergent Communication of Generalizations

Glance-and-Gaze Vision Transformer

On the Sample Complexity of Privately Learning Axis-Aligned Rectangles

Teachable Reinforcement Learning via Advice Distillation

Sampling with Trusthworthy Constraints: A Variational Gradient Framework

Anti-Backdoor Learning: Training Clean Models on Poisoned Data

Control Variates for Slate Off-Policy Evaluation

TriBERT: Human-centric Audio-visual Representation Learning

How Powerful are Performance Predictors in Neural Architecture Search?

RoMA: Robust Model Adaptation for Offline Model-based Optimization

Sample Complexity Bounds for Active Ranking from Multi-wise Comparisons

Understanding and Improving Early Stopping for Learning with Noisy Labels

NeRS: Neural Reflectance Surfaces for Sparse-view 3D Reconstruction in the Wild

Sageflow: Robust Federated Learning against Both Stragglers and Adversaries

A Universal Law of Robustness via Isoperimetry

Understanding the Under-Coverage Bias in Uncertainty Estimation

Improving Anytime Prediction with Parallel Cascaded Networks and a Temporal-Difference Loss

Differentially Private Model Personalization

Multi-Agent Reinforcement Learning in Stochastic Networked Systems

BulletTrain: Accelerating Robust Neural Network Training via Boundary Example Mining

Robust Predictable Control

Revisiting Model Stitching to Compare Neural Representations

Widening the Pipeline in Human-Guided Reinforcement Learning with Explanation and Context-Aware Data Augmentation

Iterative Causal Discovery in the Possible Presence of Latent Confounders and Selection Bias

Fast Extra Gradient Methods for Smooth Structured Nonconvex-Nonconcave Minimax Problems

VATT: Transformers for Multimodal Self-Supervised Learning from Raw Video, Audio and Text

Detecting Anomalous Event Sequences with Temporal Point Processes

ViTAE: Vision Transformer Advanced by Exploring Intrinsic Inductive Bias

End-to-end Multi-modal Video Temporal Grounding

Subgroup Generalization and Fairness of Graph Neural Networks

Improving Visual Quality of Image Synthesis by A Token-based Generator with Transformers

Online Convex Optimization with Continuous Switching Constraint

Never Go Full Batch (in Stochastic Convex Optimization)

PCA Initialization for Approximate Message Passing in Rotationally Invariant Models

Evaluating State-of-the-Art Classification Models Against Bayes Optimality

Policy Optimization in Adversarial MDPs: Improved Exploration via Dilated Bonuses

LADA: Look-Ahead Data Acquisition via Augmentation for Deep Active Learning

Searching Parameterized AP Loss for Object Detection

Matrix encoding networks for neural combinatorial optimization

Probabilistic Margins for Instance Reweighting in Adversarial Training

TOHAN: A One-step Approach towards Few-shot Hypothesis Adaptation

Learning Riemannian metric for disease progression modeling

Random Noise Defense Against Query-Based Black-Box Attacks

Exploiting Domain-Specific Features to Enhance Domain Generalization

Unsupervised Domain Adaptation with Dynamics-Aware Rewards in Reinforcement Learning

What’s a good imputation to predict with missing values?

Local policy search with Bayesian optimization

Twice regularized MDPs and the equivalence between robustness and regularization

Supervising the Transfer of Reasoning Patterns in VQA

On Robust Optimal Transport: Computational Complexity and Barycenter Computation

Deconvolutional Networks on Graph Data

Set Prediction in the Latent Space

Learning High-Precision Bounding Box for Rotated Object Detection via Kullback-Leibler Divergence

Ensembling Graph Predictions for AMR Parsing

PartialFed: Cross-Domain Personalized Federated Learning via Partial Initialization

Predicting Event Memorability from Contextual Visual Semantics

Bounds all around: training energy-based models with bidirectional bounds

Online Sign Identification: Minimization of the Number of Errors in Thresholding Bandits

Artistic Style Transfer with Internal-external Learning and Contrastive Learning

Self-Attention Between Datapoints: Going Beyond Individual Input-Output Pairs in Deep Learning

Consistency Regularization for Variational Auto-Encoders

The Implicit Bias of Minima Stability: A View from Function Space

What can linearized neural networks actually say about generalization?

Neighborhood Reconstructing Autoencoders

SLOE: A Faster Method for Statistical Inference in High-Dimensional Logistic Regression

On Interaction Between Augmentations and Corruptions in Natural Corruption Robustness

Variational Multi-Task Learning with Gumbel-Softmax Priors

Noether’s Learning Dynamics: Role of Symmetry Breaking in Neural Networks

Posterior Meta-Replay for Continual Learning

SSUL: Semantic Segmentation with Unknown Label for Exemplar-based Class-Incremental Learning

Variational Diffusion Models

Collaborative Uncertainty in Multi-Agent Trajectory Forecasting

ResT: An Efficient Transformer for Visual Recognition

Unsupervised Object-Level Representation Learning from Scene Images

Locality defeats the curse of dimensionality in convolutional teacher-student scenarios

Learning Theory Can (Sometimes) Explain Generalisation in Graph Neural Networks

Fast rates for prediction with limited expert advice

Unsupervised Representation Transfer for Small Networks: I Believe I Can Distill On-the-Fly

Episodic Multi-agent Reinforcement Learning with Curiosity-driven Exploration

Speech Separation Using an Asynchronous Fully Recurrent Convolutional Neural Network

Rectifying the Shortcut Learning of Background for Few-Shot Learning

To The Point: Correspondence-driven monocular 3D category reconstruction

Robustness via Uncertainty-aware Cycle Consistency

Counterexample Guided RL Policy Refinement Using Bayesian Optimization

Foundations of Symbolic Languages for Model Interpretability

Wisdom of the Crowd Voting: Truthful Aggregation of Voter Information and Preferences

Rate-Optimal Subspace Estimation on Random Graphs

Inverse Optimal Control Adapted to the Noise Characteristics of the Human Sensorimotor System

Convergence of adaptive algorithms for constrained weakly convex optimization

Gradient Driven Rewards to Guarantee Fairness in Collaborative Machine Learning

Model-Based Reinforcement Learning via Imagination with Derived Memory

Pareto Domain Adaptation

Optimal Rates for Nonparametric Density Estimation under Communication Constraints

Distilling Robust and Non-Robust Features in Adversarial Examples by Information Bottleneck

Neo-GNNs: Neighborhood Overlap-aware Graph Neural Networks for Link Prediction

Federated Split Task-Agnostic Vision Transformer for COVID-19 CXR Diagnosis

DRIVE: One-bit Distributed Mean Estimation

MIRACLE: Causally-Aware Imputation via Learning Missing Data Mechanisms

Do Transformers Really Perform Badly for Graph Representation?

Diversity Enhanced Active Learning with Strictly Proper Scoring Rules

Fast Federated Learning in the Presence of Arbitrary Device Unavailability

Clockwork Variational Autoencoders

Learning Domain Invariant Representations in Goal-conditioned Block MDPs

Rethinking conditional GAN training: An approach using geometrically structured latent manifolds

Learning Space Partitions for Path Planning

Independent Prototype Propagation for Zero-Shot Compositionality

A Normative and Biologically Plausible Algorithm for Independent Component Analysis

Representing Hyperbolic Space Accurately using Multi-Component Floats

Compacter: Efficient Low-Rank Hypercomplex Adapter Layers

Proportional Participatory Budgeting with Additive Utilities

Landmark-RxR: Solving Vision-and-Language Navigation with Fine-Grained Alignment Supervision

Streaming Linear System Identification with Reverse Experience Replay

Estimating Multi-cause Treatment Effects via Single-cause Perturbation

DualNet: Continual Learning, Fast and Slow

End-to-end reconstruction meets data-driven regularization for inverse problems

Garment4D: Garment Reconstruction from Point Cloud Sequences

Identification of the Generalized Condorcet Winner in Multi-dueling Bandits

Learning Collaborative Policies to Solve NP-hard Routing Problems

Towards Scalable Unpaired Virtual Try-On via Patch-Routed Spatially-Adaptive GAN

On the Second-order Convergence Properties of Random Search Methods

Combating Noise: Semi-supervised Learning by Region Uncertainty Quantification

Neural Bellman-Ford Networks: A General Graph Neural Network Framework for Link Prediction

Natural continual learning: success is a journey, not (just) a destination

Parameterized Knowledge Transfer for Personalized Federated Learning

ToAlign: Task-Oriented Alignment for Unsupervised Domain Adaptation

Integrated Latent Heterogeneity and Invariance Learning in Kernel Space

Variational Inference for Continuous-Time Switching Dynamical Systems

Improve Agents without Retraining: Parallel Tree Search with Off-Policy Correction

3D Pose Transfer with Correspondence Learning and Mesh Refinement

Fast Approximation of the Sliced-Wasserstein Distance Using Concentration of Random Projections

Efficient Learning of Discrete-Continuous Computation Graphs

From global to local MDI variable importances for random forests and when they are Shapley values

Limiting fluctuation and trajectorial stability of multilayer neural networks with mean field training

Nearly-Tight and Oblivious Algorithms for Explainable Clustering

Lower Bounds on Metropolized Sampling Methods for Well-Conditioned Distributions

Learning to Generate Realistic Noisy Images via Pixel-level Noise-aware Adversarial Training

Safe Reinforcement Learning by Imagining the Near Future

BernNet: Learning Arbitrary Graph Spectral Filters via Bernstein Approximation

Structured Denoising Diffusion Models in Discrete State-Spaces

An Information-theoretic Approach to Distribution Shifts

Offline Reinforcement Learning with Reverse Model-based Imagination

On learning sparse vectors from mixture of responses

SegFormer: Simple and Efficient Design for Semantic Segmentation with Transformers

The Role of Global Labels in Few-Shot Classification and How to Infer Them

Predify: Augmenting deep neural networks with brain-inspired predictive coding dynamics

Meta-Learning Sparse Implicit Neural Representations

Pruning Randomly Initialized Neural Networks with Iterative Randomization

Periodic Activation Functions Induce Stationarity

Stateful Strategic Regression

On the Estimation Bias in Double Q-Learning

Disentangling Identifiable Features from Noisy Data with Structured Nonlinear ICA

A Faster Maximum Cardinality Matching Algorithm with Applications in Machine Learning

Few-Shot Segmentation via Cycle-Consistent Transformer

Augmented Shortcuts for Vision Transformers

Adversarial Reweighting for Partial Domain Adaptation

Instance-dependent Label-noise Learning under a Structural Causal Model

Auto-Encoding Knowledge Graph for Unsupervised Medical Report Generation

Optimizing Reusable Knowledge for Continual Learning via Metalearning

Breaking the centralized barrier for cross-device federated learning

Towards Enabling Meta-Learning from Target Models

Universal Semi-Supervised Learning

The Emergence of Objectness: Learning Zero-shot Segmentation from Videos

CoFrNets: Interpretable Neural Architecture Inspired by Continued Fractions

Confidence-Aware Imitation Learning from Demonstrations with Varying Optimality

Unsupervised Foreground Extraction via Deep Region Competition

DeepSITH: Efficient Learning via Decomposition of What and When Across Time Scales

Factored Policy Gradients: Leveraging Structure for Efficient Learning in MOMDPs

From Canonical Correlation Analysis to Self-supervised Graph Neural Networks

Powerpropagation: A sparsity inducing weight reparameterisation

Towards a Unified Game-Theoretic View of Adversarial Perturbations and Robustness

Deep Residual Learning in Spiking Neural Networks

Semi-Supervised Semantic Segmentation via Adaptive Equalization Learning

Learning curves of generic features maps for realistic datasets with a teacher-student model

Stochastic Shortest Path: Minimax, Parameter-Free and Towards Horizon-Free Regret

SyncTwin: Treatment Effect Estimation with Longitudinal Outcomes

Rethinking Space-Time Networks with Improved Memory Coverage for Efficient Video Object Segmentation

Recurrent Bayesian Classifier Chains for Exact Multi-Label Classification

Meta-Learning the Search Distribution of Black-Box Random Search Based Adversarial Attacks

When Are Solutions Connected in Deep Networks?

SWAD: Domain Generalization by Seeking Flat Minima

Efficient Neural Network Training via Forward and Backward Propagation Sparsification

Least Square Calibration for Peer Reviews

Differentiable Spike: Rethinking Gradient-Descent for Training Spiking Neural Networks

Choose a Transformer: Fourier or Galerkin

MixACM: Mixup-Based Robustness Transfer via Distillation of Activated Channel Maps

Gauge Equivariant Transformer

An Axiomatic Theory of Provably-Fair Welfare-Centric Machine Learning

Mastering Atari Games with Limited Data

Contextual Similarity Aggregation with Self-attention for Visual Re-ranking

Distributed Saddle-Point Problems Under Data Similarity

Online Variational Filtering and Parameter Learning

Support vector machines and linear regression coincide with very high-dimensional features

Information Directed Reward Learning for Reinforcement Learning

Rank Overspecified Robust Matrix Recovery: Subgradient Method and Exact Recovery

Associating Objects with Transformers for Video Object Segmentation

Learning in Non-Cooperative Configurable Markov Decision Processes

Partial success in closing the gap between human and machine vision

No-regret Online Learning over Riemannian Manifolds

On Effective Scheduling of Model-based Reinforcement Learning

Dual Parameterization of Sparse Variational Gaussian Processes

Online Facility Location with Multiple Advice

Agent Modelling under Partial Observability for Deep Reinforcement Learning

Self-Supervised Learning Disentangled Group Representation as Feature

MOMA: Multi-Object Multi-Actor Activity Parsing

Optimizing Information-theoretical Generalization Bound via Anisotropic Noise of SGLD

Batched Thompson Sampling

Shape your Space: A Gaussian Mixture Regularization Approach to Deterministic Autoencoders

On the Bias-Variance-Cost Tradeoff of Stochastic Optimization

R-Drop: Regularized Dropout for Neural Networks

Hard-Attention for Scalable Image Classification

A Faster Decentralized Algorithm for Nonconvex Minimax Problems

Co-evolution Transformer for Protein Contact Prediction

Dynamic COVID risk assessment accounting for community virus exposure from a spatial-temporal transmission model

The balancing principle for parameter choice in distance-regularized domain adaptation

Large-Scale Wasserstein Gradient Flows

Non-Gaussian Gaussian Processes for Few-Shot Regression

Robustness between the worst and average case

Alignment Attention by Matching Key and Query Distributions

Learning Conjoint Attentions for Graph Neural Nets

FedDR – Randomized Douglas-Rachford Splitting Algorithms for Nonconvex Federated Composite Optimization

Neural Symplectic Form: Learning Hamiltonian Equations on General Coordinate Systems

Open Rule Induction

Biological learning in key-value memory networks

Mixed Supervised Object Detection by Transferring Mask Prior and Semantic Similarity

An Improved Analysis of Gradient Tracking for Decentralized Machine Learning

Task-Adaptive Neural Network Search with Meta-Contrastive Learning

Minimax Optimal Quantile and Semi-Adversarial Regret via Root-Logarithmic Regularizers

On Inductive Biases for Heterogeneous Treatment Effect Estimation

Deconditional Downscaling with Gaussian Processes

Understanding Instance-based Interpretability of Variational Auto-Encoders

Self-Supervised GANs with Label Augmentation

Dynamic Distillation Network for Cross-Domain Few-Shot Recognition with Unlabeled Data

Goal-Aware Cross-Entropy for Multi-Target Reinforcement Learning

Fair Scheduling for Time-dependent Resources

Distilling Image Classifiers in Object Detectors

Discovery of Options via Meta-Learned Subgoals

Balanced Chamfer Distance as a Comprehensive Metric for Point Cloud Completion

CO-PILOT: COllaborative Planning and reInforcement Learning On sub-Task curriculum

Topographic VAEs learn Equivariant Capsules

Self-Consistent Models and Values

Implicit MLE: Backpropagating Through Discrete Exponential Family Distributions

On Linear Stability of SGD and Input-Smoothness of Neural Networks

Adversarial Training Helps Transfer Learning via Better Representations

Going Beyond Linear Transformers with Recurrent Fast Weight Programmers

Regret Minimization Experience Replay in Off-Policy Reinforcement Learning

PreferenceNet: Encoding Human Preferences in Auction Design with Deep Learning

Self-Supervised Representation Learning on Neural Network Weights for Model Characteristic Prediction

Learning to Learn Graph Topologies

AutoGEL: An Automated Graph Neural Network with Explicit Link Information

Interpreting Representation Quality of DNNs for 3D Point Cloud Processing

Low-Rank Constraints for Fast Inference in Structured Models

On the Equivalence between Neural Network and Support Vector Machine

Active Assessment of Prediction Services as Accuracy Surface Over Attribute Combinations

Learning Equivariant Energy Based Models with Equivariant Stein Variational Gradient Descent

FlexMatch: Boosting Semi-Supervised Learning with Curriculum Pseudo Labeling

Adversarially robust learning for security-constrained optimal power flow

Stochastic Optimization of Areas Under Precision-Recall Curves with Provable Convergence

MST: Masked Self-Supervised Transformer for Visual Representation

Generalized Shape Metrics on Neural Representations

Faster Neural Network Training with Approximate Tensor Operations

Personalized Federated Learning With Gaussian Processes

ReSSL: Relational Self-Supervised Learning with Weak Augmentation

Aligned Structured Sparsity Learning for Efficient Image Super-Resolution

Differentiable Quality Diversity

Recurrence along Depth: Deep Convolutional Neural Networks with Recurrent Layer Aggregation

Efficient Equivariant Network

The functional specialization of visual cortex emerges from training parallel pathways with self-supervised predictive learning

Alias-Free Generative Adversarial Networks

Statistically and Computationally Efficient Linear Meta-representation Learning

Towards Unifying Behavioral and Response Diversity for Open-ended Learning in Zero-sum Games

Hardware-adaptive Efficient Latency Prediction for NAS via Meta-Learning

Deceive D: Adaptive Pseudo Augmentation for GAN Training with Limited Data

Safe Reinforcement Learning with Natural Language Constraints

Improving Self-supervised Learning with Automated Unsupervised Outlier Arbitration

Dynamical Wasserstein Barycenters for Time-series Modeling

Global-aware Beam Search for Neural Abstractive Summarization

Optimal Order Simple Regret for Gaussian Process Bandits

Invariant Causal Imitation Learning for Generalizable Policies

Directed Probabilistic Watershed

Shape from Blur: Recovering Textured 3D Shape and Motion of Fast Moving Objects

STORM+: Fully Adaptive SGD with Recursive Momentum for Nonconvex Optimization

Counterfactual Explanations in Sequential Decision Making Under Uncertainty

Diversity Matters When Learning From Ensembles

Qimera: Data-free Quantization with Synthetic Boundary Supporting Samples

Make Sure You're Unsure: A Framework for Verifying Probabilistic Specifications

Beyond Bandit Feedback in Online Multiclass Classification

Learning Fast-Inference Bayesian Networks

Physics-Aware Downsampling with Deep Learning for Scalable Flood Modeling

Directed Graph Contrastive Learning

Neural Auto-Curricula in Two-Player Zero-Sum Games

Determinantal point processes based on orthogonal polynomials for sampling minibatches in SGD

Asynchronous Decentralized Online Learning

Diffusion Normalizing Flow

A sampling-based circuit for optimal decision making

Demystifying and Generalizing BinaryConnect

Learning Transferable Features for Point Cloud Detection via 3D Contrastive Co-training

Bayesian Optimization with High-Dimensional Outputs

Using Random Effects to Account for High-Cardinality Categorical Features and Repeated Measures in Deep Neural Networks

HRFormer: High-Resolution Vision Transformer for Dense Predict

Graph Adversarial Self-Supervised Learning

The Image Local Autoregressive Transformer

Fine-grained Generalization Analysis of Inductive Matrix Completion

Canonical Capsules: Self-Supervised Capsules in Canonical Pose

On the Power of Edge Independent Graph Models

On the Theory of Reinforcement Learning with Once-per-Episode Feedback

Conflict-Averse Gradient Descent for Multi-task learning

Near-optimal Offline and Streaming Algorithms for Learning Non-Linear Dynamical Systems

Predicting What You Already Know Helps: Provable Self-Supervised Learning

Fair Sortition Made Transparent

Denoising Normalizing Flow

TopicNet: Semantic Graph-Guided Topic Discovery

Effective Meta-Regularization by Kernelized Proximal Regularization

No RL, No Simulation: Learning to Navigate without Navigating

Knowledge-Adaptation Priors

Universal Approximation Using Well-Conditioned Normalizing Flows

Domain Invariant Representation Learning with Domain Density Transformations

OSOA: One-Shot Online Adaptation of Deep Generative Models for Lossless Compression

Is Automated Topic Model Evaluation Broken? The Incoherence of Coherence

VAST: Value Function Factorization with Variable Agent Sub-Teams

Relaxed Marginal Consistency for Differentially Private Query Answering

Neural Flows: Efficient Alternative to Neural ODEs

Square Root Principal Component Pursuit: Tuning-Free Noisy Robust Matrix Recovery

Fast Training Method for Stochastic Compositional Optimization Problems

Fast Projection onto the Capped Simplex with Applications to Sparse Regression in Bioinformatics

MobTCast: Leveraging Auxiliary Trajectory Forecasting for Human Mobility Prediction

Reliable Causal Discovery with Improved Exact Search and Weaker Assumptions

Adaptive Sampling for Minimax Fair Classification

Relative Flatness and Generalization

Mini-Batch Consistent Slot Set Encoder for Scalable Set Encoding

Stochastic Solutions for Linear Inverse Problems using the Prior Implicit in a Denoiser

RelaySum for Decentralized Deep Learning on Heterogeneous Data

FMMformer: Efficient and Flexible Transformer via Decomposed Near-field and Far-field Attention

Gaussian Kernel Mixture Network for Single Image Defocus Deblurring

Global Filter Networks for Image Classification

No Fear of Heterogeneity: Classifier Calibration for Federated Learning with Non-IID Data

Linear-Time Probabilistic Solution of Boundary Value Problems

Adaptive Online Packing-guided Search for POMDPs

Topological Relational Learning on Graphs

MobILE: Model-Based Imitation Learning From Observation Alone

Multi-Label Learning with Pairwise Relevance Ordering

Volume Rendering of Neural Implicit Surfaces

Loss function based second-order Jensen inequality and its application to particle variational inference

Towards Robust and Reliable Algorithmic Recourse

DROID-SLAM: Deep Visual SLAM for Monocular, Stereo, and RGB-D Cameras

Does Preprocessing Help Training Over-parameterized Neural Networks?

Adversarial Robustness with Semi-Infinite Constrained Learning

Don’t Generate Me: Training Differentially Private Generative Models with Sinkhorn Divergence

Optimizing Conditional Value-At-Risk of Black-Box Functions

Learning to dehaze with polarization

TestRank: Bringing Order into Unlabeled Test Instances for Deep Learning Tasks

Federated Linear Contextual Bandits

Fast Doubly-Adaptive MCMC to Estimate the Gibbs Partition Function with Weak Mixing Time Bounds

An Efficient Transfer Learning Framework for Multiagent Reinforcement Learning

Calibrating Predictions to Decisions: A Novel Approach to Multi-Class Calibration

Implicit Transformer Network for Screen Content Image Continuous Super-Resolution

Do Input Gradients Highlight Discriminative Features?

Three Operator Splitting with Subgradients, Stochastic Gradients, and Adaptive Learning Rates

Linear Convergence in Federated Learning: Tackling Client Heterogeneity and Sparse Gradients

3D Siamese Voxel-to-BEV Tracker for Sparse Point Clouds

Training for the Future: A Simple Gradient Interpolation Loss to Generalize Along Time

Dense Unsupervised Learning for Video Segmentation

Scalable Quasi-Bayesian Inference for Instrumental Variable Regression

Ultrahyperbolic Neural Networks

Drop, Swap, and Generate: A Self-Supervised Approach for Generating Neural Activity

Evaluating Gradient Inversion Attacks and Defenses in Federated Learning

Neural Hybrid Automata: Learning Dynamics With Multiple Modes and Stochastic Transitions

Rot-Pro: Modeling Transitivity by Projection in Knowledge Graph Embedding

Gradient Inversion with Generative Image Prior

Action-guided 3D Human Motion Prediction

SmoothMix: Training Confidence-calibrated Smoothed Classifiers for Certified Robustness

Higher Order Kernel Mean Embeddings to Capture Filtrations of Stochastic Processes

Learning to Learn Dense Gaussian Processes for Few-Shot Learning

Achieving Rotational Invariance with Bessel-Convolutional Neural Networks

Online Learning and Control of Complex Dynamical Systems from Sensory Input

Intermediate Layers Matter in Momentum Contrastive Self Supervised Learning

End-to-End Training of Multi-Document Reader and Retriever for Open-Domain Question Answering

Reinforcement learning for optimization of variational quantum circuit architectures

Efficient First-Order Contextual Bandits: Prediction, Allocation, and Triangular Discrimination

Active clustering for labeling training data

Dynamic Normalization and Relay for Video Action Recognition

Local Differential Privacy for Regret Minimization in Reinforcement Learning

Predicting Molecular Conformation via Dynamic Graph Score Matching

Identification and Estimation of Joint Probabilities of Potential Outcomes in Observational Studies with Covariate Information

Residual Pathway Priors for Soft Equivariance Constraints

Robust Deep Reinforcement Learning through Adversarial Loss

Accelerating Robotic Reinforcement Learning via Parameterized Action Primitives

Interesting Object, Curious Agent: Learning Task-Agnostic Exploration

ASSANet: An Anisotropic Separable Set Abstraction for Efficient Point Cloud Representation Learning

Smooth Normalizing Flows

Directional Message Passing on Molecular Graphs via Synthetic Coordinates

Lower Bounds and Optimal Algorithms for Smooth and Strongly Convex Decentralized Optimization Over Time-Varying Networks

On Contrastive Representations of Stochastic Processes

Joint inference and input optimization in equilibrium networks

Black Box Probabilistic Numerics

STEM: A Stochastic Two-Sided Momentum Algorithm Achieving Near-Optimal Sample and Communication Complexities for Federated Learning

NTopo: Mesh-free Topology Optimization using Implicit Neural Representations

A 3D Generative Model for Structure-Based Drug Design

Circa: Stochastic ReLUs for Private Deep Learning

Explaining Hyperparameter Optimization via Partial Dependence Plots

Learning Causal Semantic Representation for Out-of-Distribution Prediction

Charting and Navigating the Space of Solutions for Recurrent Neural Networks

Disentangling the Roles of Curation, Data-Augmentation and the Prior in the Cold Posterior Effect

Manipulating SGD with Data Ordering Attacks

Practical Large-Scale Linear Programming using Primal-Dual Hybrid Gradient

Recovery Analysis for Plug-and-Play Priors using the Restricted Eigenvalue Condition

Do Different Tracking Tasks Require Different Appearance Models?

Online Learning in Periodic Zero-Sum Games

CentripetalText: An Efficient Text Instance Representation for Scene Text Detection

Mosaicking to Distill: Knowledge Distillation from Out-of-Domain Data

Image Generation using Continuous Filter Atoms

Beltrami Flow and Neural Diffusion on Graphs

Multimodal Few-Shot Learning with Frozen Language Models

Averaging on the Bures-Wasserstein manifold: dimension-free convergence of gradient descent

Fast Bayesian Inference for Gaussian Cox Processes via Path Integral Formulation

NORESQA: A Framework for Speech Quality Assessment using Non-Matching References

Duplex Sequence-to-Sequence Learning for Reversible Machine Translation

Coupled Gradient Estimators for Discrete Latent Variables

Perturbation-based Regret Analysis of Predictive Control in Linear Time Varying Systems

Metropolis-Hastings Data Augmentation for Graph Neural Networks

Private Non-smooth ERM and SCO in Subquadratic Steps

Automatic Symmetry Discovery with Lie Algebra Convolutional Network

Finding Bipartite Components in Hypergraphs

Gone Fishing: Neural Active Learning with Fisher Embeddings

SketchGen: Generating Constrained CAD Sketches

Dueling Bandits with Team Comparisons

The Effect of the Intrinsic Dimension on the Generalization of Quadratic Classifiers

Exponential Graph is Provably Efficient for Decentralized Deep Training

A Convergence Analysis of Gradient Descent on Graph Neural Networks

Design of Experiments for Stochastic Contextual Linear Bandits

Think Big, Teach Small: Do Language Models Distil Occam’s Razor?

Towards Calibrated Model for Long-Tailed Visual Recognition from Prior Perspective

On the Convergence Theory of Debiased Model-Agnostic Meta-Reinforcement Learning

Solving Min-Max Optimization with Hidden Structure via Gradient Descent Ascent

Cooperative Stochastic Bandits with Asynchronous Agents and Constrained Feedback

Latent Equilibrium: A unified learning theory for arbitrarily fast computation with arbitrarily slow neurons

On Locality of Local Explanation Models

Learning Signal-Agnostic Manifolds of Neural Fields

Convolutional Normalization: Improving Deep Convolutional Network Robustness and Training

Unsupervised Learning of Compositional Energy Concepts

Neural Circuit Synthesis from Specification Patterns

Spatiotemporal Joint Filter Decomposition in 3D Convolutional Neural Networks

Grounding Representation Similarity Through Statistical Testing

Uncertainty Calibration for Ensemble-Based Debiasing Methods

Activation Sharing with Asymmetric Paths Solves Weight Transport Problem without Bidirectional Connection

Neural Population Geometry Reveals the Role of Stochasticity in Robust Perception

Structure learning in polynomial time: Greedy algorithms, Bregman information, and exponential families

Who Leads and Who Follows in Strategic Classification?

Catalytic Role Of Noise And Necessity Of Inductive Biases In The Emergence Of Compositional Communication

Bandits with many optimal arms

Exploiting a Zoo of Checkpoints for Unseen Tasks

Offline Model-based Adaptable Policy Learning

Formalizing the Generalization-Forgetting Trade-off in Continual Learning

(Almost) Free Incentivized Exploration from Decentralized Learning Agents

Emergent Communication under Varying Sizes and Connectivities

Meta Learning Backpropagation And Improving It

Adaptable Agent Populations via a Generative Model of Policies

Faster Algorithms and Constant Lower Bounds for the Worst-Case Expected Error

Preserved central model for faster bidirectional compression in distributed settings

InfoGCL: Information-Aware Graph Contrastive Learning

Reinforced Few-Shot Acquisition Function Learning for Bayesian Optimization

Boosting with Multiple Sources

Dynamic Neural Representational Decoders for High-Resolution Semantic Segmentation

Towards Biologically Plausible Convolutional Networks

Minibatch and Momentum Model-based Methods for Stochastic Weakly Convex Optimization

Actively Identifying Causal Effects with Latent Variables Given Only Response Variable Observable

Gradual Domain Adaptation without Indexed Intermediate Domains

Understanding How Encoder-Decoder Architectures Attend

Moshpit SGD: Communication-Efficient Decentralized Training on Heterogeneous Unreliable Devices

PLUGIn: A simple algorithm for inverting generative models with recovery guarantees

Relative Uncertainty Learning for Facial Expression Recognition

Vector-valued Gaussian Processes on Riemannian Manifolds via Gauge Independent Projected Kernels

Manifold Topology Divergence: a Framework for Comparing Data Manifolds.

Bayesian Bellman Operators

Inverse Reinforcement Learning in a Continuous State Space with Formal Guarantees

Compositional Reinforcement Learning from Logical Specifications

One Question Answering Model for Many Languages with Cross-lingual Dense Passage Retrieval

Imitating Deep Learning Dynamics via Locally Elastic Stochastic Differential Equations

Dynamic population-based meta-learning for multi-agent communication with natural language

SurvITE: Learning Heterogeneous Treatment Effects from Time-to-Event Data

Model, sample, and epoch-wise descents: exact solution of gradient flow in the random feature model

Revitalizing CNN Attention via Transformers in Self-Supervised Visual Representation Learning

Instance-Conditional Knowledge Distillation for Object Detection

Entropic Desired Dynamics for Intrinsic Control

A unified framework for bandit multiple testing

Memory Efficient Meta-Learning with Large Images

A single gradient step finds adversarial examples on random two-layers neural networks

The Out-of-Distribution Problem in Explainability and Search Methods for Feature Importance Explanations

A Unified Approach to Fair Online Learning via Blackwell Approachability

On Component Interactions in Two-Stage Recommender Systems

Wasserstein Flow Meets Replicator Dynamics: A Mean-Field Analysis of Representation Learning in Actor-Critic

CARMS: Categorical-Antithetic-REINFORCE Multi-Sample Gradient Estimator

ReAct: Out-of-distribution Detection With Rectified Activations

Representer Point Selection via Local Jacobian Expansion for Post-hoc Classifier Explanation of Deep Neural Networks and Ensemble Models

Stability and Deviation Optimal Risk Bounds with Convergence Rate $O(1/n)$

Efficient Training of Visual Transformers with Small Datasets

Combiner: Full Attention Transformer with Sparse Computation Cost

On the Frequency Bias of Generative Models

SSAL: Synergizing between Self-Training and Adversarial Learning for Domain Adaptive Object Detection

High-probability Bounds for Non-Convex Stochastic Optimization with Heavy Tails

Discrete-Valued Neural Communication

Robust Contrastive Learning Using Negative Samples with Diminished Semantics

Chebyshev-Cantelli PAC-Bayes-Bennett Inequality for the Weighted Majority Vote

Keeping Your Eye on the Ball: Trajectory Attention in Video Transformers

XDO: A Double Oracle Algorithm for Extensive-Form Games

From Optimality to Robustness: Adaptive Re-Sampling Strategies in Stochastic Bandits

History Aware Multimodal Transformer for Vision-and-Language Navigation

Reformulating Zero-shot Action Recognition for Multi-label Actions

The Utility of Explainable AI in Ad Hoc Human-Machine Teaming

Understanding Deflation Process in Over-parametrized Tensor Decomposition

Diffusion Schrödinger Bridge with Applications to Score-Based Generative Modeling

Covariance-Aware Private Mean Estimation Without Private Covariance Estimation

MEST: Accurate and Fast Memory-Economic Sparse Training Framework on the Edge

Generalized Linear Bandits with Local Differential Privacy

Scalable Diverse Model Selection for Accessible Transfer Learning

Unbiased Classification through Bias-Contrastive and Bias-Balanced Learning

Snowflake: Scaling GNNs to high-dimensional continuous control via parameter freezing

Distributional Reinforcement Learning for Multi-Dimensional Reward Functions

Learning Nonparametric Volterra Kernels with Gaussian Processes

Estimating the Unique Information of Continuous Variables

Towards Optimal Strategies for Training Self-Driving Perception Models in Simulation

Rethinking Graph Transformers with Spectral Attention

Continual Learning via Local Module Composition

Local Explanation of Dialogue Response Generation

Robust Visual Reasoning via Language Guided Neural Module Networks

Robust and differentially private mean estimation

Differentially Private Stochastic Optimization: New Results in Convex and Non-Convex Settings

Accurate Point Cloud Registration with Robust Optimal Transport

Efficient and Local Parallel Random Walks

RMM: Reinforced Memory Management for Class-Incremental Learning

Why Generalization in RL is Difficult: Epistemic POMDPs and Implicit Partial Observability

Comprehensive Knowledge Distillation with Causal Intervention

How Does it Sound?

Contrastive Laplacian Eigenmaps

Deep learning is adaptive to intrinsic dimensionality of model smoothness in anisotropic Besov space

Online Meta-Learning via Learning with Layer-Distributed Memory

Neural Program Generation Modulo Static Analysis

$(\textrm{Implicit})^2$: Implicit Layers for Implicit Representations

Sparsely Changing Latent States for Prediction and Planning in Partially Observable Domains

Leveraging the Inductive Bias of Large Language Models for Abstract Textual Reasoning

Small random initialization is akin to spectral learning: Optimization and generalization guarantees for overparameterized low-rank matrix reconstruction

Exact marginal prior distributions of finite Bayesian neural networks

Functional Regularization for Reinforcement Learning via Learned Fourier Features

Discovering Dynamic Salient Regions for Spatio-Temporal Graph Neural Networks

Training Over-parameterized Models with Non-decomposable Objectives

Stochastic Multi-Armed Bandits with Control Variates

Implicit Deep Adaptive Design: Policy-Based Experimental Design without Likelihoods

Reinforcement Learning Enhanced Explainer for Graph Neural Networks

RETRIEVE: Coreset Selection for Efficient and Robust Semi-Supervised Learning

Directed Spectrum Measures Improve Latent Network Models Of Neural Populations

Antipodes of Label Differential Privacy: PATE and ALIBI

POODLE: Improving Few-shot Learning via Penalizing Out-of-Distribution Samples

Proper Value Equivalence

Data driven semi-supervised learning

Trash or Treasure? An Interactive Dual-Stream Strategy for Single Image Reflection Separation

CAM-GAN: Continual Adaptation Modules for Generative Adversarial Networks

An Online Riemannian PCA for Stochastic Canonical Correlation Analysis

Deep Learning with Label Differential Privacy

Label Noise SGD Provably Prefers Flat Global Minimizers

Sparse Flows: Pruning Continuous-depth Models

Adversarial Examples in Multi-Layer Random ReLU Networks

Fast Certified Robust Training with Short Warmup

Realistic evaluation of transductive few-shot learning

Active Offline Policy Selection

Finding Discriminative Filters for Specific Degradations in Blind Super-Resolution

Systematic Generalization with Edge Transformers

Contrastive Active Inference

Relational Self-Attention: What's Missing in Attention for Video Understanding

Blending Anti-Aliasing into Vision Transformer

Data-Efficient Instance Generation from Instance Discrimination

Scalable Neural Data Server: A Data Recommender for Transfer Learning

High Probability Complexity Bounds for Line Search Based on Stochastic Oracles

Terra: Imperative-Symbolic Co-Execution of Imperative Deep Learning Programs

Non-convex Distributionally Robust Optimization: Non-asymptotic Analysis

Multi-View Representation Learning via Total Correlation Objective

Bandit Phase Retrieval

Object DGCNN: 3D Object Detection using Dynamic Graphs

IA-RED$^2$: Interpretability-Aware Redundancy Reduction for Vision Transformers

XCiT: Cross-Covariance Image Transformers

Optimal Policies Tend To Seek Power

On the Periodic Behavior of Neural Network Training with Batch Normalization and Weight Decay

SalKG: Learning From Knowledge Graph Explanations for Commonsense Reasoning

Multiclass versus Binary Differentially Private PAC Learning

Partition and Code: learning how to compress graphs

An Image is Worth More Than a Thousand Words: Towards Disentanglement in The Wild

Error Compensated Distributed SGD Can Be Accelerated

Revisiting Deep Learning Models for Tabular Data

Conditioning Sparse Variational Gaussian Processes for Online Decision-making

Spatial-Temporal Super-Resolution of Satellite Imagery via Conditional Pixel Synthesis

Implicit SVD for Graph Representation Learning

Explanation-based Data Augmentation for Image Classification

On the Universality of Graph Neural Networks on Large Random Graphs

Navigating to the Best Policy in Markov Decision Processes

Identifying and Benchmarking Natural Out-of-Context Prediction Problems

Asynchronous Decentralized SGD with Quantized and Local Updates

Analogous to Evolutionary Algorithm: Designing a Unified Sequence Model

Grammar-Based Grounded Lexicon Learning

SIMONe: View-Invariant, Temporally-Abstracted Object Representations via Unsupervised Video Decomposition

Retiring Adult: New Datasets for Fair Machine Learning

Space-time Mixing Attention for Video Transformer

CSDI: Conditional Score-based Diffusion Models for Probabilistic Time Series Imputation

Weisfeiler and Lehman Go Cellular: CW Networks

Iterative Teaching by Label Synthesis

Lossy Compression for Lossless Prediction

T-LoHo: A Bayesian Regularization Model for Structured Sparsity and Smoothness on Graphs

Contrastive Reinforcement Learning of Symbolic Reasoning Domains

How Tight Can PAC-Bayes be in the Small Data Regime?

VQ-GNN: A Universal Framework to Scale up Graph Neural Networks using Vector Quantization

Deep Markov Factor Analysis: Towards Concurrent Temporal and Spatial Analysis of fMRI Data

Is Bang-Bang Control All You Need? Solving Continuous Control with Bernoulli Policies

AC/DC: Alternating Compressed/DeCompressed Training of Deep Neural Networks

Synthetic Design: An Optimization Approach to Experimental Design with Synthetic Controls

Going Beyond Linear RL: Sample Efficient Neural Function Approximation

One Loss for All: Deep Hashing with a Single Cosine Similarity based Learning Objective

De-randomizing MCMC dynamics with the diffusion Stein operator

On the Provable Generalization of Recurrent Neural Networks

A first-order primal-dual method with adaptivity to local smoothness

Encoding Spatial Distribution of Convolutional Features for Texture Representation

Curriculum Offline Imitating Learning

Sparse is Enough in Scaling Transformers

Federated Graph Classification over Non-IID Graphs

Adaptive Denoising via GainTuning

Rates of Estimation of Optimal Transport Maps using Plug-in Estimators via Barycentric Projections

Attention Approximates Sparse Distributed Memory

ViSER: Video-Specific Surface Embeddings for Articulated 3D Shape Reconstruction

Better Safe Than Sorry: Preventing Delusive Adversaries with Adversarial Training

Amortized Synthesis of Constrained Configurations Using a Differentiable Surrogate

NeuroMLR: Robust & Reliable Route Recommendation on Road Networks

ATISS: Autoregressive Transformers for Indoor Scene Synthesis

Distributional Gradient Matching for Learning Uncertain Neural Dynamics Models

Pure Exploration in Kernel and Neural Bandits

On the Cryptographic Hardness of Learning Single Periodic Neurons

Policy Finetuning: Bridging Sample-Efficient Offline and Online Reinforcement Learning

Meta-learning with an Adaptive Task Scheduler

Multi-Facet Clustering Variational Autoencoders

Soft Calibration Objectives for Neural Networks

Complexity Lower Bounds for Nonconvex-Strongly-Concave Min-Max Optimization

Generalization of Model-Agnostic Meta-Learning Algorithms: Recurring and Unseen Tasks

Learning rule influences recurrent network representations but not attractor structure in decision-making tasks

Techniques for Symbol Grounding with SATNet

Improved Guarantees for Offline Stochastic Matching via new Ordered Contention Resolution Schemes

A Domain-Shrinking based Bayesian Optimization Algorithm with Order-Optimal Regret Performance

Continuous Latent Process Flows

How does a Neural Network's Architecture Impact its Robustness to Noisy Labels?

When does Contrastive Learning Preserve Adversarial Robustness from Pretraining to Finetuning?

On the Importance of Gradients for Detecting Distributional Shifts in the Wild

Learning to Simulate Self-driven Particles System with Coordinated Policy Optimization

Evaluating Efficient Performance Estimators of Neural Architectures

Multiwavelet-based Operator Learning for Differential Equations

Bubblewrap: Online tiling and real-time flow prediction on neural manifolds

Dirichlet Energy Constrained Learning for Deep Graph Neural Networks

S$^3$: Sign-Sparse-Shift Reparametrization for Effective Training of Low-bit Shift Networks

The staircase property: How hierarchical structure can guide deep learning

Topological Attention for Time Series Forecasting

Compressive Visual Representations

When False Positive is Intolerant: End-to-End Optimization with Low FPR for Multipartite Ranking

Best-case lower bounds in online learning

Nearly Minimax Optimal Reinforcement Learning for Discounted MDPs

Computer-Aided Design as Language

No-Press Diplomacy from Scratch

Efficient Mirror Descent Ascent Methods for Nonsmooth Minimax Problems

Data Sharing and Compression for Cooperative Networked Control

DIB-R++: Learning to Predict Lighting and Material with a Hybrid Differentiable Renderer

Adaptive First-Order Methods Revisited: Convex Minimization without Lipschitz Requirements

Backdoor Attack with Imperceptible Input and Latent Modification

Teaching an Active Learner with Contrastive Examples

On sensitivity of meta-learning to support data

Contrast and Mix: Temporal Contrastive Video Domain Adaptation with Background Mixing

Can we have it all? On the Trade-off between Spatial and Adversarial Robustness of Neural Networks

Inverse Problems Leveraging Pre-trained Contrastive Representations

H-NeRF: Neural Radiance Fields for Rendering and Temporal Reconstruction of Humans in Motion

Slice Sampling Reparameterization Gradients

Why Do Better Loss Functions Lead to Less Transferable Features?

Meta Two-Sample Testing: Learning Kernels for Testing with Limited Data

Revisit Multimodal Meta-Learning through the Lens of Multi-Task Learning

Transfer Learning of Graph Neural Networks with Ego-graph Information Maximization

Fast Axiomatic Attribution for Neural Networks

Targeted Neural Dynamical Modeling

On the Role of Optimization in Double Descent: A Least Squares Study

Attention Bottlenecks for Multimodal Fusion

Stochastic Bias-Reduced Gradient Methods

Shift-Robust GNNs: Overcoming the Limitations of Localized Graph Training data

Interactive Label Cleaning with Example-based Explanations

Parameter Inference with Bifurcation Diagrams

Logarithmic Regret from Sublinear Hints

Deep Marching Tetrahedra: a Hybrid Representation for High-Resolution 3D Shape Synthesis

Learning to Predict Trustworthiness with Steep Slope Loss

Breaking the Linear Iteration Cost Barrier for Some Well-known Conditional Gradient Methods Using MaxIP Data-structures

Kernel Identification Through Transformers

Convex-Concave Min-Max Stackelberg Games

Three-dimensional spike localization and improved motion correction for Neuropixels recordings

Outcome-Driven Reinforcement Learning via Variational Inference

Transformers Generalize DeepSets and Can be Extended to Graphs & Hypergraphs

Efficient Generalization with Distributionally Robust Learning

How to transfer algorithmic reasoning knowledge to learn new algorithms?

Fast Routing under Uncertainty: Adaptive Learning in Congestion Games via Exponential Weights

Absolute Neighbour Difference based Correlation Test for Detecting Heteroscedastic Relationships

Self-Paced Contrastive Learning for Semi-supervised Medical Image Segmentation with Meta-labels

On Optimal Interpolation in Linear Regression

Towards Sample-efficient Overparameterized Meta-learning

Self-Supervised Learning with Kernel Dependence Maximization

Instance-Conditioned GAN

Optimal prediction of Markov chains with and without spectral gap

Overlapping Spaces for Compact Graph Representations

Long Short-Term Transformer for Online Action Detection

Supercharging Imbalanced Data Learning With Energy-based Contrastive Representation Transfer

Neural Pseudo-Label Optimism for the Bank Loan Problem

Differentially Private Learning with Adaptive Clipping

Nested Counterfactual Identification from Arbitrary Surrogate Experiments

On Provable Benefits of Depth in Training Graph Convolutional Networks

Robust Counterfactual Explanations on Graph Neural Networks

Perturb-and-max-product: Sampling and learning in discrete energy-based models

Class-Disentanglement and Applications in Adversarial Detection and Defense

Hypergraph Propagation and Community Selection for Objects Retrieval

Aligning Silhouette Topology for Self-Adaptive 3D Human Pose Recovery

Robust Implicit Networks via Non-Euclidean Contractions

Successor Feature Landmarks for Long-Horizon Goal-Conditioned Reinforcement Learning

Tailoring: encoding inductive biases by optimizing unsupervised objectives at prediction time

Generative vs. Discriminative: Rethinking The Meta-Continual Learning

Controllable and Compositional Generation with Latent-Space Energy-Based Models

CoFiNet: Reliable Coarse-to-fine Correspondences for Robust PointCloud Registration

Automatic and Harmless Regularization with Constrained and Lexicographic Optimization: A Dynamic Barrier Approach

Joint Modeling of Visual Objects and Relations for Scene Graph Generation

Pessimism Meets Invariance: Provably Efficient Offline Mean-Field Multi-Agent RL

Renyi Differential Privacy of The Subsampled Shuffle Model In Distributed Learning

Visual Adversarial Imitation Learning using Variational Models

Online false discovery rate control for anomaly detection in time series

Double Machine Learning Density Estimation for Local Treatment Effects with Instruments

An analysis of Ermakov-Zolotukhin quadrature using kernels

An Exponential Improvement on the Memorization Capacity of Deep Threshold Networks

NAS-Bench-x11 and the Power of Learning Curves

Reinforcement Learning based Disease Progression Model for Alzheimer’s Disease

Scalable Online Planning via Reinforcement Learning Fine-Tuning

Differentiable Optimization of Generalized Nondecomposable Functions using Linear Programs

Parallel Bayesian Optimization of Multiple Noisy Objectives with Expected Hypervolume Improvement

Explicable Reward Design for Reinforcement Learning Agents

Robust and Fully-Dynamic Coreset for Continuous-and-Bounded Learning (With Outliers) Problems

Remember What You Want to Forget: Algorithms for Machine Unlearning

Faster Matchings via Learned Duals

A Separation Result Between Data-oblivious and Data-aware Poisoning Attacks

Learning to Select Exogenous Events for Marked Temporal Point Process

Score-based Generative Modeling in Latent Space

Reducing Collision Checking for Sampling-Based Motion Planning Using Graph Neural Networks

Center Smoothing: Certified Robustness for Networks with Structured Outputs

Numerical Composition of Differential Privacy

The Semi-Random Satisfaction of Voting Axioms

Better Algorithms for Individually Fair $k$-Clustering

A Near-Optimal Algorithm for Stochastic Bilevel Optimization via Double-Momentum

One More Step Towards Reality: Cooperative Bandits with Imperfect Communication

Discovering and Achieving Goals via World Models

Learning-to-learn non-convex piecewise-Lipschitz functions

Tracking People with 3D Representations

Efficient Truncated Linear Regression with Unknown Noise Variance

Moser Flow: Divergence-based Generative Modeling on Manifolds

Stateful ODE-Nets using Basis Function Expansions

Adversarial Graph Augmentation to Improve Graph Contrastive Learning

Latent Matters: Learning Deep State-Space Models

Permuton-induced Chinese Restaurant Process

Beware of the Simulated DAG! Causal Discovery Benchmarks May Be Easy to Game

A Gang of Adversarial Bandits

Bayesian Adaptation for Covariate Shift

Differentiable Synthesis of Program Architectures

Fair Classification with Adversarial Perturbations

Heavy Tails in SGD and Compressibility of Overparametrized Neural Networks

Stochastic Online Linear Regression: the Forward Algorithm to Replace Ridge

Strategic Behavior is Bliss: Iterative Voting Improves Social Welfare

Sifting through the noise: Universal first-order methods for stochastic variational inequalities

The Complexity of Sparse Tensor PCA

Extending Lagrangian and Hamiltonian Neural Networks with Differentiable Contact Models

CAPE: Encoding Relative Positions with Continuous Augmented Positional Embeddings

Double/Debiased Machine Learning for Dynamic Treatment Effects

Fine-Grained Neural Network Explanation by Identifying Input Features with Predictive Information

Do Wider Neural Networks Really Help Adversarial Robustness?

Hyperparameter Tuning is All You Need for LISTA

Learning Stable Deep Dynamics Models for Partially Observed or Delayed Dynamical Systems

Delayed Propagation Transformer: A Universal Computation Engine towards Practical Control in Cyber-Physical Systems

FLEX: Unifying Evaluation for Few-Shot NLP

TokenLearner: Adaptive Space-Time Tokenization for Videos

Adjusting for Autocorrelated Errors in Neural Networks for Time Series

The Benefits of Implicit Regularization from SGD in Least Squares Problems

Teaching via Best-Case Counterexamples in the Learning-with-Equivalence-Queries Paradigm

SIMILAR: Submodular Information Measures Based Active Learning In Realistic Scenarios

Shapley Residuals: Quantifying the limits of the Shapley value for explanations

Instance-Dependent Bounds for Zeroth-order Lipschitz Optimization with Error Certificates

Robust Regression Revisited: Acceleration and Improved Estimation Rates

Reinforcement Learning with State Observation Costs in Action-Contingent Noiselessly Observable Markov Decision Processes

On the Sample Complexity of Learning under Geometric Stability

Scaling Ensemble Distribution Distillation to Many Classes with Proxy Targets

Subgoal Search For Complex Reasoning Tasks

Consistent Estimation for PCA and Sparse Regression with Oblivious Outliers

Curriculum Design for Teaching via Demonstrations: Theory and Applications

Uncertain Decisions Facilitate Better Preference Learning

Asymptotically Best Causal Effect Identification with Multi-Armed Bandits

Qu-ANTI-zation: Exploiting Quantization Artifacts for Achieving Adversarial Outcomes

DropGNN: Random Dropouts Increase the Expressiveness of Graph Neural Networks

Improving Compositionality of Neural Networks by Decoding Representations to Inputs

Invariance Principle Meets Information Bottleneck for Out-of-Distribution Generalization

Learning to See by Looking at Noise

Parametric Complexity Bounds for Approximating PDEs with Neural Networks

General Nonlinearities in SO(2)-Equivariant CNNs

Representing Long-Range Context for Graph Neural Networks with Global Attention

Subgame solving without common knowledge

Towards a Unified Information-Theoretic Framework for Generalization

Learning to Synthesize Programs as Interpretable and Generalizable Policies

Distributed Estimation with Multiple Samples per User: Sharp Rates and Phase Transition

Structured Dropout Variational Inference for Bayesian Neural Networks

Local Signal Adaptivity: Provable Feature Learning in Neural Networks Beyond Kernels

Newton-LESS: Sparsification without Trade-offs for the Sketched Newton Update

Taming Communication and Sample Complexities in Decentralized Policy Evaluation for Cooperative Multi-Agent Reinforcement Learning

Scatterbrain: Unifying Sparse and Low-rank Attention

Look at the Variance! Efficient Black-box Explanations with Sobol-based Sensitivity Analysis

Minimizing Polarization and Disagreement in Social Networks via Link Recommendation

Adversarial Examples Make Strong Poisons

Laplace Redux - Effortless Bayesian Deep Learning

Enabling Fast Differentially Private SGD via Just-in-Time Compilation and Vectorization

A Multi-Implicit Neural Representation for Fonts

Towards Hyperparameter-free Policy Selection for Offline Reinforcement Learning

Sanity Checks for Lottery Tickets: Does Your Winning Ticket Really Win the Jackpot?

Learning Distilled Collaboration Graph for Multi-Agent Perception

Generalization Bounds for (Wasserstein) Robust Optimization

Achieving Forgetting Prevention and Knowledge Transfer in Continual Learning

Compositional Transformers for Scene Generation

Structural Credit Assignment in Neural Networks using Reinforcement Learning

Reducing the Covariate Shift by Mirror Samples in Cross Domain Alignment

Characterizing possible failure modes in physics-informed neural networks

Fast Training of Neural Lumigraph Representations using Meta Learning

Correlated Stochastic Block Models: Exact Graph Matching with Applications to Recovering Communities

Can Information Flows Suggest Targets for Interventions in Neural Circuits?

Kernel Functional Optimisation

Exploiting the Intrinsic Neighborhood Structure for Source-free Domain Adaptation

Training Certifiably Robust Neural Networks with Efficient Local Lipschitz Bounds

ReLU Regression with Massart Noise

Exponential Bellman Equation and Improved Regret Bounds for Risk-Sensitive Reinforcement Learning

Fair Clustering Under a Bounded Cost

Pragmatic Image Compression for Human-in-the-Loop Decision-Making

Second-Order Neural ODE Optimizer

Early Convolutions Help Transformers See Better

PatchGame: Learning to Signal Mid-level Patches in Referential Games

Structured Reordering for Modeling Latent Alignments in Sequence Transduction

ResNEsts and DenseNEsts: Block-based DNN Models with Improved Representation Guarantees

MERLOT: Multimodal Neural Script Knowledge Models

Novel Upper Bounds for the Constrained Most Probable Explanation Task

Low-Fidelity Video Encoder Optimization for Temporal Action Localization

Replay-Guided Adversarial Environment Design

Voxel-based 3D Detection and Reconstruction of Multiple Objects from a Single Image

Stochastic Gradient Descent-Ascent and Consensus Optimization for Smooth Games: Convergence Analysis under Expected Co-coercivity

Beyond Smoothness: Incorporating Low-Rank Analysis into Nonparametric Density Estimation

A Geometric Structure of Acceleration and Its Role in Making Gradients Small Fast

Differentiable Spline Approximations

Measuring Generalization with Optimal Transport

Baleen: Robust Multi-Hop Reasoning at Scale via Condensed Retrieval

Optimal Sketching for Trace Estimation

Robustness of Graph Neural Networks at Scale

Dynamic Inference with Neural Interpreters

Stochastic bandits with groups of similar arms.

Identification of Partially Observed Linear Causal Models: Graphical Conditions for the Non-Gaussian and Heterogeneous Cases

Continual Auxiliary Task Learning

Generalization Bounds for Graph Embedding Using Negative Sampling: Linear vs Hyperbolic

On the Rate of Convergence of Regularized Learning in Games: From Bandits and Uncertainty to Optimism and Beyond

CLDA: Contrastive Learning for Semi-Supervised Domain Adaptation

Causal Abstractions of Neural Networks

Optimal Underdamped Langevin MCMC Method

Greedy Approximation Algorithms for Active Sequential Hypothesis Testing

Towards Deeper Deep Reinforcement Learning with Spectral Normalization

NeuS: Learning Neural Implicit Surfaces by Volume Rendering for Multi-view Reconstruction

Divergence Frontiers for Generative Models: Sample Complexity, Quantization Effects, and Frontier Integrals

When in Doubt: Neural Non-Parametric Uncertainty Quantification for Epidemic Forecasting

KALE Flow: A Relaxed KL Gradient Flow for Probabilities with Disjoint Support

Hyperbolic Procrustes Analysis Using Riemannian Geometry

MADE: Exploration via Maximizing Deviation from Explored Regions

Federated Multi-Task Learning under a Mixture of Distributions

Overcoming Catastrophic Forgetting in Incremental Few-Shot Learning by Finding Flat Minima

Regularized Frank-Wolfe for Dense CRFs: Generalizing Mean Field and Beyond

Collapsed Variational Bounds for Bayesian Neural Networks

Characterizing Generalization under Out-Of-Distribution Shifts in Deep Metric Learning

Improving Deep Learning Interpretability by Saliency Guided Training

Label consistency in overfitted generalized $k$-means

Meta Internal Learning

Analytic Insights into Structure and Rank of Neural Network Hessian Maps

LEADS: Learning Dynamical Systems that Generalize Across Environments

Intrinsic Dimension, Persistent Homology and Generalization in Neural Networks

Multi-Agent Reinforcement Learning for Active Voltage Control on Power Distribution Networks

Uniform Sampling over Episode Difficulty

GeoMol: Torsional Geometric Generation of Molecular 3D Conformer Ensembles

Efficient Algorithms for Learning Depth-2 Neural Networks with General ReLU Activations

A-NeRF: Articulated Neural Radiance Fields for Learning Human Shape, Appearance, and Pose

AugMax: Adversarial Composition of Random Augmentations for Robust Training

Drawing Robust Scratch Tickets: Subnetworks with Inborn Robustness Are Found within Randomly Initialized Networks

Overparameterization Improves Robustness to Covariate Shift in High Dimensions

MagNet: A Neural Network for Directed Graphs

Dr Jekyll & Mr Hyde: the strange case of off-policy policy updates

Stronger NAS with Weaker Predictors

Multi-Step Budgeted Bayesian Optimization with Unknown Evaluation Costs

Adaptive Machine Unlearning

Time-independent Generalization Bounds for SGLD in Non-convex Settings

NeRV: Neural Representations for Videos

Causal Effect Inference for Structured Treatments

Learning-Augmented Dynamic Power Management with Multiple States via New Ski Rental Bounds

Implicit Finite-Horizon Approximation and Efficient Optimal Algorithms for Stochastic Shortest Path

Finite Sample Analysis of Average-Reward TD Learning and $Q$-Learning

Stochastic optimization under time drift: iterate averaging, step-decay schedules, and high probability guarantees

Focal Attention for Long-Range Interactions in Vision Transformers

Iteratively Reweighted Least Squares for Basis Pursuit with Global Linear Convergence Rate

COHESIV: Contrastive Object and Hand Embedding Segmentation In Video

Scalars are universal: Equivariant machine learning, structured like classical physics

Rethinking gradient sparsification as total error minimization

Grad2Task: Improved Few-shot Text Classification Using Gradients for Task Representation

Habitat 2.0: Training Home Assistants to Rearrange their Habitat

Language models enable zero-shot prediction of the effects of mutations on protein function

Label-Imbalanced and Group-Sensitive Classification under Overparameterization

Deep inference of latent dynamics with spatio-temporal super-resolution using selective backpropagation through time

TTT++: When Does Self-Supervised Test-Time Training Fail or Thrive?

Two-sided fairness in rankings via Lorenz dominance

Decoupling the Depth and Scope of Graph Neural Networks

Learning in two-player zero-sum partially observable Markov games with perfect recall

Mixture Proportion Estimation and PU Learning:A Modern Approach

A Law of Iterated Logarithm for Multi-Agent Reinforcement Learning

Refining Language Models with Compositional Explanations

Noether Networks: meta-learning useful conserved quantities

Efficient hierarchical Bayesian inference for spatio-temporal regression models in neuroimaging

Leveraging Spatial and Temporal Correlations in Sparsified Mean Estimation

Stochastic Anderson Mixing for Nonconvex Stochastic Optimization

NovelD: A Simple yet Effective Exploration Criterion

Fast Multi-Resolution Transformer Fine-tuning for Extreme Multi-label Text Classification

Row-clustering of a Point Process-valued Matrix

Optimal Best-Arm Identification Methods for Tail-Risk Measures

Deep Networks Provably Classify Data on Curves

Global Convergence to Local Minmax Equilibrium in Classes of Nonconvex Zero-Sum Games

A Winning Hand: Compressing Deep Networks Can Improve Out-of-Distribution Robustness

EF21: A New, Simpler, Theoretically Better, and Practically Faster Error Feedback

On the Generative Utility of Cyclic Conditionals

CAFE: Catastrophic Data Leakage in Vertical Federated Learning

Topological Detection of Trojaned Neural Networks

Proxy-Normalizing Activations to Match Batch Normalization while Removing Batch Dependence

Be Confident! Towards Trustworthy Graph Neural Networks via Confidence Calibration

The Causal-Neural Connection: Expressiveness, Learnability, and Inference

SEAL: Self-supervised Embodied Active Learning using Exploration and 3D Consistency

Redesigning the Transformer Architecture with Insights from Multi-particle Dynamical Systems

Interpolation can hurt robust generalization even when there is no noise

OctField: Hierarchical Implicit Functions for 3D Modeling

Test-Time Personalization with a Transformer for Human Pose Estimation

Dense Keypoints via Multiview Supervision

Functional Variational Inference based on Stochastic Process Generators

Overcoming the Convex Barrier for Simplex Inputs

Look at What I’m Doing: Self-Supervised Spatial Grounding of Narrations in Instructional Videos

CLIP-It! Language-Guided Video Summarization

The Lazy Online Subgradient Algorithm is Universal on Strongly Convex Domains

Adversarial Robustness without Adversarial Training: A Teacher-Guided Curriculum Learning Approach

An Exact Characterization of the Generalization Error for the Gibbs Algorithm

Evaluating model performance under worst-case subpopulations

DP-SSL: Towards Robust Semi-supervised Learning with A Few Labeled Samples

Risk-averse Heteroscedastic Bayesian Optimization

Mining the Benefits of Two-stage and One-stage HOI Detection

Bias Out-of-the-Box: An Empirical Analysis of Intersectional Occupational Biases in Popular Generative Language Models

Learning Equilibria in Matching Markets from Bandit Feedback

Improving black-box optimization in VAE latent space using decoder uncertainty

On the Convergence of Prior-Guided Zeroth-Order Optimization Algorithms

Validating the Lottery Ticket Hypothesis with Inertial Manifold Theory

TöRF: Time-of-Flight Radiance Fields for Dynamic Scene View Synthesis

Sample Complexity of Tree Search Configuration: Cutting Planes and Beyond

Evidential Softmax for Sparse Multimodal Distributions in Deep Generative Models

Adapting to function difficulty and growth conditions in private optimization

Combining Recurrent, Convolutional, and Continuous-time Models with Linear State Space Layers

Leveraging SE(3) Equivariance for Self-supervised Category-Level Object Pose Estimation from Point Clouds

AutoBalance: Optimized Loss Functions for Imbalanced Data

Gradient Descent on Two-layer Nets: Margin Maximization and Simplicity Bias

TransMatcher: Deep Image Matching Through Transformers for Generalizable Person Re-identification

You Only Look at One Sequence: Rethinking Transformer in Vision through Object Detection

Towards Efficient and Effective Adversarial Training

Neural Dubber: Dubbing for Videos According to Scripts

Revealing and Protecting Labels in Distributed Training

Drop-DTW: Aligning Common Signal Between Sequences While Dropping Outliers

Preconditioned Gradient Descent for Over-Parameterized Nonconvex Matrix Factorization

Scaling Up Exact Neural Network Compression by ReLU Stability

Revisiting 3D Object Detection From an Egocentric Perspective

Learning Debiased Representation via Disentangled Feature Augmentation

ImageBART: Bidirectional Context with Multinomial Diffusion for Autoregressive Image Synthesis

SQALER: Scaling Question Answering by Decoupling Multi-Hop and Logical Reasoning

Bridging Offline Reinforcement Learning and Imitation Learning: A Tale of Pessimism

Object-Centric Representation Learning with Generative Spatial-Temporal Factorization

Post-processing for Individual Fairness

Linear and Kernel Classification in the Streaming Model: Improved Bounds for Heavy Hitters

Bridging the Imitation Gap by Adaptive Insubordination

Particle Dual Averaging: Optimization of Mean Field Neural Network with Global Convergence Rate Analysis

Gradient-based Editing of Memory Examples for Online Task-free Continual Learning

Last-iterate Convergence in Extensive-Form Games

GENESIS-V2: Inferring Unordered Object Representations without Iterative Refinement

On Blame Attribution for Accountable Multi-Agent Sequential Decision Making

Locally private online change point detection

A Causal Lens for Controllable Text Generation

Unsupervised Part Discovery from Contrastive Reconstruction

PlayVirtual: Augmenting Cycle-Consistent Virtual Trajectories for Reinforcement Learning

Beyond Tikhonov: faster learning with self-concordant losses, via iterative regularization

Combinatorial Optimization for Panoptic Segmentation: A Fully Differentiable Approach

Topology-Imbalance Learning for Semi-Supervised Node Classification

FACMAC: Factored Multi-Agent Centralised Policy Gradients

Exploring Cross-Video and Cross-Modality Signals for Weakly-Supervised Audio-Visual Video Parsing

Benign Overfitting in Multiclass Classification: All Roads Lead to Interpolation

Continuous vs. Discrete Optimization of Deep Neural Networks

Post-Training Quantization for Vision Transformer

Edge Representation Learning with Hypergraphs

SILG: The Multi-domain Symbolic Interactive Language Grounding Benchmark

Conditional Generation Using Polynomial Expansions

Model-Based Episodic Memory Induces Dynamic Hybrid Controls

Property-Aware Relation Networks for Few-Shot Molecular Property Prediction

Deep Learning Through the Lens of Example Difficulty

Understanding Bandits with Graph Feedback

Deep Jump Learning for Off-Policy Evaluation in Continuous Treatment Settings

An Infinite-Feature Extension for Bayesian ReLU Nets That Fixes Their Asymptotic Overconfidence

Sparse Steerable Convolutions: An Efficient Learning of SE(3)-Equivariant Features for Estimation and Tracking of Object Poses in 3D Space

Believe What You See: Implicit Constraint Approach for Offline Multi-Agent Reinforcement Learning

Online Adaptation to Label Distribution Shift

Sample Selection for Fair and Robust Training

Integrating Tree Path in Transformer for Code Representation

Beta-CROWN: Efficient Bound Propagation with Per-neuron Split Constraints for Neural Network Robustness Verification

VigDet: Knowledge Informed Neural Temporal Point Process for Coordination Detection on Social Media

Prototypical Cross-Attention Networks for Multiple Object Tracking and Segmentation

Integrating Expert ODEs into Neural ODEs: Pharmacology and Disease Progression

A Non-commutative Extension of Lee-Seung's Algorithm for Positive Semidefinite Factorizations

The Hardness Analysis of Thompson Sampling for Combinatorial Semi-bandits with Greedy Oracle

Differentially Private Federated Bayesian Optimization with Distributed Exploration

SAPE: Spatially-Adaptive Progressive Encoding for Neural Optimization

Deep Conditional Gaussian Mixture Model for Constrained Clustering

EditGAN: High-Precision Semantic Image Editing

Neural Analysis and Synthesis: Reconstructing Speech from Self-Supervised Representations

VoiceMixer: Adversarial Voice Style Mixup

BCORLE($\lambda$): An Offline Reinforcement Learning and Evaluation Framework for Coupons Allocation in E-commerce Market

Unsupervised Object-Based Transition Models For 3D Partially Observable Environments

Learning Graph Models for Retrosynthesis Prediction

On Success and Simplicity: A Second Look at Transferable Targeted Attacks

Variational Model Inversion Attacks

A Computationally Efficient Method for Learning Exponential Family Distributions

Streaming Belief Propagation for Community Detection

Learned Robust PCA: A Scalable Deep Unfolding Approach for High-Dimensional Outlier Detection

Learning Generalized Gumbel-max Causal Mechanisms

A PAC-Bayes Analysis of Adversarial Robustness

QuPeD: Quantized Personalization via Distillation with Applications to Federated Learning

Recursive Bayesian Networks: Generalising and Unifying Probabilistic Context-Free Grammars and Dynamic Bayesian Networks

Improved Regularization and Robustness for Fine-tuning in Neural Networks

Pay Better Attention to Attention: Head Selection in Multilingual and Multi-Domain Sequence Modeling

Dataset Distillation with Infinitely Wide Convolutional Networks

Multi-Person 3D Motion Prediction with Multi-Range Transformers

Efficient Bayesian network structure learning via local Markov boundary search

SADGA: Structure-Aware Dual Graph Aggregation Network for Text-to-SQL

On the Value of Infinite Gradients in Variational Autoencoder Models

Sequential Causal Imitation Learning with Unobserved Confounders

Faster Non-asymptotic Convergence for Double Q-learning

CorticalFlow: A Diffeomorphic Mesh Transformer Network for Cortical Surface Reconstruction

Provably Strict Generalisation Benefit for Invariance in Kernel Methods

Accelerating Quadratic Optimization with Reinforcement Learning

Multi-armed Bandit Requiring Monotone Arm Sequences

Non-asymptotic convergence bounds for Wasserstein approximation using point clouds

Early-stopped neural networks are consistent

Class-agnostic Reconstruction of Dynamic Objects from Videos

Meta-Adaptive Nonlinear Control: Theory and Algorithms

A nonparametric method for gradual change problems with statistical guarantees

Analyzing the Generalization Capability of SGLD Using Properties of Gaussian Channels

An Uncertainty Principle is a Price of Privacy-Preserving Microdata

Robust Inverse Reinforcement Learning under Transition Dynamics Mismatch

Neural Routing by Memory

Improved Regret Bounds for Tracking Experts with Memory

Adversarial Attacks on Graph Classifiers via Bayesian Optimisation

Learning where to learn: Gradient sparsity in meta and continual learning

Fuzzy Clustering with Similarity Queries

User-Level Differentially Private Learning via Correlated Sampling

Learning to Generate Visual Questions with Noisy Supervision

Scaling Vision with Sparse Mixture of Experts

What training reveals about neural network complexity

Dimensionality Reduction for Wasserstein Barycenter

Gradient Starvation: A Learning Proclivity in Neural Networks

Reusing Combinatorial Structure: Faster Iterative Projections over Submodular Base Polytopes

Reinforcement Learning in Linear MDPs: Constant Regret and Representation Selection

Dynamic Resolution Network

Probabilistic Forecasting: A Level-Set Approach

Learnable Fourier Features for Multi-dimensional Spatial Positional Encoding

Pipeline Combinators for Gradual AutoML

Play to Grade: Testing Coding Games as Classifying Markov Decision Process

Leveraging Recursive Gumbel-Max Trick for Approximate Inference in Combinatorial Spaces

Collaborating with Humans without Human Data

Constrained Two-step Look-Ahead Bayesian Optimization

A Stochastic Newton Algorithm for Distributed Convex Optimization

Evaluation of Human-AI Teams for Learned and Rule-Based Agents in Hanabi

Adversarial Feature Desensitization

Shared Independent Component Analysis for Multi-Subject Neuroimaging

Nested Variational Inference

Generalized Depthwise-Separable Convolutions for Adversarially Robust and Efficient Neural Networks

Risk Minimization from Adaptively Collected Data: Guarantees for Supervised and Policy Learning

Pooling by Sliced-Wasserstein Embedding

Exploiting Local Convergence of Quasi-Newton Methods Globally: Adaptive Sample Size Approach

Meta-learning to Improve Pre-training

Stylized Dialogue Generation with Multi-Pass Dual Learning

Uncertainty-Based Offline Reinforcement Learning with Diversified Q-Ensemble

Optimization-Based Algebraic Multigrid Coarsening Using Reinforcement Learning

Object-aware Contrastive Learning for Debiased Scene Representation

Dynamic Grained Encoder for Vision Transformers

Contrastively Disentangled Sequential Variational Autoencoder

A Surrogate Objective Framework for Prediction+Programming with Soft Constraints

The Adaptive Doubly Robust Estimator and a Paradox Concerning Logging Policy

Robust and Decomposable Average Precision for Image Retrieval

Learning Transferable Adversarial Perturbations

Efficient Online Estimation of Causal Effects by Deciding What to Observe

Pay Attention to MLPs

Robust Learning of Optimal Auctions

Asymptotics of the Bootstrap via Stability with Applications to Inference with Model Selection

Locally differentially private estimation of functionals of discrete distributions

Neural View Synthesis and Matching for Semi-Supervised Few-Shot Learning of 3D Pose

Breaking the Dilemma of Medical Image-to-image Translation

PDE-GCN: Novel Architectures for Graph Neural Networks Motivated by Partial Differential Equations

Machine Learning for Variance Reduction in Online Experiments

Learning with Noisy Correspondence for Cross-modal Matching

Indexed Minimum Empirical Divergence for Unimodal Bandits

Learning Graph Cellular Automata

The Skellam Mechanism for Differentially Private Federated Learning

Logarithmic Regret in Feature-based Dynamic Pricing

SNIPS: Solving Noisy Inverse Problems Stochastically

A Shading-Guided Generative Implicit Model for Shape-Accurate 3D-Aware Image Synthesis

Learning to Compose Visual Relations

Skyformer: Remodel Self-Attention with Gaussian Kernel and Nystr\"om Method

Machine versus Human Attention in Deep Reinforcement Learning Tasks

Analytic Study of Families of Spurious Minima in Two-Layer ReLU Neural Networks: A Tale of Symmetry II

Global Convergence of Gradient Descent for Asymmetric Low-Rank Matrix Factorization

Coupled Segmentation and Edge Learning via Dynamic Graph Propagation

Online learning in MDPs with linear function approximation and bandit feedback.

Dual Adaptivity: A Universal Algorithm for Minimizing the Adaptive Regret of Convex Functions

Temporal-attentive Covariance Pooling Networks for Video Recognition

Improving Conditional Coverage via Orthogonal Quantile Regression

Speech-T: Transducer for Text to Speech and Beyond

Machine learning structure preserving brackets for forecasting irreversible processes

TransformerFusion: Monocular RGB Scene Reconstruction using Transformers

Group Equivariant Subsampling

GRIN: Generative Relation and Intention Network for Multi-agent Trajectory Prediction

Tree in Tree: from Decision Trees to Decision Graphs

Generalized Proximal Policy Optimization with Sample Reuse

DECAF: Generating Fair Synthetic Data Using Causally-Aware Generative Networks

Diverse Message Passing for Attribute with Heterophily

Matching a Desired Causal State via Shift Interventions

Learning to Assimilate in Chaotic Dynamical Systems

Independent mechanism analysis, a new concept?

Representation Costs of Linear Neural Networks: Analysis and Design

Active Learning of Convex Halfspaces on Graphs

Environment Generation for Zero-Shot Compositional Reinforcement Learning

ScaleCert: Scalable Certified Defense against Adversarial Patches with Sparse Superficial Layers

Grounding inductive biases in natural images: invariance stems from variations in data

Efficient Statistical Assessment of Neural Network Corruption Robustness

Understanding Negative Samples in Instance Discriminative Self-supervised Representation Learning

argmax centroid

Rethinking the Variational Interpretation of Accelerated Optimization Methods

Adaptive Proximal Gradient Methods for Structured Neural Networks

Decision Transformer: Reinforcement Learning via Sequence Modeling

Scaling Neural Tangent Kernels via Sketching and Random Features

Fast Pure Exploration via Frank-Wolfe

Learning with Algorithmic Supervision via Continuous Relaxations

SLAPS: Self-Supervision Improves Structure Learning for Graph Neural Networks

On the Variance of the Fisher Information for Deep Learning

On the Validity of Modeling SGD with Stochastic Differential Equations (SDEs)

How Should Pre-Trained Language Models Be Fine-Tuned Towards Adversarial Robustness?

Twins: Revisiting the Design of Spatial Attention in Vision Transformers

Auditing Black-Box Prediction Models for Data Minimization Compliance

Regularized Softmax Deep Multi-Agent Q-Learning

BlendGAN: Implicitly GAN Blending for Arbitrary Stylized Face Generation

Task-Agnostic Undesirable Feature Deactivation Using Out-of-Distribution Data

When Expressivity Meets Trainability: Fewer than $n$ Neurons Can Work

Detecting Individual Decision-Making Style: Exploring Behavioral Stylometry in Chess

Light Field Networks: Neural Scene Representations with Single-Evaluation Rendering

Contrastive Graph Poisson Networks: Semi-Supervised Learning with Extremely Limited Labels

Evolution Gym: A Large-Scale Benchmark for Evolving Soft Robots

Invertible Tabular GANs: Killing Two Birds with One Stone for Tabular Data Synthesis

Not All Images are Worth 16x16 Words: Dynamic Transformers for Efficient Image Recognition

Learning Debiased and Disentangled Representations for Semantic Segmentation

Stability and Generalization of Bilevel Programming in Hyperparameter Optimization

A Compositional Atlas of Tractable Circuit Operations for Probabilistic Inference

Tractable Density Estimation on Learned Manifolds with Conformal Embedding Flows

Adversarial Robustness of Streaming Algorithms through Importance Sampling

Video Instance Segmentation using Inter-Frame Communication Transformers

Towards Tight Communication Lower Bounds for Distributed Optimisation

Deep Bandits Show-Off: Simple and Efficient Exploration with Deep Networks

Uncertainty Quantification and Deep Ensembles

BooVI: Provably Efficient Bootstrapped Value Iteration

A Framework to Learn with Interpretation

Learning Tree Interpretation from Object Representation for Deep Reinforcement Learning

Align before Fuse: Vision and Language Representation Learning with Momentum Distillation

Improving Robustness using Generated Data

Model Selection for Bayesian Autoencoders

Generalization Error Rates in Kernel Regression: The Crossover from the Noiseless to Noisy Regime

On Path Integration of Grid Cells: Group Representation and Isotropic Scaling

Unfolding Taylor's Approximations for Image Restoration

Towards Lower Bounds on the Depth of ReLU Neural Networks

Deep Self-Dissimilarities as Powerful Visual Fingerprints

UFC-BERT: Unifying Multi-Modal Controls for Conditional Image Synthesis

Conformal Prediction using Conditional Histograms

D2C: Diffusion-Decoding Models for Few-Shot Conditional Generation

Distributed Machine Learning with Sparse Heterogeneous Data

Associative Memories via Predictive Coding

Adaptive Data Augmentation on Temporal Graphs

Novel Visual Category Discovery with Dual Ranking Statistics and Mutual Knowledge Distillation

Finding Optimal Tangent Points for Reducing Distortions of Hard-label Attacks

SyMetric: Measuring the Quality of Learnt Hamiltonian Dynamics Inferred from Vision

Learning to Iteratively Solve Routing Problems with Dual-Aspect Collaborative Transformer

Meta-Learning Reliable Priors in the Function Space

Neural Ensemble Search for Uncertainty Estimation and Dataset Shift

Federated-EM with heterogeneity mitigation and variance reduction

Recovering Latent Causal Factor for Generalization to Distributional Shifts

Dual-stream Network for Visual Recognition

Shape As Points: A Differentiable Poisson Solver

Spatio-Temporal Variational Gaussian Processes

Smoothness Matrices Beat Smoothness Constants: Better Communication Compression Techniques for Distributed Optimization

Long-Short Transformer: Efficient Transformers for Language and Vision

Momentum Centering and Asynchronous Update for Adaptive Gradient Methods

Unadversarial Examples: Designing Objects for Robust Vision

Reward is enough for convex MDPs

Dangers of Bayesian Model Averaging under Covariate Shift

Differentially Private Sampling from Distributions

Provably efficient, succinct, and precise explanations

Storchastic: A Framework for General Stochastic Automatic Differentiation

Differentially Private Multi-Armed Bandits in the Shuffle Model

Program Synthesis Guided Reinforcement Learning for Partially Observed Environments

Offline Constrained Multi-Objective Reinforcement Learning via Pessimistic Dual Value Iteration

Hessian Eigenspectra of More Realistic Nonlinear Models

Motif-based Graph Self-Supervised Learning for Molecular Property Prediction

Causal-BALD: Deep Bayesian Active Learning of Outcomes to Infer Treatment-Effects from Observational Data

Subgraph Federated Learning with Missing Neighbor Generation

Learning Policies with Zero or Bounded Constraint Violation for Constrained MDPs

Provably Efficient Causal Reinforcement Learning with Confounded Observational Data

Simple steps are all you need: Frank-Wolfe and generalized self-concordant functions

A Kernel-based Test of Independence for Cluster-correlated Data

Unique sparse decomposition of low rank matrices

Data Augmentation Can Improve Robustness

Fair Sequential Selection Using Supervised Learning Models

Tuning Mixed Input Hyperparameters on the Fly for Efficient Population Based AutoRL

Escaping Saddle Points with Compressed SGD

The Difficulty of Passive Learning in Deep Reinforcement Learning

Multilingual Pre-training with Universal Dependency Learning

Self-Supervised Bug Detection and Repair

Neural Trees for Learning on Graphs

When Is Generalizable Reinforcement Learning Tractable?

On the Representation of Solutions to Elliptic PDEs in Barron Spaces

Towards optimally abstaining from prediction with OOD test examples

Precise characterization of the prior predictive distribution of deep ReLU networks

Random Shuffling Beats SGD Only After Many Epochs on Ill-Conditioned Problems

Looking Beyond Single Images for Contrastive Semantic Segmentation Learning

Rethinking Neural Operations for Diverse Tasks

Training Neural Networks is ER-complete

ErrorCompensatedX: error compensation for variance reduced algorithms

Densely connected normalizing flows

Collaborative Causal Discovery with Atomic Interventions

An Even More Optimal Stochastic Optimization Algorithm: Minibatching and Interpolation Learning

SOPE: Spectrum of Off-Policy Estimators

Learning with User-Level Privacy

Neural Tangent Kernel Maximum Mean Discrepancy

Estimating the Long-Term Effects of Novel Treatments

Autoformer: Decomposition Transformers with Auto-Correlation for Long-Term Series Forecasting

Bandit Quickest Changepoint Detection

OpenMatch: Open-Set Semi-supervised Learning with Open-set Consistency Regularization

How Well do Feature Visualizations Support Causal Understanding of CNN Activations?

Margin-Independent Online Multiclass Learning via Convex Geometry

Does enforcing fairness mitigate biases caused by subpopulation shift?

Batch Active Learning at Scale

Variational Bayesian Optimistic Sampling

Mind the Gap: Assessing Temporal Generalization in Neural Language Models

Automated Dynamic Mechanism Design

On the Suboptimality of Thompson Sampling in High Dimensions

Interventional Sum-Product Networks: Causal Inference with Tractable Probabilistic Models

Deep Neural Networks as Point Estimates for Deep Gaussian Processes

Learning Treatment Effects in Panels with General Intervention Patterns

PiRank: Scalable Learning To Rank via Differentiable Sorting

Ranking Policy Decisions

Local Disentanglement in Variational Auto-Encoders Using Jacobian $L_1$ Regularization

CoAtNet: Marrying Convolution and Attention for All Data Sizes

Multiple Descent: Design Your Own Generalization Curve

Generating High-Quality Explanations for Navigation in Partially-Revealed Environments

Solving Soft Clustering Ensemble via $k$-Sparse Discrete Wasserstein Barycenter

Learning Models for Actionable Recourse

A variational approximate posterior for the deep Wishart process

Bayesian decision-making under misspecified priors with applications to meta-learning

Infinite Time Horizon Safety of Bayesian Neural Networks

Network-to-Network Regularization: Enforcing Occam's Razor to Improve Generalization

Pretraining Representations for Data-Efficient Reinforcement Learning

Domain Adaptation with Invariant Representation Learning: What Transformations to Learn?

BayesIMP: Uncertainty Quantification for Causal Data Fusion

Self-Interpretable Model with Transformation Equivariant Interpretation

Generalization Bounds for Meta-Learning via PAC-Bayes and Uniform Stability

Roto-translated Local Coordinate Frames For Interacting Dynamical Systems

Distributed Zero-Order Optimization under Adversarial Noise

Scalable Inference in SDEs by Direct Matching of the Fokker–Planck–Kolmogorov Equation

Parallelizing Thompson Sampling

Differential Privacy Over Riemannian Manifolds

GradInit: Learning to Initialize Neural Networks for Stable and Efficient Training

Sliced Mutual Information: A Scalable Measure of Statistical Dependence

Smooth Bilevel Programming for Sparse Regularization

Hamiltonian Dynamics with Non-Newtonian Momentum for Rapid Sampling

Variance-Aware Off-Policy Evaluation with Linear Function Approximation

On the Representation Power of Set Pooling Networks

Dimension-free empirical entropy estimation

Geometry Processing with Neural Fields

Provably efficient multi-task reinforcement learning with model transfer

DominoSearch: Find layer-wise fine-grained N:M sparse schemes from dense neural networks

Deep Synoptic Monte-Carlo Planning in Reconnaissance Blind Chess

Attention over Learned Object Embeddings Enables Complex Visual Reasoning

Unbalanced Optimal Transport through Non-negative Penalized Linear Regression

Closing the Gap: Tighter Analysis of Alternating Stochastic Gradient Methods for Bilevel Problems

A Topological Perspective on Causal Inference

Shifted Chunk Transformer for Spatio-Temporal Representational Learning

Distilling Meta Knowledge on Heterogeneous Graph for Illicit Drug Trafficker Detection on Social Media

Continuous-time edge modelling using non-parametric point processes

Argmax Flows and Multinomial Diffusion: Learning Categorical Distributions

Intriguing Properties of Vision Transformers

Arbitrary Conditional Distributions with Energy

Why Do Pretrained Language Models Help in Downstream Tasks? An Analysis of Head and Prompt Tuning

UCB-based Algorithms for Multinomial Logistic Regression Bandits

BooVAE: Boosting Approach for Continual Learning of VAE

Conditionally Parameterized, Discretization-Aware Neural Networks for Mesh-Based Modeling of Physical Systems

Why Spectral Normalization Stabilizes GANs: Analysis and Improvements

Rebounding Bandits for Modeling Satiation Effects

Efficient methods for Gaussian Markov random fields under sparse linear constraints

Physics-Integrated Variational Autoencoders for Robust and Interpretable Generative Modeling

Adversarially Robust 3D Point Cloud Recognition Using Self-Supervisions

Revisiting the Calibration of Modern Neural Networks

Understanding the Limits of Unsupervised Domain Adaptation via Data Poisoning

Formalizing Generalization and Adversarial Robustness of Neural Networks to Weight Perturbations

Generalized Jensen-Shannon Divergence Loss for Learning with Noisy Labels

Accommodating Picky Customers: Regret Bound and Exploration Complexity for Multi-Objective Reinforcement Learning

NN-Baker: A Neural-network Infused Algorithmic Framework for Optimization Problems on Geometric Intersection Graphs

Bootstrapping the Error of Oja's Algorithm

Towards Stable and Robust AdderNets

Probability Paths and the Structure of Predictions over Time

Brick-by-Brick: Combinatorial Construction with Deep Reinforcement Learning

Global Convergence of Online Optimization for Nonlinear Model Predictive Control

ProTo: Program-Guided Transformer for Program-Guided Tasks

Oracle-Efficient Regret Minimization in Factored MDPs with Unknown Structure

Observation-Free Attacks on Stochastic Bandits

Contrastive Learning of Global and Local Video Representations

A Theoretical Analysis of Fine-tuning with Linear Teachers

On Training Implicit Models

Implicit Bias of SGD for Diagonal Linear Networks: a Provable Benefit of Stochasticity

Reconstruction for Powerful Graph Representations

Deep Molecular Representation Learning via Fusing Physical and Chemical Information

Implicit Generative Copulas

Automatic Data Augmentation for Generalization in Reinforcement Learning

Local Hyper-Flow Diffusion

Analysis of Sensing Spectral for Signal Recovery under a Generalized Linear Model

Differentially Private Empirical Risk Minimization under the Fairness Lens

Adversarial Neuron Pruning Purifies Backdoored Deep Models

Large Scale Learning on Non-Homophilous Graphs: New Benchmarks and Strong Simple Methods

PLUR: A Unifying, Graph-Based View of Program Learning, Understanding, and Repair

Adversarial Intrinsic Motivation for Reinforcement Learning

Embedding Principle of Loss Landscape of Deep Neural Networks

Progressive Feature Interaction Search for Deep Sparse Network

Towards Multi-Grained Explainability for Graph Neural Networks

Multi-task Learning of Order-Consistent Causal Graphs

Sequence-to-Sequence Learning with Latent Neural Grammars

Causal Identification with Matrix Equations

Compressed Video Contrastive Learning

Low-Rank Subspaces in GANs

Accelerated Sparse Neural Training: A Provable and Efficient Method to Find N:M Transposable Masks

Differentiable rendering with perturbed optimizers

iFlow: Numerically Invertible Flows for Efficient Lossless Compression via a Uniform Coder

Controlled Text Generation as Continuous Optimization with Multiple Constraints

Dynamic Analysis of Higher-Order Coordination in Neuronal Assemblies via De-Sparsified Orthogonal Matching Pursuit

Best of Both Worlds: Practical and Theoretically Optimal Submodular Maximization in Parallel

Individual Privacy Accounting via a Rényi Filter

Improving Contrastive Learning on Imbalanced Data via Open-World Sampling

A Comprehensively Tight Analysis of Gradient Descent for PCA

CCVS: Context-aware Controllable Video Synthesis

Adaptive Ensemble Q-learning: Minimizing Estimation Bias via Error Feedback

Multi-Scale Representation Learning on Proteins

Exploring the Limits of Out-of-Distribution Detection

The best of both worlds: stochastic and adversarial episodic MDPs with unknown transition

The Value of Information When Deciding What to Learn

Minimax Regret for Stochastic Shortest Path

Tensor Normal Training for Deep Learning Models

Fair Algorithms for Multi-Agent Multi-Armed Bandits

Nested Graph Neural Networks

General Low-rank Matrix Optimization: Geometric Analysis and Sharper Bounds

Variational Bayesian Reinforcement Learning with Regret Bounds

A Gradient Method for Multilevel Optimization

A universal probabilistic spike count model reveals ongoing modulation of neural variability

Shape Registration in the Time of Transformers

Towards Instance-Optimal Offline Reinforcement Learning with Pessimism

Optimality of variational inference for stochasticblock model with missing links

Dynamic Trace Estimation

Zero Time Waste: Recycling Predictions in Early Exit Neural Networks

Revisiting Discriminator in GAN Compression: A Generator-discriminator Cooperative Compression Scheme

Learning Student-Friendly Teacher Networks for Knowledge Distillation

Towards Best-of-All-Worlds Online Learning with Feedback Graphs

A$^2$-Net: Learning Attribute-Aware Hash Codes for Large-Scale Fine-Grained Image Retrieval

Progressive Coordinate Transforms for Monocular 3D Object Detection

Neural Human Performer: Learning Generalizable Radiance Fields for Human Performance Rendering

Learning and Generalization in RNNs

Counterfactual Explanations Can Be Manipulated

Scheduling jobs with stochastic holding costs

On the Value of Interaction and Function Approximation in Imitation Learning

Nonparametric estimation of continuous DPPs with kernel methods

Learning Disentangled Behavior Embeddings

Topic Modeling Revisited: A Document Graph-based Neural Network Perspective

Dueling Bandits with Adversarial Sleeping

Inverse-Weighted Survival Games

Identifiability in inverse reinforcement learning

Modular Gaussian Processes for Transfer Learning

Faster proximal algorithms for matrix optimization using Jacobi-based eigenvalue methods

Neural Relightable Participating Media Rendering

Time-series Generation by Contrastive Imitation

Exploiting Opponents Under Utility Constraints in Sequential Games

Model-Based Domain Generalization

The Elastic Lottery Ticket Hypothesis

Hybrid Regret Bounds for Combinatorial Semi-Bandits and Adversarial Linear Bandits

Learning Optimal Predictive Checklists

Learning Markov State Abstractions for Deep Reinforcement Learning

Learning to Elect

Projected GANs Converge Faster

Certifying Robustness to Programmable Data Bias in Decision Trees

M-FAC: Efficient Matrix-Free Approximations of Second-Order Information

Bellman Eluder Dimension: New Rich Classes of RL Problems, and Sample-Efficient Algorithms

Transformer in Transformer

Neural Scene Flow Prior

MAUVE: Measuring the Gap Between Neural Text and Human Text using Divergence Frontiers

Neural Rule-Execution Tracking Machine For Transformer-Based Text Generation

Dynamics-regulated kinematic policy for egocentric pose estimation

TransGAN: Two Pure Transformers Can Make One Strong GAN, and That Can Scale Up

A/B/n Testing with Control in the Presence of Subpopulations

EvoGrad: Efficient Gradient-Based Meta-Learning and Hyperparameter Optimization

Baby Intuitions Benchmark (BIB): Discerning the goals, preferences, and actions of others

Introspective Distillation for Robust Question Answering

Bandit Learning with Delayed Impact of Actions

DynamicViT: Efficient Vision Transformers with Dynamic Token Sparsification

Heavy Ball Neural Ordinary Differential Equations

Recurrent Submodular Welfare and Matroid Blocking Semi-Bandits

Online Learning Of Neural Computations From Sparse Temporal Feedback

PerSim: Data-Efficient Offline Reinforcement Learning with Heterogeneous Agents via Personalized Simulators

Testing Probabilistic Circuits

Aligning Pretraining for Detection via Object-Level Contrastive Learning

Perturbation Theory for the Information Bottleneck

Equilibrium Refinement for the Age of Machines: The One-Sided Quasi-Perfect Equilibrium

DRONE: Data-aware Low-rank Compression for Large NLP Models

Pseudo-Spherical Contrastive Divergence

How Fine-Tuning Allows for Effective Meta-Learning

Learning in Multi-Stage Decentralized Matching Markets

Structured in Space, Randomized in Time: Leveraging Dropout in RNNs for Efficient Training

Cross-view Geo-localization with Layer-to-Layer Transformer

Differential Privacy Dynamics of Langevin Diffusion and Noisy Gradient Descent

Flattening Sharpness for Dynamic Gradient Projection Memory Benefits Continual Learning

FINE Samples for Learning with Noisy Labels

Distributionally Robust Imitation Learning

Probabilistic Tensor Decomposition of Neural Population Spiking Activity

Change Point Detection via Multivariate Singular Spectrum Analysis

Mixability made efficient: Fast online multiclass logistic regression

Does Knowledge Distillation Really Work?

Risk Bounds and Calibration for a Smart Predict-then-Optimize Method