Timezone: »

DAGMA: Learning DAGs via M-matrices and a Log-Determinant Acyclicity Characterization
Kevin Bello · Bryon Aragam · Pradeep Ravikumar

Thu Dec 01 02:00 PM -- 04:00 PM (PST) @ Hall J #309
The combinatorial problem of learning directed acyclic graphs (DAGs) from data was recently framed as a purely continuous optimization problem by leveraging a differentiable acyclicity characterization of DAGs based on the trace of a matrix exponential function. Existing acyclicity characterizations are based on the idea that powers of an adjacency matrix contain information about walks and cycles. In this work, we propose a new acyclicity characterization based on the log-determinant (log-det) function, which leverages the nilpotency property of DAGs. To deal with the inherent asymmetries of a DAG, we relate the domain of our log-det characterization to the set of $\textit{M-matrices}$, which is a key difference to the classical log-det function defined over the cone of positive definite matrices.Similar to acyclicity functions previously proposed, our characterization is also exact and differentiable. However, when compared to existing characterizations, our log-det function: (1) Is better at detecting large cycles; (2) Has better-behaved gradients; and (3) Its runtime is in practice about an order of magnitude faster. From the optimization side, we drop the typically used augmented Lagrangian scheme and propose DAGMA ($\textit{Directed Acyclic Graphs via M-matrices for Acyclicity}$), a method that resembles the central path for barrier methods. Each point in the central path of DAGMA is a solution to an unconstrained problem regularized by our log-det function, then we show that at the limit of the central path the solution is guaranteed to be a DAG. Finally, we provide extensive experiments for $\textit{linear}$ and $\textit{nonlinear}$ SEMs and show that our approach can reach large speed-ups and smaller structural Hamming distances against state-of-the-art methods. Code implementing the proposed method is open-source and publicly available at https://github.com/kevinsbello/dagma.

Author Information

Kevin Bello (The University of Chicago, Carnegie Mellon University)
Kevin Bello

Kevin Bello is a CI Fellow and postdoctoral researcher in the Booth School of Business at The University of Chicago and in the Machine Learning Department at Carnegie Mellon University. His research focuses on developing principled algorithms that are computationally and statistically efficient for various machine learning problems. His current interests include structure learning (a.k.a., causal discovery), and learning invariant/causal representations. Previously, Kevin worked in structured prediction, studying efficient learning with latent variables, minimax bounds, and exact inference. Kevin received his PhD in Computer Science from Purdue University, under the supervision of Jean Honorio. Before his PhD, Kevin received a BSc in Mechatronics Engineering from the National University of Engineering in Lima, Peru.

Bryon Aragam (University of Chicago)
Pradeep Ravikumar (Carnegie Mellon University)

More from the Same Authors