`

Timezone: »

 
Poster
Interpolation Technique to Speed Up Gradients Propagation in Neural ODEs
Talgat Daulbaev · Alexandr Katrutsa · Larisa Markeeva · Julia Gusak · Andrzej Cichocki · Ivan Oseledets

Tue Dec 08 09:00 AM -- 11:00 AM (PST) @ Poster Session 1 #379

We propose a simple interpolation-based method for the efficient approximation of gradients in neural ODE models. We compare it with reverse dynamic method (known in literature as “adjoint method”) to train neural ODEs on classification, density estimation and inference approximation tasks. We also propose a theoretical justification of our approach using logarithmic norm formalism. As a result, our method allows faster model training than the reverse dynamic method what was confirmed and validated by extensive numerical experiments for several standard benchmarks.

Author Information

Talgat Daulbaev (Skolkovo Institute of Science and Technology)
Alexandr Katrutsa (Skolkovo Institute of Science and Technology)
Larisa Markeeva (Skolkovo Institute of Science and Technology)
Julia Gusak (Skolkovo Institute of Science and Technology)

Currently, I am a Research Scientist (AI) at Skolkovo Institute of Science and Technology at Tensor networks and deep learning for applications in data mining laboratory, working with Prof. Ivan Oseledets and Prof. Andrzej Cichocki. My recent research deals with compression and acceleration of computer vision models using tensor methods; training time and performance improvement of neural ordinary differential equations; as well as neural networks analysis using low-rank methods. Also, I have participated in some audio-related projects on speech synthesis and voice conversion. Some of my earlier projects were related to medical data processing (EEG, ECG). My research interests include but not limited to: Deep learning (DL), Computer vision, Speech technologies, Multi-modal/Multi-task learning, Semi-supervised/unsupervised learning, One-/few-/low-shot learning, Incremental learning, Continual learning, Domain adaptation, Hyper Networks, Tensor decompositions for DL, Neural Ordinary Differential Equations, Interpretability of DL.

Andrzej Cichocki (Skolkovo Institute of Science and Technology)
Ivan Oseledets (Skolkovo Institute of Science and Technology)

More from the Same Authors