In this talk, I will demonstrate how to combine the power of deep neural networks and classic symbolic AI to deal with challenges in video understanding. I will showcase the application of these methods to problems such as temporal and causal reasoning in videos and music generation from videos.