Poster
Learning to See Physics via Visual De-animation
Jiajun Wu · Erika Lu · Pushmeet Kohli · Bill Freeman · Josh Tenenbaum

Wed Dec 6th 06:30 -- 10:30 PM @ Pacific Ballroom #145 #None

We introduce a paradigm for understanding physical scenes without human annotations. At the core of our system is a physical world representation that is first recovered by a perception module and then utilized by physics and graphics engines. During training, the perception module and the generative models learn by visual de-animation --- interpreting and reconstructing the visual information stream. During testing, the system first recovers the physical world state, and then uses the generative models for reasoning and future prediction. Even more so than forward simulation, inverting a physics or graphics engine is a computationally hard problem; we overcome this challenge by using a convolutional inversion network. Our system quickly recognizes the physical world state from appearance and motion cues, and has the flexibility to incorporate both differentiable and non-differentiable physics and graphics engines. We evaluate our system on both synthetic and real datasets involving multiple physical scenes, and demonstrate that our system performs well on both physical state estimation and reasoning problems. We further show that the knowledge learned on the synthetic dataset generalizes to constrained real images.

Author Information

Jiajun Wu (MIT)

Jiajun Wu is a fifth-year Ph.D. student at Massachusetts Institute of Technology, advised by Professor Bill Freeman and Professor Josh Tenenbaum. His research interests lie on the intersection of computer vision, machine learning, and computational cognitive science. Before coming to MIT, he received his B.Eng. from Tsinghua University, China, advised by Professor Zhuowen Tu. He has also spent time working at research labs of Microsoft, Facebook, and Baidu.

Erika Lu (University of Oxford)
Pushmeet Kohli (DeepMind)
Bill Freeman (MIT/Google)
Josh Tenenbaum (MIT)

Josh Tenenbaum is an Associate Professor of Computational Cognitive Science at MIT in the Department of Brain and Cognitive Sciences and the Computer Science and Artificial Intelligence Laboratory (CSAIL). He received his PhD from MIT in 1999, and was an Assistant Professor at Stanford University from 1999 to 2002. He studies learning and inference in humans and machines, with the twin goals of understanding human intelligence in computational terms and bringing computers closer to human capacities. He focuses on problems of inductive generalization from limited data -- learning concepts and word meanings, inferring causal relations or goals -- and learning abstract knowledge that supports these inductive leaps in the form of probabilistic generative models or 'intuitive theories'. He has also developed several novel machine learning methods inspired by human learning and perception, most notably Isomap, an approach to unsupervised learning of nonlinear manifolds in high-dimensional data. He has been Associate Editor for the journal Cognitive Science, has been active on program committees for the CogSci and NIPS conferences, and has co-organized a number of workshops, tutorials and summer schools in human and machine learning. Several of his papers have received outstanding paper awards or best student paper awards at the IEEE Computer Vision and Pattern Recognition (CVPR), NIPS, and Cognitive Science conferences. He is the recipient of the New Investigator Award from the Society for Mathematical Psychology (2005), the Early Investigator Award from the Society of Experimental Psychologists (2007), and the Distinguished Scientific Award for Early Career Contribution to Psychology (in the area of cognition and human learning) from the American Psychological Association (2008).

Related Events (a corresponding poster, oral, or spotlight)

More from the Same Authors