One remarkable aspect of human vision is the ability to understand the meaning of a novel image or event quickly and effortlessly. Within a single glance, we can comprehend the semantic category of a place, its spatial layout as well as identify many of the objects that compose the scene. Approaching human abilities at scene understanding is a current challenge for computer vision systems.
The field of scene understanding is multidisciplinary, involving a growing collection of works in computer vision, machine learning, cognitive psychology, and neuroscience. In this tutorial, we will focus on recent work on computer vision attempting to solve the tasks of scene recognition and classification, visual context representation, object recognition in context, drawing parallelism with work in psychology and neuroscience. Devising systems that solve scene and object recognition in an integrated fashion will lead towards more efficient and robust artificial visual understanding systems.