Timezone: »
Learned region sparsity has achieved state-of-the-art performance in classification tasks by exploiting and integrating a sparse set of local information into global decisions. The underlying mechanism resembles how people sample information from an image with their eye movements when making similar decisions. In this paper we incorporate the biologically plausible mechanism of Inhibition of Return into the learned region sparsity model, thereby imposing diversity on the selected regions. We investigate how these mechanisms of sparsity and diversity relate to visual attention by testing our model on three different types of visual search tasks. We report state-of-the-art results in predicting the locations of human gaze fixations, even though our model is trained only on image-level labels without object location annotations. Notably, the classification performance of the extended model remains the same as the original. This work suggests a new computational perspective on visual attention mechanisms and shows how the inclusion of attention-based mechanisms can improve computer vision techniques.
Author Information
Zijun Wei (Stony Brook)
I am currently a graduate student at Department of Computer Science in Stony Brook Univeristy. From fall 2014 I'm working in the Computer Vision Lab under the supervision of Prof. Dimitris Samaras, Prof. Minh Hoai and Prof. Gregory Zelinsky Prior to this, I received my master degree from the Robotics Institute, Carnege Mellon University in 2013 advised by Prof. Mel Siegel. I work on research problems in Computer Vision and Machine Learning. I am especially interested in plugging human visual perception experience into computer vision to either boost performance or enable human-like results. I am also interested in the other way around -- using computer vision algorithms to model human visual perception systems. I'm a recipient of the Renaissance Technologies Fellowship from 2014 to 2017. I worked as research intern at Adobe Research twice: 2017 spring and 2018 summer.
Hossein Adeli (Stony Brook University)
Minh Hoai Nguyen (Stony Brook University)
Greg Zelinsky (Stony Brook University)
Dimitris Samaras (Stony Brook University)
More from the Same Authors
-
2022 : Reconstruction-guided attention improves the robustness and shape processing of neural networks »
Seoyoung Ahn · Hossein Adeli · Greg Zelinsky -
2020 Poster: Distribution Matching for Crowd Counting »
Boyu Wang · Huidong Liu · Dimitris Samaras · Minh Hoai Nguyen -
2020 Spotlight: Distribution Matching for Crowd Counting »
Boyu Wang · Huidong Liu · Dimitris Samaras · Minh Hoai Nguyen -
2020 Poster: Detecting Hands and Recognizing Physical Contact in the Wild »
Supreeth Narasimhaswamy · Trung Nguyen · Minh Hoai Nguyen -
2018 Poster: Sequence-to-Segment Networks for Segment Detection »
Zijun Wei · Boyu Wang · Minh Hoai Nguyen · Jianming Zhang · Zhe Lin · Xiaohui Shen · Radomir Mech · Dimitris Samaras -
2013 Poster: Modeling Clutter Perception using Parametric Proto-object Partitioning »
Chen-Ping Yu · Wen-yu Hua · Dimitris Samaras · Greg Zelinsky -
2009 Poster: Sparse and Locally Constant Gaussian Graphical Models »
Jean Honorio · Luis E Ortiz · Dimitris Samaras · Nikos Paragios · Rita Goldstein