Skip to yearly menu bar Skip to main content

Workshop: Workshop on robustness of zero/few-shot learning in foundation models (R0-FoMo)

SAD: Segment Any RGBD

Jun CEN · Yizheng Wu · Kewei Wang · Xingyi Li · Jingkang Yang · Yixuan Pei · Lingdong Kong · Ziwei Liu · Qifeng Chen


The Segment Anything Model (SAM) has demonstrated its effectiveness in segmenting any part of 2D RGB images. A lot of SAM-based applications have shown amazing performance. However, SAM exhibits a stronger emphasis on texture information while paying less attention to geometry information when segmenting RGB images. To address this limitation, we propose the Segment Any RGBD (SAD) model, which is specifically designed to extract geometry information directly from images. Inspired by the natural ability of humans to identify objects through the visualization of depth maps, SAD utilizes SAM to segment the rendered depth map, thus providing cues with enhanced geometry information and mitigating the issue of over-segmentation. Compared to other SAM-based projects, we are the first to use SAM to segment non-RGB images. We further include the open-vocabulary semantic segmentation in our framework to provide the semantic labels of each segment.

Chat is not available.