Skip to yearly menu bar Skip to main content


Poster

OPS: Occupancy Prediction as A Sparse Set

JiaBao Wang · Zhaojiang Liu · Qiang Meng · Liujiang Yan · Ke Wang · JIE YANG · Wei Liu · Qibin Hou · Ming-Ming Cheng

East Exhibit Hall A-C #1409
[ ] [ Project Page ]
Fri 13 Dec 11 a.m. PST — 2 p.m. PST

Abstract:

Occupancy prediction, aiming at predicting the occupancy status within voxelized 3D environment, is quickly gaining momentum within the autonomous driving community. Mainstream occupancy prediction work first discretize the 3D environment into voxels, then perform classification on such dense grids. However, inspection on sample data reveals that the vast majority of voxels is unoccupied. Performing classification on these empty voxels demands suboptimal computation resource allocation, and reducing such empty voxels necessitates complex algorithm designs. To this end, we present a novel perspective on the occupancy prediction task: formulating it as a streamlined set prediction paradigm without the need for explicit space modeling or complex sparsification procedures. Our proposed framework, called OPS, utilizes a transformer encoder-decoder architecture to simultaneously predict occupied locations and classes using a set of learnable queries. Firstly, we employ the Chamfer distance loss to scale the set-to-set comparison problem to unprecedented magnitudes, making training such model end-to-end a reality. Subsequently, semantic classes are adaptively assigned using nearest neighbor search based on the learned locations. In addition, OPS incorporates a suite of non-trivial strategies to enhance model performances, including coarse-to-fine learning, consistent point sampling, and adaptive re-weighting, etc. Finally, compared with current state-of-the-art methods, our lightest model achieves superior RayIoU on the Occ3D-nuScenes dataset at over 2x FPS, while our heaviest model surpasses previous best results by 4.9 RayIoU. Code will be made public.

Live content is unavailable. Log in and register to view live content