Huaiyuan Xu . Junliang Chen . Shiyu Meng . Yi Wang . Lap-Pui Chau*
This work focuses on 3D dense perception in autonomous driving, encompassing LiDAR-Centric Occupancy Perception, Vision-Centric Occupancy Perception, and Multi-Modal Occupancy Perception. Information fusion techniques for this field are discussed. We believe this will be the most comprehensive survey to date on 3D Occupancy Perception. Please stay tuned!😉😉😉
This is an active repository, you can watch for following the latest advances. If you find it useful, please kindly star this repo.
✨You are welcome to provide us your work with a topic related to 3D occupancy for autonomous driving (involving not only perception, but also applications)!
If you discover any missing work or have any suggestions, please feel free to submit a pull request or contact us. We will promptly add the missing papers to this repository.
[1] A systematically survey for the latest research on 3D occupancy perception in the field of autonomous driving.
[2] The survey provides the taxonomy of 3D occupancy perception, and elaborate on core methodological issues, including network pipelines, multi-source information fusion, and effective network training.
[3] The survey presents evaluations for 3D occupancy perception, and offers detailed performance comparisons. Furthermore, current limitations and future research directions are discussed.
- [2024-09-03] This survey got accepted by Information Fusion (Impact factor: 14.7).
- [2024-07-21] More representative works and benchmarking comparisons have been incorporated, bringing the total to 192 literature references.
- [2024-05-18] More figures have been added to the survey. We reorganize the occupancy-based applications.
- [2024-05-08] The first version of the survey is available on arXiv. We curate this repository.
3D occupancy perception technology aims to observe and understand dense 3D environments for autonomous vehicles. Owing to its comprehensive perception capability, this technology is emerging as a trend in autonomous driving perception systems, and is attracting significant attention from both industry and academia. Similar to traditional bird's-eye view (BEV) perception, 3D occupancy perception has the nature of multi-source input and the necessity for information fusion. However, the difference is that it captures vertical structures that are ignored by 2D BEV. In this survey, we review the most recent works on 3D occupancy perception, and provide in-depth analyses of methodologies with various input modalities. Specifically, we summarize general network pipelines, highlight information fusion techniques, and discuss effective network training. We evaluate and analyze the occupancy perception performance of the state-of-the-art on the most popular datasets. Furthermore, challenges and future research directions are discussed. We hope this paper will inspire the community and encourage more research work on 3D occupancy perception.
- Introduction
- Summary of Contents
- Methods: A Survey
- 3D Occupancy Datasets
- Occupancy-based Applications
- Cite The Survey
- Contact
| Year | Venue | Paper Title | Link |
|---|---|---|---|
| 2025 | arXiv | 4D-ROLLS: 4D Radar Occupancy Learning via LiDAR Supervision | Code |
| 2024 | NeurIPS | RadarOcc: Robust 3D Occupancy Prediction with 4D Imaging Radar | - |
| Dataset | Year | Venue | Modality | # of Classes | Flow | Link |
|---|---|---|---|---|---|---|
| UniOcc | 2025 | ICCV | Camera | 10, 15, 17 | ✔️ | Intro. |
| OpenScene | 2024 | CVPR 2024 Challenge | Camera | - | ✔️ | Intro. |
| Cam4DOcc | 2024 | CVPR | Camera+LiDAR | 2 | ✔️ | Intro. |
| Occ3D | 2024 | NeurIPS | Camera | 14 (Occ3D-Waymo), 16 (Occ3D-nuScenes) | ❌ | Intro. |
| OpenOcc | 2023 | ICCV | Camera | 16 | ❌ | Intro. |
| OpenOccupancy | 2023 | ICCV | Camera+LiDAR | 16 | ❌ | Intro. |
| SurroundOcc | 2023 | ICCV | Camera | 16 | ❌ | Intro. |
| OCFBench | 2023 | arXiv | LiDAR | -(OCFBench-Lyft), 17(OCFBench-Argoverse), 25(OCFBench-ApolloScape), 16(OCFBench-nuScenes) | ❌ | Intro. |
| SSCBench | 2023 | arXiv | Camera | 19(SSCBench-KITTI-360), 16(SSCBench-nuScenes), 14(SSCBench-Waymo) | ❌ | Intro. |
| SemanticKITT | 2019 | ICCV | Camera+LiDAR | 19(Semantic Scene Completion task) | ❌ | Intro. |
| Specific Task | Year | Venue | Paper Title | Link |
|---|---|---|---|---|
| Indoor Occupancy Prediction | 2026 | CVPR | Monocular Open Vocabulary Occupancy Prediction for Indoor Scenes | Code |
| Indoor Occupancy Prediction | 2026 | CVPR | Generalizing Visual Geometry Priors to Sparse Gaussian Occupancy Prediction | Code |
| Indoor Occupancy Prediction | 2026 | arXiv | Parameter-Free Adaptive Multi-Scale Channel-Spatial Attention Aggregation framework for 3D Indoor Semantic Scene Completion Toward Assisting Visually Impaired | - |
| Indoor Occupancy Prediction | 2025 | RA-L | Enhancing Indoor Occupancy Prediction via Sparse Query-Based Multi-Level Consistent Knowledge Distillation | Code |
| Indoor Semantic Scene Completion | 2025 | arXiv | TGSFormer: Scalable Temporal Gaussian Splatting for Embodied Semantic Scene Completion | - |
| Indoor Occupancy Prediction | 2025 | arXiv | SplatSSC: Decoupled Depth-Guided Gaussian Splatting for Semantic Scene Completion | - |
| Indoor Occupancy Prediction | 2025 | arXiv | YouTube-Occ: Learning Indoor 3D Semantic Occupancy Prediction from YouTube Videos | - |
| Specific Task | Year | Venue | Paper Title | Link |
|---|---|---|---|---|
| Occupancy for Mobile Robots | 2025 | arXiv | MobileOcc: A Human-Aware Semantic Occupancy Dataset for Mobile Robots | - |
| Humanoid Occupancy | 2025 | arXiv | Humanoid Occupancy: Enabling A Generalized Multimodal Occupancy Perception System on Humanoid Robots | Project Page |
| Video Generation | 2025 | arXiv | ORV: 4D Occupancy-centric Robot Video Generation | Project Page |
| World Model | 2025 | arXiv | Occupancy World Model for Robots | - |
| Perception | 2025 | arXiv | RoboOcc: Enhancing the Geometric and Semantic Scene Understanding for Robots | - |
| Specific Task | Year | Venue | Paper Title | Link |
|---|---|---|---|---|
| 3D Panoptic Segmentation | 2024 | CVPR | PanoOcc: Unified Occupancy Representation for Camera-based 3D Panoptic Segmentation | Code |
| BEV Segmentation | 2024 | CVPRW | OccFeat: Self-supervised Occupancy Feature Prediction for Pretraining BEV Segmentation Networks | Code |
| Specific Task | Year | Venue | Paper Title | Link |
|---|---|---|---|---|
| 3D Object Detection | 2025 | ICONIP | Collaborative Perceiver: Elevating Vision-based 3D Object Detection via Local Density-Aware Spatial Occupancy | Code |
| 3D Object Detection | 2024 | NeurIPS | Towards Flexible 3D Perception: Object-Centric Occupancy Completion Augments 3D Object Detection | Code |
| 3D Object Detection | 2024 | CVPR | Learning Occupancy for Monocular 3D Object Detection | Code |
| 3D Object Detection | 2024 | AAAI | SOGDet: Semantic-Occupancy Guided Multi-view 3D Object Detection | Code |
| 3D Object Detection | 2024 | arXiv | UltimateDO: An Efficient Framework to Marry Occupancy Prediction with 3D Object Detection via Channel2height | - |
| Specific Task | Year | Venue | Paper Title | Link |
|---|---|---|---|---|
| Object Tracking | 2025 | ICRA | TrackOcc: Camera-based 4D Panoptic Occupancy Tracking | Code |
| Specific Task | Year | Venue | Paper Title | Link |
|---|---|---|---|---|
| 3D Flow Prediction | 2026 | RA-L | SelfOccFlow: Towards end-to-end self-supervised 3D Occupancy Flow prediction | - |
| 3D Flow Prediction | 2024 | CVPR | Cam4DOcc: Benchmark for Camera-Only 4D Occupancy Forecasting in Autonomous Driving Applications | Code |
| 3D Flow Prediction | 2024 | arXiv | Let Occ Flow: Self-Supervised 3D Occupancy Flow Prediction | Project Page |
| Specific Task | Year | Venue | Paper Title | Link |
|---|---|---|---|---|
| Scene Generation | 2025 | T-PAMI | OccScene: Semantic Occupancy-based Cross-task Mutual Learning for 3D Scene Generation | - |
| Multimodal Scene Generation | 2025 | CVPR | UniScene: Unified Occupancy-centric Driving Scene Generation | Project Page |
| Scene Generation | 2025 | arXiv | GenieDrive: Towards Physics-Aware Driving World Model with 4D Occupancy Guided Video Generation | Project Page |
| Multimodal Scene Generation | 2025 | arXiv | Scaling Up Occupancy-centric Driving Scene Generation: Dataset and Method | Code |
| Scene Generation | 2024 | ECCV | Pyramid Diffusion for Fine 3D Large Scene Generation (Oral paper) | Code |
| Scene Generation | 2024 | CVPR | SemCity: Semantic Scene Generation with Triplane Diffusion | Code |
| Scene Generation | 2024 | arXiv | InfiniCube: Unbounded and Controllable Dynamic 3D Driving Scene Generation with World-Guided Video Models | Project Page |
| Scene Generation | 2024 | arXiv | SyntheOcc: Synthesize Geometric-Controlled Street View Images through 3D Semantic MPIs | Project Page |
| Specific Task | Year | Venue | Paper Title | Link |
|---|---|---|---|---|
| Navigation | 2026 | arXiv | SPAN-Nav: Generalized Spatial Awareness for Versatile Vision-Language Navigation | Project Page |
| Navigation | 2025 | arXiv | OmniNWM: Omniscient Driving Navigation World Models | Project Page |
| Navigation for Air-Ground Robots | 2024 | RA-L | HE-Nav: A High-Performance and Efficient Navigation System for Aerial-Ground Robots in Cluttered Environments | Project Page |
| Navigation for Air-Ground Robots | 2024 | ICRA | AGRNav: Efficient and Energy-Saving Autonomous Navigation for Air-Ground Robots in Occlusion-Prone Environments | Code |
| Navigation for Air-Ground Robots | 2024 | arXiv | OMEGA: Efficient Occlusion-Aware Navigation for Air-Ground Robot in Dynamic Environments via State Space Model | Project Page |
If you find our survey and repository useful for your research project, please consider citing our paper:
@misc{xu2024survey,
title={A Survey on Occupancy Perception for Autonomous Driving: The Information Fusion Perspective},
author={Huaiyuan Xu and Junliang Chen and Shiyu Meng and Yi Wang and Lap-Pui Chau},
year={2024},
eprint={2405.05173},
archivePrefix={arXiv}
}If you have any questions, please feel free to get in touch:
If you are interested in joining us as a Ph.D. student to research computer vision, machine learning, please feel free to contact Professor Chau:

