Monocular 3D Object Detection for Autonomous Driving



The goal of this paper is to perform 3D object detection from a single monocular image in the domain of autonomous driving. Our method first aims to generate a set of candidate class-specific object proposals, which are then run through a standard CNN pipeline to obtain high-quality object detections. The focus of this paper is on proposal generation. In particular, we propose an energy minimization approach that places object candidates in 3D using the fact that objects should be on the ground-plane. We then score each candidate box projected to the image plane via several intuitive potentials encoding semantic segmentation, contextual information, size and location priors and typical object shape. Our experimental evaluation demonstrates that our object proposal generation approach significantly outperforms all monocular approaches, and achieves the best detection performance on the challenging KITTI benchmark, among published monocular competitors.


Xiaozhi Chen, Kaustav Kundu, Ziyu Zhang, Huimin Ma, Sanja Fidler, Raquel Urtasun.
Monocular 3D Object Detection for Autonomous Driving
Computer Vision and Pattern Recognition (CVPR), Las Vegas, US, 2016
[PDF] [Supplementary Material] [BibTeX]


2D/3D Detections on KITTI:


Code & Data


The work was partially supported by NSFC 61171113, NSERC and Toyota Motor Corporation.


For questions regarding the data or code, please contact Xiaozhi Chen.

Related Work