Towards real-time 3D vehicle detection from monocular images using deep learning

One key task of the environment perception pipeline for autonomous driving is object detection using monocular RGB images. This task is usually limited to 2D object detection. The question arises whether 3D object detection is also possible using only monocular RGB images. In this dissertation, we evaluate this question specifically for 3D vehicle detection in monocular RGB images in the scope of driver assistance systems and autonomous driving. We use modern deep learning techniques without utilizing temporal information and a so-called 2D-3D lifting. In particular, this includes the estimation of 3D location, orientation, and the size of the object. In addition to a reliable and high-quality detection performance, autonomous driving systems require a short runtime. Therefore, we opt for the best possible trade-off between detection performance and runtime. Since the basis of any deep learning approach is high-quality data, we introduce a new dataset, Cityscapes 3D. This dataset is characterized in particular by its annotations with 9 degrees of freedom, as well as novel and improved evaluation metrics. We published a publicly available benchmark that allows research groups to assess and compare their methods for 3D object detection to those of other researchers. We develop improvements for 2D object detection and prove their effectiveness. Firstly, we increase the 2D detection performance by more than 5% using an adapted error function during training. Secondly, we develop vg-NMS that particularly supports 2D amodal object detection. With MB-Net, BS3D, and 3D-GCK, we develop three different approaches based on the 2D-3D lifting scheme. All developed approaches stand out for their comparably good detection performances and their short runtime. In direct comparison to MB-Net and BS3D, 3D-GCK does not require any post-processing. It estimates all 9 degrees of freedom of a vehicle in 3D space and also requires no prior knowledge about possible vehicle extents.


Citation style:
Could not load citation form.


Use and reproduction: