Curie Kim

Research Scientist | Gwangju Institute of Science and Technology

Cross-View Road Layout Estimation (2022) | Curie Kim

Cross-View Road Layout Estimation (2022)

A Dual-Cycled Cross-View Transformer Network for Unified Road Layout Estimation and 3D Object Detection in the Bird’s-Eye-View, Curie Kim, and Ue-Hwan Kim

The 20th International Conference on Ubiquitous Robots (2023 UR).


Network Architecture

example image The proposed dual-cycled cross-view transformer (DCT) architecture. The DCT network requires both the top-view layout and the front-view images for training; these two inputs get transformed to another feature representation for the dual cycle loss. When deployed, the DCT network receives just a front-view image to predict the road layout and detect objects.


The bird’s-eye-view (BEV) representation allows robust learning of multiple tasks for autonomous driving including road layout estimation and 3D object detection. However, contemporary methods for unified road layout estimation and 3D object detection rarely handle the class imbalance of the training dataset and multi-class learning to reduce the total number of networks required. To overcome these limitations, we propose a unified model for road layout estimation and 3D object detection inspired by the transformer architecture and the CycleGAN learning framework. The proposed model deals with the performance degradation due to the class imbalance of the dataset utilizing the focal loss and the proposed dual cycle loss. Moreover, we set up extensive learning scenarios to study the effect of multi-class learning for road layout estimation in various situations. To verify the effectiveness of the proposed model and the learning scheme, we conduct a thorough ablation study and a comparative study. The experiment results attest the effectiveness of our model; we achieve state-of-the-art performance in both the road layout estimation and 3D object detection tasks.


Dataset Segmentation Objects      mIOU(%)            mAP(%)     
KITTI 3D Object Vehicle 39.44 58.89
KITTI Odometry Road 77.15 88.28
KITTI Raw Road 65.86 86.56
Argoverse Tracking Vehicle 48.04 68.96
Argoverse Tracking Road 76.71 88.87


Dataset Segmentation Objects      mIOU(%)            mAP(%)     
Argoverse Tracking Vehicle 31.75 46.20
Argoverse Tracking Road 74.73 86.76


Result Images

example image

example image