IEEE Transactions on Intelligent Vehicles (early access) / 26 June 2024
Semi-Automatic BEV Ground Truth Generation for Automotive Perception Systems
The development and testing of automotive perception systems require a large amount of labeled data. The ground truth of the measurement scene is usually generated by manual video annotation. However, besides requiring a lot of manual effort, it can be time-consuming. Moreover, since the annotation boxes must be transformed into the vehicle's BEV space, the result is often not precise enough due to the camera's poor depth estimation. The existing ground truth generation systems rely on extremely expensive sensor clusters, including LiDARs and GNSS RTK. In this paper, we present a novel semi-automatic ground truth generation tool that exclusively depends on a mono-vision camera and an automotive radar, eliminating the need for these costly sensors. The proposed system comprises an automatic pre-generation algorithm and an interactive GUI annotation tool. We present a novel pre-generation algorithm relying on offline tracking algorithms and fusing radar and camera data that exploits the offline nature of the problem and enhances the preciseness of depth estimation. The proposed system is evaluated on a highway section by comparing the automatically generated objects to the manually created ground truth. According to the results, the automatic pre-generation can significantly decrease the effort of manual annotation, which is also supported by a user-friendly GUI. Along with the theoretical background, the tool and the dataset used for this paper are provided on GitHub.