Building Maps Using Monocular Image-feeds from Windshield-mounted Cameras in a Simulator Environment
Abstract
3-dimensional, accurate, and up-to-date maps are essential for vehicles with autonomous capabilities, whose functionality is made possible by machine learning-based algorithms. Since these solutions require a tremendous amount of data for parameter optimization, simulation-to-reality (Sim2Real) methods have been proven immensely useful for training data generation. For creating realistic models to be used for synthetic data generation, crowdsourcing techniques present a resource-efficient alternative. In this paper, we show that using the Carla simulation environment, a crowdsourcing model can be created that mimics a multi-agent data gathering and processing pipeline. We developed a solution that yields dense point clouds based on monocular images and location information gathered by individual data acquisition vehicles. Our method provides scene reconstructions using the robust Structure-from-Motion (SfM) solution of Colmap. Moreover, we introduce a solution for synthesizing dense ground truth point clouds originating from the Carla simulator using a simulated data acquisition pipeline. We compare the results of the Colmap reconstruction with the reference point cloud after aligning them using the iterative closest point algorithm. Our results show that a precise point cloud reconstruction was feasible with this crowdsourcing-based approach, with 54\% of the reconstructed points having an error under 0.05 m, and a weighted root mean square error of 0.0449 m for the entire point cloud.