摘要:AbstractIn this paper, a simple and effective real-time 3D semantic mapping method is proposed. The proposed method take per-frame bounding box detections and sensor (camera) extrinsic transformation estimates as inputs and produces a set of static 3D bounding boxes in world coordinate system as 3D semantic mapping results. Each object has a Kalman filter as its state estimator and intersection over union of 3D bounding boxes is used for data association. To evaluate the proposed method, a new benchmark is derived from the KITTI object tracking evaluation. Ground-truth semantic maps are constructed based on oxts data and labeled 3D bounding boxes of KITTI. Three novel semantic map-centered metrics: DAOD, AAOD, and PRVO are proposed. Experiments are conducted to evaluate the proposed method. Experiments show that the metric and benchmarking dataset can serve as a new benchmark platform for easier comparison of new methods.