文章基本信息

标题：Multiple Traffic Target Tracking with Spatial-Temporal Affinity Network
本地全文：下载
作者：Yamin Sun ; Yue Zhao ; Sirui Wang 等
期刊名称：Computational Intelligence and Neuroscience
印刷版ISSN：1687-5265
电子版ISSN：1687-5273
出版年度：2022
卷号：2022
DOI：10.1155/2022/9693767
语种：English
出版社：Hindawi Publishing Corporation
摘要：Traffic target tracking is a core task in intelligent transportation system because it is useful for scene understanding and vehicle autonomous driving. Most state-of-the-art (SOTA) multiple object tracking (MOT) methods adopt a two-step procedure: object detection followed by data association. The object detection has made great progress with the development of deep learning. However, the data association still heavily depends on hand crafted constraints, such as appearance, shape, and motion, which need to be elaborately trained for a special object. In this study, a spatial-temporal encoder-decoder affinity network is proposed for multiple traffic targets tracking, aiming to utilize the power of deep learning to learn a robust spatial-temporal affinity feature of the detections and tracklets for data association. The proposed spatial-temporal affinity network contains a two-stage transformer encoder module to encode the features of the detections and the tracked targets at the image level and the tracklet level, aiming to capture the spatial correlation and temporal history information. Then, a spatial transformer decoder module is designed to compute the association affinity, where the results from the two-stage transformer encoder module are fed back to fully capture and encode the spatial and temporal information from the detections and the tracklets of the tracked targets. Thus, efficient affinity computation can be applied to perform data association in online tracking. To validate the effectiveness of the proposed method, three popular multiple traffic target tracking datasets, KITTI, UA-DETRAC, and VisDrone, are used for evaluation. On the KITTI dataset, the proposed method is compared with 15 SOTA methods and achieves 86.9% multiple object tracking accuracy (MOTA) and 85.71% multiple object tracking precision (MOTP). On the UA-DETRAC dataset, 12 SOTA methods are used to compare with the proposed method, and the proposed method achieves 20.82% MOTA and 35.65% MOTP, respectively. On the VisDrone dataset, the proposed method is compared with 10 SOTA trackers and achieves 40.5% MOTA and 74.1% MOTP, respectively. All those experimental results show that the proposed method is competitive to the state-of-the-art methods by obtaining superior tracking performance.