摘要:The performance of convolutional neural network- (CNN-) based object detection has achieved incredible success. Howbeit, existing CNN-based algorithms suffer from a problem that small-scale objects are difficult to detect because it may have lost its response when the feature map has reached a certain depth, and it is common that the scale of objects (such as cars, buses, and pedestrians) contained in traffic images and videos varies greatly. In this paper, we present a 32-layer multibranch convolutional neural network named MBNet for fast detecting objects in traffic scenes. Our model utilizes three detection branches, in which feature maps with a size of 16 × 16, 32 × 32, and 64 × 64 are used, respectively, to optimize the detection for large-, medium-, and small-scale objects. By means of a multitask loss function, our model can be trained end-to-end. The experimental results show that our model achieves state-of-the-art performance in terms of precision and recall rate, and the detection speed (up to 33 fps) is fast, which can meet the real-time requirements of industry.