摘要:Convolutional Neural Networks (CNNs) have become the most advanced algorithms for deep learning. They are widely used in image processing, object detection and automatic translation. As the demand for CNNs continues to increase, the platforms on which they are deployed continue to expand. As an excellent low-power, high-performance, embedded solution, Digital Signal Processor (DSP) is used frequently in many key areas. This paper attempts to deploy the CNN to Texas Instruments (TI)’s TMS320C6678 multi-core DSP and optimize the main operations (convolution) to accommodate the DSP structure. The efficiency of the improved convolution operation has increased by tens of times.