Convolutional neural networks (CNNs) are widely used in the field of remote sensing image object detection due to their high accuracy. However, the large number of parameters and high computational complexity of CNNs make it challenging to deploy them in real-time on embedded devices with limited computational and storage resources. This significantly restricts their practical application. To address this challenge, lightweight YOLOv5n model is chosen to realize object detection. The model is further optimized by activation function modification and parameter quantization to be more hardware-friendly. In addition, deploying a Deep Learning Processing Unit (DPU) on the Zynq heterogeneous SoC by hardware-software co-design to significantly accelerate the YOLOv5n based remote sensing object detection. Experimental results show that the optimized YOLOv5n model reaches a lossless accuracy at 61.4% on the DIOR dataset when implemented on an embedded platform based on Zynq. The experimental platform achieves an image throughput of 232.1 FPS with a power consumption of 19.4W. Performance per watt (FPS/W) is 9.0× and 1.8× higher than that of i7-12700H CPU and RTX 3070Ti GPU respectively.
Comment submit