Current Issue Cover
可变形卷积与注意力的SAR舰船检测轻量化模型

余光浩1, 陈润霖1, 徐金燕2, 徐前祥3, 王大寒1, 陈峰1(1.厦门理工学院;2.自然资源部海岛研究中心;3.深智城集团)

摘 要
目的 针对合成孔径雷达 (synthetic aperture radar,SAR) 图像舰船检测中因背景复杂、目标尺寸各异等因素导致的漏检、误检结果,提出一种基于YOLOv8 (you only look once v8) 的改进算法。方法 首先,轻量化处理YOLOv8的原有网络结构,大幅降低网络的冗余度,使轻量化的网络更适合SAR图像舰船检测任务。其次,在主干网络中融入可变形卷积,增强模型对目标的感知能力,能更好地适应目标形变和复杂背景;同时,在颈部网络融入卷积注意力模块,减弱背景信息的干扰,使网络更专注舰船目标的特征。最后,采用EIoU (efficient intersection over union) 损失函数,最小化预测框与真实框间的差值 (包括宽度和高度),实现更快的收敛速度。结果 分别在SSDD (SAR ship detection dataset) 和HRSID (high-resolution SAR images dataset) 上进行测试,结果表明,改进算法的检测性能优于当前几种流行的目标检测算法。其中,与YOLOv8相比,在两个公开数据集上其精度评估指标mAP (mean average precision) @0.5分别提升0.68%和1.29%,mAP@0.75分别提升3.32%和3.10%,其处理速度FPS (frames per second) 则分别提升22帧/秒和18帧/秒。结论 本文在轻量化处理YOLOv8基础上融合可变形卷积与注意力机制构建的改进算法,能实现SAR舰船检测精度和速度的双重提升。
关键词
A lightweight model for SAR ship detection incorporating deformable convolution and attention mechanism

(Xiamen University of Technology)

Abstract
Objective Recently, synthetic aperture radar (SAR) has been widely used in fields, such as maritime monitoring, military intelligence acquisition and maritime management, due mainly to its capabilities in data acquisition at any time under all weather conditions. The algorithm with better performance not only help to improve ocean monitoring and navigation safety, but also play a key role in areas such as maritime rescue, border security, and ocean resource management. The ship target detection methods can be divided into two categories: the one based on deep learning and the traditional one. Methods based on deep learning have higher accuracy and stronger generalization capability. The deep learning methods mainly include two categories, one-stage detection and two-stage detection. Generally, compared to two-stage detection methods, one-stage detection methods have faster detection speed with lower detection accuracy. One-stage detection methods, such as YOLO and SSD, extract feature through backbone network, followed by direct classification and spatial position regression. Two-stage detection methods, such as R-CNN and Fast R-CNN, generally include initial region generation and final region classification regression. At present, for ship target detection with SAR image, more and more scholars are focusing on deep learning based algorithms. However, most of these methods have failed to achieve a better balance between detection accuracy and processing efficiency. In this study, a lightweight model based on YOLOv8 was proposed to improve the performance of SAR ship detection, considering the balance between detection accuracy and efficiency. Method This study proposed a new method that significantly improved YOLOv8, called LDCE (Lightweight-Deformable Convolution-CBAM-EIoU) -YOLOv8. The network structure of YOLOv8 was firstly reconstructed to significantly reduce network redundancy while maintaining sensitivity to ship features in SAR images. Furthermore, introducing deformable convolutional (DConv) enabled the network to better perceive the environmental information around ship targets, which improved the network’s capabilities in understanding and capturing of ship targets consequently. To reduce the interference of background information on the network, convolutional block attention module (CBAM) was introduced, through which the network paid more attention to the key features of ship targets. Moreover, efficient intersection over union (EIoU) loss function was adopted to improve the convergence speed of the model. The experiments were initially conducted using the publicly available SAR ship detection dataset (SSDD), which comprises 1,160 images with an average size of 500×500 pixels and 2,587 instances of ship targets totally. SSDD was randomly divided into training and testing sets in a ratio of 8:2. During training process, the size of the input images was adjusted to 640×640. The batch size and initial learning rate were set to 32 and 0.001 respectively. Meanwhile, the momentum and weight decay coefficient were 0.937 and 0.0005 respectively. Multiple ablation experiments were conducted to validate the effectiveness of the newly proposed model, by using the original YOLOv8 as baseline for comparison. Furthermore, more comparisons were conducted with other methods proposed recently (i.e. CCSSNet and MSSDNet), and other detection algorithms widely used, including Faster R-CNN, SSD, RetinaNet, YOLOv5, and YOLOv6. To further validate the effectiveness and generalization of LDCE-YOLOv8, other experiments were conducted on the high-resolution SAR images dataset for ship detection (HRSID), which contains 5,604 images with an average size of 800×800 pixels and 16,951 instances of ship targets totally. Result The accuracy evaluation indices mAP@0.5 and mAP@0.75 of YOLOv8 (the baseline) were 98.16% and 82.46% respectively, while the frames per second (FPS, as speed evaluation index) was 263f/s. For LDCE-YOLOv8, the accuracy evaluation indices mAP@0.5 and mAP@0.75 were 98.84% and 85.78% respectively, the speed evaluation index FPS was 285f/s. The parameter count of LDCE-YOLOv8 decreased by 24.58%, as compared against YOLOv8. The mAP@0.5, mAP@0.75, and FPS of LDCE-YOLOv8 were 0.62%, 2.23%, and 18.30% higher than those of MSSDNet, while these indices were 0.90%, 2.96%, and 4.40% higher than those of CCSSNet correspondingly. The iterative curve of the bounding box loss values for each training in the ablation experiment showed that LDCE-YOLOv8 had the minimum loss value and the fastest iteration speed. Generally, results apparently showed that the newly proposed model (i.e. LDCE-YOLOv8) possessed the best detection performance, in terms of parameter, precision, recall, average precision and FPS, suggesting the higher feature extraction ability of LDCE-YOLOv8 for ship targets in SAR image. To compare the detection performance among different methods intuitively, the detection result graphs for five representative scenarios were showcased. Among them, LDCE-YOLOv8 achieved the best performances with all ship targets being detected accurately under these different scenarios. Consequently, the newly proposed method demonstrated anti-interference ability when dealing with a large amount of irregular noise distributed in SAR images. Moreover, it achieved false alarm suppression for some strong bright spots with high similarity to ship features, while performed well in small object detection. Whether for the cases in complex nearshore scene or simple sea scene, LDCE-YOLOv8 could effectively improve the missed detection and false detection and maintain a high detection confidence. In addition, the newly proposed method achieved better experiment results with HRSID, with 88.91%, 73.74%, 312 f/s in mAP@0.5, mAP@0.75, and FPS respectively, representing improvements of 1.29%, 3.13%, and 6.1% compared to YOLOv8. Accordingly, the results with HRSID demonstrated the superior detection performance of LDCE-YOLOv8 in SAR ship detection. Conclusion In this study, to better SAR ship detection in terms of both accuracy and efficiency, a lightweight model based on YOLOv8 was proposed, with incorporating deformable convolution and attention mechanism, which was named LDCE-YOLOv8. In particular, LDCE-YOLOv8 is provided with significantly reduced network redundancy while maintains the ability to capture ship features in SAR image. The integration of deformable convolution enhanced the network"s perception ability of environmental information around ship targets. In addition, convolutional block attention modules and efficient intersection over union loss function were incorporated to enhance target location ability. Experiments on two publicly available datasets (i.e. SSDD and HRSID) validated LDCE-YOLOv8’s effectiveness in SAR ship detection. However, the newly proposed algorithm still showed limitations. Specifically, it was difficult to detect accurately all ship targets using LDCE-YOLOv8 from the SAR image with densely packed ships. Other investigations focusing on this specific challenge are being conducted.
Keywords

订阅号|日报