English

智能化农业装备学报(中英文) ›› 2023, Vol. 4 ›› Issue (2): 35-43.DOI: 10.12398/j.issn.2096-7217.2023.02.004

• • 上一篇    下一篇

基于改进YOLO-v4的果园环境下葡萄检测

肖张娜罗陆锋*陈明猷王金海卢清华骆少明   

  1. 佛山科学技术学院机电工程与自动化学院,广东佛山,528000
  • 出版日期:2023-05-15 发布日期:2023-05-15
  • 通讯作者: 罗陆锋,男,1982年生,湖南娄底人,博士,副教授;研究方向为采摘机器人。E-mail: luolufeng@163.com
  • 作者简介:肖张娜,女,1999年生,河南南阳人,硕士研究生;研究方向为采摘机器人。E-mail: 2604209368@qq.com
  • 基金资助:
    国家自然科学基金项目(32171909,51705365);广东省基础与应用基础研究基金项目(2020B1515120050);广东省自然科学基金项目(2023A1515011255);季华实验室开放课题(X220931UZ230)

Detection of grapes in orchard environment based on improved YOLO-v4

XIAO Zhangna, LUO Lufeng*, CHEN Mingyou, WANG Jinhai, LU Qinghua, LUO Shaoming   

  1. School of Mechatronics Engineering and Automation, Foshan University, Foshan 528000, China
  • Online:2023-05-15 Published:2023-05-15

摘要: 针对果园环境下葡萄生长场景复杂多变,葡萄机器人难以根据视觉检测结果制定无碰撞采摘策略的问题,提出了一种基于改进YOLO-v4的不同遮挡状态葡萄检测方法。首先,根据果园环境下葡萄的生长场景状态,将葡萄分别标记为4种类型:无遮挡葡萄,叶片遮挡葡萄,枝干遮挡葡萄,重叠遮挡葡萄;然后采用YOLO-v4框架作为检测模型,将注意力机制模型(CBAM)分别嵌入YOLO-v4框架中的主干网络(CSPDarknet53,YOLO-C-C)和路径聚合网络(PANet,YOLO-C-P),通过对CSPDarknet53和PANet网络特征提取过程进行目标注意,增强网络对葡萄特征的提取能力,降低复杂场景的干扰,以期达到果园环境下不同遮挡葡萄的高精确度检测;最后通过比较YOLO-C-C和YOLO-C-P网络的识别精确度与F1得分,得到最适合果园遮挡场景下的葡萄检测模型YOLO-C-P。对该方法的性能评估及与其他算法对比试验结果表明,YOLO-C-P模型对无遮挡、叶片遮挡、枝干遮挡、重叠遮挡的葡萄检测精确度分别为91.26%、92.47%、92.41%、90.65%,平均F1得分为91.71%;与同系列模型YOLO-v4、YOLO-X-X、YOLO-v5-X相比,F1得分分别提升了12.62、8.65、5.31个百分点。平均识别一幅图像的时间为0.13 s。该研究能够快速、有效识别无遮挡、叶片遮挡、枝干遮挡、重叠遮挡情况下的葡萄,可帮助机器人制定果园环境下的采摘策略(采摘顺序和路径规划),以避免因遮挡导致的碰撞造成采摘失败,为葡萄机器人提供了一种果园采摘辅助决策方法。

关键词: 葡萄, 机器人, YOLO-v4, 注意力机制, 目标检测

Abstract:  Aiming at the difficulty for the grape robot to formulate a collision-free picking strategy based on the visual detection results due to the complex and changeable grape growth scene in the orchard environment, a grape detection method based on improved YOLO-v4 in different occlusion states was proposed. First, according to the state of the grape growth scene in the orchard environment, the grapes were marked into four types: no shading, leaf shading, branch shading, and overlapping shading. Then the YOLO-v4 framework was used as the detection model, in which the attention mechanism model (CBAM) was respectively embedded in the backbone network (CSPDarknet53, YOLO-C-C) and path aggregation network (PANet, YOLO-C-P). The network was enhanced by performing target attention on the CSPDarknet53 and PANet network feature extraction processes. Furthermore, the ability to extract grape features reduced the interference of complex scenes in order to achieve high-precision detection of grapes under different occlusions in orchard environments. Finally, the best grape detection model YOLO-C-P for the orchard obscuration scenario was derived by comparing the recognition accuracy and F1 scores of YOLO-C-C and YOLO-C-P networks. The performance evaluation of the method and comparison with other algorithms showed that the accuracy of the YOLO-C-P model was 91.26%, 92.47%, 92.41%, and 90.65% for the grape detection of no shading, leaf shading, branch shading, and overlapping shading, respectively, with the average F1 score of 91.71%. Compared with the same series of models YOLO-v4, YOLO-X-X, YOLO-v5-X, the F1 score increased by 12.62, 8.65, and 5.31 percentage points, respectively. The average time to recognize an image was 0.13s. The research can quickly and effectively identify grapes of no shade, leaf shade, branch shade, and overlapping shade, and can help robots formulate picking strategies (picking order and path planning) in orchard environments so as to avoid collisions caused by occlusions, and consequently provides grape robots with an auxiliary decision-making method for orchard picking.

Key words: grape, robot, YOLO-v4, attention mechanism, object detection

中图分类号: