English

智能化农业装备学报(中英文) ›› 2025, Vol. 6 ›› Issue (2): 35-43.DOI: 10.12398/j.issn.2096-7217.2025.02.003

• • 上一篇    下一篇

果园作业机器人自主导航多任务联合感知方法研究

张津国1,2(), 蔡建峰1,2, 姜蓉蓉3, 余山山4, 王蓬勃1,2()   

  1. 1.苏州大学机电工程学院,江苏 苏州,215137
    2.江苏省具身智能机器人技术重点实验室,江苏 苏州,215137
    3.苏州漕阳生态农业发展有限公司,江苏 苏州,215143
    4.农业农村部南京农业机械化研究所,江苏 南京,210014
  • 收稿日期:2025-04-10 修回日期:2025-05-10 出版日期:2025-05-15 发布日期:2025-05-20
  • 通讯作者: 王蓬勃
  • 作者简介:张津国,男,2000年生,江苏连云港人,硕士研究生;研究方向为农业机器人导航技术。E-mail: 20225229128@stu.suda.edu.cn
  • 基金资助:
    国家重点研发计划项目(2022YFB4702202);江苏省农业农村厅农机研发制造推广应用一体化试点项目(JSYTH07)

Multi-task joint perception framework for autonomous navigation in orchard robotics

ZHANG Jinguo1,2(), CAI Jianfeng1,2, JIANG Rongrong3, YU Shanshan4, WANG Pengbo1,2()   

  1. 1.College of Mechanical and Electrical Engineering,Soochow University,Suzhou 215137,China
    2.Jiangsu Key Laboratory of Embodied Intelligent Robot Technology,Suzhou 215137,China
    3.Suzhou Caoyang Ecological Agriculture Development Co. ,Ltd. ,Suzhou 215143,China
    4.Nanjing Institute of Agricultural Mechanization,Ministry of Agriculture and Rural Affairs,Nanjing 210014,China
  • Received:2025-04-10 Revised:2025-05-10 Online:2025-05-15 Published:2025-05-20
  • Contact: WANG Pengbo
  • About author:ZHANG Jinguo, E-mail:1432263736@qq.com
  • Supported by:
    National Key Research and Development Program of China(2022YFB4702202);Integrated Pilot Project for Agricultural Machinery Research, Development, Manufacturing, Promotion and Application of Jiangsu Provincial Department of Agriculture and Rural Affairs(JSYTH07)

摘要:

针对果园复杂作业场景中植被异质性高、光照动态变化显著以及目标形态多样性等挑战,传统基于单任务学习的视觉感知模型因特征复用率低、计算冗余度高等问题,难以满足农业机器人实时环境感知需求。本研究提出一种面向果园场景的轻量化多任务联合感知框架AgriYOLOP,通过系统性地重构YOLOP网络架构,引入高效的主干网络,并添加增强的无锚检测技术以及增加了特征金字塔网络和路径聚合网络,并设计任务自适应的损失函数权重策略,实现树干检测、障碍物识别与可通行区域分割的并行化协同处理。在自主构建的果园数据集中进行验证,该数据集包含果园场景下4 765张分辨率为1 280像素×720像素的图像,涵盖不同季节、光照条件及植被生长阶段。试验结果表明,AgriYOLOP模型在目标检测任务中检测准确率、召回率以及mAP50分别达到92.7%、94.6%以及96.7%,在可行区域分割任务下召回率、F1分数以及mIoU达到98.3%、99.1%、98.1%,在NVIDIA RTX 4060平台下实现69f/s的实时推理速度,模型参数量仅14 M。对比结果表明,多任务协同学习架构能够显著提升特征共享效率,较单任务模型减少32.6%推理延迟,同时增强了对光照变化与季节特征的鲁棒性,有效地解决了农业机器人实时作业场景中目标检测精度与语义分割效率的协同优化难题。本研究为果园机器人自主导航提供了高精度、低延迟的实时感知解决方案。

关键词: 多任务学习, 果园环境感知, 目标检测, 语义分割, 农业机器人

Abstract:

Orchard operational scenarios present significant challenges for visual perception, including high vegetation heterogeneity, dynamic lighting variations, and diverse target morphologies. Traditional single-task visual perception models suffer from low feature reusability and high computational redundancy, thereby inadequately addressing real-time environmental perception demands for agricultural robots. This study proposes AgriYOLOP, a lightweight multi-task collaborative perception framework specifically designed for orchard environments. Through a systematic reconstruction of the YOLOP architecture, AgriYOLOP incorporates an efficient backbone network, enhanced anchor-free detection techniques, feature pyramid networks (FPN), path aggregation networks (PAN), and task-adaptive loss function weighting strategies. This framework faciliates parallel collaborative processing of three critical percetption tasks: trunk detection, obstacle recognition, and traversable region segmentation. The proposed framework was validated on a self-constructed orchard dataset comprising 4 765 images (1 280 pixels×720 pixels), captured across diverse seasons, lighting conditions, and vegetation growth stages. Experimental results demonstrate that AgriYOLOP achieves 92.7% precision, 94.6% recall, and 96.7% mAP50 in object detection tasks, along with 98.3% recall, 99.1 F1 score, and 98.1% mIoU in traversable region segmentation. Deployed on an NVIDIA RTX 4060 platform, the model attains 69 f/s real-time inference speed with only 14 M parameters. Comparative experiments reveal that the multi-task collaborative architecture significantly enhances feature-sharing efficiency, reducing inference latency by 32.6% compared to single-task models while improving robustness to illumination and seasonal variations. This approach effectively mitigates the conventional trade-off between target detection accuracy and semantic segmentation efficiency encountered in real-time agricultural robotic applications. The study provides a high-precision, low-latency real-time perception solution for autonomous orchard robot navigation.

Key words: multi-task learning, orchard environment perception, target detection, semantic segmentation, agricultural robot

中图分类号: