首页 /研究 /DDM-YOLO: A lightweight oriented detection model for mature daylily fruits in complex environments
PERCEPTION

DDM-YOLO: A lightweight oriented detection model for mature daylily fruits in complex environments

Minqiu Kuang, Xuejie Zou, Fangping Xie, Xiaojian Li, Shang Chen, Dawei Liu, Yuxuan Zhang, Sebastian Bader, Xiangjun Zou, Xu Li

发表年份
2026
引用次数
4

摘要

Abstract Accurate and robust recognition of daylily flower buds at the pre-bloom stage is essential for timely harvesting and quality preservation, yet remains highly challenging under natural field conditions due to the buds’ slender morphology, diverse orientations, dense distribution, and frequent occlusion by foliage. Existing horizontal-box object detectors struggle to capture the orientation and geometric structure of daylily buds, leading to inaccurate localization and unreliable guidance for automated harvesting. To address these challenges, we propose DDM-YOLO, a lightweight orientation-aware detection model tailored for daylily production environments. The model integrates three key components: (i) a Multi-scale Adaptive Feature Pyramid Network (MAFPN) that enhances the extraction and fusion of multi-dimensional features for densely distributed and occluded slender buds; (ii) a Lightweight Adaptive Direction-aware Head (LADH) that dynamically optimizes angle regression for rotated bounding boxes, improving orientation stability and reducing localization bias; and (iii) an Adaptive Down-sampling module (Adown) that preserves structurally critical spatial cues while reducing model complexity. Experiments conducted on a custom daylily field dataset demonstrate that DDM-YOLO achieves 96.8% precision and 98.1% mAP50, outperforming the baseline YOLOv11n-OBB by 1.3 percentage points in mAP while reducing model parameters by 17.0% to 2.2M. Deployment verification using a PySide5-based visualization prototype demonstrated a total system-level latency of less than 0.2 s, a duration encompassing the cumulative overhead of image input and output, pre-processing, post-processing including non-maximum suppression, and interface rendering. Furthermore, physical deployment on an NVIDIA Jetson AGX Orin embedded platform utilizing TensorRT optimization achieved an impressive inference speed of 114.5 FPS, corresponding to approximately 8.7 ms per frame. This performance confirms that the model meets the stringent real-time requirements for edge computing in mobile agricultural robotics. The model efficiently and accurately performs oriented detection and harvesting pose estimation for daylily buds, providing critical technical support for the visual perception system of harvesting robots.

关键词

Minimum bounding boxOrientation (vector space)Latency (audio)VisualizationObject detectionSoftware deploymentFeature extractionInterface (matter)Pyramid (geometry)

相关论文

查看 PERCEPTION 分类全部论文