首页 /研究 /Enhancing object pose estimation for RGB images in cluttered scenes
LEARNING

Enhancing object pose estimation for RGB images in cluttered scenes

Metwalli Al-Selwi, Ning Huang, Gao Yin, Yan Chao, Qiming Li, Jun Li

发表年份
2025
引用次数
17
访问权限
开放获取

摘要

Estimating the 6D pose of objects is crucial for robots to interact with the environment. 6D Object pose estimation from RGB images in a cluttered scene and heavy occlusions is a critical issue. Most existing methods use two stages to estimate object pose: First, extract the object features, and then use the PnP/RANSAC method to estimate object pose. However, most of these techniques merely localize a group of key-points by regressing their coordinates, which are vulnerable to occlusion and have poor performance for multi-object pose estimation. These methods cannot directly regress the 6D pose estimation from a loss during training. In this paper, we propose a framework based on convolutional neural network (CNN) and self-attention mechanism as an end-to-end method for single and multi-object 6D pose estimation using RGB images with low computational cost. Our method utilizes feature fusion to extract local features and combines multi-head self-attention (MHSA) with iterative refinement to improve pose estimation performance. Furthermore, our method can be scaled according to computational resources. Our experiments illustrate that our method performs in benchmark datasets the Linemod and Occlusion Linemod and achieves 97.45% and 84.84% in terms of the ADD(-S) metric in both datasets, respectively.

关键词

Artificial intelligenceComputer visionPoseComputer scienceRGB color modelObject (grammar)EstimationPattern recognition (psychology)

相关论文

查看 LEARNING 分类全部论文