首页 /研究 /Enhancing RGB-D Mirror Segmentation With a Neighborhood-Matching and Demand-Modal Adaptive Network Using Knowledge Distillation
PERCEPTION

Enhancing RGB-D Mirror Segmentation With a Neighborhood-Matching and Demand-Modal Adaptive Network Using Knowledge Distillation

Wujie Zhou, Han Zhang, Yuanyuan Liu, Ting Luo

发表年份
2025
引用次数
6

摘要

Recent breakthroughs in computer vision have led to remarkable progress in the areas of autonomous vehicles and robotics. However, ordinary objects such as mirrors pose unique challenges to computer vision systems owing to occlusion, reflection, and distortion. Moreover, existing deep learning models suffer from issues such as excessive parameters and high computational complexity, making it challenging to implement numerous studies offline. To address these issues, we propose an innovative solution: a neighborhood-matching and demand-modal adaptive network using knowledge distillation (KD), called NDANet-S<inline-formula xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink"> <tex-math notation="LaTeX">$^{\ast }$ </tex-math></inline-formula>, specifically designed for red-green-blue depth mirror segmentation. NDANet-S<inline-formula xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink"> <tex-math notation="LaTeX">$^{\ast }$ </tex-math></inline-formula> operates by iteratively matching detailed and semantic difference between neighborhood features during the encoding phase. It then complements information across different modalities through demand-modal adaptation, enhancing heteromodal cross-complementation during the KD stage. In the decoding phase, semantic enhancement features and iterative encoding features are deeply integrated, forming a strong foundation for multistage progressive knowledge transfer in the KD process. Furthermore, we introduce a multistage teacher-assisted KD scheme, guided by sample complexity, to work synergistically with the mirror segmentation model. This innovative scheme includes a sample complexity rater, heterogeneous cross-complementarity, and hierarchical progressive knowledge transfer. Experimental evaluations on publicly available datasets indicate that NDANet-S<inline-formula xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink"> <tex-math notation="LaTeX">$^{\ast }$ </tex-math></inline-formula> significantly enhances segmentation accuracy while preserving a consistent number of parameters. Additionally, it achieves state-of-the-art performance in mirror segmentation. The source code for our model is publicly available and can be accessed at: <uri xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">https://github.com/2021nihao/NMDANet</uri>. Note to Practitioners—This study presents a neighborhood-matching and demand-modal adaptive network for RGB-D mirror segmentation, incorporating knowledge distillation (KD). The combination of the KD framework and segmentation network enables sample complexity discrimination, cross-modal distillation, and multilevel distillation (guided by the former) to achieve targeted distillation. The newly proposed NDANet-S<inline-formula xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink"> <tex-math notation="LaTeX">$^{\ast }$ </tex-math></inline-formula> surpasses current state-of-the-art methods, with a reduction of 93.2% in parameters and 87.65% in floating-point operations (FLOPs) compared to NDANet-T.

关键词

SegmentationArtificial intelligenceMatching (statistics)RGB color modelModalDistillationComputer visionComputer scienceImage segmentationPattern recognition (psychology)

相关论文

查看 PERCEPTION 分类全部论文