Information Filtering via Variational Regularization for Robot Manipulation
Jinhao Zhang, Wenlong Xia, Yaojia Wang, Zhexuan Zhou, Huizhe Li, Yichen Lai, Haoming Song, Youmin Gong, Jie Mei
- Year
- 2026
- Access
- Open access
Abstract
Diffusion-based visuomotor policies built on 3D visual representations have achieved strong performance in learning complex robotic skills. However, most existing methods employ an oversized denoising decoder. While increasing model capacity can improve denoising, empirical evidence suggests that it also introduces redundancy and noise in intermediate feature blocks. Crucially, we find that randomly masking backbone features in U-Net or skipping intermediate layers in DiT at inference time (without changing training) can improve performance, confirming the presence of task-irrelevant noise in intermediate features. To this end, we propose Variational Regularization (VR), a plug-and-play module that imposes a context-conditioned Gaussian over the noisy features and applies a KL-divergence regularizer, forming an adaptive information bottleneck. Extensive experiments on three simulation benchmarks, RoboTwin2.0, Adroit, and MetaWorld, show that our approach consistently improves task success rates over the baseline for both DP3-UNet and DP3-DiT, achieving new state-of-the-art results. Real-world experiments further demonstrate that our method performs well in practical deployments.
Keywords
Related papers
Real-Time Obstacle Avoidance for Manipulators and Mobile Robots
Oussama Khatib
1986
A Mathematical Introduction to Robotic Manipulation
Richard M. Murray, Zexiang Li, Shankar Sastry
2017
Robot dynamics and control
Mark W. Spong
1989
A tutorial on visual servo control
Seth Hutchinson, Gregory D. Hager, Peter Corke
1996