LEARNING
SpatialVLA: Exploring Spatial Representations for Visual-Language-Action Models
Delin Qu, Haoming Song, Qizhi Chen, Yuanqi Yao, Xinyi Ye, Jiayuan Gu, Zhigang Wang, Yan Ding, Bin Zhao, Dong Wang
- Year
- 2025
- Citations
- 15
- Access
- Open access
Abstract
and real-world setups, where the pre-learned action grids are re-discretized to capture robot-specific spatial action movements of new setups.The superior results from extensive evaluations demonstrate the exceptional in-distribution generalization and out-of-distribution adaptation capability, highlighting the crucial benefit of the proposed spatial-aware representations for generalist robot policy learning.All the details and codes are opensourced.
Keywords
Representation (politics)Field (mathematics)Perspective (graphical)Feature (linguistics)Set (abstract data type)
Related papers
OTHER
📊 10,390 cites
Self-Organizing Maps
Teuvo Kohonen
1995
PERCEPTION
📊 9,328 cites
Machine learning a probabilistic perspective
Kevin P. Murphy
2012
LEARNING
📊 8,465 cites
The Organization of Behavior
D. O. Hebb
2005
OTHER
📊 8,310 cites
The spread of true and false news online
Soroush Vosoughi, Deb Roy, Sinan Aral
2018