Home /Research /SpatialVLA: Exploring Spatial Representations for Visual-Language-Action Models
LEARNING

SpatialVLA: Exploring Spatial Representations for Visual-Language-Action Models

Delin Qu, Haoming Song, Qizhi Chen, Yuanqi Yao, Xinyi Ye, Jiayuan Gu, Zhigang Wang, Yan Ding, Bin Zhao, Dong Wang

Year
2025
Citations
15
Access
Open access

Abstract

and real-world setups, where the pre-learned action grids are re-discretized to capture robot-specific spatial action movements of new setups.The superior results from extensive evaluations demonstrate the exceptional in-distribution generalization and out-of-distribution adaptation capability, highlighting the crucial benefit of the proposed spatial-aware representations for generalist robot policy learning.All the details and codes are opensourced.

Keywords

Representation (politics)Field (mathematics)Perspective (graphical)Feature (linguistics)Set (abstract data type)

Related papers

Browse all LEARNING papers