Home /Research /SpatialVLA: Exploring Spatial Representations for Visual-Language-Action Models

LEARNING

SpatialVLA: Exploring Spatial Representations for Visual-Language-Action Models

Delin Qu, Haoming Song, Qizhi Chen, Yuanqi Yao, Xinyi Ye, Jiayuan Gu, Zhigang Wang, Yan Ding, Bin Zhao, Dong Wang

Year: 2025
Citations: 15
Access: Open access

Abstract

and real-world setups, where the pre-learned action grids are re-discretized to capture robot-specific spatial action movements of new setups.The superior results from extensive evaluations demonstrate the exceptional in-distribution generalization and out-of-distribution adaptation capability, highlighting the crucial benefit of the proposed spatial-aware representations for generalist robot policy learning.All the details and codes are opensourced.

Keywords

Representation (politics)Field (mathematics)Perspective (graphical)Feature (linguistics)Set (abstract data type)

SpatialVLA: Exploring Spatial Representations for Visual-Language-Action Models

Abstract

Keywords

Related papers

Self-Organizing Maps

Machine learning a probabilistic perspective

The Organization of Behavior

The spread of true and false news online