首页 /研究 /RoVE: Rotary Value Embeddings Attention for Relative Position-dependent Value Pathways

PERCEPTION

RoVE: Rotary Value Embeddings Attention for Relative Position-dependent Value Pathways

Alejandro García-Castellanos, Maurice Weiler, Erik J Bekkers

发表年份: 2026
访问权限: 开放获取

摘要

Rotary Position Embeddings (RoPE) make attention scores position-relative but leave the value pathway position-blind: the message sent by a value token is the same regardless of its distance from the query. We propose RoVE, a parameter-free modification that makes values position-sensitive by rotating them simultaneously with keys, and show that it turns RoPE attention into attentive convolution. This new perspective unifies several independent formulations of the same operation across computer vision, robotics, and modern LLM architectures. Trained 124M and 354M GPT-2 models show consistent empirical gains over RoPE on few-shot in-context learning, out-of-distribution perplexity, and long-context retrieval, with the clearest improvements on tasks that require long-range aggregation.

关键词

cs.LGcs.AI

RoVE: Rotary Value Embeddings Attention for Relative Position-dependent Value Pathways

摘要

关键词

相关论文

Artificial intelligence: a modern approach

Are we ready for autonomous driving? The KITTI vision benchmark suite

TensorFlow: Large-Scale Machine Learning on Heterogeneous Distributed Systems

Vision meets robotics: The KITTI dataset