首页 /研究 /C$^3$ache: Accelerating World Action Models with Cross Inference Chunk Cache
OTHER

C$^3$ache: Accelerating World Action Models with Cross Inference Chunk Cache

Weisen Zhao, Lam Nguyen, Zhicong Lu, Yuzhang Shang

发表年份
2026
访问权限
开放获取

摘要

World Action Models (WAMs) generalize better than standard Vision-Language-Action (VLA) policies to novel motions and environments, because a video-modeling objective lets them learn from abundant unlabeled video rather than scarce labeled robot demonstrations. This generalization is computationally expensive. To complete a task, a WAM runs over multiple inference chunks, and each chunk requires a costly denoising process. Existing acceleration methods reduce this cost by caching and reusing computation within a single chunk's denoising trajectory. Our empirical analysis reveals a substantial source of redundancy they overlook: redundancy across chunks. When a robot executes a smooth behavior, the residuals computed at a given denoising step are strongly correlated from one chunk to the next. We introduce C$^3$ache, a training-free method that caches and reuses these residuals across inference chunks at the same denoising step. Experiments on benchmarks with a Fast-WAM backbone show that C$^3$ache achieves up to a $2.5\times$ speedup in total wall-clock inference time, with negligible degradation in task success rate.

关键词

cs.LGcs.CVcs.RO

相关论文

查看 OTHER 分类全部论文