VLM-Guided Experience Replay
Elad Sharony, Tom Jurgenson, Orr Krupnik, Dotan Di Castro, Shie Mannor
- 发表年份
- 2026
- 访问权限
- 开放获取
摘要
Recent advances in Large Language Models (LLMs) and Vision-Language Models (VLMs) have enabled powerful semantic and multimodal reasoning capabilities, creating new opportunities to enhance sample efficiency, high-level planning, and interpretability in reinforcement learning (RL). While prior work has integrated LLMs and VLMs into various components of RL, the replay buffer, a core component for storing and reusing experiences, remains unexplored. We propose addressing this gap by leveraging VLMs to guide the prioritization of experiences in the replay buffer. Our key idea is to use a frozen, pre-trained VLM (requiring no fine-tuning) as an automated evaluator to identify and prioritize promising sub-trajectories from the agent's experiences. Across scenarios, including game-playing and robotics, spanning both discrete and continuous domains, agents trained with our proposed prioritization method achieve 11-52% higher average success rates and improve sample efficiency by 19-45% compared to previous approaches. https://esharony.me/projects/vlm-rb/
关键词
相关论文
The Organization of Behavior
D. O. Hebb
2005
Fractional Brownian Motions, Fractional Noises and Applications
Benoît B. Mandelbrot, John W. Van Ness
1968
Review of deep learning: concepts, CNN architectures, challenges, applications, future directions
Laith Alzubaidi, Jinglan Zhang, Amjad J. Humaidi 等 10 位作者
2021
A guide to deep learning in healthcare
Andre Esteva, Alexandre Robicquet, Bharath Ramsundar 等 10 位作者
2018