Cosmos-Surg-dVRK: World Foundation Model-based Automated Online Evaluation of Surgical Robot Policy Learning
Lukas Zbinden, Nigel Nelson, Juo-Tung Chen, Xinhao Chen, Ji Woong Kim, Mahdi Azizian, Axel Krieger, Sean Huver
- 发表年份
- 2025
- 访问权限
- 开放获取
摘要
The rise of surgical robots and vision-language-action models has accelerated the development of autonomous surgical policies and efficient assessment strategies. However, evaluating these policies directly on physical robotic platforms such as the da Vinci Research Kit (dVRK) remains hindered by high costs, time demands, reproducibility challenges, and variability in execution. World foundation models (WFM) for physical AI offer a transformative approach to simulate complex real-world surgical tasks, such as soft tissue deformation, with high fidelity. This work introduces Cosmos-Surg-dVRK, a surgical finetune of the Cosmos WFM, which, together with a trained video classifier, enables fully automated online evaluation and benchmarking of surgical policies. We evaluate Cosmos-Surg-dVRK using two distinct surgical datasets. On tabletop suture pad tasks, the automated pipeline achieves strong correlation between online rollouts in Cosmos-Surg-dVRK and policy outcomes on the real dVRK Si platform, as well as good agreement between human labelers and the V-JEPA 2-derived video classifier. Additionally, preliminary experiments with ex-vivo porcine cholecystectomy tasks in Cosmos-Surg-dVRK demonstrate promising alignment with real-world evaluations, highlighting the platform's potential for more complex surgical procedures.
关键词
相关论文
Campbell-Walsh urology
Alan J. Wein editor-in-chief
2012
Principles of Robot Motion: Theory, Algorithms, and Implementations
Howie Choset, Jean‐Claude Latombe
2005
Minimally Invasive versus Abdominal Radical Hysterectomy for Cervical Cancer
Pedro T. Ramírez, Michael Frumovitz, René Pareja 等 19 位作者
2018
Guideline for Management of the Clinical T1 Renal Mass
Steven C. Campbell, Andrew C. Novick, Arie S. Belldegrun 等 12 位作者
2009