VP-VLA-Robocasa-Tabletop

This repository contains the VP-VLA policy checkpoint trained for RoboCasa tabletop manipulation.

VP-VLA uses visual prompts as an interface for vision-language-action models: a high-level planner converts language instructions into visual prompts, and the policy follows those prompts to produce robot actions.

Usage

Use this checkpoint with the released VP-VLA codebase:

Please follow the installation and evaluation instructions in the VP-VLA repository, then pass this checkpoint path to the RoboCasa tabletop evaluation script.

Citation

If you use this model, please cite the VP-VLA paper:

https://huggingface.co/papers/2603.22003

Downloads last month
10
Video Preview
loading

Collection including Vincent2311/VP-VLA-Robocasa-Tabletop

Paper for Vincent2311/VP-VLA-Robocasa-Tabletop