PaliGemma-Pi0.5 · LIBERO (all 4 suites, joint training, openpi-aligned)
Vision-Language-Action (VLA) checkpoint released with the AlphaBrain framework. Trained jointly on all four LIBERO suites — Goal, Spatial, Object, and Long — for direct evaluation across the full LIBERO benchmark without retraining.
This is the openpi-aligned Pi0.5 training recipe: PaliGemma 3B VLM
- a 300 M flow-matching Gemma action expert, with the exact architecture layout and MEAN_STD action normalization used by OpenAI's openpi. Released as the steps = 15 000 checkpoint of a 60 000-step budget run, which is the best-performing openpi-aligned checkpoint in AlphaBrain's PaliGemma-Pi0.5 family on LIBERO.
Overview
| Architecture | PaliGemmaPi0 (PaliGemma 3B + Pi0.5 flow-matching action expert) |
| Base VLM | google/paligemma-3b-pt-224 (bundled /datasets/peligemma equivalent) |
| Action expert | Gemma 300 M · depth=18, width=1024, mlp=4096, bfloat16 |
| Action head | Flow matching, action_dim=7, horizon=10, 10 inference steps |
| Training data | LIBERO · all 4 suites (Goal + Spatial + Object + Long) · dataset_mix=libero_all |
| Training type | Supervised fine-tuning (single run; not continual learning) |
| Normalization | MEAN_STD (action & state mean/std hardcoded in framework_config.yaml) |
| Attention | SDPA (no flash-attn pinning) |
| Optimiser | AdamW · lr = 5e-5 · cosine-with-min-lr · 10 000 warmup |
| Step budget | 15 000 (this release) / 60 000 planned |
| Hardware / batch | 4 × A800 80 GB · per_device_batch = 32 · grad_accum = 2 · effective 256 |
Results
Evaluated on all 4 LIBERO suites, 50 rollouts per task × 10 tasks per suite = 500 episodes per suite.
| Suite | Success Rate |
|---|---|
| LIBERO-Goal | 96.7 % |
| LIBERO-Spatial | 100.0 % |
| LIBERO-Object | 100.0 % |
| LIBERO-10 (Long) | 96.0 % |
| Avg (4-suite) | 98.2 % |
These numbers are very close to OpenPi's official Pi0.5 (98.0 / 98.8 / 98.2 / 92.4, avg 96.85) and establish a strong AlphaBrain Pi0.5 baseline at one quarter of the step budget (15 k vs 30 k).
Files
├── README.md model card
├── framework_config.yaml AlphaBrain framework configuration (contains MEAN_STD norm stats inline)
├── model.safetensors full VLA weights (~8.8 GB, includes VLM + action expert + flow-matching head)
├── resume_meta.json training metadata (completed_steps=15000, effective_bs=256)
└── paligemma_pretrained/ PaliGemma tokenizer + preprocessor configs
Usage
git clone https://github.com/AlphaBrainGroup/AlphaBrain.git
cd AlphaBrain
pip install -e .
export PALIGEMMA_TOKENIZER_PATH=/path/to/paligemma_pretrained # or bundled dir
export PI05_PRETRAINED_PATH=/path/to/this/download # if you want to fine-tune from here
huggingface-cli download AlphaBrainGroup/paligemma-pi0-libero-all4suite \
--local-dir ./paligemma_pi0_libero_all
python deployment/model_server/server_policy.py \
--ckpt_path ./paligemma_pi0_libero_all --port 10093 --use_bf16
For evaluation on any of the 4 LIBERO suites, see the LIBERO eval pipeline.
Reproduction
bash scripts/run_base_vla/train.sh paligemma_pi0_openpi_aligned_v3
Expect multi-day training on 4 × A800 80 GB for the full 60 000-step
schedule. The shipped framework_config.yaml captures the exact
framework configuration used for this 15 000-step checkpoint.
Notes
- Joint-training baseline, not continual learning. For CL releases of
AlphaBrain models see the
qwengr00t-cl-*/neurovla-cl-*/paligemma-pi0-cl-*repos. - Attention: SDPA — chosen so the checkpoint loads without a pinned
flash-attn wheel. Users can override to flash_attention_2 via
--framework.paligemma.attn_implementation=flash_attention_2if available. - MEAN_STD norm is baked into
framework_config.yaml. A separatedataset_statistics.jsonis not required for inference.
License
MIT — see the parent repository.
Citation
@misc{alphabrain2026,
title = {AlphaBrain: A Modular Open-Source Framework for Embodied Intelligence Research},
author = {AlphaBrain Team},
year = {2026},
url = {https://github.com/AlphaBrainGroup/AlphaBrain}
}
Model tree for AlphaBrainGroup/paligemma-pi0-libero-all4suite
Base model
google/paligemma-3b-pt-224