--- license: apache-2.0 pipeline_tag: image-segmentation tags: - pytorch - self-supervised - transformers - multimodal - remote sensing library_name: pytorch datasets: - allenai/s2-naip --- We introduce **MAESTRO**, a tailored adaptation of the Masked Autoencoder (MAE) framework that effectively orchestrates the use of multimodal, multitemporal, and multispectral Earth Observation (EO) data. Evaluated on four EO datasets, MAESTRO sets a new state-of-the-art on tasks that strongly rely on multitemporal dynamics, while remaining highly competitive on tasks dominated by a single monotemporal modality. Our contributions are as follows: - **Extensive benchmarking of multimodal and multitemporal SSL:** Impact evaluation of various fusion strategies for multimodal and multitemporal SSL. - **Patch-group-wise normalization:** Novel normalization scheme that normalizes reconstruction targets patch-wise within groups of highly correlated spectral bands. - **MAESTRO:** Novel adaptation of the MAE that combines optimized fusion strategies with our tailored patch-group-wise normalization..
Classes distribution.
💻 **Code repository:** https://github.com/IGNF/MAESTRO
📃 **Paper:** https://arxiv.org/abs/2508.10894
## Pre-training Dataset
## 🔎 Cross-dataset Evaluation Benchmark results on 4 datasets :

| Model | Pre-training dataset | TreeSatAI-TS | PASTIS-HD | FLAIR#2 | FLAIR-HUB | |--------------------|-----------------------|--------------|-----------|---------|-----------| | MAESTRO (ours) | FLAIR-HUB | **79.6** | **68.0** | - | - | | MAESTRO (ours) | S2-NAIP urban | 78.8 | 67.4 | 62.6 | 64.6 | | DINO-v2 | LVD-142M | 76.7 | 64.4 | **64.2**| 66.0 | | DINO-v2 sat. | Maxar Vivid2 | 76.3 | 64.0 | 63.5 | **66.0** | | DOFA | DOFA MM | 76.0 | 62.9 | 62.3 | 65.1 | | CROMA | SSL4EO | 70.5 | 65.0 | 39.0 | 44.3 | | Prithvi-EO-2.0 | HLS | 75.6 | 66.2 | 41.8 | 44.9 | | SatMAE | fMoW RGB+S | 76.9 | 66.6 | 42.5 | 45.0 |


## 🚀 Getting Started First, set up the module with [Poetry](https://python-poetry.org/). ```bash # 1. Change directory cd MAESTRO # 2. Install dependencies with Poetry poetry install ``` Then, you can start from the following minimal examples. Intra-dataset MAESTRO on TreeSatAI-TS: ```bash # pre-train, probe and finetune on TreeSatAI-TS poetry run python main.py \ model.model=mae model.model_size=medium \ opt_pretrain.epochs=100 opt_probe.epochs=10 opt_finetune.epochs=50 \ datasets.name_dataset=treesatai_ts \ datasets.root_dir=/path/to/dataset/dir datasets.treesatai_ts.rel_dir=TreeSatAI-TS \ run.exp_dir=/path/to/experiments/dir run.exp_name=mae-m_treesat ``` Intra-dataset MAESTRO on PASTIS-HD: ```bash # pre-train, probe and finetune on PASTIS-HD poetry run python main.py \ model.model=mae model.model_size=medium \ opt_pretrain.epochs=100 opt_probe.epochs=10 opt_finetune.epochs=50 \ datasets.name_dataset=pastis_hd \ datasets.root_dir=/path/to/dataset/dir datasets.pastis_hd.rel_dir=PASTIS-HD \ run.exp_dir=/path/to/experiments/dir run.exp_name=mae-m_pastis ``` Intra-dataset MAESTRO on FLAIR-HUB: ```bash # pre-train, probe and finetune on FLAIR-HUB poetry run python main.py \ model.model=mae model.model_size=medium \ opt_pretrain.epochs=100 opt_probe.epochs=15 opt_finetune.epochs=100 \ datasets.name_dataset=flair \ datasets.root_dir=/path/to/dataset/dir datasets.flair.rel_dir=FLAIR-HUB \ run.exp_dir=/path/to/experiments/dir run.exp_name=mae-m_flair ``` Cross-dataset MAESTRO from S2-NAIP urban to TreeSatAI-TS: ```bash # pre-train on S2-NAIP urban poetry run python main.py \ model.model=mae model.model_size=medium \ opt_pretrain.epochs=15 opt_probe.epochs=0 opt_finetune.epochs=0 \ datasets.name_dataset=s2_naip \ datasets.root_dir=/path/to/dataset/dir datasets.s2_naip.rel_dir=s2-naip-urban \ run.exp_dir=/path/to/experiments/dir run.exp_name=mae-m_s2-naip && \ # probe and finetune on TreeSatAI-TS poetry run python main.py \ model.model=mae model.model_size=medium \ opt_pretrain.epochs=0 opt_probe.epochs=10 opt_finetune.epochs=50 \ datasets.name_dataset=treesatai_ts \ datasets.treesatai_ts.aerial.image_size=240 datasets.treesatai_ts.aerial.patch_size.mae=16 \ datasets.treesatai_ts.s1_asc.name_embed=s1 datasets.treesatai_ts.s1_des.name_embed=s1 \ datasets.root_dir=/path/to/dataset/dir datasets.treesatai_ts.rel_dir=TreeSatAI-TS \ run.exp_dir=/path/to/experiments/dir run.load_name=mae-m_s2-naip run.exp_name=mae-m_s2-naip-x-treesat ```
## Reference If you use this code, please cite: ```bibtex @article{labatie2025maestro, title={MAESTRO: Masked AutoEncoders for Multimodal, Multitemporal, and Multispectral Earth Observation Data}, author={Labatie, Antoine and Vaccaro, Michael and Lardiere, Nina and Garioud, Anatol and Gonthier, Nicolas}, journal={arXiv preprint arXiv:2508.10894}, year={2025} } ```
## Acknowledgement The experiments in the paper were conducted using HPC/AI resources from GENCI-IDRIS (allocations A0181013803, A0161013803, and AD010114597R1).