Spectrally Distilled Representations Aligned with Instruction-Augmented LLMs for Satellite Imagery

Overview

We propose SATtxt, a spectrum-aware VLFM for satellite imagery that leverages spectral priors while operating exclusively on RGB inputs at inference:

Representation Distillation (SRD), which transfers multi-spectral priors into an RGB-based representation space, enabling spectrum-aware reasoning without multispectral inputs during inference
Grounded Alignment with Instruction-Augmented LLMs (SGI-LLM), an alignment stage that bridges spectrally distilled visual representations into the space of instructionaugmented LLM embeddings via lightweight projectors, thereby producing spectrally grounded and semantically expressive cross-modal representation

How is SATtxt pre-trained?

Pre-training dataset

SATtxt is pre-trained on SL4EO-S12 v1.1 with captions obtained from LLaMA3-SSL4EO-S12-v1.1-captions, a large-scale global dataset comprising approximately 1 million images from Sentinel-2 satellite. Figure 1 illustrates its worldwide geographic coverage.

Interpolate start reference image. — Geographic coverage of SATtxt's pre-trained dataset, SL4EO-S12 v1.1. (training (green) and validation (magenta) samples) (This image is adapted from SL4EO-S12 v1.1 publication)

Pre-training workflow

Related Work

Clive Tinashe Marimo et. al. Beyond the Visible: Multispectral Vision-Language Learning for Earth Observation. ECML PKDD 2025

Johannes Jakubik et. al. TerraMind: Large-Scale Generative Multimodality for Earth Observation. ICCV 2025

Danfeng Hong et. al. Spectralgpt: Spectral remote sensing foundation model. IEEE TPAMI 2024

BibTeX


@misc{do2026sattxt,
      title={Spectrally Distilled Representations Aligned with Instruction-Augmented LLMs for Satellite Imagery}, 
      author={Minh Kha Do and Wei Xiang and Kang Han and Di Wu and Khoa Phan and Yi-Ping Phoebe Chen and Gaowen Liu and Ramana Rao Kompella},
      year={2026},
      eprint={2602.22613},
      archivePrefix={arXiv},
      primaryClass={cs.CV},
      url={https://arxiv.org/abs/2602.22613}, 
}