We propose RobSense, a robust multi-modal foundation model designed for multi-spectral and Synthetic Aperture Radar (SAR) data. RobSense:
Robsense is pre-trained on Satlas, a large-scale global dataset comprising approximately 12 million images from Sentinel-1 and Sentinel-2 satellites. Figure 1 illustrates its worldwide geographic coverage.
Favyen Bastani et. al. Satlaspretrain: A large-scale dataset for remote sensing image understanding. ICCV 2023
Anthony Fuller et. al. CROMA: Remote Sensing Representations with Contrastive Radar-Optical Masked Autoencoders. NeurIPS 2023
Mubashir Noman et. al. Rethinking transformers pre-training for multi- spectral satellite imagery. CVPR 2024
Danfeng Hong et. al. Spectralgpt: Spectral remote sensing foundation model. IEEE TPAMI 2024
@article{robsense2025, title={RobSense: A Robust Multi-modal Foundation Model for Remote Sensing with Static, Temporal, and Incomplete Data Adaptability}, author={Minh Kha Do and Kang Han and Phu Lai and Khoa T. Phan and Wei Xiang}, journal={2025 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)}, year={2025}, }