Inference serving geospatial foundation models


Red Hat AI Inference Server 3.3

Inference serving geospatial foundation models with Red Hat AI Inference Server

Red Hat AI Documentation Team

Abstract

Learn about inference serving IBM Prithvi geospatial foundation models using Red Hat AI Inference Server with TerraTorch for segmentation tasks such as flood detection and burn scar mapping.

Preface

Serve IBM and NASA Prithvi geospatial foundation models using Red Hat AI Inference Server and TerraTorch for satellite imagery analysis.

Chapter 1. About geospatial inference

Geospatial models use the Vision Transformer (ViT) architecture to analyze satellite imagery and remote sensing data for applications such as environmental monitoring, land use classification, and climate analysis. Prithvi models are developed in collaboration with IBM and NASA.

IBM and NASA Prithvi geospatial foundation models are pre-trained on large datasets of satellite and aerial imagery. These models are trained on general representation of Earth observation data that can be fine-tuned for specific tasks.

Prithvi geospatial foundation models use a Vision Transformer (ViT) architecture that adapts the transformer model, originally designed for natural language processing, to process image data. ViT divides images into fixed-size patches, which are then processed as sequences similar to tokens in text.

For geospatial applications, ViT models can process multi-spectral satellite imagery with multiple input bands, enabling analysis beyond standard RGB imagery.

You can fine-tune geospatial foundation models using TerraTorch, an open-source library for fine-tuning and inference of geospatial foundation models.

You can find out more about the Prithvi models at huggingface.co/ibm-nasa-geospatial.

Serve IBM and NASA Prithvi geospatial foundation models using AI Inference Server and TerraTorch for satellite imagery analysis.

Prerequisites

  • You have installed Podman or Docker.
  • You are logged in as a user with sudo access.
  • You have access to registry.redhat.io and have logged in.
  • You have a Hugging Face account and have generated a Hugging Face access token.
  • You have access to a Linux server with data center grade NVIDIA AI accelerators installed.

  • You have satellite imagery data in a supported format such as GeoTIFF.

Procedure

  1. Open a terminal on your server host, and log in to registry.redhat.io:

    $ podman login registry.redhat.io
    Copy to Clipboard Toggle word wrap
  2. Pull the AI Inference Server NVIDIA CUDA container image:

    $ podman pull registry.redhat.io/rhaiis/vllm-cuda-rhel9:3.3.0
    Copy to Clipboard Toggle word wrap
  3. If your system has SELinux enabled, configure SELinux to allow device access:

    $ sudo setsebool -P container_use_devices 1
    Copy to Clipboard Toggle word wrap
  4. Create a volume and mount it into the container. Adjust the container permissions so that the container can use it.

    $ mkdir -p rhaiis-cache
    Copy to Clipboard Toggle word wrap
    $ chmod g+rwX rhaiis-cache
    Copy to Clipboard Toggle word wrap
  5. Add your HF_TOKEN Hugging Face token to the private.env file. Source the private.env file.

    $ echo "export HF_TOKEN=<your_HF_token>" > private.env
    Copy to Clipboard Toggle word wrap
    $ source private.env
    Copy to Clipboard Toggle word wrap
  6. Start the AI Inference Server container image.

    1. For NVIDIA CUDA accelerators, if the host system has multiple GPUs and uses NVSwitch, then start NVIDIA Fabric Manager. To detect if your system is using NVSwitch, first check if files are present in /proc/driver/nvidia-nvswitch/devices/, and then start NVIDIA Fabric Manager. Starting NVIDIA Fabric Manager requires root privileges.

      $ ls /proc/driver/nvidia-nvswitch/devices/
      Copy to Clipboard Toggle word wrap

      PCI device addresses for each available AI accelerator are returned.

      $ systemctl start nvidia-fabricmanager
      Copy to Clipboard Toggle word wrap
      Important

      NVIDIA Fabric Manager is only required on systems with multiple GPUs that use NVSwitch. For more information, see NVIDIA Server Architectures.

    2. Check that the AI Inference Server container can access NVIDIA GPUs on the host by running the following command:

      $ podman run --rm -it \
      --security-opt=label=disable \
      --device nvidia.com/gpu=all \
      nvcr.io/nvidia/cuda:12.4.1-base-ubi9 \
      nvidia-smi
      Copy to Clipboard Toggle word wrap

      All available AI accelerators are returned.

    3. Start the container with the TerraTorch backend and the Prithvi geospatial model.

      $ podman run --rm -it \
        --device nvidia.com/gpu=all \
        --security-opt=label=disable \
        --shm-size=4g \
        -p 8000:8000 \
        --userns=keep-id:uid=1001 \
        --env "HUGGING_FACE_HUB_TOKEN=$HF_TOKEN" \
        --env "HF_HUB_OFFLINE=0" \
        -v ./rhaiis-cache:/opt/app-root/src/.cache:Z \
        registry.redhat.io/rhaiis/vllm-cuda-rhel9:3.3.0 \
        --model ibm-nasa-geospatial/Prithvi-EO-2.0-300M-TL-Sen1Floods11 \
        --skip-tokenizer-init \
        --enforce-eager \
        --io-processor-plugin terratorch_segmentation \
        --enable-mm-embeds
      Copy to Clipboard Toggle word wrap

      For detailed information about TerraTorch server arguments and configuration options, see TerraTorch configuration options.

  7. In a separate tab in your terminal, send an inference request with your geospatial data.

    $ curl -X POST http://localhost:8000/pooling \
      -H "Content-Type: application/json" \
      -d '{
        "model": "ibm-nasa-geospatial/Prithvi-EO-2.0-300M-TL-Sen1Floods11",
        "data": {
          "data": "https://<your_sample_geospatial_image>.tiff",
          "data_format": "url",
          "image_format": "tiff",
          "out_data_format": "b64_json"
        },
        "priority": 0
      }'
    Copy to Clipboard Toggle word wrap

    Example output

    {
      "request_id": "pool-98f71fcf667df37b",
      "created_at": 1770725528,
      "data": {
        "data_format": "b64_json",
        "data": "<BASE64_ENCODED_TIFF_DATA>",
        "request_id": "pool-98f71fcf667df37b"
      }
    }
    Copy to Clipboard Toggle word wrap

    The model returns a JSON response containing base64-encoded prediction data. Decode the data.data field to retrieve the output GeoTIFF file containing segmentation results.

Use the Red Hat AI Inference Server server arguments when starting AI Inference Server with the TerraTorch backend for geospatial model serving.

Expand
Table 3.1. Required Red Hat AI Inference Server server arguments for TerraTorch
ArgumentDescription

--skip-tokenizer-init

Skips tokenizer initialization. Vision models do not require a tokenizer.

--enforce-eager

Disables CUDA graph optimization for compatibility with geospatial model architectures.

--io-processor-plugin terratorch_segmentation

Specifies the I/O processor plugin for segmentation tasks.

--enable-mm-embeds

Enables multimodal embeddings for processing geospatial imagery.

Geospatial model serving with TerraTorch exposes the /pooling POST API endpoint for geospatial imagery inference requests.

Example request payload

{
  "model": "ibm-nasa-geospatial/Prithvi-EO-2.0-300M-TL-Sen1Floods11",
  "data": {
    "data": "https://huggingface.co/ibm-nasa-geospatial/Prithvi-EO-2.0-300M-TL-Sen1Floods11/resolve/main/examples/India_900498_S2Hand.tif",
    "data_format": "url",
    "image_format": "tiff",
    "out_data_format": "b64_json"
  },
  "priority": 0
}
Copy to Clipboard Toggle word wrap

Legal Notice

Copyright © Red Hat.
Except as otherwise noted below, the text of and illustrations in this documentation are licensed by Red Hat under the Creative Commons Attribution–Share Alike 3.0 Unported license . If you distribute this document or an adaptation of it, you must provide the URL for the original version.
Red Hat, as the licensor of this document, waives the right to enforce, and agrees not to assert, Section 4d of CC-BY-SA to the fullest extent permitted by applicable law.
Red Hat, the Red Hat logo, JBoss, Hibernate, and RHCE are trademarks or registered trademarks of Red Hat, LLC. or its subsidiaries in the United States and other countries.
Linux® is the registered trademark of Linus Torvalds in the United States and other countries.
XFS is a trademark or registered trademark of Hewlett Packard Enterprise Development LP or its subsidiaries in the United States and other countries.
The OpenStack® Word Mark and OpenStack logo are trademarks or registered trademarks of the Linux Foundation, used under license.
All other trademarks are the property of their respective owners.
Red Hat logoGithubredditYoutubeTwitter

Learn

Try, buy, & sell

Communities

About Red Hat Documentation

We help Red Hat users innovate and achieve their goals with our products and services with content they can trust. Explore our recent updates.

Making open source more inclusive

Red Hat is committed to replacing problematic language in our code, documentation, and web properties. For more details, see the Red Hat Blog.

About Red Hat

We deliver hardened solutions that make it easier for enterprises to work across platforms and environments, from the core datacenter to the network edge.

Theme

© 2026 Red Hat
Back to top