Chapter 24. Using Podman in HPC environment
You can use Podman with Open MPI (Message Passing Interface) to run containers in a High Performance Computing (HPC) environment.
24.1. Using Podman with MPI
The example is based on the ring.c program taken from Open MPI. In this example, a value is passed around by all processes in a ring-like fashion. Each time the message passes rank 0, the value is decremented. When each process receives the 0 message, it passes it on to the next process and then quits. By passing the 0 first, every process gets the 0 message and can quit normally.
Prerequisites
-
The
container-tools
module is installed.
Procedure
Install Open MPI:
# yum install openmpi
To activate the environment modules, type:
$ . /etc/profile.d/modules.sh
Load the
mpi/openmpi-x86_64
module:$ module load mpi/openmpi-x86_64
Optionally, to automatically load
mpi/openmpi-x86_64
module, add this line to the.bashrc
file:$ echo "module load mpi/openmpi-x86_64" >> .bashrc
To combine
mpirun
andpodman
, create a container with the following definition:$ cat Containerfile FROM registry.access.redhat.com/ubi8/ubi RUN yum -y install openmpi-devel wget && \ yum clean all RUN wget https://raw.githubusercontent.com/open-mpi/ompi/master/test/simple/ring.c && \ /usr/lib64/openmpi/bin/mpicc ring.c -o /home/ring && \ rm -f ring.c
Build the container:
$ podman build --tag=mpi-ring .
Start the container. On a system with 4 CPUs this command starts 4 containers:
$ mpirun \ --mca orte_tmpdir_base /tmp/podman-mpirun \ podman run --env-host \ -v /tmp/podman-mpirun:/tmp/podman-mpirun \ --userns=keep-id \ --net=host --pid=host --ipc=host \ mpi-ring /home/ring Rank 2 has cleared MPI_Init Rank 2 has completed ring Rank 2 has completed MPI_Barrier Rank 3 has cleared MPI_Init Rank 3 has completed ring Rank 3 has completed MPI_Barrier Rank 1 has cleared MPI_Init Rank 1 has completed ring Rank 1 has completed MPI_Barrier Rank 0 has cleared MPI_Init Rank 0 has completed ring Rank 0 has completed MPI_Barrier
As a result,
mpirun
starts up 4 Podman containers and each container is running one instance of thering
binary. All 4 processes are communicating over MPI with each other.
Additional resources
24.2. The mpirun options
The following mpirun
options are used to start the container:
-
--mca orte_tmpdir_base /tmp/podman-mpirun
line tells Open MPI to create all its temporary files in/tmp/podman-mpirun
and not in/tmp
. If using more than one node this directory will be named differently on other nodes. This requires mounting the complete/tmp
directory into the container which is more complicated.
The mpirun
command specifies the command to start, the podman
command. The following podman
options are used to start the container:
-
run
command runs a container. -
--env-host
option copies all environment variables from the host into the container. -
-v /tmp/podman-mpirun:/tmp/podman-mpirun
line tells Podman to mount the directory where Open MPI creates its temporary directories and files to be available in the container. -
--userns=keep-id
line ensures the user ID mapping inside and outside the container. -
--net=host --pid=host --ipc=host
line sets the same network, PID and IPC namespaces. -
mpi-ring
is the name of the container. -
/home/ring
is the MPI program in the container.
Additional resources