Installing Infrastructure Software

List of Infrastructure Software

The installation node fields are expressed as follows:

M: Management node
L: Login node
C: Compute node

Software Name	Component Name	Version	Service Name	Installation Node	Notes
nfs	nfs-utils	1.3.0	nfs-server	M
	nfs-kernel-server	1.3.0	nfs-server	M
	nfs-client	1.3.0	nfs	C,L
ntp	ntp	4.2.6	ntpd	M
slurm	ohpc-slurm-server	1.3.3	munge,slurmctld	M
slurm	ohpc-slurm-client	1.3.3	munge,slurmd	C,L
ganglia	ganglia-gmond-ohpc	3.7.2	gmond	M,C,L
singularity	singularity-ohpc	2.4		M
cuda	cudnn	7		C	Only needs to be installed on the GPU node
cuda	cuda	9.1		C	Only needs to be installed on the GPU node
mpi	openmpi3-gnu7-ohpc	3.0.0		M	Install at least one of three types of MPI
	mpich-gnu7-ohpc	3.2		M
	mvapich2-gnu7-ohpc	2.2		M

Set the Local Repository for Management Node

Download the local repository

https://hpc.lenovo.com/lico/downloads/5.1/Lenovo-OpenHPC-1.3.3.CentOS_7.x86_64.tar 

https://hpc.lenovo.com/lico/downloads/5.1/Lenovo-OpenHPC-1.3.3.SLES.x86_64.tar 

Configuring the local repository

Upload the package to management node.Run the commands below to configure the local Lenovo OpenHPC repository:



$ sudo mkdir -p $ohpc_repo_dir
$ sudo tar xvf Lenovo-OpenHPC-1.3.3.CentOS_7.x86_64.tar -C $ohpc_repo_dir
$ sudo $ohpc_repo_dir/make_repo.sh



$ sudo mkdir -p $ohpc_repo_dir
$ sudo tar xvf Lenovo-OpenHPC-1.3.3.SLES.x86_64.tar -C $ohpc_repo_dir
$ sudo $ohpc_repo_dir/make_repo.sh
$ sudo rpm --import $ohpc_repo_dir/SLE_12/repodata/repomd.xml.key

Configuring the Local Repository for Compute and Login Nodes

el7

Installing yum-utils

$ sudo psh all yum --setopt=\*.skip_if_unavailable=1 -y install yum-utils

Add the Local Repository

$ sudo cp /etc/yum.repos.d/Lenovo.OpenHPC.local.repo /var/tmp
$ sudo sed -i '/^baseurl=/d' /var/tmp/Lenovo.OpenHPC.local.repo
$ sudo sed -i '/^gpgkey=/d' /vars/tmp/Lenovo.OpenHPC.local.repo

$ sudo echo "baseurl=http://${sms_name}/${ohpc_repo_dir}/CentOS_7" >> /var/tmp/Lenovo.OpenHPC.local.repo
$ sudo echo "gpgkey=http://${sms_name}/${ohpc_repo_dir}/CentOS_7/repodata/repomd.xml.key" >> /var/tmp/Lenovo.OpenHPC.local.repo

# Distribute repo files
$ sudo xdcp all /var/tmp/Lenovo.OpenHPC.local.repo /etc/yum.repos.d/
$ sudo psh all echo -e %_excludedocs 1 \>\> ~/.rpmmacros

Run the following command to shut down the yum source access to the external network.

Note

This step can be performed according to the actual situation.If the operating system itself does not install enough packages, the subsequent installation steps may fail

$ sudo psh all yum-config-manager --disable CentOS\*

sle12

$ sudo cp /etc/zypp/repos.d/Lenovo.OpenHPC.local.repo /var/tmp
$ sudo sed -i '/^baseurl=/d' /var/tmp/Lenovo.OpenHPC.local.repo
$ sudo sed -i '/^gpgkey=/d' /var/tmp/Lenovo.OpenHPC.local.repo

$ sudo echo "baseurl=http://${sms_name}/${ohpc_repo_dir}/SLE_12" >> /var/tmp/Lenovo.OpenHPC.local.repo
$ sudo echo "gpgkey=http://${sms_name}/${ohpc_repo_dir}/SLE_12/repodata/repomd.xml.key" >> /var/tmp/Lenovo.OpenHPC.local.repo

# Distribute repo files
$ sudo xdcp all /var/tmp/Lenovo.OpenHPC.local.repo /etc/zypp/repos.d/
$ sudo psh all rpm --import http://${sms_name}/${ohpc_repo_dir}/SLE_12/repodata/repomd.xml.key
$ sudo psh all echo -e %_excludedocs 1 \>\> ~/.rpmmacros

Configuring LiCO Dependencies Repository

el7

Download the package： https://hpc.lenovo.com/lico/downloads/5.1/lico-dep-5.1.el7.x86_64.tgz

Then upload the package to management node. Run the commands below to configure the Yum repository for the management node. Please make sure management node should configure local operating system yum repository for following actions

$ sudo mkdir -p $lico_dep_repo_dir
$ sudo tar xvf lico-dep-5.1.el7.x86_64.tgz -C $lico_dep_repo_dir
$ sudo $lico_dep_repo_dir/mklocalrepo.sh

Run the commands below to configure the Yum repository for other nodes:

$ sudo cp /etc/yum.repos.d/lico-dep.repo /var/tmp

$ sudo sed -i '/^baseurl=/d' /var/tmp/lico-dep.repo
$ sudo sed -i '/^gpgkey=/d' /var/tmp/lico-dep.repo

$ sudo echo "baseurl=http://${sms_name}/${lico_dep_repo_dir}" >> /var/tmp/lico-dep.repo
$ sudo echo "gpgkey=http://${sms_name}/${lico_dep_repo_dir}/RPM-GPG-KEY-LICO-DEP-EL7" >> /var/tmp/lico-dep.repo

# Distribution configuration
$ sudo xdcp all /var/tmp/lico-dep.repo /etc/yum.repos.d

sle12

Download the package: https://hpc.lenovo.com/lico/downloads/lico-dep-5.1.sle12.x86_64.tgz

Then upload the package to management node.Run the commands below to configure the zypper repository for the management node. Please make sure management node should configure local operating system zypper repository for following actions:

$ sudo mkdir -p $lico_dep_repo_dir
$ sudo tar xvf lico-dep-5.1.sle12.x86_64.tgz -C $lico_dep_repo_dir
$ sudo $lico_dep_repo_dir/mklocalrepo.sh
$ sudo rpm --import $lico_dep_repo_dir/RPM-GPG-KEY-LICO-DEP-SLE12

Run the commands below to configure the Zypper repository for other nodes:

$ sudo cp /etc/zypp/repos.d/lico-dep.repo /var/tmp

$ sudo sed -i '/^baseurl=/d' /var/tmp/lico-dep.repo
$ sudo sed -i '/^gpgkey=/d' /var/tmp/lico-dep.repo

$ sudo echo "baseurl=http://${sms_name}/${lico_dep_repo_dir}" >> /var/tmp/lico-dep.repo
$ sudo echo "gpgkey=http://${sms_name}/${lico_dep_repo_dir}/RPM-GPG-KEY-LICO-DEP-SLE12" >> /var/tmp/lico-dep.repo

# Distribution configuration
$ sudo xdcp all /var/tmp/lico-dep.repo /etc/zypp/repos.d
$ sudo psh all rpm --import http://${sms_name}/${lico_dep_repo_dir}/RPM-GPG-KEY-LICO-DEP-SLE12

Installing slurm

Installing ohpc-base
```
$ sudo yum -y install lenovo-ohpc-base
```
Installing the Slurm server
```
$ sudo yum -y install ohpc-slurm-server
```
Installing the Slurm client
```
$ sudo psh all yum -y install ohpc-base-compute ohpc-slurm-client lmod-ohpc
```
Configuring pam_slurm

Note

The following optional command will prevent non-root logins to the compute nodes unless they are already running a slurm job on that node submitted by the userid being used for logging in:

$ sudo psh all echo "\""account required pam_slurm.so"\"" \>\> /etc/pam.d/sshd

Installing ohpc-base
```
$ sudo zypper install lenovo-ohpc-base
```
Installing the Slurm server
```
$ sudo zypper install ohpc-slurm-server
```
Installing the Slurm client
```
$ sudo psh all zypper install -y --force-resolution ohpc-base-compute ohpc-slurm-client lmod-ohpc
```
Configuring pam_slurm

Note

The following optional command will prevent non-root logins to the compute nodes unless they are already running a slurm job on that node submitted by the userid being used for logging in:
```
$ sudo psh all echo "\""account required pam_slurm.so"\"" \>\> /etc/pam.d/sshd
```

Configuring nfs

el7

Note

Run the following commands to create the share directory of /opt/ophc/pub. This directory is necessary. If you have already shared this directory, you can skip this step.

# Management node share the Lenovo OpenHPC directory
$ sudo yum -y install nfs-utils
$ sudo echo "/opt/ohpc/pub *(ro,no_subtree_check,fsid=11)" >> /etc/exports
$ sudo exportfs -a

# Installing NFS for Cluster Nodes
$ sudo psh all yum -y install nfs-utils

# Configure shared directory for cluster nodes
$ sudo psh all mkdir -p /opt/ohpc/pub
$ sudo psh all echo "\""${sms_ip}:/opt/ohpc/pub /opt/ohpc/pub nfs nfsvers=3,nodev,noatime 0 0"\"" \>\> /etc/fstab

# Mount shared directory
$ sudo psh all mount /opt/ohpc/pub

Note

Run the following commands to create user share directory, this document takes /home as an example, also you can choose other directory.

# Management node shares /home and Lenovo OpenHPC package directory
$ sudo echo "/home *(rw,no_subtree_check,fsid=10,no_root_squash)" >> /etc/exports
$ sudo exportfs -a

# if /home already mounted, unmount it first
$ sudo psh all "sed -i '/ \/home /d' /etc/fstab"
$ sudo psh all umount /home

# Configure a shared directory for cluster nodes
$ sudo psh all echo "\""${sms_ip}:/home /home nfs nfsvers=3,nodev,nosuid,noatime 0 0"\"" \>\> /etc/fstab

# Mount a shared directory
$ sudo psh all mount /home

sle12

Note

Run the following commands to create the share directory of /opt/ophc/pub. This directory is necessary. If you have already shared this directory, you can skip this step.

# Management node share the Lenovo OpenHPC directory
$ sudo zypper install nfs-kernel-server
$ sudo echo "/opt/ohpc/pub *(ro,no_subtree_check,fsid=11)" >> /etc/exports
$ sudo exportfs -a

# Configure shared directory for cluster nodes
$ sudo psh all zypper install -y --force-resolution nfs-client
$ sudo psh all mkdir -p /opt/ohpc/pub
$ sudo psh all echo "\""${sms_ip}:/opt/ohpc/pub /opt/ohpc/pub nfs nfsvers=3,nodev,noatime 0 0"\"" \>\> /etc/fstab

# Mount shared directory
$ sudo psh all mount /opt/ohpc/pub

Note

Run the following commands to create user share directory, this document takes /home as an example, also you can choose other directory.

# Management node shares /home and Lenovo OpenHPC package directory
$ sudo echo "/home *(rw,no_subtree_check,fsid=10,no_root_squash)" >> /etc/exports
$ sudo exportfs -a

# if /home already mounted, unmount it first
$ sudo psh all "sed -i '/ \/home /d' /etc/fstab"
$ sudo psh all umount /home

# Configure a shared directory for cluster nodes
$ sudo psh all echo "\""${sms_ip}:/home /home nfs nfsvers=3,nodev,nosuid,noatime 0 0"\"" \>\> /etc/fstab

# Mount a shared directory
$ sudo psh all mount /home

Configuring ntp

Note

If ntp service has already been configured for nodes in the cluster, skip this step



    $ sudo echo "server 127.127.1.0" >> /etc/ntp.conf
    $ sudo echo "fudge  127.127.1.0 stratum 10" >> /etc/ntp.conf
    $ sudo systemctl enable ntpd
    $ sudo systemctl start ntpd
    $ sudo psh all yum -y install ntp
    $ sudo psh all echo "\""server ${sms_ip}"\"" \>\> /etc/ntp.conf
    # Startup
    $ sudo ppsh all systemctl enable ntpd
    $ sudo ppsh all systemctl start ntpd

    # check service
    psh all "ntpq -p | tail -n 1"



    $ sudo echo "server 127.127.1.0" >> /etc/ntp.conf
    $ sudo echo "fudge  127.127.1.0 stratum 10" >> /etc/ntp.conf
    $ sudo systemctl enable ntpd
    $ sudo systemctl start ntpd
    $ sudo psh all zypper install -y --force-resolution ntp
    $ sudo psh all echo "\""server ${sms_ip}"\"" \>\> /etc/ntp.conf
    # Startup
    $ sudo psh all systemctl enable ntpd
    $ sudo psh all systemctl start ntpd

    # check service
    psh all "ntpq -p | tail -n 1"

Installing cuda and cudnn

Note

Run the commands below to install CUDA and CUDNN on all the GPU compute nodes (if only a subset of nodes have GPUs, replace “compute” argument in psh commands with node range corresponding to GPU nodes):

Installing cuda

Download cuda_9.1.85_387.26_linux.run to share directory, if you are installing according to this document the share directory is /home

Download address: https://developer.nvidia.com/cuda-downloads
If the operating system was launched from the desktop,first run the commands below to set the launch from the command line,and then restart the system
```
$ sudo psh compute systemctl set-default multi-user.target
$ sudo psh compute reboot
```
Install nvidia drivers

Download address:

http://www.nvidia.com/content/DriverDownload-March2009/confirmation.php?url=/tesla/390.46/nvidia-diag-driver-local-repo-rhel7-390.46-1.0-1.x86_64.rpm&lang=us&type=Tesla

http://www.nvidia.com/content/DriverDownload-March2009/confirmation.php?url=/tesla/390.46/nvidia-diag-driver-local-repo-sles123-390.46-1.0-1.x86_64.rpm&lang=us&type=Tesla

Note

We suggest you to install kernel patch for Spectre/Meltdown security issue, you can get the pach from here:

https://access.redhat.com/errata/RHSA-2018:0395

https://access.redhat.com/errata/RHBA-2018:0408

https://lists.centos.org/pipermail/centos-announce/2018-March/022768.html

https://lists.centos.org/pipermail/centos-announce/2018-March/022789.html

https://download.suse.com/Download?buildid=zkkoONhUDd0~

Then make sure to install the kernel-devel package that matches the running kernel. If this has been done then the kernel-devel package can be omitted from the following commands. Otherwise run the following commands as shown:

```
$ sudo psh compute rpm -ivh /home/nvidia-diag-driver-local-repo-rhel7-390.46-1.0-1.x86_64.rpm
$ sudo psh compute yum install -y cuda-drivers
```

```
$ sudo psh compute rpm -ivh /home/nvidia-diag-driver-local-repo-sles123-390.46-1.0-1.x86_64.rpm
$ sudo psh compute zypper --gpg-auto-import-keys install -y --force-resolution cuda-drivers
$ psh compute perl -pi -e "s/NVreg_DeviceFileMode=0660/NVreg_DeviceFileMode=0666/" /etc/modprobe.d/50-nvidia-default.conf
$ psh compute reboot -h now
```

Installing cuda



$ sudo psh compute yum install -y kernel-devel gcc gcc-c++
$ sudo psh compute /home/cuda_9.1.85_387.26_linux.run --silent --toolkit --samples --no-opengl-libs --verbose --override



$ sudo psh compute zypper install -y --force-resolution kernel-devel gcc gcc-c++
$ sudo psh compute /home/cuda_9.1.85_387.26_linux.run --silent --toolkit --samples --no-opengl-libs --verbose --override

Download cudnn

Download cudnn-9.1-linux-x64-v7.1.tgz into directory /root.The official website:

https://developer.nvidia.com/cudnn

Installing cudnn

$ cd ~
$ tar -xvf cudnn-9.1-linux-x64-v7.1.tgz
$ sudo xdcp compute cuda/include/cudnn.h /usr/local/cuda/include
$ sudo xdcp compute cuda/lib64/libcudnn_static.a /usr/local/cuda/lib64
$ sudo xdcp compute cuda/lib64/libcudnn.so.7.0.5 /usr/local/cuda/lib64
$ sudo psh compute "ln -s /usr/local/cuda/lib64/libcudnn.so.7.0.5 /usr/local/cuda/lib64/libcudnn.so.7"
$ sudo psh compute "ln -s /usr/local/cuda/lib64/libcudnn.so.7 /usr/local/cuda/lib64/libcudnn.so"
$ sudo psh compute chmod a+r /usr/local/cuda/include/cudnn.h
$ sudo psh compute chmod a+r /usr/local/cuda/lib64/libcudnn*

Configuring Environmental Variables
Certain environment variables need to be set in order to proper operation of the Cuda package. This can be accomplished by modifying the configuration files described in the commands below. These commands should be run on the management node, even though Cuda isn’t installed on the management node, to facilitate deployment of these files, and the Cuda environment variables, across all of the the compute nodes in the cluster that contain GPUs:
```
$ sudo echo "/usr/local/cuda/lib64" >> /etc/ld.so.conf.d/cuda.conf
$ sudo echo "export CUDA_HOME=/usr/local/cuda" >> /etc/profile.d/cuda.sh
$ sudo echo "export PATH=/usr/local/cuda/bin:\$PATH" >> /etc/profile.d/cuda.sh
```

Distribute Configuration

$ sudo xdcp compute /etc/ld.so.conf.d/cuda.conf /etc/ld.so.conf.d/cuda.conf
$ sudo xdcp compute /etc/profile.d/cuda.sh /etc/profile.d/cuda.sh

Run the commands below on the GPU nodes to determine if the GPU can be identified:

$ sudo psh compute ldconfig
$ sudo psh compute nvidia-smi
$ sudo psh compute "cd /root/NVIDIA_CUDA-9.1_Samples/1_Utilities/deviceQuery; make; ./deviceQuery" | xcoll

Set CUDA’s self-start



# configuration
$ sudo psh compute sed -i '/Wants=syslog.target/a\Before=slurmd.service' /usr/lib/systemd/system/nvidia-persistenced.service

$ sudo psh compute systemctl daemon-reload
$ sudo psh compute systemctl enable nvidia-persistenced
$ sudo psh compute systemctl start nvidia-persistenced



# add configure file
$ cat << eof > /usr/lib/systemd/system/nvidia-persistenced.service
[Unit]
Description=NVIDIA Persistence Daemon
Before=slurmd.service
Wants=syslog.target

[Service]
Type=forking
ExecStart=/usr/bin/nvidia-persistenced --user root
ExecStopPost=/bin/rm -rf /var/run/nvidia-persistenced

[Install]
WantedBy=multi-user.target
eof

# Distribute configure file
xdcp compute /usr/lib/systemd/system/nvidia-persistenced.service /usr/lib/systemd/system/nvidia-persistenced.service

# restart service
psh compute systemctl daemon-reload
psh compute systemctl enable nvidia-persistenced
psh compute systemctl start nvidia-persistenced

Installing slurm

Configuring slurm
1. Download https://hpc.lenovo.com/lico/downloads/5.1/examples/conf/slurm.conf to the /etc/ganglia/ directory on the management node and change as needed
2. Download https://hpc.lenovo.com/lico/downloads/5.1/examples/conf/gres.conf to the /etc/ganglia/ directory on the management node and change as needed

Distribute Configuration

$ sudo xdcp all /etc/slurm/slurm.conf /etc/slurm/slurm.conf
$ sudo xdcp all /etc/munge/munge.key /etc/munge/munge.key

Startup service

# Startup Management Node
$ sudo systemctl enable munge
$ sudo systemctl enable slurmctld
$ sudo systemctl restart munge
$ sudo systemctl restart slurmctld

# Startup Other Node
$ sudo psh all systemctl enable munge
$ sudo psh all systemctl restart munge
$ sudo psh all systemctl enable slurmd
$ sudo psh all systemctl restart slurmd

Note

If the slurm operation appears a problem, please refer to How To Solve slurm Common Problem

Installing ganglia

Installing gmond



# Management node
$ sudo yum -y install ganglia-gmond-ohpc

# Other node
$ sudo psh all yum install -y ganglia-gmond-ohpc



# Management node
$ sudo zypper install ganglia-gmond-ohpc

# Other node
$ sudo psh all zypper install -y --force-resolution ganglia-gmond-ohpc

Configuring gmond
1. Download https://hpc.lenovo.com/lico/downloads/5.1/examples/conf/ganglia/management/gmond.conf to the /etc/ganglia/gmond.conf directory on the management node.
2. Download https://hpc.lenovo.com/lico/downloads/5.1/examples/conf/ganglia/gmond.conf to the /var/tmp/gmond.conf directory on the management node
Note

Please according to the actual situation modify the hostname to the management node in the udp_send_channel.

Modifying Kernel Parameters

$ sudo echo net.core.rmem_max=10485760 > /usr/lib/sysctl.d/gmond.conf
$ sudo /usr/lib/systemd/systemd-sysctl gmond.conf
$ sudo sysctl -w net.core.rmem_max=10485760

Distribute Configuration

$ sudo xdcp all /var/tmp/gmond.conf /etc/ganglia/gmond.conf

Startup service

# Management node
$ sudo systemctl enable gmond
$ sudo systemctl start gmond

# Other node
$ sudo psh all systemctl enable gmond
$ sudo psh all systemctl start gmond

# Make sure all nodes are listed
$ sudo gstat -a

Installing mpi

Installing mpi Module

```
$ sudo yum -y install openmpi3-gnu7-ohpc mpich-gnu7-ohpc mvapich2-gnu7-ohpc
```

```
$ sudo zypper install openmpi3-gnu7-ohpc mpich-gnu7-ohpc mvapich2-gnu7-ohpc
```
Note

The above commands will install three modules ( openmpi, mpich , mvapich ) to the system, and the user can use lmod to choose the specific MPI module to be used.
Set The Default mpi
If you run the commands below, you can set the openmpi module as default
$ sudo yum -y install lmod-defaults-gnu7-openmpi3-ohpc
If you run the commands below, you can set the mpich module as default
$ sudo yum -y install lmod-defaults-gnu7-mpich-ohpc
If you run the commands below, you can set the mvapich module as default
$ sudo yum -y install lmod-defaults-gnu7-mvapich2-ohpc
If you run the commands below, you can set the openmpi module as default
$ sudo zypper install lmod-defaults-gnu7-openmpi3-ohpc
If you run the commands below, you can set the mpich module as default
$ sudo zypper install lmod-defaults-gnu7-mpich-ohpc
If you run the commands below, you can set the mvapich module as default
$ sudo zypper install lmod-defaults-gnu7-mvapich2-ohpc
Here is table of interconnect support for each MPI type from OpenHPC

Ethernet(TCP)

InfiniBand

Omni-Path

Installing mpi

X

MVAPICH2

X

MVAPICH2(psm2)

X

OpenMPI

X

X

X

OpenMPI(PMIx)

X

X

X

Note

If you want to use MVAPICH2 (psm2), you should install mvapich2-psm2-gnu7-ohpc. If you want to use OpenMPI (PMIx), you should install openmpi3-pmix-slurm-gnu7-ohpc. However, openmpi3-gnu7-ohpc and openmpi3-pmix-slurm-gnu7-ohpc is incompatible and mvapich2-psm2-gnu7-ohpc and mvapich2-gnu7-ohpc is incompatible.

Installing singularity

singluarity is an hpc-facing lightweight container framework

Installing singluarity



$ sudo yum -y install singularity-ohpc



$ sudo zypper install singularity-ohpc

Installing openhpc default environment
Edit the /opt/ohpc/pub/modulefiles/ohpc file, and Add the following to the corresponding block.
```
# Add in module try-add
module try-add singularity

# Add in module del
module del singularity
```
Make the configuration file take effect
Run the following command
```
$ sudo source /etc/profile.d/lmod.sh
```

Note

Changes to /opt/ohpc/pub/modulefiles/ohpc may be lost when default modules are changed by installing lmod-defaults* package. In that case, modify /opt/ohpc/pub/modulefiles/ohpc file again, or, alternatively,add module try-add singularity to the bottom of /etc/profile.d/lmod.sh

Checkpoint B

Checking slurm

$ sudo sinfo
...
PARTITION AVAIL  TIMELIMIT  NODES  STATE NODELIST
normal*      up 1-00:00:00      2   idle c[1-2]
...

Attention

The status of all nodes should be idle , idle* is not acceptable.

Add a test account

$ sudo useradd -m test
$ sudo echo "MERGE:" > syncusers
$ sudo echo "/etc/passwd -> /etc/passwd" >> syncusers
$ sudo echo "/etc/group -> /etc/group" >> syncusers
$ sudo echo "/etc/shadow -> /etc/shadow" >> syncusers
$ sudo xdcp all -F syncusers

Run and Test mpi

$ su - test
$ mpicc -O3 /opt/ohpc/pub/examples/mpi/hello.c
$ srun -n 8 -N 1 -w compute --pty /bin/bash
$ prun ./a.out
...
Master compute host = c1
Resource manager = slurm
Launch cmd = mpiexec.hydra -bootstrap slurm ./a.out
Hello, world (8 procs total)
--> Process # 0 of 8 is alive. -> c1
--> Process # 4 of 8 is alive. -> c2
--> Process # 1 of 8 is alive. -> c1
--> Process # 5 of 8 is alive. -> c2
--> Process # 2 of 8 is alive. -> c1
--> Process # 6 of 8 is alive. -> c2
--> Process # 3 of 8 is alive. -> c1
--> Process # 7 of 8 is alive. -> c2

Note

After the finished the command, notice that you exit to the root user of the management node.

	Ethernet(TCP)	InfiniBand	Omni-Path
Installing mpi	X
MVAPICH2		X
MVAPICH2(psm2)			X
OpenMPI	X	X	X
OpenMPI(PMIx)	X	X	X