LiCO GUI Installer
LiCO Installer is a tool that simplifies HPC cluster deployment and LiCO setup. It runs on the management node and it can use Confluent to deploy the OS on the compute nodes.
The user can define the following node types:
- head node (currently only a single head node is supported; this is the same machine on which the installer runs)
- login nodes - none or more (if no login node is defined, the head node will be used as login node)
- compute nodes - one or more
The compute nodes that have at least 1 GPU defined in the configuration are treated as GPU nodes and NVIDIA drivers will be installed on these.
Intel GPU nodes are not currently supported by the installer. If you have nodes that have Intel GPU they should be marked with 0 GPUs in the node definition and the user will have to manually install the drivers and configure Slurm for these nodes. See the LiCO Installation Guide for instructions.
Setting up the installer
Disable Firewalld and SELinux
systemctl disable firewalld --now sed -i 's/enforcing/disabled/' /etc/selinux/config reboot
Download lico-installer.tar.gz
Create the path and extract archive
mkdir lico-installer tar xvf lico-installer.tar.gz -C lico-installer
Create the /lico_files directory
cp -r ./lico-installer/etc/* /
Start the GUI
cd lico-installer ./start.sh
Browse to:
http://<host-ip>:8000
Prepare
If the user doesn't want to upload the files from the browser they have the alternative to manually copy them to the location where the installer expects to find them: /lico_files/LiCO Version/OS_extension/
You can refer to the Installation Guide for the appropiate LiCO version to check the file names and versions.
LiCO v7.1.0 for EL_8 will check the following paths:
- ISO file:
/lico_files/7.1.0/EL_8/iso/Rocky-8.6-x86_64-dvd1.iso
- GPU Driver:
/lico_files/7.1.0/EL_8/archives/NVIDIA-Linux-x86_64-520.61.07.run
- Confluent:
/lico_files/7.1.0/EL_8/archives/confluent-3.7.1-el8.tar.xz
- OHPC:
/lico_files/7.1.0/EL_8/archives/Lenovo-OpenHPC-2.6.1.EL8.x86_64.tar
- LiCO dependencies:
/lico_files/7.1.0/EL_8/archives/lico-dep-7.1.0.el8.x86_64.tgz
- LiCO release:
/lico_files/7.1.0/EL_8/archives/lico-release-7.1.0.el8.x86_64.tar.gz
- AuthSelect:
/lico_files/7.1.0/EL_8/archives/authselect.tar.gz
Starting with v7.1.0 the installation will also use openlico-monitor-x.y.z.x86_64.tgz:
- OpenLiCO Monitor:
/lico_files/7.1.0/EL_8/archives/openlico-monitor-1.0.0.x86_64.tgz
For Red Hat Enterprise Linux replace EL_8 with RHEL_8
The installer will check if these files are available before proceeding. Use the "SCAN" button to scan for file paths.
NOTE: If you don't use the upload button and want to manually add these files you need to make sure the file names and paths match the above logic.
Click "NEXT" to continue.
Config
Cluster
There are 3 ways the user can add node information: - Using the "Add" button and manually type the node information. - Using the "Load CSV" button to upload the node information in CSV format. - Using the "Load config" button to upload node information and cluster settings in JSON format.
Using JSON file to upload node and cluster settings information
- Click "Load config" at the bottom left of the page
- You can download a file example by clicking on "See example"
- Use 'config_example.json' as a starting point or download the current JSON Config file by clicking "SAVE CONFIG" at the bottom of the Config Section:
Location
: Describe your cluster location. Rooms are related to this location.Rooms
: Currently a single room is supported.Rows
: Relative to the rooms.Racks
: Relative to a row.Chassiss
: If you have chassis in your cluster they will relate to the rack. Chassis 'unit_location' represents the position of the chassis inside the rack.Groups
- Nodes can be placed in different logic groups.Nodes
- A node can be placed in a Chassis (which belongs to a rack) OR directly in a rack. This means that it will only have "chassis" OR "rack".
- Paste the content of the JSON file.
- Click "Load".
Using CSV file to upload node information
- Click "Load CSV" button.
- You can also download an example file from this window.
- Select the csv file from you computer that contains node information.
- Click "Load".
Network
Set your DNS, DNS Domain and the IPv4 gateway. Currently InfiniBand is not supported by the installer. The user can configure IB manually after the cluster is deployed.
Auth
Define the LDAP Domain Name, Domain Component and LDAP Password The installer will install openldap-servers and configure nslcd and libuser.
Storage
- Define the NFS shared directory - this location is used to deploy necessary files to the nodes
- Set the user Home directory - /home - the installer will export this path as a NFS and LiCO uses this as the location where it stores user related files.
Monitor
In this section you can set the port used by Icinga2 for monitoring.
Scheduler
Currently only SLURM is supported by the installer.
MPI
You can choose between OpenMPI and MPICH, which both work with Ethernet.
If MVAPICH2 is desired the user will have to manually install the necessary Infiniband/OPA drivers and the required modules after the deployment of the LiCO installation
LiCO
This section allows the user to input the necessesary credentials that the deployment process needs. The users and passwords will be created accordingly to user input.
Also you have the option to Enable VNC Control and choose the notification agent.
Proceed to the next section by clicking "DEPLOY".
Deploy
The deployment process consists of 3 main steps: OS, HPC and LiCO.
For each of these steps the installer runs a series of commands that can be viewed before proceeding.
Click "DEPLOY" to start deploying your cluster.
At this point the installer will verify the checksums before the deployment starts.
It will look in the following paths:
- /lico_files/<lico_version>/<os>/archives/
- If there is any checksum mismatch in this location the deployment will not start.
- /lico_files/<lico_version>/<os>/config/
- If there is any checksum mismatch in this location the user can:
- Proceed with the mismatch.
- Replace the existing config files with the default ones and proceed.
- If there is any checksum mismatch in this location the user can:
If the deployment fails at any point, after fixing the issue, you have two options:
- RETRY: restart the installation starting from the failed step.
- RESTART: restart the installation from the beginning.
Validation
After the deployment is complete, the user can run a series of checks to view different status info.
Users
The installation will not add any default users.
- You can add a new user to LDAP with administrator privileges:
luseradd <HPC_ADMIN_USERNAME> -P <HPC_ADMIN_PASSWORD>
nodeshell all "su - <HPC_ADMIN_USERNAME> -c whoami"
- Import the user into LiCO:
lico import_user -u <HPC_ADMIN_USERNAME> -r admin
Troubleshooting
In case the deployment fails the user has to fix the issue and then click RETRY or RESTART. If the fail occurs after the deployment process has already set up the MySQL database, the user will have to delete the database and user before restarting the deployment.
mysql
DROP DATABASE lico;
DROP USER '<USERNAME>'@'localhost';