2.3.0 Confluent release
There is an update to the HPC software stack, bringing xCAT to 2.14.6.lenovo1, based on 2.14.6, and confluent is brought to 2.3.0.
Here are some of the highlighted changes:
redfish plugin
2.3.0 contains the initial release of the redfish plugin. This provides more capability on an industry standard system. Note that ipmi remains the default and generally recommended plugin to use with Lenovo systems (on Lenovo systems, the ipmi plugin will use redfish or other protocols as appropriate already). The following scenarios would be reasons to use redfish plugin on Lenovo systems
- Wanting to use an authentication method such as LDAP, which is incompatible with ipmi
- Requirement not to use UDP port 623
- Faster nodeconfig access to select UEFI settings (not all settings are available).
At this point, redfish implementations are generally slower than their IPMI counterparts. Additionally system firmware redfish support has been maturing only recently, and as such firmware from 2019 is strongly recommended regardless of vendor.
To use, set the hardwaremanagement.method
attribute to redfish
.
nodeshell and noderun commands have -n argument
This will suppress the nodename portion of the output. For example, one could do an ad-hoc creation of /etc/hosts entries through a command such as:
$ noderun -n rackmount echo 172.20.{n1}.{n2} {node}.domain
172.20.8.32 r8u32.domain
172.20.8.33 r8u33.domain
New stats
command
The stats
command has been added to help analyze numerical output.
By default it gives a few frequently useful statistics:
$ grep 2R2 node*.log | stats
Samples: 54 Min: 1680.54 Median: 1682.485 Mean: 1682.51018519 Max: 1688.49 StandardDeviation: 1.2434755515 Sum: 90855.55
It can optionally produce a textual histogram.
$ grep 2R2 node*.log | stats -t
1680.5|=======================
1681.3|==========================================================
1682.1|=========================================================================
1682.9|===================================
1683.7|============
1684.5|====
1685.3|
1686.1|
1686.9|
1687.7|====
Samples: 54 Min: 1680.54 Median: 1682.485 Mean: 1682.51018519 Max: 1688.49 StandardDeviation: 1.2434755515 Sum: 90855.55
If you have a sixel enabled terminal, a graphical chart may be requested:
Confluent now supports user groups
The /usergroups/ api is now available to create groups. This is mapped by default to system group names and memberships, which may be /etc/group or LDAP groups.
To confer access to any member of the wheel group to confluent:
confetty create /usergroups/wheel
Confluent now has user roles
Confluent users and usergroups may be limited in privilege to one of the roles:
Administrator
, Operator
, or Monitor
.
Administrator
is the usual full access, Operator
is blocked from modifying users/passwords. Monitor
is severely limited to select reead-only access appropriate foruse in monitoring scripts. This may be done to users or groups.
For example, to limit wheel group to not change any passwords or users:
confetty set /usergroups/wheel role=Operator
Improved discovery for XCC and SMM
Where possible, discovery for XCC and SMM will induce fewer failed login attempts and also will execute more quickly. Additionally, systems that are configured to not have IPMI available are now able to be discovered.
Web forwarder now works if confluent gui served from custom port
The functionality to open endpoint management consoles in web interface now supports custom https ports for the main confluent gui.
Storage configuration now supports custom strip sizes and modifying disk state
nodestorage
has the -s
argument added to specify strip size on array creation. Additionally, it has added the setdisk
subcommand to manage jbod/hotspare state.
Fix confluent environment file for zsh users
The bash completion enhancements are now only attempted if the shell is bash
nodelicense
can now save and delete licenses from Lenovo XCCs
For example, to backup a noderange of license keys to one directory per node:
nodelicense rack1 save ./licenses/{node}/
Fix “login process died” errors
In certain specific scenarios, “login process died” errors would come from ipmi plugin. A class of those scenarios has been fixed and an autorecovery mechanism has been put into place to catch any remaining causes.
A node may now be more easily forced through repeating discovery
nodediscover
now has a reassign
subcommand. To have discovered node d1
repeat discovery procedure:
nodediscover reassign -n d1
New credentials are encrypted and validated using AES-GCM instead of AES-CBC and SHA-256
As a result, 2.3.0 attribute databases and backups cannot be used with older versions of confluent.
Improve detail in the event of “Unexpected Error” situations
Error messages now include some potentially useful information for resolving the issue.
IPMI now supports HMAC-SHA256/HMAC-SHA256-128
xCAT and confluent both try to use HMAC-SHA256 when supported now for IPMI communication.
GUI consoles now show health summary in title bar
When using many consoles, it will obscure the normal view. This brings the health status front and center to be usable in that scenario