3.1.0 Confluent release
There is an update to the HPC software stack, bringing xCAT to 2.16.1.lenovo1, and confluent is brought to 3.1.0.
Here are some of the highlighted changes:
Collective autofailover and restricted deployment support
A new attribute collective.managercandidates
can be used to specify a noderange of valid managers for a node.
If defined, deployment will not be served from other collective members even if it would otherwise be possible,
and a failure of a member of that range will have its managed nodes migrated to another member of that noderange.
Implement ssh.trustnodes
to limit node to node ssh trust
A node may define a noderange of trusted nodes to have more limited trust. By default all members of the cluster deployed by confluent trust each other. This attribute allows this to be restricted so that, for example, storage nodes could opt not to allow compute nodes to log in.
New OS deployment profiles
Support has been added for CentOS 8.3 and RedHat 8.3, Oracle Linux 7 and 8, and CentOS Stream 8
Most stock OS images now have ‘post.d’ and ‘firstboot.d’
Scripts may be placed in post.d or firstboot.d to be automatically executed at end of install or on first boot respectively.
OS images now have a ‘fetch_remote’ function in /etc/confluent/functions
OS images can now retrieve arbitrary payloads from it’s host directory. For example:
. /etc/confluent/functions
fetch_remote infiniband/mofed.tgz
In a script will download mofed.tgz from /var/lib/confluent/public/os/(profilename)/scripts/infiniband/mofed.tgz
OS image profile.yaml now have ‘installedargs’
The current kernelargs
controls how the installer is booted. installedargs
can be used
to control how the installed system boots separate of the installer.
configbmc
will now trigger remote authentication configuration
If using in-band BMC configuration rather than remote configuration, it will now request the manager to remotely configure the username and password, to be consistent with out of band and keep the credentials withheld from the OS that is installing.
Support use of TPM2 to persist node keys across reboot
Genesis can now be rebooted without rearming the node token grant. This will facilitate a more secure stateless strategy as well, where node trust is persisted through the TPM2.
Web interface will function better when used in domain shared with other web services
Other servers in a companies domain would set domain wide cookies that could interfere with confluent web gui operation. Those invalid cookies are now discarded to allow the web interface to work.
Improved error messages on some commands
A few commands provide more specific and useful feedback on failure.
Various fixes for ESXi deployment
A number of limitations around ESXi deployment have been addressed
New memory
console.logging option
If console.logging is set to memory
, then the replay buffer will be maintained, but not committed to disk. This
can improve performance on slow /var/log/confluent filesystems.
Better support for Cisco ethernet switches in /networking/ api
More complex use of vlans on Cisco equipment will no longer make addresses invisible to confluent
Add more attributes to discovery api for Lenovo equipment
Some enduser curated information is made available if detected
Genesis now starts ssh if booted without detected confluent server, and listens on 3389
This change allows a genesis booted through the BMC to be logged into remotely through BMC port forwarding where available. This can be used to diagnose/reconfigure networking when that would normally block remote deployment.
Performance improvements
Several areas of memory and processor usage have been optimized, particularly in handling PXE requests, discovery scanning, and nodeconfig
Various bugfixes
- Remove cursor key mode preservation, which can conflict with firmware setup menu operation
- Shell sessions no longer leave phantom ssh sessions
- Fixes to automatic console behavior
- Improved TSM discoevry as used in SR635 and SR655 servers
- Relax expectations in nodeconfig batch files, quotes are no longer required for spaces in values
- UUID handling is now case insensitive, working better with some script injection of id.uuid
- Fix file descriptor leak in the web forwarder