Discovery with chained ThinkSystem D2 enclosures
The ThinkSystem D2 enclosure (which houses SD530 servers) has a variant of System Management Module (SMM) that supports chaining enclosures together. For the usual discovery flow in confluent, it is not obvious how this should work.
There are two strategies. The first is more resilient and easier, but requires confluent 1.8.0 together with SMM firmware 1.04.
The other works with older chained SMM firmware, but requires nodes to attempt PXE boot.
Fully out of band discovery
- The Ethernet switch must have LLDP enabled.
- Confirm that you have confluent version 1.8.0, and that all SMMs will at least have firmware 1.04.
- Set the
net.switch
andnet.switchport
attributes only on the SMM directly connected to a switch. - For other SMMs, set
enclosure.extends
attribute to a directly connected adjacent SMM. For example, with three SMMs, smm1 would havenet
attributes to describe connecting to switch, smm2 would haveenclosure.extends==smm1
, and smm3 would haveenclosure.extends==smm2
- It is not required to have
net
attributes defined for any of the nodes. - Discovery proceeds normally in accordance with the general documentation of discovery here.
PXE Driven Discovery
If using older SMM firmware in a chain, or else wanting to drive discovery from the node network side rather than the SMM side, the method with PXE may be used.
- Do not set any switch attributes for any SMM (verify by running
nodeattrib <noderange> net
and seeing they are all empty). - Ensure that all the nodes have correct enclosure.manager/enclosure.bay attribute (
nodeattrib <noderange> enclosure
) - All the nodes must have either
permissive,pxe
oropen
as thediscovery.policy
attribute (nodeattrib <noderange> discovery.policy=permissive,pxe
) to enable PXE discovery. The recommended discovery policy ispermissive,pxe
for a balance of automatic behavior versus security. - Provide net.*.switch/net.*.switchport values for the ethernet ports that will PXE boot of the SD530 servers.
- Induce the servers to PXE boot (generally by turning them on). The PXE attempt need not be able to succeed, but confluent must be on the same VLAN (regardless of IP configuration).
The procedure will then automatically proceed as follows:
- As a matter of course,
nodediscover list
should showlenovo-xcc
devices, without identifying them. - When the server transmits the DHCPDISCOVER packet to initiate PXE, confluent will detect the packet and begin a search of the ethernet switches for a match.
- When the node identity is determined, it will commit to that node attribute the UUID of the node as the attribute
id.uuid
. - If there is a detected XCC that matches the uuid, that XCC gets discovered and configured appropriately, and the SMM is enabled, if not previously enabled.
- When an SMM is detected that has a UUID that matches the enclosure UUID indicated by a relevant XCC, that SMM is discovered and autoconfigured.
Trobuleshooting:
- If no lenovo-smm devices are appearing, it may be worth doing
nodediscover rescan
. - It may be the case that the XCCs were otherwise configured without enabling SMM. To rediscover XCC to try to enable SMM,
nodeattrib <noderange> pubkeys.tls_hardwaremanager=
. This will induce the SMM enablement process that is normally part of XCC discovery. - If using older SD530 firmware, confluent may not be able to link XCC and SMM. If this is the case, XCC discovery sholud still work and enable XCC firmware updates to proceed.