Good afternoon to everybody!
Can someone help me… There is no problem to install drivers in diskfull environment (especially if there are only a couple of nodes), but in my case there is xcat that is configured to provide diskless style of the cluster system.
And the first question is - How can I properly install mlx-connectX driver (as far as I can understand i have to install ofed) in my netboot image?
The second one - Can I mass-install (or mass-update) mlx-ofed through some couples of the nodes? Or the only way is - one by one?
p.s. ofed updates the connectX firmware, so can that be done automatically if we are talking about mass-diskless and mass-diskfull?
Although it has been a while since i’ve done it, i’ll try to help
First, are you using staleless or statelite? i assume you are using stateless because the answer is shorter and if you are using statelite let me know, i’ve done that as well.
xCAT does has automation for this (see links below) which is preferred because every time you’ll generate new image OFED will be there, but if you want to install it yourself it should be pretty simple -
copy the OFED installation files to the image
chroot into the generated image
install OFED with the following parameters - ./mlnxofedinstall --without-32bit --without-fw-update
pack the image
Of course you must ensure you have the prerequisits (OFED installtion will indicate what is missing)
Configuration for Diskless Installation — xCAT 2.11 documentation Configuration for Diskless Installation — xCAT 2.11 documentation
xCAT / Wiki / Managing_the_Mellanox_Infiniband_Network xCAT / Wiki / Managing_the_Mellanox_Infiniband_Network (before moving to xcat.org)
As for mass installation upgrade (which is pretty much identical to mass installation) - OFED itself does not have the mechanism but you can use xCAT or any other tool (pdsh etc.) that can do parallel ssh.
FOr mass FW update, it is the same, you can use xCAT to run teh same commands on each node. if you want to update the FW without installing OFED, this is a good place to start - NVIDIA Networking Firmaware Downloads NVIDIA Networking Firmaware Downloads
Just run the process on all nodes.
Again, it has been a while so sorry if i missed some details but let me know if you see any issues and i will try to help.