Hello folks,
Hope all are doing well!
I’m HPC Admin Trainee. I have one issue on my cluster. one of the node was not able to run due to ib_sdp {failed} showing at the time of boot. I tried following commands:
#etc/init.d/openibd restart
Unloading ib_addr [FAILED]
ERROR: Module ib_addr is in use by ib_core
#service openibd stop
Unloading ib_addr [FAILED]
ERROR: Module ib_addr is in use by ib_core
#service openibd start
ls: cannot access /sys/class/infiniband/qib*: No such file or directory
Loading HCA driver and Access Layer: [ OK ]
Setting up InfiniBand network interfaces:
Determining if ip address 192.168.x.x is already in use for device ib0…
Bringing up interface ib0: [ OK ]
Setting up service network . . . [ done ]
Loading ib_sdp [FAILED]
Kindly help to resolve this issue.
Thanks in advance!!