IB network card IP not working, check IP address loop display through IP add command - not displayed

My IB environment network is as follows:
A single GPU server is equipped with 8 IB cards, and multiple GPU servers are connected to a 9790 unmanaged switch, which is then connected to a managed switch;
Each server’s IB port is statically configured with an IP address, such as 10.00.1-10.0.8, and the ports are all within the same subnet;
At present, there is a problem where the port information can be viewed multiple times on the server side through the IP add command. Sometimes, all IP addresses under the port can be displayed, and the network is open. Sometimes, it is partially displayed, and sometimes it is not displayed at all;



I don’t know where the problem is. Can multiple network cards on the same server be configured with one network segment address? Is this the problem, or is there a problem with my IB related configuration?

Hi,

If I understand correctly from your description, the issue you are describing is that IPoIB addresses are disappearing after configuration.
This usually happens when an IP address is configured on multiple interfaces in the fabric or something else is not configured properly.
I suggest flushing the cache on all ports using ip -s -s neigh flush all.
Afterwards, try configuring the addresses again and see if they stick.

If the issue still persists after this, please open a case with enterprisesupport@nvidia.com and it will be handled based on entitlement.

Thanks,
Jonathan.