Hi all, we inherited an HPC system where each node contains a single physical port, with two x16 cards for socket direct (this RHEL 8). lspci shows both the main board and the auxillary card as it should, but we’re confused because we can’t find any documentation on best practices for the two ib0 and ib1 interfaces that come up. We’ve been just using ib0, and leaving ib1 with no ip address assigned.
lspci shows what the driver installation instructions say we should see, but nowhere can i find what if any preferred interface we should use, or if there’s a differernce. Should they be bonded? Is the auxillary card only there for supplemental power? My understanding was that it was so a dual socket machine could connect through different ones, but would they need their own ip addresses?
7f:00.0 Infiniband controller: Mellanox Technologies MT28908 Family [ConnectX-6]
ac:00.0 Infiniband controller: Mellanox Technologies MT28908 Family [ConnectX-6]
CA ‘mlx5_0’
CA type: MT4123
Number of ports: 1
Firmware version: 20.35.3006
Hardware version: 0
Node GUID: 0x0c42a1030099939c
System image GUID: 0x0c42a1030099939c
Port 1:
State: Active
Physical state: LinkUp
Rate: 200
Base lid: 25
LMC: 0
SM lid: 5
Capability mask: 0xa651e848
Port GUID: 0x0c42a1030099939c
Link layer: InfiniBand
CA ‘mlx5_1’
CA type: MT4123
Number of ports: 1
Firmware version: 20.35.3006
Hardware version: 0
Node GUID: 0x0c42a1030099939e
System image GUID: 0x0c42a1030099939c
Port 1:
State: Active
Physical state: LinkUp
Rate: 200
Base lid: 36
LMC: 0
SM lid: 5
Capability mask: 0xa641e848
Port GUID: 0x0c42a1030099939e
Link layer: InfiniBand
ib0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 2044
inet 192.168.30.11 netmask 255.255.255.0 broadcast 192.168.30.255
inet6 fe80::e42:a103:99:939c prefixlen 64 scopeid 0x20
Infiniband hardware address can be incorrect! Please read BUGS section in ifconfig(8).
infiniband 00:00:00:4C:FE:80:00:00:00:00:00:00:00:00:00:00:00:00:00:00 txqueuelen 256 (InfiniBand)
RX packets 1293992 bytes 1493855506 (1.3 GiB)
RX errors 0 dropped 0 overruns 0 frame 0
TX packets 2930310 bytes 3604531200 (3.3 GiB)
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0
ib1: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 2044
Infiniband hardware address can be incorrect! Please read BUGS section in ifconfig(8).
infiniband 00:00:01:F0:FE:80:00:00:00:00:00:00:00:00:00:00:00:00:00:00 txqueuelen 256 (InfiniBand)
RX packets 38052 bytes 3805232 (3.6 MiB)
RX errors 0 dropped 0 overruns 0 frame 0
TX packets 0 bytes 0 (0.0 B)
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0