Please help…
Hey there, can we get more details:
- Operating system
- OFED version
- Firmware Version
- Symptoms you are seeing
- Any other details?
Any update Please.
-
Operating System CentOS 6.3(kernel 2.6.32-279.5.1.) X86_64
-
Tried with OFED1.5.4.1 and MLNX_OFED_LINUX-1.5.3-3.1.0-rhel6.3-x86_64
-
Firmware version: 2.10.700
-
IB is working with both OFED versions,
-
When I am probing lustre module with lnet as (o2ib) following messges
are comming
Mar 18 07:44:12 mds1 kernel: ko2iblnd: disagrees about version of symbol
ib_dealloc_pd
Mar 18 07:44:12 mds1 kernel: ko2iblnd: Unknown symbol ib_dealloc_pd
Mar 18 07:44:12 mds1 kernel: ko2iblnd: disagrees about version of symbol
ib_fmr_pool_map_phys
Mar 18 07:44:12 mds1 kernel: ko2iblnd: Unknown symbol ib_fmr_pool_map_phys
Mar 18 07:44:12 mds1 modprobe: FATAL: Error inserting ko2iblnd
(/lib/modules/2.6.32-279.14.1.el6_lustre.x86_64/updates/kernel/net/lustre/ko2iblnd.ko):
Unknown symbol in module, or unknown parameter (see dmesg)
Mar 18 07:44:12 mds1 kernel: LustreError:
4431:0:(api-ni.c:1055:lnet_startup_lndnis()) Can’t load LND o2ib, module
ko2iblnd, rc=256
Mar 18 07:44:12 mds1 kernel: LustreError:
4431:0:(events.c:742:ptlrpc_init_portals()) network initialisation failed
Mar 18 07:44:15 mds1 kernel: ib0: multicast join failed for
ff12:401b:ffff:0000:0000:0000:ffff:ffff, status -22
Mar 18 07:44:31 mds1 kernel: ib0: multicast join failed for
ff12:401b:ffff:0000:0000:0000:ffff:ffff, status -22
Mar 18 07:44:47 mds1 kernel: ib0: multicast join failed for
ff12:401b:ffff:0000:0000:0000:ffff:ffff, status -22
Mar 18 07:45:03 mds1 kernel: ib0: multicast join failed for
ff12:401b:ffff:0000:0000:0000:ffff:ffff, status -22
PLEASE SEE THE ABOVE MESSAGES in /var/log/messages during the probe.
module is not loading.
Please send the drivers link for the following ib adaptor
ibstat
CA ‘mlx4_0’
CA type: MT4099
Number of ports: 1
Firmware version: 2.10.700
Hardware version: 0
Node GUID: 0x0002c903001c50b0
System image GUID: 0x0002c903001c50b3
Port 1:
State: Initializing
Physical state: LinkUp
Rate: 56
Base lid: 0
LMC: 0
SM lid: 0
Capability mask: 0x02514868
Port GUID: 0x0002c903001c50b1
Link layer: InfiniBand
Thanks for the response.
Venkat
I’m pretty sure this is a Lustre error, not an OFED issue. But, I am not a Lustre expert, so I’ve asked someone to jump into this Q&A, and hopefully they should be able to assist soon.
Thank you very much.
Here is what one of my contacts stated:
"They have not compiled the Lustre modules against the correct Infiniband symbols:
Mar 18 07:44:12 mds1 kernel: ko2iblnd: disagrees about version of symbol ib_fmr_pool_map_phys
They must compile against the correct OFED distro to be able to load the ko2iblnd module. Also, if they are using ko2iblnd they are not trying to use Lustre over IPoIB, they are trying to use native verbs which is recommended"