Facing problem in configuring the luster(2.3) lnet configuration using IPOIB.  Please suggest.(MT4099)

Please help…

Hey there, can we get more details:

  1. Operating system
  2. OFED version
  3. Firmware Version
  4. Symptoms you are seeing
  5. Any other details?

Any update Please.

  1. Operating System CentOS 6.3(kernel 2.6.32-279.5.1.) X86_64

  2. Tried with OFED1.5.4.1 and MLNX_OFED_LINUX-1.5.3-3.1.0-rhel6.3-x86_64

  3. Firmware version: 2.10.700

  4. IB is working with both OFED versions,

  5. When I am probing lustre module with lnet as (o2ib) following messges

are comming

Mar 18 07:44:12 mds1 kernel: ko2iblnd: disagrees about version of symbol

ib_dealloc_pd

Mar 18 07:44:12 mds1 kernel: ko2iblnd: Unknown symbol ib_dealloc_pd

Mar 18 07:44:12 mds1 kernel: ko2iblnd: disagrees about version of symbol

ib_fmr_pool_map_phys

Mar 18 07:44:12 mds1 kernel: ko2iblnd: Unknown symbol ib_fmr_pool_map_phys

Mar 18 07:44:12 mds1 modprobe: FATAL: Error inserting ko2iblnd

(/lib/modules/2.6.32-279.14.1.el6_lustre.x86_64/updates/kernel/net/lustre/ko2iblnd.ko):

Unknown symbol in module, or unknown parameter (see dmesg)

Mar 18 07:44:12 mds1 kernel: LustreError:

4431:0:(api-ni.c:1055:lnet_startup_lndnis()) Can’t load LND o2ib, module

ko2iblnd, rc=256

Mar 18 07:44:12 mds1 kernel: LustreError:

4431:0:(events.c:742:ptlrpc_init_portals()) network initialisation failed

Mar 18 07:44:15 mds1 kernel: ib0: multicast join failed for

ff12:401b:ffff:0000:0000:0000:ffff:ffff, status -22

Mar 18 07:44:31 mds1 kernel: ib0: multicast join failed for

ff12:401b:ffff:0000:0000:0000:ffff:ffff, status -22

Mar 18 07:44:47 mds1 kernel: ib0: multicast join failed for

ff12:401b:ffff:0000:0000:0000:ffff:ffff, status -22

Mar 18 07:45:03 mds1 kernel: ib0: multicast join failed for

ff12:401b:ffff:0000:0000:0000:ffff:ffff, status -22

PLEASE SEE THE ABOVE MESSAGES in /var/log/messages during the probe.

module is not loading.

Please send the drivers link for the following ib adaptor

ibstat

CA ‘mlx4_0’

CA type: MT4099

Number of ports: 1

Firmware version: 2.10.700

Hardware version: 0

Node GUID: 0x0002c903001c50b0

System image GUID: 0x0002c903001c50b3

Port 1:

State: Initializing

Physical state: LinkUp

Rate: 56

Base lid: 0

LMC: 0

SM lid: 0

Capability mask: 0x02514868

Port GUID: 0x0002c903001c50b1

Link layer: InfiniBand

Thanks for the response.

Venkat

I’m pretty sure this is a Lustre error, not an OFED issue. But, I am not a Lustre expert, so I’ve asked someone to jump into this Q&A, and hopefully they should be able to assist soon.

Thank you very much.

Here is what one of my contacts stated:

"They have not compiled the Lustre modules against the correct Infiniband symbols:

Mar 18 07:44:12 mds1 kernel: ko2iblnd: disagrees about version of symbol ib_fmr_pool_map_phys

They must compile against the correct OFED distro to be able to load the ko2iblnd module. Also, if they are using ko2iblnd they are not trying to use Lustre over IPoIB, they are trying to use native verbs which is recommended"