ESX 5.1 IPoIB driver crash

Hello,

after two weeks of testing and firmware patching I think we found some major bug in the ESX 5.1 OFED 1.8.1.0 IPoIB driver. We are currently running on a Fujitsu RX300 S6 (Dual Xeon X5670) and a Mellanox ConnectX-2 MHRH2A (Firmware 2.9.1200). The storage server is running Ubuntu 12.04 LTS with an older ConnectX (PCIe Gen2) card and Linux Kernel 3.5. In between an 24 Port DDR Flextronics IB CX4 Switch. Therefore our max MTU is limited to 2K but that is no problem for us.

On the ESX the Infiniband card serves as a VMKernel interface and as a VM port group at the same time. A running VM has its “local” disks mounted over the VMKernel interface via IPoIB. Inside the VM we have mounted a NFS filesystem from the NFS server. So it looks like:

vm:~ # df

Filesystem 1K-blocks Used Available Use% Mounted on

/dev/sda1 61927388 3577888 55203784 7% / (mounted by ESX)

10.10.30.253:/var/nas/backup 11007961088 6360753152 4647207936 58% /backup (mounted inside VM)

To reproduce the error we copy data into the VM using SCP and use /backup as a target. After copying some gigabytes of data the Infiniband card stops working and the ESX kernel gives the following error message. Ths situation cannot be solved without ESX reboot.

WARNING: LinDMA: Linux_DMACheckContraints:149:Cannot

map machine address = 0x15ffff37b0, length = 65160

for device 0000:02:00.0; reason = buffer straddles

device dma boundary (0xffffffff)

<3>vmnic_ib1:ipoib_send:504: found skb where it does not belong

tx_head = 323830, tx_tail =323830

<3>vmnic_ib1:ipoib_send:505: netif_queue_stopped = 0

Backtrace for current CPU #20, worldID=8212, ebp=0x41220051b028

ipoib_send@#+0x5d4 stack: 0x41800c4524aa, 0x4f0f5000000d

ipoib_send@#+0x5d4 stack: 0x41800c44bca8, 0x41000fe5d6c0

ipoib_start_xmit@#+0x53 stack: 0x41220051b238, 0x41800c4

In the process of eleminating the error we tried (without success)

  1. Updated servers firmware to latest version

  2. Switched from ConnectX to ConnectX-2 card

  3. Switched from firmware 2.9.1000 to 2.9.1200

Everything works fine if we use the infiniband card only as a VMKernel interface. More details in my first post: Infrastructure & Networking - NVIDIA Developer Forums

Any help is appreciated.

wonderful!

I will check with the folks if they have an ETA for a permanent fix.

Hi Markus,

Thank you for taking the time and posting. I poked around with some smart engineers and was able to get some idea in addition to the data you provided.

The issue here was the SCSI mid-layer modifying the DMA device dma_boundary attribute under IPoIB (from 64bit to 32bit).

This phenomenon was due to SRP adding a new SCSI host while keeping the dma_boundary attribute of scsi_host template at default.

In this case SCSI mid-layer will override the DMA device dma_boundary to default (32bit boundary) – causing IPoIB allocation across the 32bit boundary to fail and possibly crash.

In order to avoid this problem, it is recommended to uninstall SRP (if no need for it) using:

$ esxcli software vib remove –n scsi-ib-srp

$ reboot

I hope that it will help…

Cheers!

Fine,

after 200GB of transferred data without problems I can confirm that your workaround fixes our problem. We do not need SRP, so no more headaches. Maybe you could give other interested users a tip if this will be fixed in a future driver version.

Thanks.

Guys are saying it is coming soon (matter of 2-3 weeks). hold on…

Do you mean that new driver will be release for vSphere 5.x?