Hi!
I think all problem originated from VMware’s native driver model.
New native driver for AHCI is buggy then almost users were disable native driver then reboot ESXi host then use vKernel driver.
Then I think disable all native driver form Mellanox HCA + uninstall inbox driver then install Mellanox OFED 1.8.2.5 on my ESXi 6.0 host.
I’m test Mellanox SRP driver 1.8.2.5 on ESXi 6.0 like below.
A. Configurations
All HCA were MHQH29B-XTR with firmware 2.9.1200
- If test is success then switch to all HCAs to ConnectX-3 MCX354A-FCBT
2 of SX6036G FDR14 Gateway switches
- Configure diffrent SM Prefix on each switches for SRP path optimization
OmniOS with latest update ZFS COMSTAR SRP Target
B. Installation
- disable native driver for vRDMA - this is very buggy
esxcli system module set --enabled=false -m=nrdma
esxcli system module set --enabled=false -m=nrdma_vmkapi_shim
esxcli system module set --enabled=false -m=nmlx4_rdma
esxcli system module set --enabled=false -m=vmkapi_v2_3_0_0_rdma_shim
esxcli system module set --enabled=false -m=vrdma
- uninstall inbox driver - also useless function that can’t support ethernet iSER properly
esxcli software vib remove -n net-mlx4-en
esxcli software vib remove -n net-mlx4-core
esxcli software vib remove -n nmlx4-rdma
esxcli software vib remove -n nmlx4-en
esxcli software vib remove -n nmlx4-core
esxcli software vib remove -n nmlx5-core
- install Mellanox OFED 1.8.2.5 for ESXi 6.x.
esxcli software vib install -d /var/log/vmware/MLNX-OFED-ESX-1.8.2.5-10EM-600.0.0.2494585.zip

- register VMware NMP path optimization rule
- This is optional. I’m use now ESOS linux based SRP Target that can support all of VAAI features…:)
esxcli storage nmp satp rule add -s “VMW_SATP_ALUA” -V “SUN” -M “COMSTAR” -P “VMW_PSP_RR” -O “iops=1” -e “Sun ZFS COMSTAR” -c “tpgs_on”
esxcli storage core claimrule load
then reboot the ESXi 6.0 host
All ESXi 6.0 host works properly 3 days.
Some weeks later I’ll report it on this threads…:)
P.S
If all ESXi 6.0 works properly for 1 weeks then I’ll switched all host to ESXi 6.5 for another test…:)
PS2 (Updated in 20 June, 2017)
All of ESXi 6.0, 6.5 test works perfectly…:)
But on ESXi 6.5 P01 has a problem with In-Guest SCSI unmap that cause VMDK performance Problems, not SRP Target.
I’m work in ESXi 6.0 latest build with ESOS SCST SRP Target that show me a perfect performance and rock solid!
All Problem was VMware’s inbox native vRDMA driver that cause ESXi 6.x PSOD conflict with SRP driver 1.8.2.5.
Be stand!
I think VMware launch new ESXi 6.5 update 1 on July, 2017.
I’m waiting for IPoIB Infiniband iSER native driver…:)
But I’m afraid that there is a possibility that VMware & Mellanox discontinue support ConnectX-3 HCA…:(
ESXi 6.5 inbox driver can support Ethernet iSER on ConnectX-4 only now.
If ConnectX-3 will discontinue support iSER that IPoIB or Ethernet, l’ll change CX-3 operation mode from FDR Infiniband to 40Gb Ethernet mode then change lab storage from ESOS to ScaleIO.next…:)