Can you do ROCEv2 without a switch?

Just looking for a sanity check. I have used a Linux target (SCST) to present disks to esxi hosts with iser. The linux server has a dual port connectx-4 card. IT has a single connection to each host at 100gb. This has worked well.

I am attempting to test NVMEof with a similar setup. I believe I have the target configured correctly. VMware (esxi7) sees the nvmeof adapater and when i run a discover on the controller I can see the namespace, and it reports the correct size of the disk. I never receive an active path though and the target field is blank.

when i check the vmkernel log i see the HPP driver attempting to claim the path but it fails saying not supported. I am just theorizing that this may be because some rocev2 check is failing but I cant see for sure as I am unable to get anymore detail about the error.

Should this test work? on the local linux box if i run both nvme and nvmet i am able to connect to the controller and mount the disk. below are the hosts logs related to the connection from the esxi host and the vmware error.

i called it pure simply because i wasnt sure if my nqn was not adhering to a proper naming convention, so i duplicated one that i saw used in another test:

on the linux server these entries are generated when i scan the controller and attempt to connect:

[ 475.028906] nvmet: creating controller 2 for subsystem nqn.2014-08.org.nvmexpress.discovery for NQN nqn.2014-08.Net.PSC:nvme:ESXi-2.

[ 475.093908] nvmet: creating controller 2 for subsystem nqn.2014-08.org.nvmexpress.discovery for NQN nqn.2014-08.Net.PSC:nvme:ESXi-2.

[ 478.676817] nvmet: creating controller 2 for subsystem nqn.2014-08.org.nvmexpress.discovery for NQN nqn.2014-08.Net.PSC:nvme:ESXi-2.

[ 478.746582] nvmet: creating controller 2 for subsystem nqn.2014-08.org.nvmexpress.discovery for NQN nqn.2014-08.Net.PSC:nvme:ESXi-2.

[ 480.835089] nvmet: creating controller 2 for subsystem nqn.2010-06.com.purestorage.flasharray.1f3d6733c48eadcb for NQN nqn.2014-08.Net.PSC:nvme:ESXi-2

on the esxi host:

2020-04-24T12:54:03.383Z cpu1:2097454)WARNING: HPP: HppClaimPath:3719: Failed to claim path ‘vmhba67:C0:T0:L0’: Not supported

2020-04-24T12:54:03.383Z cpu1:2097454)HPP: HppUnclaimPath:3765: Unclaiming path vmhba67:C0:T0:L0

2020-04-24T12:54:03.383Z cpu1:2097454)ScsiPath: 8397: Plugin ‘HPP’ rejected path ‘vmhba67:C0:T0:L0’

2020-04-24T12:54:03.383Z cpu1:2097454)ScsiClaimrule: 1568: Plugin HPP specified by claimrule 65534 was not able to claim path vmhba67:C0:T0:L0: Not supported

2020-04-24T12:55:03.383Z cpu1:2097454)HPP: HppCreateDevice:2957: Created logical device ‘uuid.077ebd74d5e9406f94e26a4fc5b87517’.

2020-04-24T12:55:03.383Z cpu1:2097454)WARNING: HPP: HppClaimPath:3719: Failed to claim path ‘vmhba67:C0:T0:L0’: Not supported

2020-04-24T12:55:03.383Z cpu1:2097454)HPP: HppUnclaimPath:3765: Unclaiming path vmhba67:C0:T0:L0

2020-04-24T12:55:03.383Z cpu1:2097454)ScsiPath: 8397: Plugin ‘HPP’ rejected path ‘vmhba67:C0:T0:L0’

2020-04-24T12:55:03.383Z cpu1:2097454)ScsiClaimrule: 1568: Plugin HPP specified by claimrule 2047 was not able to claim path vmhba67:C0:T0:L0: Not supported

2020-04-24T12:55:03.383Z cpu1:2097454)HPP: HppCreateDevice:2957: Created logical device ‘uuid.077ebd74d5e9406f94e26a4fc5b87517’.

2020-04-24T12:55:03.383Z cpu1:2097454)WARNING: HPP: HppClaimPath:3719: Failed to claim path ‘vmhba67:C0:T0:L0’: Not supported

2020-04-24T12:55:03.383Z cpu1:2097454)HPP: HppUnclaimPath:3765: Unclaiming path vmhba67:C0:T0:L0

2020-04-24T12:55:03.383Z cpu1:2097454)ScsiPath: 8397: Plugin ‘HPP’ rejected path ‘vmhba67:C0:T0:L0’

2020-04-24T12:55:03.383Z cpu1:2097454)ScsiClaimrule: 1568: Plugin HPP specified by claimrule 65534 was not able to claim path vmhba67:C0:T0:L0: Not supported

I have tried making a custom claim rule 2047, at this point i am confident the claim rule matches, but the HPP plugin doesnt like something its seeing.

Hello Grant,

Thank you for posting your inquiry on the Mellanox Community.

Based on your inquiry, we can only provide you the information that you do not need a switch to have RoCEv2 connectivity. You can do this as well between two Mellanox ConnectX-4 adapters connected back to back.

For your reference, please review the following link → https://community.mellanox.com/s/article/howto-configure-roce-on-connectx-4

For the connectivity error you are getting when trying to connect the target, unfortunately you need to reach out to VMWARE as this is part of the ESXi hypervisor platform, which Mellanox does not support.

Thank you,

~Mellanox Technical Support

thank you for the response, one question i have regarding the article. the example shows how to identify gid associated to rocev2 vs v1. If i am running a nvmet target on the server, how can i ensure it is binded to the the gid that would send and receive rocev2? from a nvmet setup i just identify the ip address, the ip address is already configured on the system. I see examples from mellanox how to configure nvmet, but i do not see them get into the gid assignment discussion.

on the esxi client, how can i be sure the client initiates from the gid associated to rocev2? Is this perhaps default? i add a nvmeof adapter and associate it to the physical connectx4 adapter. It shows the adapter is both rocv1 and 2 capable, how can I ensure it is using 2, i suspect one or both sides may be using v1.

i know we are bleeding over into vmware questions but i believe the driver is still provided by you and you may have this answer. the features and configuration of nvmeof for esxi and mellanox cards do not seem to be well documented yet.

in my use case it would be fine to disable v1 if possible. That is assuming v2 would then be used by default in all cases…