Question about Dual SX6036G configurations on vSphere 6.0 Update 2

I made a tiny personal lab with 2 of SX6036G Gateway switches.

During 1 month POC period later I have some questions below.

Q1. Why ARP-Proxy mode enabled then IB SM was disabled?

At first I think that single SX6036G configuration with VPI Profile mode can support IB SM for IB Network & VPI Gateway mode for external Ethernet connection.

But my thery was discovered impossible configuration then I add a another SX6036G for SM Management.

Q2. default partition (0x7fff) configuration

I want to enable IPoIB on non-default partition (not 0x7fff)

When I disabled IPoIB on Default Partition (0x7fff) then all of partition’s IPoIB also disabled.

I can’t ping - also vmkping - to each ESXi host then forever.

How can I resolve this problem?

Q3. 4k mtu configuration on vSphere Environment.

I want to running throughput required service like iSER or etc then configured mtu to 4k on all of partitions include Default Partition (0x7fff).

But when setup 4k mtu then each ESXi host can’t connect to each otheres - any IPoIB service functions can’t works, too!

Whenever boot ESXi hosts then can’t connect to IPoIB network with 4k mtu & can’t join multicast message on ESXi host’s boot log

How can I resolve this problem?

Q4. Cable information display problems.

I didn’t add a Fabric Inspecer License to 2 of SX6036G yet.

Whenever I connect new FDR copper or Active Optical cable to blank port, it show a message like below.

But long time later it show a cable information like below.

Main problem is FDR Active Optical cable information display!

If I reboot SX6036G switch then it show a cable information immediately

How can I resolve problem?

Is it Fabric Inspecter mandatory?


Here is my lab configurations

Hypervisor - vSphere 6.0 update 2

(I wan’t RDMA Storage like SRP, iSER. But iSER on vSpher OFED 1.8.3 has critical error…:()

StorageOS - OmniOS latest version with SRP Target

vSphere OFED version - 1.8.2.4

ESXi Host - 10 of Dell PowerEdge R610 Server with latest firmware

Storage - 5 of SuperMicro CSE826E16

Switch - 2 of SX6036G

Switch configuration

Fabric-A : SX6036G with VPI profile mode

8 ports configured ETH port and connect with QSA to external 10Gb Ethernet switch and another all port configured IB port

5 of ARP-Proxy configurations - also IB SM, IP routing disabled

4 of IB ports connect to Fabric-B with Mellanox FDR copper cables.

Fabric-B : SX6036G with VPI profile mode

All port configured IB port

embedded IB SM enabled with some partition configuration.

4 of IB ports connect to Fabric-A with Mellanox FDR copper cables.

Unfortunately my 2 of SX6036G’s warranty was out in last year.

I always saw your PB before purchase your products.

But always PB’s function was beta or slowest driver support in historical.

I will use your HCA but switch to Intel Omni-Path fabric switch.

That will give me a peace of mind.

I’m burn-out everytime at problem almost 6 years.

That’s all.

Hi ,

I would suggest opening a support ticket with Mellanox support@mellanox.com mailto:support@mellanox.com

Hi Jea-Hoon,

Q1. Why ARP-Proxy mode enabled then IB SM was disabled?

this is by design - SM can’t run on the same switch where proxy-arp can run - this is documented in the MLNX-OS user manual

Q2.

default partition (0x7fff) configuration

I want to enable IPoIB on non-default partition (not 0x7fff)

When I disabled IPoIB on Default Partition (0x7fff) then all of partition’s IPoIB also disabled.

I can’t ping - also vmkping - to each ESXi host then forever.

How can I resolve this problem?

It’s not recommended to disable the default partition since the subnet manager is managing the fabric via the default pkey - this explain why nothing worked.

Q3. Cable information display problems.

This issue of cable transceiver display is going to be fixed in the next GA MLNX-OS release of 3.6.1000

The cable transceiver display in the ports tab of the switch webUI is not related to the fabric inspector feature

Thank you for your quick exact reply…:)

I know also default partition must be enable on fabric. But security reason I want to disable IPoIB function on default partition, not default partition itself. Is there any solution on MLNX-OS?

And is there any solution 4k MTU configuration problems?

Hi,

Try the below - I never tried it but it’s available:

switch(config)# no ib partition Default ipoib force

Okay!

I tried your command via my iPhone with VPN.

But force option wasn’t accepted to MLNX-OS.

I think if there is a modification on default partition then MLNX-OS apply the modification to all othere partition.

If I change MTU to 4k on all of non-default partition & not default partition then my all ESXi host don’t accept to change to vSwitch’s

MTU to 4092. If I change default partition’s MTU to 4k then all of my ESXi host accept change vSwitch’s MTU to 4092.

All of ESXi host pinging to each others on default partition, but can’ non-default partition.

That’s a very curious status.

I’m also experienced your 4036, 5035, too.

But 2 previous generation switch works well with 4k MTU partition with ESXi host.

Sure! When disable default partition’s IPoIB then another partition’s IPoIB also disabled.

But I saw a IBTA eBook that show me a partition concept that include effective routing and security.

If current MLNX-OS can’t support I don’t concern about it.

2k MTU also show a reasonable performance and best practice for latency aspect.

Could you check it again?