Trouble using ConnectX-5 Ex cards using host chaining mode. Some connections work but not others.

I have 3 servers, each with a dual port ConnectX-5 Ex card. I have them connected in a chain like this: S1 → S2 → S3. Server 2 (S2) uses both ports and has (by default) host chaining mode set to BASIC(1). I assigned IP addresses to all the ports and try to ping on each server the other two and I find some combinations that don’t work.

S1 can ping S2 and S3 correctly. (which means host chaining seems to be working)

S2 can ping S1, but pings to S3 fail

S3 can ping S1, but pings to S2 fail

I tried a different topology. Now I have S1 → S3 → S2

S1 can ping S2 but pings to S3 fail

S2 can ping S1 and S3

S3 can ping S2 but pings to S1 fail

Any thoughts as to what may be happening? Or how to get more information to fix this issue? It seems weird that in the first experiment S3 responds to pings from S1 but not from S2 on the same port.

I appreciate the help.

Hello Rafael,

Thank you for posting your inquiry on the NVIDIA/Mellanox Community.

Based on your information, we noticed you have a valid support contract, therefor it is more appropriate to assist you further through a support ticket.

You will receive a notification from your new support ticket shortly.

Thank you,

~NVIDIA/Mellanox Technical Support.