Why I got "LID change" event during normal IO

Hi all,

I’m running our ibverbs-based program to communicate over infiniband. Every once a while I got this error on one of the nodes:

ibv_get_async_event: dev:mlx4_0 evt: LID change

When this happens, the peer nodes will report:

ibv_get_async_event: dev:mlx4_0 evt: client reregistration

Once this happens, the IB connection starts to show problems and eventually shutdown.

We are using MT25408 ConnectX-3 QDR NIC, Infiniscale-IV QDR switch.