My ConnectX-6 Dx NICs can't work consistently

I use two servers equipped with ConnectX-6 DX adapter cards connected back-to-back to do some congestion control programming in Ubuntu 18.0.4. Unfortunately, my network card can’t work for a long time, and both of them are. I only run the instance code provided by NVIDIA, Raw Ethernet Programming: Basic Introduction - Code Example (nvidia.com). The network card only worked for a minute or two and then went on strike. Then the server needs to be restarted,the adapter cards can run again.

image
Each time the server is restarted, it can run for one to two minutes, and then the network card is automatically disabled. At this time, the cable temperature is very high. Is the network card automatically disabled due to high temperature? If so, what should I do?

Hi @binaryartist,

Welcome to the NVIDIA Forums! Your topic belongs in the Networking category, I have moved it over for you.

Hi binaryartist

if you see think the server restarted caused by “At this time, the cable temperature is very high. Is the network card automatically disabled due to high temperature”, could you please open a ticket?

Actually there are several commands to check cable & nic temperature. Those are needed to investigate by technical support guys.

/HyungKwang