DGx station (Tesla V100) performance very slow

I am trying to run the same code with the same CUDA version, TensorFlow version (2.4), and cuDNN version, in Ubuntu 18.04 using DGX station with 4 Tesla V100 and in Titan XP.

I observed that the DGX station is very slow in comparison to Titan XP. Please inform the corrective actions to update or debug the DGX station to keep the performance up to the mark.

I am sharing the screen short for both the system (Same code and same environment)

Installation as per GPU 支援  |  TensorFlow (For both systems)

Hi @RatheeshR,

There’s lots of things which could be going on here. Can you make a ticket with NVIDIA Enterprise Support (email EnterpriseSupport@nvidia.com , or click “Create ticket” in the Enterprise Support Portal ) so they can help narrow down what is the source of this behavior?

If you wanted to do some sanity-checking yourself before doing that, I’d recommend seeing if the performance difference is still present when using one of the NGC containers, such as TensorFlow | NVIDIA NGC . That’ll help rule out any software differences on the two systems.

ScottE