tesla m40 runs extremely slowly

jamesben6688 · November 26, 2017, 5:36am

My keras code with tensorflow backend runs extremely slowly on my Tesla M40 GPUs. I doubt that there are some bugs in my code. However, when I run the same code on another 1080ti GPU, it runs very fast. I test the same code on 3 Tesla M40 GPU group and another single 1080ti, however, the single 1080ti runs much faster than 3 Tesla M40. Sometimes, the Volatile GPU-Util was 100% without any running process. Furthermore, the power usage is always no more than 100W. Is this a hardware problem?

fgardiner · November 26, 2017, 3:32pm

The Nvidia GPU Cloud images are only intended to be used with Pascal or Volta GPUs. The M40 is an earlier generation and it is not supported by NGC while the 1080ti, being a Pascal based GPU is supported. NGC only supports Pascal and Volta GPUs because they are far better suited to Deep Learning workloads and would be expected to be much faster.

Unless your M40 is showing some other symptoms, it is unlikely to be a hardware issue

Robert_Crovella · November 26, 2017, 8:42pm

My guess is that this is a cross-posting of your similar question here:

[url]https://stackoverflow.com/questions/47483099/tesla-m40-run-extremely-slowly[/url]

Unless you are using an NGC container, posting that question here may give rise to confusion.

Apart from that subject, I would expect, in general, for a DL framework or code to run faster on a 1080ti than on a single Tesla M40. The 1080ti has more compute throughput as well as more memory bandwidth, than that older GPU.

Whether or not a code might run faster on 3 Tesla M40’s compared to a single 1080ti will have a lot to do with the code itself. Unless your keras/tensorflow code is written to automatically use multiple GPUs, running it on a machine with multiple GPUs may not give any benefit (over a single GPU).

jamesben6688 · December 1, 2017, 1:25pm

Thanks for your reply. I am using multiple GPUs with official keras API in [url]https://github.com/fchollet/keras/blob/master/keras/utils/training_utils.py[/url], so I do not think it is about my code issue. In addition, one epoch’s training which takes half a hour in one 1080ti should take about 4 hours in 3 multiple GPUs, and the power usage is no more than 100W. As your can see, the GPU state at[url]https://stackoverflow.com/questions/47483099/tesla-m40-run-extremely-slowly[/url] is really strange, so the large performance difference(0.5 hour per epoch on 1080ti VS 4 hours per epoch on 3 Tesla M40) is just because of the GPU architecture ?

Topic		Replies	Views
tesla m40 runs extremely slowly CUDA Programming and Performance	1	719	August 30, 2018
Tesla M40 Multi-GPU Performance Issue CUDA Programming and Performance	1	671	February 6, 2017
Slow 1080Ti compared to GTX960 running tensorflow CUDA Programming and Performance	5	3464	May 8, 2018
MPS overhead TensorRT	1	698	December 13, 2021
TRTIS Tesla M60 performance issues (TensorRT model) Triton Inference Server - archived	4	1256	August 20, 2019
Problem with multiple GPUs The multiple GPUs are not working in parallel CUDA Programming and Performance	6	1868	September 2, 2010
Tesla k40c performs slower than gtx1060 with OpenACC Legacy PGI Compilers	2	3642	February 23, 2017
[Solved] Strange Results using Tensorflow with GTX 1080 CUDA Setup and Installation	8	4157	November 5, 2016
Computational speed of High-End and Medium-End Nvidia Tesla cards CUDA Programming and Performance	1	499	November 20, 2019
cudaMemcpy2D slow with TESLA1060 ? CUDA Programming and Performance	3	2765	November 6, 2009

tesla m40 runs extremely slowly

Related topics