Multi GPU computing not working properly

I have had problems utilizing all four GPUs on my computer for Keras and TensorFlow multi GPU training.

They train my neural network, but it’s slower than when using only one GPU.

I tried to get help here, but we didn’t solve the problem:

https://github.com/avolkov1/keras_experiments/issues/13

The conclusion was that this is likely a setup/driver issue.

Does anyone have any idea what’s going on, or where to start to fix this?

  • Ubuntu 16.04
  • TensorFlow 1.4.0
  • Keras 2.1.5
  • CUDA Version 8.0.61
  • cuDNN 6.0.21

I’ve read the linked issue thread.

Is this a skylake CPU system, perhaps? The combination of outputs from the various P2P tests you ran in the other thread make it look that way. In particular the combination of topo test reporting a SOC connection between two sets of 2 GPUs each (so, 2 GPUs connected to one CPU socket, and 2 GPUs connected to the other CPU socket), combined with the fact that P2P being reported as available amongst all 4 devices, would be unusual for any intel chipset system prior to skylake.

What GPU driver are you currently running? Is it still 384.90?

Thanks for reply @txbob! Its truly appreciated.

The CPU is a “AMD Ryzen Threadripper 1900X”.

Yes, still 384.90. I never updated because I was in the middle of a project. Would you recommend updating them?

I got to be honest, I’m not really good at this technical GPU stuff, I never owned a desktop computer before now. When I got it I plugged in two monitors in one of the 4 GPUs, and when I train my DNNs I try using all 4 in parallel. Is that even right?