Docker Image : cuda:10.0-cudnn7-devel-ubuntu16.04
Total GPUs : 4 nos Tesla V100 - GPU Memory 16.2 GB
Tensorflow-gpu - 1.15
nvidia-smi shows as below - that is all GPUs Utilisation is above 90% for few seconds
I have noticed Training performance is low and taking longer duration in Docker container mentioned here. Same code is working fine in Tesla K80 2 GPUs with CUDA 10.0 on a Dedicated server as shown below.
But in Docker container, Why GPU utilization% is keep changing unusually and why process list section is empty? Why CUDA version is shown a 10.2 in the cuda:10.0-cudnn7-devel-ubuntu16.04 Image refer first three nvidia-smi schreen shots? I have not installed CUDA 10.2 toolkit and CuDNN libraries in the Image. How can i solve this issue?
ps aux command shows processids but nvidia-smi doesn’t show