Only K40c is being utilized for computation out of two GPUs. Other one is K5200.

mhasa004 · October 19, 2015, 8:46pm

Hi,

In my Dell Precision Tower 5810 machine I have installed two graphics card - Tesla K40c and Quadro K5200. When I try to perform computation, K5200 is never used. Computation always goes to K40c. Anyone have any idea what is going on?
Here is some nvidia-smi log

$ nvidia-smi
Mon Oct 19 13:44:26 2015
+------------------------------------------------------+
| NVIDIA-SMI 352.41     Driver Version: 352.41         |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  Quadro K5200        Off  | 0000:03:00.0     Off |                    0 |
| 28%   46C    P8    21W / 150W |     16MiB /  7678MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+
|   1  Tesla K40c          Off  | 0000:04:00.0     Off |                    0 |
| 25%   51C    P0   138W / 235W |   1482MiB / 11519MiB |     96%      Default |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes:                                                       GPU Memory |
|  GPU       PID  Type  Process name                               Usage      |
|=============================================================================|
|    1      2088    C   .../mhasan/_caffe/.build_release/tools/caffe  1456MiB |
+-----------------------------------------------------------------------------+

$ nvidia-smi  -q -d CLOCK

==============NVSMI LOG==============

Timestamp                           : Mon Oct 19 13:41:17 2015
Driver Version                      : 352.41

Attached GPUs                       : 2
GPU 0000:03:00.0
    Clocks
        Graphics                    : 875 MHz
        SM                          : 875 MHz
        Memory                      : 3004 MHz
    Applications Clocks
        Graphics                    : N/A
        Memory                      : N/A
    Default Applications Clocks
        Graphics                    : N/A
        Memory                      : N/A
    Max Clocks
        Graphics                    : 875 MHz
        SM                          : 875 MHz
        Memory                      : 3004 MHz
    SM Clock Samples
        Duration                    : 3.18 sec
        Number of Samples           : 4
        Max                         : 875 MHz
        Min                         : 324 MHz
        Avg                         : 832 MHz
    Memory Clock Samples
        Duration                    : 3.18 sec
        Number of Samples           : 4
        Max                         : 3004 MHz
        Min                         : 324 MHz
        Avg                         : 3004 MHz
    Clock Policy
        Auto Boost                  : N/A
        Auto Boost Default          : N/A

GPU 0000:04:00.0
    Clocks
        Graphics                    : 745 MHz
        SM                          : 745 MHz
        Memory                      : 3004 MHz
    Applications Clocks
        Graphics                    : 745 MHz
        Memory                      : 3004 MHz
    Default Applications Clocks
        Graphics                    : 745 MHz
        Memory                      : 3004 MHz
    Max Clocks
        Graphics                    : 875 MHz
        SM                          : 875 MHz
        Memory                      : 3004 MHz
    SM Clock Samples
        Duration                    : 0.00 sec
        Number of Samples           : 2
        Max                         : 745 MHz
        Min                         : 324 MHz
        Avg                         : 745 MHz
    Memory Clock Samples
        Duration                    : 0.00 sec
        Number of Samples           : 2
        Max                         : 3004 MHz
        Min                         : 324 MHz
        Avg                         : 3004 MHz
    Clock Policy
        Auto Boost                  : N/A
        Auto Boost Default          : N/A

~$ nvidia-smi -q -d SUPPORTED_CLOCKS

==============NVSMI LOG==============

Timestamp                           : Mon Oct 19 13:43:10 2015
Driver Version                      : 352.41

Attached GPUs                       : 2
GPU 0000:03:00.0
    Supported Clocks
        Memory                      : 3004 MHz
            Graphics                : 875 MHz
            Graphics                : 771 MHz
            Graphics                : 666 MHz
            Graphics                : 549 MHz
        Memory                      : 810 MHz
            Graphics                : 549 MHz
        Memory                      : 324 MHz
            Graphics                : 324 MHz

GPU 0000:04:00.0
    Supported Clocks
        Memory                      : 3004 MHz
            Graphics                : 875 MHz
            Graphics                : 810 MHz
            Graphics                : 745 MHz
            Graphics                : 666 MHz
        Memory                      : 324 MHz
            Graphics                : 324 MHz

sudo nvidia-smi -ac 3004,875 -i 0
Setting applications clocks is not supported for GPU 0000:03:00.0.
Treating as warning and moving on.
All done.

Thanks.
Hasan

Robert_Crovella · October 19, 2015, 9:06pm

This is all expected behavior.

Setting application clocks is not supported on your K5200 GPU.

Apart from that, CUDA applications that only use a single GPU will generally default to using a particular GPU in your system. If you want to “steer” an application to use another GPU, you could try using the CUDA_VISIBLE_DEVICES environment variable.

[url]Programming Guide :: CUDA Toolkit Documentation

mhasa004 · October 19, 2015, 9:42pm

Hi txbob,

Thanks for your reply. So I set GPU 0 as my only visible device by changing the corresponding environment variable as follows -

export CUDA_VISIBLE_DEVICES=0

Still no utilization of GPU 0, computation directly goes to GPU 1.

nvidia-smi
Mon Oct 19 14:41:01 2015
+------------------------------------------------------+
| NVIDIA-SMI 352.41     Driver Version: 352.41         |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  Quadro K5200        Off  | 0000:03:00.0     Off |                    0 |
| 31%   50C    P8    21W / 150W |    107MiB /  7678MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+
|   1  Tesla K40c          Off  | 0000:04:00.0     Off |                    0 |
| 33%   73C    P0   130W / 235W |   1606MiB / 11519MiB |     97%      Default |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes:                                                       GPU Memory |
|  GPU       PID  Type  Process name                               Usage      |
|=============================================================================|
|    0      2799    C   /home/mhasan/torch/install/bin/luajit           89MiB |
|    1      2743    C   .../mhasan/_caffe/.build_release/tools/caffe  1456MiB |
|    1      2799    C   /home/mhasan/torch/install/bin/luajit          121MiB |
+-----------------------------------------------------------------------------+

Robert_Crovella · October 19, 2015, 9:47pm

Try doing

export CUDA_VISIBLE_DEVICES="1"

The enumeration order in nvidia-smi is not always the same as the CUDA enumeration order.

CUDA tries to order the most powerful GPU first. That would be the K40c before the K5200

mhasa004 · October 19, 2015, 10:01pm

Thanks a lot!
Both of the GPUs are now doing computations.

nvidia-smi
Mon Oct 19 15:00:02 2015
+------------------------------------------------------+
| NVIDIA-SMI 352.41     Driver Version: 352.41         |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  Quadro K5200        Off  | 0000:03:00.0     Off |                    0 |
| 34%   59C    P0    73W / 150W |    125MiB /  7678MiB |     88%      Default |
+-------------------------------+----------------------+----------------------+
|   1  Tesla K40c          Off  | 0000:04:00.0     Off |                    0 |
| 33%   72C    P0   129W / 235W |   1482MiB / 11519MiB |     97%      Default |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes:                                                       GPU Memory |
|  GPU       PID  Type  Process name                               Usage      |
|=============================================================================|
|    0      2987    C   /home/mhasan/torch/install/bin/luajit          108MiB |
|    1      2743    C   .../mhasan/_caffe/.build_release/tools/caffe  1456MiB |
+-----------------------------------------------------------------------------+