How to get the number of the used GPU device?

sWienke · April 20, 2010, 11:05am

Hello,
I was wondering how to get the number of the GPU device on which my program runs. acc_get_device only returns the type of my device, but not the number (i.e. 0,1,…).
There could be a possibility as pgaccelinfo also returns the device number. It’s a pity a can’t have a look into pgaccelinfo source files to get to know how they did it.
So, does anyone knows?

MatColgrove · April 20, 2010, 3:35pm

Hi Xray,

I just added a feature request to add a “acc_get_device_num” function to query the device number the application is currently using. (TPR#16863).

As for the pgaccelinfo source, for the most part it’s the same as the example CUDA Fortran code “cufinfo.cuf” found in your installation’s “etc/samples” directory.

Thanks,
Mat

MatColgrove · May 4, 2010, 7:10pm

Hi Xray,

This feature will be available in 10.5. However, missed the deadline for documentation changes so documentation will follow in 10.6. Here’s an example of it’s usage.

% cat test.f90
  use accel_lib
  integer n

  call acc_init(acc_device_nvidia)
  n = acc_get_device_num(acc_device_nvidia)
  print *, n
  end
% pgf90 -ta=nvidia test.f90 -V10.5
% a.out
            0
% setenv ACC_DEVICE_NUM 2
% a.out
            2

Mat

sWienke · May 10, 2010, 6:31am

Thanks a lot! I will try it out as soon as we have installed 10.5.

sWienke · July 13, 2010, 12:04pm

Hi,
with Fortran everything works fine. Today I tried it with C and then the output is just wrong:

$ cat getDevTest.c
#include <stdio.h>
#include <accel.h>

int main() {
  int n;

  acc_init(acc_device_nvidia);
  n = acc_get_device_num(acc_device_nvidia);
  printf("\t%d\n");
}

$ pgcc -ta=nvidia,3.0 getDevTest.c 
$ a.out
        6354592

$ ACC_DEVICE_NUM=1 a.out
        6354592

Can it be that this is a bug? Or am I just to stupid?
BTW: I tried it with 10.5 and 10.6: same problem

njackson · July 13, 2010, 3:41pm

I noticed that in your code snippit you have a typo in your printf() call and neglected to actually print the value…

You have

printf("\t%d\n");

when you needed

printf("\t%d\n", n);

.

sWienke · July 15, 2010, 12:48pm

Oh my gosh! That was really stupid. Thanks a lot!

However, I have still a problem or rather a question:
If I don’t use acc_init, the output of acc_get_device_num is “-1”. I know I have a similar problem in CUDA because you can’t get the device when there haven’t been a context created. But I’ve thought that setting a particular device should do the thing (but it doesn’t):

  //acc_init(acc_device_nvidia); 
  acc_set_device_num(1, acc_device_nvidia);
  n = acc_get_device_num(acc_device_nvidia); 
  printf("\t%d\n",n);

$a.out
      -1

Additionally, I don’t want to use acc_init as I don’t want to isolate any initialization cost from the computational cost!
Is there another possibility?
Cheers, Sandra

MatColgrove · July 15, 2010, 4:51pm

Hi Sandra,

acc_get_deivce_num gets the current device your program is actively using. However, your program does not get attached to a device until you initialize it, either explicitly via acc_init or implicitly by entering an accelerator region. acc_set_device_num only sets which device you wish to use, but does not initialize it.

Additionally, I don’t want to use acc_init as I don’t want to isolate any initialization cost from the computational cost!

Personally, I prefer to separate initialization from computation for several reasons. Not that it should be ignored, but rather noted separately. First, my program has no control over the initialization cost since it’s a hardware issue. Second, it varies by the number of attached devices (~1 second per attached device) and if the ‘pgcudainit’ utility is running on the host (pgcudainit holds the device open eliminating the initialization costs). This can lead to puzzling performance variations.

Is there another possibility?

Enter an accelerator region (even an empty one) since this will implicitly initialize the device. Though, this is really no different than calling acc_init.

Hope this helps,
Mat