Hello,
I had a couple of related questions.
Is there any way to know the device ID from a thread in a kernel (not the host)?
I see there are two functions for querying the device ID from the host , first in the runtime API
host device cudaError_t cudaGetDevice ( int* device )
secondly in the driver API
CUresult cuCtxGetDevice ( CUdevice* device )
Are they the Runtime-Driver API counterparts ? Is there any difference between them and could they return different values in any edge case ?
Thanks,
Anil Mahmud.
Is there any way to know the device ID from a thread in a kernel (not the host)?
This seems to work for me:
$ cat t1399.cu
#include <stdio.h>
#include <stdlib.h>
__global__ void k(){
int d;
cudaError_t err = cudaGetDevice(&d);
if (err != cudaSuccess) printf("kernel cuda error: %d, %s\n", (int)err, cudaGetErrorString(err));
printf("device = %d\n", d);
}
int main(int argc, char *argv[]){
int d = 0;
if (argc > 1) d=atoi(argv[1]);
cudaSetDevice(d);
k<<<1,1>>>();
cudaDeviceSynchronize();
}
$ nvcc -rdc=true -arch=sm_35 -o t1399 t1399.cu -lcudadevrt
$ cuda-memcheck ./t1399 2
========= CUDA-MEMCHECK
device = 2
========= ERROR SUMMARY: 0 errors
$
Hi.
I am running the code on a K80 but something weird is happening, the call to cudaGetDevice(&d) is not changing the value of d at all. Could there be any particular reason for this ? The code is more or less identical to your code. The difference is that in my case I have a device function in which the code resides, which is called inside a kernel.
Thanks,
Anil
Is there any intrinsic function, or inlined assembly that one can use to get the same result as cudaGetDevice(int *device) ?
Thanks,
Anil