Hello,
I recently moved to tool-kit version 4.0 from the version 3.2.
I was wondering how the device variables get handled in the newer toolkit, especially since each host thread can now access multi-GPU. I wanted to create device variable before calling the kernel, and I wanted to get the address of the variable on the host thread. Do I need to do something special for this in the newer toolkit?
Previously I was doing something like,
File1.cu
#include <cuda_runtime.h>
device int array_1 = {1,2,3,4};
extern “C” void get_array_address(int* array_h_1) {
cudaGetSymbolAddress((void**)&array_h_1, “array_1”);
}
Now the above function would be called from the main() function in some other file like “File2.cu”, which would also contain the kernel call. Do I need to set the device using cudaSetDevice(0) before trying to get the symbol address ?? How would the context of the device variable be set?
Thanks for the help.