Getting address of __device__ variable in tookit 4.0RC

Hello,
I recently moved to tool-kit version 4.0 from the version 3.2.

I was wondering how the device variables get handled in the newer toolkit, especially since each host thread can now access multi-GPU. I wanted to create device variable before calling the kernel, and I wanted to get the address of the variable on the host thread. Do I need to do something special for this in the newer toolkit?

Previously I was doing something like,

File1.cu

#include <cuda_runtime.h>

device int array_1 = {1,2,3,4};

extern “C” void get_array_address(int* array_h_1) {
cudaGetSymbolAddress((void**)&array_h_1, “array_1”);
}

Now the above function would be called from the main() function in some other file like “File2.cu”, which would also contain the kernel call. Do I need to set the device using cudaSetDevice(0) before trying to get the symbol address ?? How would the context of the device variable be set?

Thanks for the help.

PLEASE IGNORE THIS POST. POSTING THE SAME QUESTION IN THE OTHER FORUM TITLED "PROGRAMMING AND DEVELOPMENT"