On scopes of arrays defined in __device__ function

Hi,

I am trying to repeatedly call a device function from my kernel. In the device function, I define an array and alter the element values in the array. Something like the following.

__device__ myFunction(){
...
double array[20];
... //alter element values in array
...
}

__global__ myKernel(){
...
loop
...
myFunction();
...
end loop
...
}

I thought that since the array is defined within the function, the values of each element in the array in each call to the device function should be independent to each other. However, it seems that the resulting element values in the current call will be passed on to the call in the next iteration. I don’t have a clue why this is.

I tried to change the way the array is defined, say,

__device__ myFunction(){
...
double *array = (double *)malloc(20 * sizeof(double));
... //alter element values in array
...
free(array);
}

In this case, the kernel simply crashes with “unspecified launch failure”. Could anyone please tell me how I can make the array in each call to stay independent of each other without crashing the kernel?

Thanks!

I would expect the array element values from one thread to be independent from any activity in another thread.

However, with respect to the calling the function in the same thread repeatedly, your comment:

" it seems that the resulting element values in the current call will be passed on to the call in the next iteration"

only makes sense to me if you are using a particular element in an unitialized fashion, ie. doing a read from that element within the device function before doing any writes to it. If that is the case, then your observation is no surprise to me, and your code is broken.

If you are in fact suggesting that you write a value to a particular element, and after that, you read the same value and you do not get what you wrote, but instead you get what you wrote in some previous invocation of the function, that would indeed be surprising, and you should probably provide a short, complete example that demonstrates the same.

regarding your 2nd attempt (with malloc) my guess is that you may be exceeding the device heap allocation limit. Study the programming guide, for proper usage of in-kernel malloc, and also note that proper error checking for any malloc operation involves testing if the returned pointer is NULL, or not.

Thanks a lot! Initializing the array solved the problem perfectly.