Global variables Across Threads


I am relatively new to CUDA and am trying to translate some C++ code to GPU and am having some difficulty.

The original code calls an update function that consists of passing a integer reference variable, e.g. ‘num’, and incrementing it. The newly incremented ‘num’ variable is then passed to the next function within the class. I know that CUDA does not allow for the passing of reference variables to kernels, is there a work around for this kind of situation?

The Original C++ Code follows:

void update(int &num){

	for(unsigned int i = 0; i < N; i++){

		kp[i] = kp[i] + val[num++];



Thank you for your assistance.

No, that code will work fine unchanged. CUDA supports the pass-by-reference C++ idiom for device functions.

You’re also right that you can’t pass values by reference to KERNELS, because that makes no sense. But inside the kernel and inside device functions it’s no problem.


Can you show a quick example of how I would accomplish this using the small function that I posted?

__device__ void doubleV(int &v)




__device__ void addOneToV(int &v)




__global__  myKernel(int *result)


	int v=threadIdx.x;



	result[threadIdx.x]=v;  /* stores 2*threadIdx.x + 1 */


Thank you :)