Problem with simple calculation in kernel

Hi folks,

I am trying to normalize the result of a cuFFT with the code below. data[i].x represents the real value, data[i].y the imag value of a complex. The problem I have is that the real value (.x) is calculated and stored correctly, but written also into the .y field.


Let data[i].x = 10, data[i].y = 30 and size = 5 before execution. After execution the both data[i].x == 2 and data[i].y == 2.

If I comment out one of the lines, the other result is calculated correctly and stored at the correct place. What’s going wrong?

Btw, data is of course a pointer to linear device memory, and size the size of the vector.

As far as I know, cufftDoubleComplex is just a typedef for double2. At the nvcc command line I have specified “-arch sm_13” and my card is capable of compute capability 1.3, so using doubles should not be the problem.

__global__ void dev_normalize(cufftDoubleComplex *data, double size)


	int i = blockIdx.x * blockDim.x + threadIdx.x;

	if (i < size)


		data[i].x /= size;

		data[i].y /= size;



__host__ void normalize(cufftDoubleComplex *data, double size)


	dev_normalize<<<((int)size+255) / 256, 256>>>(data, size);


It is a known compiler bug.

Simple workaround:

double scale=1./size;

That works.
Thank you for your very quick reply!

Another question on this topic: of course I do not want to get into this again, so could you please point me to a detailed description of the bug, or post the details here?


Again I seem to have a similar problem… Could please somebody point me to the bug description, so that I can work aroudn this bug in the future?