Why I got the wrong answer

In this day ,When i try to use to CUDA language to add 0.02 to same number in the same time,I got the wrong answer,but when add 1 ,It will be get the right answer,Who can tell why?

my code:

[codebox] int i;

__shared__ float ddaa[NUM];

i=blockDim.x*blockIdx.x+threadIdx.x;

ddaa[i]=dda[i];

if(i<NUM) ddaa[i]=ddaa[i]+0.2;

//if(i<=NUM) ddaa[i]=0;

dda[i]=ddaa[i];[/codebox]

NUM == 1024

s1<<<2,512>>>(da);

I got the answer like :

6.1999998

7.1999998

.

.

16.2000008

.

.

.

102.1999969

.

.

.

that code makes no sense … sorry, shared mem is per block thus usualy much smaller then the data from global memory its read from.

i would expect some thing like:

ddaa[threadIdx.x] = dda[i]

and that NUM is the number of threads in a block.

but again i don’t really understand what you are trying to do …

0.2 is not exactly representable in floating point. This is fundamental to the nature of floating point and has nothing to do with CUDA.