I’m getting a very weird CUDA bug…
Here’s my code:
device uint hashVal ( float4 pos )
{
uint gz = (pos.z - simData.min.z) * simData.delta.z;
uint gy = (pos.y - simData.min.y) * simData.delta.y;
uint gx = (pos.x - simData.min.x) * simData.delta.x;
return max(0, min(gz, (uint) simData.resmax));
}
global void hashParticles ( char* bufPnts, uint* bufHash, int numPnt )
{
uint ndx = __mul24(blockIdx.x, blockDim.x) + threadIdx.x;
if ( ndx >= 0 && ndx < numPnt ) {
float4* pos = (float4*) (bufPnts + ndx * simData.stride);
uint* clr = (uint*) ((char*) pos + 3*sizeof(float) );
uchar r = ((30+pos->x)/60)*255.0f;
uchar g = ((20+pos->y)/40)*255.0f;
uchar b = 0;
*clr = uint(r<<24) | uint(g<<16) | uint(b<<8) | 255;
// if ( ndx <= 100000000 ) // error disable
bufHash[ndx] = hashVal ( *pos ); }
}
It takes a particle, modifies its color and outputs a hash value.
The bug is observed as bad values returned by bufHash and also in *clr.
It occurs in the code above, with the commented line as is. These are what bad *clr values look like:
3064791295 255 2309816575 255 1437401343 255 3282895103 255 3182231807 255 2307916031 255 2846687487 255
When the comment on “error disable” is removed, the values are all good:
3282895103 2964127999 2846687487 2410479871 3064791295 2192376063 1655505151 2527920383 2091712767 1219297535 3182231807 2963472639 1873608959 3383558399
I noticed the bad values (255) are interleaved, and they correspond to what would happen if pos->x and pos->y were 0.0 – yet all pos are ok on kernel entry.
I think I’ve ruled out any obvious error. The only thing I can think of is some issue with doing two global memory writes in a row?
Any help is appreciated! Otherwise I’ll have to just leave the “if (ndx<=10000000)” line in.