I have a kernel that fails to persist global memory writes when compiled on the device. However, when I compile the application in emulator mode, memory writes persist outside of the kernel.
The code below takes all variables as global memory. It tries to assign a set of randomly generated 2D coordinates to a set of objects. Coordinates are constrained by an image mask, d_msk. Succesful writes are recorded with a “1” in the success array.
/*!
* Determine the starting coordinates for each object.
*/
__global__ void
d_assignObjectPositions(uchar *d_msk, uint imageWidth, uint imageHeight, float * randomNumsX, float * randomNumsY, uint4 *objects, uint *success, int N)
{
unsigned int idx = blockIdx.x*blockDim.x + threadIdx.x;
unsigned int pos = 0;
unsigned int randX = 0;
unsigned int randY = 0;
uchar a,b,c;
if(idx < N){
// Assign position if no previous successful position assignments
if( success[idx] == 0){
randX = (unsigned int)floor(randomNumsX[idx] * imageWidth);
randY = (unsigned int)floor(randomNumsY[idx] * imageHeight);
pos = (randX + (randY*imageWidth)) * 4;
a = d_msk[pos];
b = d_msk[pos+1];
c = d_msk[pos+2];
// Assign position if this location in the texture mask is open
if(a != 0x00 || b != 0x00 || c!= 0x00 ){
objects[idx].x = randX;
objects[idx].y = randY;
success[idx] = 1; // This write only persists in emulator
}
}
}
}
Some other notes:
Development platform: Ubuntu Linux 8.04, 32 bit.
Driver: 177.13
Cuda Tookit: version 2.0 beta 2 for Ubuntu 7.10
Cuda SDK: 2.0 beta 2
GPU: 8800 GT
Compiler: g++ 4.2.3
I’d greatly appreciate any suggestions. I’m a bit new to CUDA and GPGPU.