I get random memory errors when kernel writes byte data to memory.
I want to write function that processes image (array of unsigned char pixels) and
writes result in another image.
The kernel works fine “most of times” and result compares with calculated on CPU.
But once in a few times (approximately 1 out of 10) there are a few bytes wrong.
The problem seems similar to discussed in another thread “The Official NVIDIA Forums | NVIDIA”
But I do not use any __synchthread() function.
The kernel is so simple i could not see any error in it.
It does not matter what calculation I do in kernel.
for example
global void subtract_pixels(const unsigned char *A, const unsigned char *B, int width, int height, int stride, volatile unsigned char C){
int x = blockIdx.x * blockDim.x + threadIdx.x;
int y = blockIdx.y * blockDim.y + threadIdx.y;
int offset= ystride +x;
int c1;
if(x<width && y<height){
c1= abs((int)A[offset]-(int)B[offset])/2;
C[offset]= c1;
}
}
The problem seems to be worse when horizontal size of THREAD_BLOCK is small.
for example when THREAD_BLOCK size is 16x16 it happens often.
But when THREAD_BLOCK size is 64x16 it is not often.
Can I process pixel (BYTE) data in CUDA kernel at all?
Does horizontal size of THREAD_BLOCK must be aligned to cache memory chunk size ?
My graphic card is
GeForce GTX 460
Can it be simply because of defective graphic memory or garphic card?
Can somebody advise what is the matter?