Hello,
I have an issue with kernel return ! I know they must be void, I do the following :
kernel <<< ... >>> (a, b, c, d);
cudaThreadSynchronize();
cudaMemcpy(h_d, d, 50, cudaMemcpyDeviceToHost);
printf(" return : %s\n", h_d);
The kernel launch N threads and only one can modify the d variable. It’s something like :
if( (threadId.x == a) && (threadId.y == b) )
{
d[0] = 'o';
d[1] = 'k';
}
If I compile & execute the code in emulation mode (make emu=1) there is no problems and I get the result ‘ok’ on the host. If I compile & execute in ‘normal’ mode (to be executed on the device) I get nothing (I mean an empty string) !
Does anyone have an idea ?
Thanks,