the following gives funky results, so I was wondering if this is legal:
cudaMalloc((void**)&data, size*sizeof(float)); someKernel<<<grid, block>>>(data+someOffset, someMoreParams);
As this works fine from plain C, so I assume this is ok. As soon as templates and C++ join the show (basically an extension of the ‘reduction’ SDK example for NPOT arrays), I can cause reproducible freezes of X on Linux. So the question is: Can I do the above pointer arithmetic on the host side?
Thanks for insights,