Weird behaviour of CU_SAFE_CALL perhaps problem with textures?

Hi everybody,

I just wondered why my code doesn’t do what I expect it to do :)

I’m working with the driver API for the first time and try to launch a kernel.

But the result of the following two statement differs:

CUresult res = cuLaunchGrid(function, m_numBlocksX, m_numBlocksY );


  CU_SAFE_CALL(cuLaunchGrid(function, m_numBlocksX, m_numBlocksY ));

If I’m just checking the result everything works correctly. But if I Use the CU_SAFE_CALL macro my image looks weird and some blocks are at the wrong position (it looks a little bit like using a wrong parallelisation or wrong blockIds).

But the kernel code is the same.

Anybody out there who has an idea why this could be the case?



I just noticed that processing the kernel sometimes takes about 1 second or even longer. Usually it takes about 20 ms

Can you continue to run kernels successfully after one takes >1s to complete?

I have a long-standing bug report to NVIDIA on kernels that normally complete in 5ms, but randomly take 5s to complete (even when the same kernel is called over and over again on the same data) and return a timeout error or an unspecified launch failure. Any attempt to call another kernel after this error occurs fails. I’m using the runtime API, though, not the driver API like you are.

I am hoping your bug will get solved quickly as it would prevent me from introducing CUDA processing in our products. :(