It results in a driver crash for less than a second and then the last call returns cuda unknown error.
I’ve looked through this forum and it’s been said that it can be the problem of bad pointers, still all pointers are good and MEMORYFORRESULTINGARRAY is rather small and can be allocated normally.
If anyone knows what is going on please answer. Thanks in advance.
All the cuda calls are wrapped in CHECK_ERROR macro and after each of kernel calls there’s an error check.
Until the line where I copy array to CPU everything is good, no error messages. And in debugger I see that memory is allocated normally both on CPU and GPU.
I’ve also I’ve tried to minimize my app into just a single kernel call with fixed params. Still the same. Any ideas?
Insert [font=“Courier New”]cudaThreadSynchronize();[/font] before you check errors after the kernel launch, so that you also catch errors during the runtime of your kernel.
Shouldn’t you be passing the kernel d_finalPoints and not points? You do a memcpy from d_finalPoints back to points when the kernel itself doesn’t seem to use d_finalPoints at all.
(void**)&d_initArray,, somehow I’ve added & and that was a real problem. There was no error in memory allocation and due to this is C there was no problem in running this func.
Anyhow this call was wrapped in CUDA_CHECK_ERROR macro,which gave no error message and the crash happened much later. What helped was to put cudaThreadSynchronize() after this call and only then get last error. Then it gave message about failure.