CUDA failures

Hi all,

I have my cuda program, which used to run without problems 2 months ago. I am trying to use it now and it gives me plenty of “unspecified cuda failures”. Is that any problem related to cuda hardware maybe?

What do I have to check for in this case? Is there any possibility to control if the GPU (which in my case is e GTX295 one) has any part of its memory locked?

After the program terminates the constant memory is automatically freed?

Please answer me,
Thank you in advance for your consideration,

Ardita

Hi all,

I have my cuda program, which used to run without problems 2 months ago. I am trying to use it now and it gives me plenty of “unspecified cuda failures”. Is that any problem related to cuda hardware maybe?

What do I have to check for in this case? Is there any possibility to control if the GPU (which in my case is e GTX295 one) has any part of its memory locked?

After the program terminates the constant memory is automatically freed?

Please answer me,
Thank you in advance for your consideration,

Ardita

From my experience, Unspecified Launch Failure on device is something similar to Access Violation on CPU, check how your kernel interacts with memory.

Also, as you say that you did not run your kernel for months, it is possible that you don’t set appropriate GPU architecture in *.cu file settings thus compiling it to code that is inconsistent with your device.

One more pleasant issue I faced is texture bindings - an attempt to bind the texture to the memory block that is larger than approximately 500 MBytes caused the kernel to fail at it’s launch (as I’ve found later, this limitation is described in the release notes file), it was especially “interesting” to catch the reason of launch failures as I did not even use texture fetches in my code, texture binding has been accidentally left after the code has been reworked for Fermi that works better without textures.

From my experience, Unspecified Launch Failure on device is something similar to Access Violation on CPU, check how your kernel interacts with memory.

Also, as you say that you did not run your kernel for months, it is possible that you don’t set appropriate GPU architecture in *.cu file settings thus compiling it to code that is inconsistent with your device.

One more pleasant issue I faced is texture bindings - an attempt to bind the texture to the memory block that is larger than approximately 500 MBytes caused the kernel to fail at it’s launch (as I’ve found later, this limitation is described in the release notes file), it was especially “interesting” to catch the reason of launch failures as I did not even use texture fetches in my code, texture binding has been accidentally left after the code has been reworked for Fermi that works better without textures.