Silent kernel failure

romain.laneuville · July 18, 2019, 9:07am

I was a bit upset yesterday and posted here to grumble (I’m french, this is a national sport here :p).

As a “modern” C++ developer trying to learn C++ CUDA with the latest feature and the best practice, the first recommendation I found was to encapsulate all CUDA calls with a C macro “cudaCheck” or whatever to check the return status of the actual function call. This is a way to go when dealing with old / legacy C API like FFMPEG that I am using in my current project. So defining custom C++ Exception and then throw it in the callCheck if needed, something like that

/**
 * @brief Check if FFMPEG library calls are successful and throw an exception if they are not
 */
#define CHECK_ERROR(ffmpegError)                                                                                      \
    do {                                                                                                              \
        int errorCode = ffmpegError;                                                                                  \
                                                                                                                      \
        if (errorCode < 0) {                                                                                          \
            throw FFMPEG::Exception(errorCode, __FUNCTION__, __FILE__, __LINE__);                                     \
        }                                                                                                             \
    } while (0)

Even if I found this solution not very convenient (wrapping functions in macro is repetitive and makes my IDE loss the syntax color feature) I tried to do the same with CUDA. I wanted to propagate the exception to the application top level so I looked for how to propagate an exception from device to host after a kernel launch. As I remember it is possible by giving a pointer parameter to the CUDA kernel launch call and then check this pointer to retrieve the error but I gave up as this solution started to be complicated and a bit verbose.

Maybe CUDA C++ can fork from C version and provide a version with modern C++ feature such as smart pointers and change the error system with a throw / try catch one. I found a nice library https://github.com/eyalroz/cuda-api-wrappers that works as a C++ wrapper and deliver those nice C++ feature but is this project gonna be maintained ?

Talking about IDE, it can be one of the best tool for tracking / debbuging or catching errors so I tried Nsight Eclipse Edition in the latest CUDA 10.1 toolkit release. I am using Clion (JetBrains IDE for C++) but it lacks of CUDA understanding / code insight / kernel debugging so I gave a try with Nsight.
In Ubuntu 18.06 Nsight is crashing by default due to a wrong Java JDK version. I had to downgrade JDK version from 11 to 8 using “update-alternative” to make the IDE working.
Then the code inspection seems to need a makefile to inspect the code, ok I just ran “cmake .” and here we go. Then I wanted a dark theme, it appeared that the default one is broken so I tried to fixed it and used a plugin but there was an error installing it and I gave up there.

This is my little CUDA experience, I really need to improve myself but I have a lack of time to make things properly.

tera · July 18, 2019, 9:56am

If your aim is only to throw an exception on errors, then you do not need to wrap the calls in a macro, just wrap them in a regular C++ function. Your IDE will be fine keeping syntax highlighting and code completion working inside function calls, no need to switch away from CLion.

The macro is only needed to capture line number information for use in error messages or exceptions, without requiring symbol information in the build. If you can do without function, filename, line number in your error message you are fine. And that actually has nothing to do with CUDA, it is the same as with any other C++ code.

You can also make a separate macro that provides this information and pass that as a second argument into your error checking function. This gives you the best of both approaches (highlighting and code completion inside the IDE and detailed info in error messages), but increases the verbosity of the source code somewhat.

There are other approaches possible like only dumping the program counter in the error message and keeping the debug symbols separate from the executable, to be recombined to produce a full error message outside of your program.

The wrapper library you found is the one I alluded to in my post #20.

tera · July 18, 2019, 7:51pm

Apparently syntax highlighting inside macros is a new feature in CLion 2019.1 based on clangd. This would be welcome news and eliminate the trade-off between verbosity and syntax highlighting quality mentioned above.
However I haven’t invested the time yet to get it working for me. And again, this is nothing CUDA-specific.

epk · April 30, 2020, 7:33am

Author of cuda-api-wrappers here…

I just noticed this post, and the answer is yes! I am definitely maintaining the library - and actually doing work to improve and expand it. (Of course I wish nVIDIA would officially recognize/adopt it…)

@tera : I would have plugged sooner but the notification didn’t reach me somehow :-(

tera · May 18, 2020, 4:39pm

Apologies @epk, I think notifications have only been added in the new forum software, so my post was a bit early for that (but I might be wrong, can’t go back and check the old forum software anymore obviously).

There is hope for a new C++ API directly from Nvidia. (I wanted to get that out before I get access to the preview and might not be able to talk about it anymore).

epk · May 18, 2020, 9:50pm

@tera: I wonder why they can’t either adopt mine officially, or let me work for them on a version of it which has the same “level of access” to the underlying layers as the runtime API itself. Also - can you focus your link a bit? What on that page talks about that API?

Topic		Replies	Views
How to debug kernel throwing an exception? CUDA Programming and Performance	16	7963	June 14, 2013
code that crashes unpredictably CUDA Programming and Performance	15	12643	April 28, 2010
What can't you do in CUDA that you'd like? Requests for the future CUDA Programming and Performance	407	134586	May 26, 2010
This code doesn't work maybe too much threads assigned? CUDA Programming and Performance	8	1091	February 2, 2014
Cuda code performance CUDA Programming and Performance	14	3163	December 16, 2014
An Easy Introduction to CUDA C and C++ Technical Blog	48	1273	July 19, 2018
CUDA 2.1 discussion CUDA Programming and Performance	71	63942	February 17, 2009
CUDA very slow performance CUDA Programming and Performance	21	16777	March 6, 2020
The Cuda 5 Second execution-time limit Finding a the way to work around the GDI timeout CUDA Programming and Performance	24	12744	July 26, 2010
Kernel problem, execution stop after ~15min CUDA Programming and Performance	7	1788	November 4, 2016

Silent kernel failure

Related topics