Silent kernel failure

I was a bit upset yesterday and posted here to grumble (I’m french, this is a national sport here :p).

As a “modern” C++ developer trying to learn C++ CUDA with the latest feature and the best practice, the first recommendation I found was to encapsulate all CUDA calls with a C macro “cudaCheck” or whatever to check the return status of the actual function call. This is a way to go when dealing with old / legacy C API like FFMPEG that I am using in my current project. So defining custom C++ Exception and then throw it in the callCheck if needed, something like that

/**
 * @brief Check if FFMPEG library calls are successful and throw an exception if they are not
 */
#define CHECK_ERROR(ffmpegError)                                                                                      \
    do {                                                                                                              \
        int errorCode = ffmpegError;                                                                                  \
                                                                                                                      \
        if (errorCode < 0) {                                                                                          \
            throw FFMPEG::Exception(errorCode, __FUNCTION__, __FILE__, __LINE__);                                     \
        }                                                                                                             \
    } while (0)

Even if I found this solution not very convenient (wrapping functions in macro is repetitive and makes my IDE loss the syntax color feature) I tried to do the same with CUDA. I wanted to propagate the exception to the application top level so I looked for how to propagate an exception from device to host after a kernel launch. As I remember it is possible by giving a pointer parameter to the CUDA kernel launch call and then check this pointer to retrieve the error but I gave up as this solution started to be complicated and a bit verbose.

Maybe CUDA C++ can fork from C version and provide a version with modern C++ feature such as smart pointers and change the error system with a throw / try catch one. I found a nice library https://github.com/eyalroz/cuda-api-wrappers that works as a C++ wrapper and deliver those nice C++ feature but is this project gonna be maintained ?

Talking about IDE, it can be one of the best tool for tracking / debbuging or catching errors so I tried Nsight Eclipse Edition in the latest CUDA 10.1 toolkit release. I am using Clion (JetBrains IDE for C++) but it lacks of CUDA understanding / code insight / kernel debugging so I gave a try with Nsight.
In Ubuntu 18.06 Nsight is crashing by default due to a wrong Java JDK version. I had to downgrade JDK version from 11 to 8 using “update-alternative” to make the IDE working.
Then the code inspection seems to need a makefile to inspect the code, ok I just ran “cmake .” and here we go. Then I wanted a dark theme, it appeared that the default one is broken so I tried to fixed it and used a plugin but there was an error installing it and I gave up there.

This is my little CUDA experience, I really need to improve myself but I have a lack of time to make things properly.

If your aim is only to throw an exception on errors, then you do not need to wrap the calls in a macro, just wrap them in a regular C++ function. Your IDE will be fine keeping syntax highlighting and code completion working inside function calls, no need to switch away from CLion.

The macro is only needed to capture line number information for use in error messages or exceptions, without requiring symbol information in the build. If you can do without function, filename, line number in your error message you are fine. And that actually has nothing to do with CUDA, it is the same as with any other C++ code.

You can also make a separate macro that provides this information and pass that as a second argument into your error checking function. This gives you the best of both approaches (highlighting and code completion inside the IDE and detailed info in error messages), but increases the verbosity of the source code somewhat.

There are other approaches possible like only dumping the program counter in the error message and keeping the debug symbols separate from the executable, to be recombined to produce a full error message outside of your program.

The wrapper library you found is the one I alluded to in my post #20.

Apparently syntax highlighting inside macros is a new feature in CLion 2019.1 based on clangd. This would be welcome news and eliminate the trade-off between verbosity and syntax highlighting quality mentioned above.
However I haven’t invested the time yet to get it working for me. And again, this is nothing CUDA-specific.

Author of cuda-api-wrappers here…

I just noticed this post, and the answer is yes! I am definitely maintaining the library - and actually doing work to improve and expand it. (Of course I wish nVIDIA would officially recognize/adopt it…)

@tera : I would have plugged sooner but the notification didn’t reach me somehow :-(

Apologies @epk, I think notifications have only been added in the new forum software, so my post was a bit early for that (but I might be wrong, can’t go back and check the old forum software anymore obviously).

There is hope for a new C++ API directly from Nvidia. (I wanted to get that out before I get access to the preview and might not be able to talk about it anymore).

@tera: I wonder why they can’t either adopt mine officially, or let me work for them on a version of it which has the same “level of access” to the underlying layers as the runtime API itself. Also - can you focus your link a bit? What on that page talks about that API?