Limitations of CUDART C++ API kernel execution control functionsis limited to Is there a way to laun

How can I launch a kernel directly from a CPP file using the kernel function pointer?


I’ve created a wonderful kernel launching class which uses the CUDA runtime API kernel execution control functions. Since my group is writing kernels for a large scale complex application, this class has two primary purposes:

  1. Launch kernels directly from CPP files compiled with standard compilers


    Only kernels and device functions need be in CU files and compiled with NVCC

    The application GPU task manager has total control and knowledge of kernel execution


  1. Encapsulate the relatively complex, application specific, kernel wrapper interface


    Eliminates the need to write kernel wrappers in the CU files which redundantly repeat the application wrapper interface

    Isolates the kernel wrapper interface from the kernel library

    Eliminates the need for GPU algorithm developers to understand the kernel wrapper interface


To launch a kernel

I use the following runtime API calls (CUDA 4.0):



call cudaConfigureCall (C API)

call cudaSetupArgument (C API)

call cudaLaunch (C or C++ API)

sometimes call cudaFuncGetAttributes (C or C++ API) to create a basic kernel informational report.


The problem is this…

The CUDART C API kernel launch functions only accepts “const char*” kernel name arguments. This limits execution to kernels that are declared as ‘extern “C”’, since otherwise the fully qualified munged C++ name must be used, which is not realistic for a large scale multi-platform application.

The CUDART C++ API offers a solution; the kernel execution functions are templated to use kernel function pointers instead. Unfortunately, the C++ API kernel execution functions are only available when compiled with NVCC, though many other C++ API functions do not suffer this restriction. The only good documentation I found for this was in the comment header of the cuda_runtime_api.h file.

For my purposes, this basically makes the C++ API version of these functions essentially useless. I will still have to write a wrapper within the kernel CU file that calls the kernel launcher or more simply uses the MyKernel<<<…>>>(…) syntax. I’m hoping to find a work around…

Here’s my questions


Is there a way to use the C++ API kernel execution functions from a CPP file compiled with a general purpose compiler?

Is there a portable programmatic means to get the mugged name of a kernel function given either the nominal name or the kernel function pointer?

Is there any chance that future versions of CUDA will overcome the NVCC compiler limitation in the CUDART C++ API kernel launch functions?