How to force the new cuda driver (9.2) to keep using the old cudaLauch insted of the "new" cudaLaunchKernel

marceloamaral · April 2, 2018, 2:43pm

I have recently update the machines in my cluster to the new cuda driver 9.2.
However, the new driver updated how to Launch CUDA Kernels works, as comment here https://developer.nvidia.com/cuda-toolkit/whatsnew

My app use the syntax of kernelname<<<blocks, threads, 0, stream>>>(param0,param1, etc…);
Before (up to cuda 9.1), this call was being translated by the driver to other calls, such as the cudaConfigureCall(), cudaSetupArgument() (for each param) and finally the cudaLaunch().

However, now, it is being translated directly to the cudaLaunchKernel(), which receives an arg pointer where all the parameters are in the memory.

I am doing a wrapper that intercepts the CUDA calls and perform some special features. Since the cudaLaunchKernel() has only a pointer of where the parameters are, it is hard to guess the size and the type of the parameters.

Is there anyway to force the driver to keep using the old cudaLaunch fashion?
I mean, without changing the applications and keeping using the syntax before…

cbuchner1 · April 6, 2018, 9:22am

did you obtain a 9.2 prerelease through an early access program? As far as I know, there has not been an official release yet. So likely not a lot of people outside of nVidia will know any details at this point.

I think the translation of the <<< >>> kernel launch syntax into driver API calls is handled by the CUDA runtime libraries (libcudart) together with the nvcc toolchain which generates the necessary glue code to said library.

You could check if you can find a switch in the nvcc tool chain that modifies the behavior with respect to handling of kernel launch arguments. If you have the new toolchain, you also have the documentation ;)

whuala · April 30, 2019, 7:29am

I cannot find a way to force new compiler to keep using the deprecated APIs. But it would be helpful if I could get the parameter size info.

Answer here(c++ - How does cudaLaunchKernel know the array size of "void **args"? - Stack Overflow) said that such information can be retrieved directly from the kernel’s image. What is kernel image? How can I get parameter size info out of it?