Using driver API to launch kernels A way to avoid external cubins?

James_Malcolm1 · August 10, 2008, 6:23am

The runtime API makes it really easy to launch kernels with that <<<…>>> construct when those kernels are included in your .cu file. But I’m forced to use CUstream objects from the driver API, and these can’t be used with the <<<…>>> construct.

Is there a way to use the driver API to launch a kernel that’s right there in my .cu file without going through the hassle of loading it via an external cubin? Or anyone know a way to re-package those CUstream objects as cudaStream_t (which driver_types.h says are ‘int’)?

If the runtime is simply masking this whole process, then how does nvcc nicely hide the .cubin code in the .o files. I’d rather not have to mess with external cubins.

James_Malcolm1 · August 11, 2008, 2:34pm

We have one portion of our code that uses cubins created on the fly, so it requires the driver API to load cubins. We also have regular global kernels throughout that we launch with the runtime.

In the latest iteration of our project, we have introduced multiple worker threads each with its own stream so workers can block waiting for results from each other.

Problem: the driver API uses CUstream while the runtime API uses cudaStream_t.

What do you suggest? It’s advised to stick to one API, so is there any way within the runtime to load a cubin created on the fly? Or is there any way from the driver API to run a global kernel already compiled into the main program?

AndreiB · August 11, 2008, 8:29pm

I’m using cuModuleLoadData() to load contents of .cubin file into GPU context. There’s also cuModuleLoad() which I believe accepts name of cubin file and loads it.

When using runtime API cubin files are embedded in .obj files, so you can find it and use with driver API, I guess.

James_Malcolm1 · August 14, 2008, 1:26am

Right now we are using cuModuleLoad() to load external .cubin files, and it works quite well. Anyone have any success using cuModuleLoadData() to load from within the executable resources? Are there any examples out there of using cuModuleLoadData() at all?

AndreiB · August 14, 2008, 4:17am

And what’s the problem?

You need to use FindResource() / LoadResource() / LockResource() function and pass pointer returned by LockResource() to cuModuleLoadData().

James_Malcolm1 · August 14, 2008, 5:19am

Thanks for the tips. Any idea where to start looking for similar functionality on Mac and Linux?

AndreiB · August 14, 2008, 5:50am

I use following approach in Win32:
.cubin file is compressed and converted to C Header file (.h) at compile time. At runtime program decompresses it and passes resulting string to cuModuleLoadData().
Results:
a. Smaller executable size
b. cubin is not present in executable module as plaintext

Same approach may be easily applied on Linux or Mac, I think.

James_Malcolm1 · August 14, 2008, 6:48pm

Brilliantly simple! I feel guilty that I didn’t think of that.

Topic		Replies	Views
Using cubin files from kernels CUDA Programming and Performance	2	3899	November 6, 2008
Module support in the runtime API CUDA Programming and Performance	0	4334	January 30, 2009
The value of driver API CUDA Programming and Performance	4	1444	March 18, 2010
How to compile CUDA part to binary and store it in C array for further calls? CUDA Programming and Performance	10	2854	July 20, 2009
dynamically load cuda code in my application CUDA Programming and Performance	2	4422	October 27, 2008
Is it possible to load a module from two different cubin files? CUDA Programming and Performance	4	3119	May 27, 2009
Loading Kernel Code to Device Point of Time when Kernel Code will be loaded to the Device CUDA Programming and Performance	0	2020	September 28, 2010
Loading Kernel Code to Device Point of Time when Kernel Code will be loaded to the Device CUDA Programming and Performance	0	1902	September 28, 2010
(Driver API) JIT compiling direct from source code CUDA Programming and Performance	2	1170	July 25, 2009
How do I call the ". fatbin" file through "cuda c"? CUDA Programming and Performance	1	724	March 22, 2023

Using driver API to launch kernels A way to avoid external cubins?

Related topics