cuModuleLoadData with PTX

Good afternoon,

I have compiled some CUDA code to PTX and would like to use this PTX from a HOST-side function. As example, I have compiled PTX code with kernel that adds a vector, called extern "C" __global__ KernVec(float *vec, float *result). I would then like to call this kernel from inside my HOST (CPP) function named, for example, float myVectorAdd(float *vec). I know this is a trivial use of GPU, but I am just trying to learn this functionality if it exists.

I have found some information about cuModuleLoad and cuModuleGetFunction for passing the PTX filename and PTX function name respectively, to create a module. However, I would like to pass the entire PTX file (not just filename) to create a module that I can then employ in my CPP (HOST side) code. Is this possible? If so, can anyone provide a simple example of how this can be done?

Many thanks to anyone with some hint/help on how this can be accomplished.

The vectorAddDrv cuda sample code should be a good general resource/example for this. I’m a little bit unclear on this:

A typical methodology would be:

  1. Enter your ptx code into a file.
  2. In a different file, from your CPP function, load that file (i.e. the PTX code) you created in step 1 using the driver API functions.

What is the problem with using a filename? How else would you reference the file you created in step 1?

1 Like

Thank you @Robert_Crovella for the fast response.

There is no problem with using the filename per say, there just exists some code that already reads the PTX into single string. So there’s not much need to use just the filename given I get the entire PTX. I would like to go ahead and create a module or whatever is employed to call PTX code from my HOST side CPP file - if possible.

I will look into the vectorAddDrv CUDA sample code. Thanks again.

The vectorAddDrv is the “canonical” approach. If that doesn’t do what you want, please explain what you’d like to do instead (sorry if I’m being dense, I still don’t quite get it. I may need it explained to me like I am 5 years old)

I followed the work presented in vectorAddDrvCUDA sample, as suggested by @Robert_Crovella and it all works. I didn’t use the code that reads the entire PTX as a string and just used the filename as suggested by the CUDA sample.

Thanks again @Robert_Crovella for the assist.