How to add or overwrite existing overwrite CUDA API ?


I was trying to playaround CUDA APIs. I was trying to write my own implementation of:

cudaError_t cudaMalloc (void ** devPtr, size_t size)

But as expected nvcc raise an error : "(.text+0x41d60): multiple definition of `cudaMalloc’ ".

Is there a way to write own implementation of CUDA library ?

Thank you

injecting your own implementation of API calls without modifying existing binaries can be done (on most Linux/Unix systems) through LD_PRELOAD

If you have the source code of the program, you can just use a global preprocessor macro to reroute specific function calls to a differently named version of this call like e.g.

e.g #define malloc(x) my_malloc(x)

Is there a way to write own implementation of CUDA library ?

If you want to do this, you have to provide all public functions of the original CUDA device API. With the LD_PRELOAD trick you can override just selected functions.

Hi cbuchner1,

I have tried the trick for `cudaMalloc` API. I was getting this error:

nvcc --shared -o --compiler-options ‘-fPIC’

/usr/local/cuda/bin/…//lib64/libcudart_static.a(libcudart_static.a.o): In function cudaMalloc': (.text+0x41d60): multiple definition of cudaMalloc’
/tmp/tmpxft_00005994_00000000-10_lib.o:tmpxft_00005994_00000000-5_lib.cudafe1.cpp:(.text+0x15): first defined here
collect2: error: ld returned 1 exit status

which means nvcc already has the implementation and hence the compilation was failing. Can we resolve this ?

I also tried below and got same error:

nvcc --shared -o --compiler-options ‘-fPIC’ -ldl