Is there a way to call (large) CUDA function from Python?

I have been trying to call a Cuda function from python following the tutorial of CUDA Python here: Overview - CUDA Python 11.6.0 documentation.

The example looks like:

from cuda import cuda, nvrtc
import numpy as np

# the Cuda program to be called
saxpy = """\
extern "C" __global__
void saxpy(float a, float *x, float *y, float *out, size_t n)
 size_t tid = blockIdx.x * blockDim.x + threadIdx.x;
 if (tid < n) {
   out[tid] = a * x[tid] + y[tid];

# Create program
err, prog = nvrtc.nvrtcCreateProgram(str.encode(saxpy), b"", 0, [], [])

# Compile program
opts = [b"--fmad=false", b"--gpu-architecture=compute_75"]
err, = nvrtc.nvrtcCompileProgram(prog, 2, opts)

# Get PTX from compilation
err, ptxSize = nvrtc.nvrtcGetPTXSize(prog)
ptx = b" " * ptxSize
err, = nvrtc.nvrtcGetPTX(prog, ptx)

# ... more 

However, the Cuda function I am trying to call is from a large project that involves several other project .cu/.cuh files, making it hard to be written in a string as shown in the example of the tutorial.

Is there a way to call the CUDA function more conveniently, or should I copy and paste all those .cu/.cuh files into a string as the example does?

How about creating python bindings for this large project?

Thank you for the advice! Would you mind further elaborating how I could do that?

Sorry, I have never done this myself. I cannot help you with it. Maybe this helps Intro — pybind11 documentation

1 Like

It should be possible to read a text file into a python string. This doesn’t have anything to do with CUDA, so I would just suggest googling for examples.