Casting CUdeviceptr to floating point array for CUDA kernel

To use the driver API, you have to

#include <cuda.h>

and link against

-lcuda

There are various sample codes that demonstrate proper usage of the driver API, such as vectorAddDrv.

I feel like this was already addressed with the statement about simple casting from one to another. It works both ways AFAIK. Not sure what is unclear about this. Did you read the linked thread?

You can post a simple example of what you actually tried here, it may be clearer.

TBH, I probably don’t understand the issue. You have shown what looks to be entirely host code and CUDA device code. You’ve shown no origin for the CUdeviceptr. Why not just use cudaMalloc/cudaMemcpy and an ordinary pointer?

1 Like