Driver API and Runtime API interoperability

CUDA 3.0 Beta includes driver and runtime api interoperability as one of its features. However, CUDA 3.0 Beta came with CUDA 2.3 documentation, so I don’t know how to access these new features. Does anyone know the syntax for using the driver and runtime APIs together? For example, I need to allocate data with the driver API, which returns a CUdeviceptr, and then use the pointer with a runtime API function which expects a float*. How can I convert the CUdeviceptr into a float*?
Explicitly casting produces invalid pointers which and cause CUDA errors. I’m assuming there’s another way in CUDA 3.0 to convert between runtime API pointers and driver API CUdeviceptr s.

Thanks.

nope, you should be able to just cast and have things work.

Unfortunately, this is not true, at least on Mac OS X 10.6 using the CUDA 3.0 Beta drivers and toolkit.

[codebox]include <cuda.h>

include <cutil.h>

global void incr(int* vector, int length) {

int globalIndex = blockIdx.x * blockDim.x + threadIdx.x;

if (globalIndex < length) {

vector[globalIndex] = vector[globalIndex] + 1;

}

}

int main(void)

{

int length = 1000;

CUdeviceptr driverPtr;

cuMemAlloc(&driverPtr, sizeof(int) * length);

int* hostPtr = (int*)malloc(sizeof(int) * length);

for(int i = 0; i < length; i++) {

hostPtr[i] = i;

}

cuMemcpyHtoD(driverPtr, hostPtr, sizeof(int) * length);

/*** ILL FATED CAST ***/

int* runtimePtr = (int*)driverPtr;

/**************************/

int blocks = (length - 1)/256 + 1;

incr<<<blocks, 256>>>(runtimePtr, length);

CUDA_SAFE_THREAD_SYNC();

return 0;

}[/codebox]

Yields an error:

Cuda error in file ‘driver.cu’ in line 29 : unspecified launch failure.

Does it work on another platform with CUDA 3.0 Beta?

You are not checking error codes or you would see that you have not properly initialized the driver (cuInit, cuCtxCreate, …). If you start with a cudaFree(0) or similar runtime call that creates a context, your app will work properly. If you just start out with cuMemAlloc (and no error checking), you are going to get CUDA_ERROR_NOT_INITIALIZED for all subsequent calls.

I feel sheepish. Thanks for the pointer. Interoperability seems to be working great, thanks!