CUDA from Mental Ray Calling CUDA code from Mental Ray

Hi all,

Has anyone managed to run CUDA from Mental Ray? I am trying to speed some ray-sphere / box intersections in CUDA from Mental Ray. However I keep getting “unspecified launch failure” errors all the time. This happens only at the point when I am trying to retrieve the results from device. Another behaviour I noticed was that if I am not calling my kernel code (i.e. just uploading and dowloading the data) the entire operation is so slow! Could it be that Mental Ray is also accessing the GPU?

Here’s my host code:

void gpu_raysphere_intersection(int* iarray, int n, const float from[3], const float dir[3])
float* dfrom;
float* ddir;
int* diarray;

CUDA_SAFE_CALL(cudaMalloc((void**)&dfrom, 3*sizeof(float)));
CUDA_SAFE_CALL(cudaMalloc((void**)&ddir , 3*sizeof(float)));
CUDA_SAFE_CALL(cudaMalloc((void**)&diarray, n*sizeof(int)));

CUDA_SAFE_CALL(cudaMemcpy(dfrom, from, 3*sizeof(float), cudaMemcpyHostToDevice));
CUDA_SAFE_CALL(cudaMemcpy(ddir, dir, 3*sizeof(float), cudaMemcpyHostToDevice));
CUDA_SAFE_CALL(cudaMemcpy(diarray, iarray, n*sizeof(int), cudaMemcpyHostToDevice));

gpu_raysphere_intersect_kernel<<<1, 512>>>(dsphere, diarray, dfrom, ddir);

CUDA_SAFE_CALL( cudaThreadSynchronize() );

// Unspecified Launch Failure happens here.
CUDA_SAFE_CALL(cudaMemcpy(iarray, diarray, n*sizeof(int), cudaMemcpyDeviceToHost));



Thanks guys.