Hello all,
I’m struggling with a little piece of code here. I’ve been using CUDA with Matlab, running Vista64 Ultimate and NVIDIA’s Tesla D870
I’m first allocating memory for a 3D matrix on the device memory - not before making sure that the device I’m working on is the Tesla (device 0) and not the internal 8600GT (device 1):
cudaSetDevice(0);
float *WBSresult;
errCuda = (cudaMalloc ((void **) &WBSresult,sizeof(float)*N*N*numberOfWindows));
I then need to allocate memory for the matlab variable on the host memory
const mwSize dims[]={N,N,numberOfWindows};
plhs[0]=mxCreateNumericArray(3,dims,mxSINGLE_CLASS,mxREAL);
float *ar4;
ar4 = (float *) mxGetPr(plhs[0]);
I then run the code on the device
ComputeWBS<<<32, 256>>>(WBSresult);
I should note that ComputeWBS is a very intense function and for large data files includes hundreds of thousands of iterations that should take a few minutes to run.
And I then copy the data from device to host:
errCuda = (cudaMemcpy( ar4, WBSresult, sizeof(float)*N*N*numberOfWindows, cudaMemcpyDeviceToHost));
The problem: For small matrices this works fine. For large ones I get the infamous “Display Driver Stopped Responding” message.
First, I wonder why do I even get this message when I’m not running my code on the card that’s connected to the display.
In this post and in others that I found on this forum the general recommendation is to use a GPU that’s not the primary one that’s connected to the display. That’s exactly what I did and I’m getting this error message.
Second, I wonder if this has something to do with the fact that I’m using cudaMalloc and CudaMemcpy for 3D matrices instead of cudaMalloc3D and cudaMemcpy3D. The documentation says:
I tried converting my code to cudaMalloc3D and cudaMemcpy3D and got lost somewhere on the way. I’m not certain if this has anything to do with my problem and I don’t know if it’s worth the hassle to convert the code if it has nothing to do with it. The code does work when there are fewer iterations or smaller matrix dimensions, so can it really be related to 3D allocation?
Any help rendered will be highly appreciated…
Thank you.
Y.