In one of my Matlab mex functions, I am allocating an array in the usual way like this:
dims0=N; dims0=M; Ap = mxCreateNumericArray(2,dims0,mxSINGLE_CLASS,mxREAL); A = (float*) mxGetData(Ap);
This has been working fine. This array is filled in the mex routine and then shipped out to the GPU device; its a rather large array:
cudaMemcpy (Arg, A, N*M * sizeof(A), cudaMemcpyHostToDevice);
Its all been working great.
Reading the CUDA reference documentation, I ran across the suggestion that where possible cudaMallocHost() should be used, because the page-locked memory allows for faster Host-Device data transfer. Ah ha! says I, I’ll do:
instead of the Matlab array allocation, and so I can speed up the transfer of A to the device. This works fine, up until a mexCallMATLAB(1,&lhs,2,rhs,“mrdivide”); is called, where these variables have nothing to do with A.
When the mexCallMATLAB is reached the program crashes, with the usual Matlab barf. I’ve concluded that cudaMallocHost() does not play well with Matlab, and that memory corruption results.
Is this true? Is cudaMallocHost() to be avoided in mex files?
Perhaps the answer is here: http://www.mathworks.de/matlabcentral/news…w_thread/162021 (This looks like a well known issue, but I’ll make the post anyways…I should likely use mxCalloc instead)