Thank you for your response. I’m familiar with Jacket (and I highly recommend it to anyone) but I really need to build something custom this time - so I guess my question stays the same.
I think I managed to pass mex GPU pointers between different mex-functions about a year ago, I think the trick was to not destroy the context when the mex-function is finished.
That is exactly what I’m trying to do. It is just the matter of HOW to pass this pointer - it should be wrapped somehow into mxArray, as everything that goes in and out of the mex function.
Here is what i have so far:
MEX FUNCTION 1
double* h_in;
double* h_out;
double* g_var;
void mexFunction( int nlhs, mxArray *plhs[], int nrhs, const mxArray*prhs[] )
{
if (nrhs !=1) mexErrMsgTxt("Must have one input argument");
// create a pointer to the real data in the input matrix
h_var = mxGetPr(prhs[0]);
// calculate mem_size
int m = mxGetM(prhs[0]);
int n = mxGetN(prhs[0]);
const unsigned int mem_size = sizeof(double) * m * n;
// allocate device mem
cutilSafeCall( cudaMalloc( (void**) &g_var, mem_size));
// copy input data to device
cutilSafeCall( cudaMemcpy( g_var, m_var, mem_size, cudaMemcpyHostToDevice) );
// Create an mxArray for the output data
plhs[0] = mxCreateDoubleMatrix(1, 1, mxREAL);
// Create a pointer to the output data
h_out = mxGetPr(plhs[0]);
h_out = g_var;
}
and the second mex-function on the receiver side:
MEX FUNCTION 2
void mexFunction( int nlhs, mxArray *plhs[], int nrhs, const mxArray*prhs[] )
{
// check: only one input and one output argument
if (nrhs !=1) mexErrMsgTxt("Must have one input argument");
g_var = (double *)mxGetPr(prhs[0]);
int m = mxGetM(prhs[0]);
int n = mxGetN(prhs[0]);
// calculate mem_size
const unsigned int mem_size = sizeof(double) * m * n;
//allocate host mem
h_out = (double*)malloc(mem_size);
if (h_out == 0) mexPrintf("host: unable to allocate memory");
// copy input data to device
cutilSafeCall( cudaMemcpy( h_out, g_var, mem_size, cudaMemcpyDeviceToHost) );
}
I’m definitely missing something here. The last call to cudaMemcpy in the second mex function crashes matlab.
Yes, I was thinking about that, but it is very convenient to have these little functions as a separate building blocks - so I can reuse them in my future projects.
I guess this is going to be my ‘PLAN B’ if I exhaust my other possibilities.