Cuda from Mexfunction calling Cuda kernel from Matlab mexfunction

theotheraussie · April 11, 2012, 5:24am

A little background:

I am trying to call a simple vector addition function that is calulated on the GPU, but called from matlab. I used the matlab feval before, but this does not provide enough freedom to distribute memory properly for the real task. The big problem is, we need to be able to distribute the data on to shared and local memory to speed things up. The only way i have found that would be able to do this is using the mexfunction interface to call the cuda kernel. This would allow us to distribute the memory where we need it (hopefully). If anyone has another solution to this i would greatly appreciate it.

However, the problem i’m having with trying to implement the mex interface solution is that i cant get the kernel and the mexfunction to compile and link properly. I’m not entirely sure on the commands i need to use. When i try to compile the mex function and link in the kernel, the kernel function or the syntax, (kernel<<<grid,block>>>(a,b,c) to call the kernel isn’t recognized.

I’m using Matlab 2011b with all the toolboxes.

VecAdd.cu

__global__ void VecAdd(double *vector1, double* vector2, double* resultVector) 

{ 

    int idx = threadIdx.x;

    resultVector[idx] = vector1[idx] + vector2[idx]; 

}

VecAddMexFunction.cpp

/*

 * This mexFunction is a test stub to call a Cuda kernel function from Matlab

 *

 * This mexfunction will call C code to add two vectors together.

 */

#include <stdio.h>

#include "mex.h"

#include "cuda.h"

#include "cuda_runtime.h"

extern void vecAdd(double *vector1, double* vector2, int vectSize, double* resultVector);

void mexFunction (int nlhs, 

                  mxArray *plhs[],

                  int nrhs, 

                  const mxArray *prhs[])

{

double *vector1, *vector2;

    double *resultVector;

    int row, col;

/* Check for proper number of arguments */

    if (nrhs != 2) { 

        mexErrMsgTxt("Two input arguments required."); 

    }

    if (nlhs != 1) {

        mexErrMsgTxt("One output arguments required."); 

    } 

    if (mxGetM(prhs[0]) != mxGetM(prhs[1]) || 

        mxGetN(prhs[0]) != mxGetN(prhs[1])){

        mexErrMsgTxt("Input vectors must be the same size.");

    }

/* get the two vectors */

    vector1 = mxGetPr(prhs[0]);

    vector2 = mxGetPr(prhs[1]);

    row = (int)mxGetM(prhs[0]);

    col = (int)mxGetN(prhs[0]);

/* put the input vectors on the GPU */

    double *device_vect1, *device_vect2, *device_result;

    cudaMemcpy(device_vect1, &vector1, col*sizeof(double), cudaMemcpyHostToDevice);

    cudaMemcpy(device_vect2, &vector2, col*sizeof(double), cudaMemcpyHostToDevice);

mexPrintf("row: %d, col: %d\n",row,col);

    /* assign the return vectror */

    plhs[0] = mxCreateDoubleMatrix(row, col, mxREAL);  /* result vector */

    resultVector = mxGetPr(plhs[0]);

mexPrintf("calling vecAdd Cuda\n");

    VecAdd<<<1,col>>>(device_vect1, device_vect2, device_result);

    mexPrintf("returned from Cuda\n");

/* get the resulting vector of the GPU */

    cudaMemcpy(&resultVector, device_result, col*sizeof(double), cudaMemcpyDeviceToHost);

/* free the device memory */

    cudaFree(device_vect1);

    cudaFree(device_vect2);

    cudaFree(device_result);

}

It should be noted that i’m compileing this on the Matlab command line.

compiling kernel:

!nvcc -c -arch=sm_13 …/ISARLab-Dev/Cuda_Code/VecAdd.cu

VecAdd.cu

tmpxft_00001438_00000000-3_VecAdd.cudafe1.gpu

tmpxft_00001438_00000000-8_VecAdd.cudafe2.gpu

VecAdd.cu

tmpxft_00001438_00000000-3_VecAdd.cudafe1.cpp

tmpxft_00001438_00000000-14_VecAdd.ii

compiling and linking mexfunction:

mex -I"C:/Program Files/NVIDIA GPU Computing Toolkit/CUDA/v4.1/include/" …/ISARLab-Dev/Mex_Files/VectAddMexFunction.cpp VecAdd.obj

VectAddMexFunction.cpp

…\ISARLab-Dev\Mex_Files\VectAddMexFunction.cpp(54) : error C2065: ‘VecAdd’ : undeclared identifier

…\ISARLab-Dev\Mex_Files\VectAddMexFunction.cpp(54) : error C2059: syntax error : ‘<’

C:\PROGRA~1\MATLAB\R2011B\BIN\MEX.PL: Error: Compile of ‘…\ISARLab-Dev\Mex_Files\VectAddMexFunction.cpp’ failed.

Error using mex (line 206)

Unable to complete successfully.

I’ve had a look at a number of other posts, which suggest using nvmex does the linking for you, but i’ve downloaded the files from http://www.cs.ucf.edu/~janaka/gpu/using_nvmex.htm to try it this way and i still get the same error messages with an additional unknown architecture warning that i cant seem to fix.

So my questions are:

Am i getting the compilation + linking commands wrong? am i missing some step to accomplish this? is there something else that i could be missing, like an include directory or some files somewhere?

Thanks for the help in advance.

melonakos · April 11, 2012, 5:29am

Lot’s of other posts in these forums may help. Here is a recent one where mfatica has a good suggestion to fix the compiler errors by foregoing the nvmex scripts and compiling directly, and where I invite you to get a better experience all together with the Jacket SDK. Good luck!

theotheraussie · April 11, 2012, 6:46am

I tried what was suggested in the post, or something similar.

!nvcc -c -arch=sm_13 …/ISARLab-Dev/Cuda_Code/VecAdd.cu -Xcompiler -fPIC -I"C:/Program Files/MATLAB/R2011b/extern/include"

cl : Command line warning D9002 : ignoring unknown option ‘-fPIC’

I know i need the -fPIC option to work, but to see if it would compile without this, i removed it.

!nvcc -c -arch=sm_13 …/ISARLab-Dev/Cuda_Code/VecAdd.cu -Xcompiler -I"C:/Program Files/MATLAB/R2011b/extern/include"

nvcc fatal : Don’t know what to do with ‘Files/MATLAB/R2011b/extern/include’

I’m not exactely sure on what files are intended to be inlcuded from the original line -I /usr/local/matlab/extern/include, but this directory does not exist for me and i believe the one i substituted is the closest i could find. I thought -I"My directory" would work for directories with spaces? does anyone know the proper way to include directories with spaces?

Topic		Replies	Views
mex function not Cuda kernel CUDA Programming and Performance	3	1222	April 16, 2012
CUDA&MATLAB CUDA Programming and Performance	9	3805	November 24, 2009
mex function not Cuda kernel CUDA Programming and Performance	2	2581	April 23, 2012
Matlab & CUDA Cuda scripts executed from Matlab CUDA Programming and Performance	14	10285	September 9, 2015
Matlab, CUDA, and linker issues CUDA Programming and Performance	4	4132	May 29, 2010
Matlab, Mex files & CUDA CUDA Programming and Performance	3	1052	February 25, 2012
Combining Matlab and Cuda with mex-interface CUDA Programming and Performance	2	2062	November 25, 2011
Mex nvcc matlab & linux CUDA Setup and Installation	3	1790	August 6, 2014
Compiling CUDA in matlab. CUDA Programming and Performance	6	9186	January 10, 2011
Very General CUDA interfacing with MATLAB without NVMEX To have a vey general command for any CUDA c CUDA Programming and Performance	6	6974	June 19, 2009

Cuda from Mexfunction calling Cuda kernel from Matlab mexfunction

Related topics