Mex nvcc matlab & linux

Hi,
Sorry if this topic is redundant, but I have not found any information on the web.
I am new in GPU cuda programming and I am trying to copy a mex example on my pc, without success. I do not know if it is related to linux, matlab 2014 or something else.

I copied the following code, named AddVectorsCuda.cpp:

#include “mex.h”
#include “AddVectors.h”

void mexFunction(int nlhs,mxArray *plhs, int nrhs, mxArray *prhs)
{
if (nrhs!=2)
mexErrMsgTxt(“Invalid number of input arguments”);

if (nlhs!=1) 
    mexErrMsgTxt("Invalid number of outputs");

if (!mxIsSingle(prhs[0])&& !mxIsSingle(prhs[1]))
    mexErrMsgTxt("input vector data type must be single");

int numRowsA=(int)mxGetM(prhs[0]);
int numColsA=(int)mxGetN(prhs[0]);
int numRowsB=(int)mxGetM(prhs[1]);
int numColsB=(int)mxGetN(prhs[1]);

if (numRowsA!=numRowsB || numColsA!=numColsB)
    mexErrMsgTxt("Invalid size. The sizes of two vectors must be the same.");

int minSize=(numRowsA<numColsA) ? numRowsA:numColsA;
int maxSize=(numRowsA>numColsA) ? numRowsA:numColsA;

if (minSize!=1)
    mexErrMsgTxt("Invalid size. The sizes of two vectors must be one dimentional.");

float* A=(float*)mxGetData(prhs[0]);
float* B=(float*)mxGetData(prhs[1]);

plhs[0]=mxCreateNumericMatrix(numRowsA,numColsB, mxSINGLE_CLASS, mxREAL);

float* C=(float*)mxGetData(plhs[0]);

addVectors(A,B,C,maxSize);

}

Where AddVectors.h
#ifndef ADDVECTORS_H
#define ADDVECTORS_H
extern void addVectors(float* A,float* B,float* C,int size);
#endif // ADDVECTORS_H

and AddVectors.cu:

#include “AddVectors.h”
#include “/usr/local/MATLAB/R2014a/extern/include/mex.h”
global void addVectorsMask(float* A, float* B, float* C,int size)
{
int i=blockIdx.x;
if (i>=size)
return;
C[i]=A[i]+B[i];
}

void addVectors(float* A,float* B, float* C, int size)
{
float *devPtrA=0, *devPtrB=0, *devPtrC=0;

cudaMalloc(&devPtrA, sizeof(float) * size);
cudaMalloc(&devPtrB, sizeof(float) * size);
cudaMalloc(&devPtrC, sizeof(float) * size);

cudaMemcpy(devPtrA, A, sizeof(float) * size, cudaMemcpyHostToDevice);
cudaMemcpy(devPtrB, B, sizeof(float) * size, cudaMemcpyHostToDevice);

addVectorsMask<<<size, 1>>>(devPtrA, devPtrB, devPtrC, size);

cudaMemcpy(C, devPtrC, sizeof(float) * size, cudaMemcpyDeviceToHost);

cudaFree(devPtrA);
cudaFree(devPtrB);
cudaFree(devPtrC);
}

I created the file object by using:

system(‘nvcc -c AddVectors.cu’)

Finally I compiled mex by using:
mex AddVectorsCuda.cpp AddVectors.o -lcudart -L"/usr/local/cuda/lib"

As result I got the following error message:

Building with ‘g++’.
Error using mex
/usr/bin/ld: AddVectors.o: relocation R_X86_64_32 against `.bss’ can not be used when making a shared
object; recompile with -fPIC
AddVectors.o: could not read symbols: Bad value
collect2: ld returned 1 exit status

What’s the problem?
Thank you in advance for your help

Have you worked through the mathworks-provided example?

[url]http://www.mathworks.com/help/distcomp/run-mex-functions-containing-cuda-code.html[/url]

Unfortunately their examples do not work. I contact them, but they have not answered yet.

Well it doesn’t look like you have done it correctly. Nowhere in the example I linked does it suggest that you compile your mex function with:

system(‘nvcc -c AddVectors.cu’)

Saying “their examples do not work” isn’t that useful. You might instead identify, if you went through their sequence, at which point things did not work, and what the output you observed was.

Specifically, what happened when you tried to work through the mathworks example?