Very General CUDA interfacing with MATLAB without NVMEX To have a vey general command for any CUDA c

Actually I have a Matlab Code and I want to parallelize a part of it using CUDA. I dont want to rewrite everything. I searched a lot and found out Matlab plugin for CUDA by Mathworks, but that fails for complex problems. I was looking for a general interfacing of a C/C++ (MEX) file that calls up a CUDA function in a .cu file. I found out a solution as but i was unable to find it out for linux. I believe if someone can help me out then we can have very very general solution for all problems using Matlab. I tried to figure out the things from the Code of both NVMEX and also the link I gave but in vain.
Can any one of you please help me out.

Read the whitepaper. The Makefile in the tar file will compile a generic mex file.

Thanks, I earlier saw it too. The make file which we are talking about uses nvmex to compile cuda code written in mex.

But I had two questions:

  1. Does this nvmex script is general, because I want to have 3-Dimensional FFT and also the CUDA code must start some OpenGL displays too.

  2. Can you help that if I wanted to do it this way that:: I have a cuda function in a .cu file. I can compile it separately with nvcc. Then I make a MEX code and call the CUDA function from it. and I can then compile that MEX file (.C code) from inside the matlab like i compile MEX file. I wrote a Matlab script for compiling that Mex code that calls up the Cuda functions, but it is not working properly.

It goes like this

The first line tells the name by which the function is to be called from matlab.

Second line opens the bash shell and compiles the cuda file

Third line compiles the MEX code and since the Cuda compiles file is used so the libraries are also included. But this is not working. Can anyone please point out where the Error lies.

Thank you.

Hi everyone,

Just wanted to chime in here that in our recent Jacket v1.1 release we include a Developer SDK option which is designed to let you integrate custom CUDA code into MATLAB/Jacket in a way that will allow your CUDA kernels to plug into the Jacket runtime. For more information on this, see the following:


Of course this is a commercial product and is not free like the do-it-yourself CUDA options. The CUDA options are great if you don’t mind incurring the roundtrip memory transfer cost for each call to that MEX function and don’t mind writing C/CUDA code. However, if you want your CUDA code to avail itself of the Jacket runtime for memory transfer optimizations and kernel execution optimizations, you might want to have a look at the Developer SDK.

Good luck!




I’m slowly making my way to a general solution. I am porting parts of our MATLAB code to directly compiled parrallel code with the eventual goal of moving it all to C++, VSIPL++, and/or CUDA (with the hope of replacing CUDA with OpenCL).

Currently I have one mexFunction “mex_cuda_core”, compiled with nvmex, that is the command interface to the “system”, the command interface lets me download and upload arrays (currently 2D) between the “device” and MATLAB. Then I can call C / C++ / CUDA functions that are also compiled into the .mex file and process those arrays.

The calling MATLAB code looks something like this
m3 = psi_ez3d_cuda_multibin_core(‘arr.get’,‘m3’);
m4 = psi_ez3d_cuda_multibin_core(‘arr.get’,‘m4’);

It’s all pretty basic with none of the slickness of Jacket.

However the interface is simple and let’s me focus on implementing the C++/C/CUDA code and finding/fixing the slow functions using the CUDA Visual Profiler, while leaving my support MATLAB code in place.
I am having problems using cuda-gdb (see my other thread), but I am doing okay with printfs right now. (CUDA 2.2 / C++ / deviceemu don’t seem to place nice)

The “trick” of the implementation is to use static C++ objects to track memory on the device across mexFunction calls and so the resource are deallocated when the mex file is unloaded.
It could probably be extended to load dynamic library files (.so) that contain functions for registering generic kernel. But I have never done that in Linux (only in windows)

Unfortunately I doubt that I can release the code. The good news is that the code is very minimal, and you may be able to base a similar system around Thrust (
If you are interested I can provide some pointers/hints through the forum.


Thanks. Well I can implement all of the things the only thing is that I have problem for compiling the cuda code from inside the matlab. Let me put the problem like this.:
I have running matlab code that has a function. I made that function in cuda and is stored in .cu file. Since directly MEX does not uses cu files so I made a function in C file that can be called from MEX and the C function calls the function in .cu file.
The problem is that how do i compile this from matlab.


This is what i put into the my_compule.m function, it seems to work. I extracted the appropriate information from the CUDA/MATLAB plugin.

[codebox]function my_compile(varargin)

!"/home/krys/opts/" & nvcc -c --compiler-options="-fPIC -D_GNU_SOURCE -pthread -fexceptions" --ptxas-options=-v -maxrregcount=32

!"/home/krys/opts/" & nvmex -f /usr/local/matlab/bin/ test.cpp testc.o -I/usr/local/cuda/include -L/usr/local/cuda/lib -lcufft -lcudart -Wl,-rpath,/usr/local/cuda/lib[/codebox]

The “/home/krys/opts/” is where I set my enviroment variables. Youi may not need it.

The CU file is

The C++ file is test.cpp