Matlab and CUDA: A Tutorial Very basic, zero-order introduction

Boxed_Cylon · June 25, 2008, 12:12am

As a result of my past couple of week’s work with CUDA ( a lot of External Image ) I’ve written up my notes in a very basic 22pp tutorial, using example codes. Much of the material is on these fora, but rather scattered around. Perhaps people will find the tutorial useful.

The Nvidia matlab package, while impressive, seems to me to rather miss the mark for a basic introduction to CUDA on matlab.

Happy to hear back from people with corrections and suggestions; it’s meant to be an evolving document.

(Tutorial revised 6/26/08 - cleanup, corrections, and modest additions)

(Tutorial revised again 8/19/08 - minor additions)

Changed to external link 9/16/09: [url=“http://faculty.washington.edu/dushaw/epubs/Matlab_CUDA_Tutorial_8_08.pdf”]http://faculty.washington.edu/dushaw/epub...torial_8_08.pdf[/url]
(Nvidia attachment seems to have been lost, alas.)

(Tutorial revised again 2/12/10 - minor additions)

[url=“http://faculty.washington.edu/dushaw/epubs/Matlab_CUDA_Tutorial_2_10.pdf”]http://faculty.washington.edu/dushaw/epub...torial_2_10.pdf[/url]

Sarnath · June 25, 2008, 4:03am

May God Bless You!

Boxed_Cylon · June 26, 2008, 11:30pm

I’ve cleaned up the tutorial document a bit, editorial corrections, clarified things here and there and added a new section on array dimensioning conventions. The file for download in the entry that started this thread has been updated. Cheers!

e.ping · June 26, 2008, 11:57pm

excellent tutorial, but I have to disagree on your motivation, I guess you underestimate the importance of MPI here. As long as our are on a single machine, you’re right )replace MPI with OpenMP or hand-crafted pthresds) . But there is no way of using CUDA instead of MPI for distributed memory clusters.

Boxed_Cylon · June 27, 2008, 12:23am

Thanks for the reply. I may not have written that paragraph quite right, but I think we are in agreement. One needs to use the proper tool for the job - in many ways CUDA and MPI are complementary, yin and yang, etc. That was what I intended to say.

melonakos · July 6, 2008, 12:25am

Great tutorial!

I just wanted to point out that we are in the process of building a full CUDA engine for MATLAB programs (named Jacket) that may be of interest to people who read this thread. We just launched a free beta release that you can grab at:

http://www.accelereyes.com

We would love to hear your thoughts regarding Jacket and insights on how it can be improved to make MATLAB GPU Computing as beneficial to the community as possible.

Best,

John Melonakos

Boxed_Cylon · July 9, 2008, 1:10am

I’ll add here a few notes that I may include in the next version of the tutorial (call them a “patch” to the tutorial for now, if you like, or a TO DO list):

→ On the other mail list, I inquired about using cudaMallocHost in a mex file to speed up the host-device communications. This is to use “pinned memory”. Matlab is fairly memory sensitive, so it is not generally possible to allocate memory this way. I suspect, however, that one could actually sometimes get away with using cudaMallocHost() in a mex file - it is a matter of avoiding any calls to matlab. So one could cudaMallocHost(), compute away, and then clear out the CUDA variables before Matlab is aware of what is going on and complain/crash (its the “don’t ask, don’t tell” memory policy).

→ I was puzzled over why my “surf” command wasn’t working. Well, it would work, but nothing would appear in the figure. It develops that surf does not work on single precision data! (at least for me) If A is single precision, then one needs to do “surf(X,Y,double(A))” to see the data. I think this qualifies as a bug in Matlab, if you ask me.

→ C and CUDA are “row major” in how data are organized in arrays, Matlab and cublas are “column major” (nomenclature).

→ I think I need to say something about the grids/blocks/threads/warps and kernel efficiency, but I don’t think I altogether understand those quite yet…as integral to CUDA as they are…

amirk · July 17, 2008, 9:03am

Hi

when I try to compile nvmex example, I get this error:

Where can I find this file?

Thanks

ps1946 · August 9, 2008, 10:56pm

I have exactly the same problem! External Media

Boxed_Cylon · August 10, 2008, 7:13pm

mexutils.pm …

I don’t know if this is any help, but such a file does not exist on my linux machine. I also do not have an nvmex.pl; rather, I have nvmex a bash script (rather than this perl script). mexutils.pm is a perl module, I believe. I suspect you will have to look to matlab/mathworks for some sort of perl package on windows? (I presume you’ve searched your machine for this file already, so it is not a matter of having the right search path.)

This link any help?:

[url=“When I try to run the MEX -SETUP command from within MATLAB 7.1 (R14SP3), why do I receive errors? - MATLAB Answers - MATLAB Central”]http://www.mathworks.com/support/solutions...lution=1-1TNK6Y[/url]

ps1946 · August 10, 2008, 9:41pm

Its fine, thank you. I solved this bug, mexutils.pm is a Perl module supplied by Matlab. Apparently my instalation was corrupted and rather old. Upgrading it got the mexutils.pm file back in matlab/bin. My bad, sorry.

Boxed_Cylon · August 20, 2008, 6:40am

I’ve developed my small HOWTO a little bit according to the notes above. The main addition is the discussion of multiple processors, warps and threads - I am still a little uncertain about those topics, so I’d be happy to hear of any corrections to misperceptions/poor discussion in the document. I aim for clarity above all things…

I’ve added the link to the latest version above in the first entry of this thread, but here it is also:

[url=“http://forums.nvidia.com/index.php?act=Attach&type=post&id=9257”]The Official NVIDIA Forums | NVIDIA

Other than to make any corrections that anyone might suggest, I think I am done with this document for the foreseeable future. I hope people find it useful.

Andy386 · January 23, 2009, 4:58pm

Thanks for your work !
It seems to be very usefull for a CUDA-Noob like me !

Andy386 · July 16, 2009, 3:38pm

It’s not sticky yet ? :D

Boxed_Cylon · February 12, 2010, 2:09am

I am bumping this thread to say that I’ve updated this tutorial somewhat:

Mention of Fermi and Tesla
Mention of CULA & MAGMA
Mention of AccelerEyes and GP-You
Memory management discussion
Profiler no longer a separate download/install
Some reorganization and update of CUDA distribution file names

[url=“http://faculty.washington.edu/dushaw/epubs/Matlab_CUDA_Tutorial_2_10.pdf”]http://faculty.washington.edu/dushaw/epub...torial_2_10.pdf[/url] (also listed at the top of the thread)

As always, happy to hear of suggestions of things to correct, add, or develop (insofar as I can).

SPWorley · February 12, 2010, 2:30am

Quick note, your guide talks about Fermi in the second paragraph, and also that Tesla is a compute-only device with no display. But the Fermi based Tesla S2050 and S2070 do have display output.

I am not a Matlab/CUDA user so I had two initial FAQs which I didn’t see the answer to.

Are GPU accelerated routines available for commercial Matlab only, or is there also support for Octave, a popular open source Matlab clone?
Are GPU accelerated routines available in Windows? Any other platforms than Linux?

Boxed_Cylon · February 12, 2010, 2:53am

I’ll fix the first correction. As for octave or Windows…I can’t really answer these questions properly. I haven’t the time to chase down these things, but need to focus on my own project. That said, I believe that people are using CUDA/matlab in the windows environment. I believe that octave can employ mex files or their equivalent - but will it work with CUDA?

If anyone would like to post some answers to these questions, I’d be happy to try to work them into the document.

E.D_Riedijk · February 12, 2010, 5:25am

I can attest that Matlab works with CUDA on both windows and linux.

I have once written up a small example of how to use CUDA within matlab, using cmake so you can keep the same source and build-system on both. (The Official NVIDIA Forums | NVIDIA)

As for octave: never tried it with CUDA.

Topic		Replies	Views
CUDA and Matlab with "cutil.h" CUDA Programming and Performance	0	1136	April 27, 2008
Tutorial MATLAB CUDA without nvmex and Parallel Computing Toolbox CUDA Programming and Performance	2	3409	May 20, 2011
Very General CUDA interfacing with MATLAB without NVMEX To have a vey general command for any CUDA c CUDA Programming and Performance	6	7095	June 19, 2009
Matlab & CUDA Cuda scripts executed from Matlab CUDA Programming and Performance	14	10550	September 9, 2015
alternate compliation method for matlab cuda code CUDA Programming and Performance	12	14336	April 30, 2010
CUDA kernels from Matlab Launching CUDA kernels from matlab CUDA Programming and Performance	1	2337	April 27, 2012
[Begginer] Trouble with CUDA CUDA Programming and Performance	4	1248	March 21, 2011
Matlab plugin for 1.1 CUDA Programming and Performance	4	3760	January 5, 2008
Problems with CUDA and MATLAB 7.9 CUDA Programming and Performance	8	9678	March 17, 2010
why to use matlab in nvmex? CUDA Programming and Performance	1	4745	August 25, 2008

Matlab and CUDA: A Tutorial Very basic, zero-order introduction

Related topics