I hope my post is not too trivial, but I’m quite new to the CUDA community and I cannot solve a very basic issue.
The problem is the following: I want to profile some MATLAB code using the Visual Profiler tool, but I am not able to obtain any result.
Some info:
Graphic Card: GeForce GTX 460
Operating System: Ubuntu 11.04
NVIDIA drivers: 295.49
NVCC v4.2, V0.2.1221 (same for NVVP)
I successfully installed the SDK toolkit and I successfully made some tests with the Visual Profiler using Python scripts (PyCUDA). I was able to create a timeline and collect statistics and other information about running times.
I then switched to some MATLAB code, but I cannot figure out how to run it using the Visual Profiler. I serched a lot around but nobody seems to have the same problem. I found that this should be the session configuration:
The Profiler starts correctly and runs the script until the end. In the console window I can see the correct output of the program. Everything seems to work fine, but at the end of some runs (NVVP runs the program 24 times to collect statistics) the Profiler window is still empty. I can see the green “checks” next to the Analysis panel (“Timeline”, “Multiprocessor”, “Kernel Memory”, “Kernel Instruction”). Anyway, the Analysis Results panel reports the following:
“Application timeline is required for the analysis”.
What’s wrong with my configuration? I Googled a bit and i followed some advices, like putting the “exit;” or “quit;” command at the end of the script, but it didn’t work.
I’m not using MEX files, my script just uses some gpuArray() operations, even no kernels.
I read something about an additional package for using CUDA and MATLAB, but the NVIDIA website redirect me on wrong pages if I try to download MathWorks.
The NVIDIA Profiler was not designed to profile M-code. There is a built-in profiler in MATLAB already that most people use. The most anyone has integrated the NVIDIA profiler with M-code has been through the loose coupling that is possible when you MEXify CUDA code.
Are you sure you want to use gpuArray’s? Have you seen how they compare (see http://accelereyes.com/compare)? In most cases, they are slower than the CPU, which may be why you’re interested in profiling things.
If you were to go with Jacket, you’d get access to a GPU-specific profiler that runs on M-code.
Good luck on this and shoot me an email if I can be useful to you. Cheers!
If you look carefully at the PDF, you’ll find that Martin (a MathWorks engineer) doesn’t share any results of accelerating real end-user applications. Rather, the PDF only shows functions that directly call good 3rd party libraries from NVIDIA or open source:
Spectrogram benchmarks only calls FFT which is a direct call to CUFFT
A\b only calls MAGMA
MTIMES calls CUBLAS
Simple arithmetic that must be called inside a cumbersome arrayfun call
So, yes, MathWorks has been able to absorb other people’s GPU libraries, getting benefit on a few functions. But for any real applications, gpuArray’s are slow, often slower than the CPU.
The benchmarks run on this comparison page were run by bloggers and other scientists, not by AccelerEyes people. You can try it yourself if you’d like. I’m happy to provide a license for that purpose.
Actually, I found that my application obtained a speed-up of about 40x when using GpuArrays compared to the single-CPU version. Basically this is due to the fact that I mainly use very simple and fast matrix operations, like sums and multiplications over very large matrices.
I’m still a bit confused, and also disappointed, because I was expecting NVIDIA to allow profiling any MATLAB code using the Visual Profiler tool.
If somebody has an idea on how to do good CUDA profiling of GpuArrays, I’d be very interested in any suggestion or references.
Makes sense. Both of those operations are just done by CUBLAS (i.e. sums/dot products & matrix multiplies are part of CUBLAS). So this thread just turns out to be a hat tip to CUBLAS.
It is quite a bit more complicated to profile GPU-based M-code. Jacket’s GPROFVIEW is the only tool that does this well.