CUDA_ERROR_LAUNCH_TIMEOUT MATLAB, REMOTE UBUNTU, .CU FILE

hi Folks:

I am brand to new to CUDA, and this is my first post.

I got an error code CUDA_ERROR_LAUNCH_TIMEOUT, so I googled that, and it seems has something to do with the running time of the kernel

I am using my win 7 laptop to access a remote Ubuntu where MATLAB is installed. I coded a *.cu file, and a *.ptx is generated using nvcc. I programming in MATLAB (on that remote Ubuntu) to initiate a CUDA kernel, then use feval() to run my program in that *.cu file.

The first time when I execute MATLAB code, I can initiate a kernel object with parallel.gpu.CUDAkernel(), but error occurred at feval(): driver timeour! Then after that, when I execute same MATLAB again, I got stuck at parallel.gpu.CUDAkernel(), CUDA_ERROR_LAUNCH_TIMEOUT kept showing up.

I don’t know if I can debug/trace what happened in my .cu file, cos I have no idea how to do so. My *.cu file is basically a double “for” loops, where the first loop is about 30000 long and second loop is about 78400 long.

I don’t think I can change any setting on that remote Ubuntu where the GPU is installed.

Anyone has any pointer how should I solve this problem, or what direction should I be headed?

Thanks a lot.

First of all, do not fear. GPU computing in MATLAB(r) is awesome when done right and can provide tremendous speedups.

In my biased (yet well-informed) opinion, you should head in a direction away from the Parallel Computing Toolbox™ from MathWorks(r) for anything related to GPU computing, for reasons listed here :)

The Jacket SDK is the best way to integrate CUDA code into MATLAB(r), as explained here.

If we can be useful to you, feel free to email me directly (see my signature below). Good luck!

With respect to the actual problem you posted, try splitting your work between multiple kernel launches so that each one takes less time.

I’m a bit surprised about the setup though. Does the remote Ubuntu computer have a display connected to the GPU, and is anybody working on that display? In that case it would be highly annoying for that user as the screen would freeze when you launch a kernel. If there is no display connected, there should not be a timeout on kernel launches.
If there is a display connected, but nobody using it, the computer would better be booted to runlevel 3 (network but no X Windows).

hi John, Tera:

Thanks for your replies.
I figure the message showed because there is a bug in my *.cu file. So I guess I need to debug it, although, I have no idea how to do so.
Seems I cannot watch how those variables change, and cannot trace execution step by step in that *.cu file.

In linux, you can open MATLAB(r) from the commandline inside the GDB debugger with “matlab -Dgdb”. Not sure if that is useful to you, but sometimes it can help.

By doing that I can step in the *.cu file?

You can’t step, but you can sometimes get useful trace information.

One potentially helpful way is to use cuda-memcheck via matlab ( checks for things like out of bounds memory access in your kernel ).

I don’t remember the exact commands but you do a memcheck by launching the matlab.exe with the input commands to launch an m-file, this m-file in turn calls your CUDA related functions. In windows it was something like:

cuda-memcheck -c matlab.exe -nojvm -nosplash -wait -r m_file_name_here

Where “m_file_name_here” is written without .m, I also think that m-file needs to include an exit call for matlab.

Thanks to above.

I did a little change: my double loop has 2 indices: i and j; where (i = N-1; i>=0, i–), and (j = 1; j<=M, j++); But this gives error as I mentioned.
Then I changed the outside loop, j, to a while loop, with j initialized as
j = blockIdx.xblockDim.x + threadIdx.x;
and j is updated in each iteration as
j += blockDim.x
gridDim.x;
Now, the CUDA_ERROR_LAUNCH_TIMEOUT error is gone, but still, I am not getting the right answer. Anyone has any idea what this might suggest?
It is pretty weird, because my variable, a[j] has length M.

You might want to consider posting your code here. I know that for instance tera quite enjoys spending some time and checking other peoples kernels :) ( which he is very good at too!)

External Image

OK, I will start a new thread. You guys are so nice :)

try some quotation to see if that will post code nicely:

Use [font=“Courier New”][code]…[/code][/font] tags, not quotes.