System just halts for simple CUDA program

If I can think of anything, I’ll let you know, of course, but preventing lockups generally is not my ballgame.

Could you try to compile for and run the code on your 295?

If it runs on the 295, I should think that a further test could be to compile only for SM_13, and try to run that on the tesla.

If that one were to run, the problem would be narrowed down to SM_20 compilation or to your particular tesla, the latter being improbable since everything else runs fine.

That could give some results which you can use in a support-ticket.

And, maybe someone else with a tesla (2050) can try the code and see if the problem can be reproduced.

Hi there,

I appreciate your comments. I understand it’s hard to estimate what’s going on on my machine. I did try to run on Quadro card but when SRC_N is 2000 the display drivers crashes and is reset. At least, I don’t have to reboot.

Could you tell me what you mean by compiling for SM_13? I can see that there are three different “GPU Architecture” settings for the CUDA compiler. I don’t really understand what they are for. Since my Quadro card is 1.1 Compute Architecture I set the first one to SM_11. The second GPU Architecture is set to SM_20 by default. Not sure if that makes sense.

Hi there,

I appreciate your comments. I understand it’s hard to estimate what’s going on on my machine. I did try to run on Quadro card but when SRC_N is 2000 the display drivers crashes and is reset. At least, I don’t have to reboot.

Could you tell me what you mean by compiling for SM_13? I can see that there are three different “GPU Architecture” settings for the CUDA compiler. I don’t really understand what they are for. Since my Quadro card is 1.1 Compute Architecture I set the first one to SM_11. The second GPU Architecture is set to SM_20 by default. Not sure if that makes sense.

SM_13 is for the 200 series, I think this includes your 295. Possibly the main difference with 1.1 is doubles, which is not relevant here. The 2.0 probably is used for the Tesla. I used sm_13 (gpu archtecture) only, setting the 2 others (gpu architecture (2) and (3) to zero). I wondered if that might have an impact in your case. We didn’t check this before. Just checked sm_11 on my machine, again it worked with SRC_N = 5000. Basically, I have nothing left on my sleeve. Sorry.

SM_13 is for the 200 series, I think this includes your 295. Possibly the main difference with 1.1 is doubles, which is not relevant here. The 2.0 probably is used for the Tesla. I used sm_13 (gpu archtecture) only, setting the 2 others (gpu architecture (2) and (3) to zero). I wondered if that might have an impact in your case. We didn’t check this before. Just checked sm_11 on my machine, again it worked with SRC_N = 5000. Basically, I have nothing left on my sleeve. Sorry.

Hi Jan, thanks a lot for your help. The issue still remains but I do understand that you cannot help much more. I was advised to install the Tesla Compute driver but that didn’t help either.

I’ll try to post a bug report with NVidia.

Thanks again,

Christian

Hi Jan, thanks a lot for your help. The issue still remains but I do understand that you cannot help much more. I was advised to install the Tesla Compute driver but that didn’t help either.

I’ll try to post a bug report with NVidia.

Thanks again,

Christian

[quote

Any help is very much appreciated. I’m using VS2008 on a Windows 7 x64 box. The NVidia driver is 258.96 and I use CUDA 3.1.

[/quote

Since you are using Windows7, it is very likely that you hit a Timeout Detection and Recovery (TDR).

Look here for some details:

http://forums.nvidia.com/index.php?showtopic=65161

[quote

Any help is very much appreciated. I’m using VS2008 on a Windows 7 x64 box. The NVidia driver is 258.96 and I use CUDA 3.1.

[/quote

Since you are using Windows7, it is very likely that you hit a Timeout Detection and Recovery (TDR).

Look here for some details:

http://forums.nvidia.com/index.php?showtopic=65161

[quote name=‘philippev’ post=‘1117357’ date=‘Sep 14 2010, 05:39 PM’]

Thanks Philippe for your tip. Up until now the display driver just froze the system. I cannot tell if the display driver has a problem or Window’s TDR. I have installed the new Display driver ( 260.61 ) and Cuda 3.2 and now Windows tell me that the display driver has crashed but recovered. This is better but I still don’t quite understand why the display driver has a problem with such a simple cuda program.

Christian

[quote name=‘philippev’ post=‘1117357’ date=‘Sep 14 2010, 05:39 PM’]

Thanks Philippe for your tip. Up until now the display driver just froze the system. I cannot tell if the display driver has a problem or Window’s TDR. I have installed the new Display driver ( 260.61 ) and Cuda 3.2 and now Windows tell me that the display driver has crashed but recovered. This is better but I still don’t quite understand why the display driver has a problem with such a simple cuda program.

Christian

Your program might be simple but it is too long to run because you are using only one thread per wrap.

TDR under Windows7 is something aournd 2-3 sec. ( You can actually increase it through some Register key )

Please read this to know about TDR and how to adjust it

http://forums.nvidia.com/index.php?showtopic=65161

Your program might be simple but it is too long to run because you are using only one thread per wrap.

TDR under Windows7 is something aournd 2-3 sec. ( You can actually increase it through some Register key )

Please read this to know about TDR and how to adjust it

http://forums.nvidia.com/index.php?showtopic=65161

Phillipe, interesting. But tell me, why can other people run the app without any problems? Jan Heckman run the app on his GTX275 and it took a minute. There was no display driver problem. Also, bear in mind I have two GPU in my machine. One that is running the monitor ( Quadro NVS 295 ) and a tesla C2050 for GPGPU stuff.

Phillipe, interesting. But tell me, why can other people run the app without any problems? Jan Heckman run the app on his GTX275 and it took a minute. There was no display driver problem. Also, bear in mind I have two GPU in my machine. One that is running the monitor ( Quadro NVS 295 ) and a tesla C2050 for GPGPU stuff.

He simply ran it on Linux or WindowsXP or adjust its TDR timeout on his Windows7 machine.

He simply ran it on Linux or WindowsXP or adjust its TDR timeout on his Windows7 machine.

For the records:

The solution to my problem is to add a registry key which turns off Windows TDR ( Timeout Detection and Recovery ). Under

HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\GraphicsDrivers

I added a DWORD named TdrLevel and set it to 0.

The article http://www.microsoft.com/whdc/device/displ…dm_timeout.mspx is somewhat misleading since it doesn’t explain that a user has to create such a key and not just modify it under Windows 7.

For the records:

The solution to my problem is to add a registry key which turns off Windows TDR ( Timeout Detection and Recovery ). Under

HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\GraphicsDrivers

I added a DWORD named TdrLevel and set it to 0.

The article http://www.microsoft.com/whdc/device/displ…dm_timeout.mspx is somewhat misleading since it doesn’t explain that a user has to create such a key and not just modify it under Windows 7.

I ran it on win7. Had to disable WDDM TDR for Nsight and completely forgot about it, so thanks for helping out!