Has 2-second timeout problem been fixed yet?

TimothyMasters · November 1, 2012, 7:54pm

I developed a serious commercial app using CUDA 3.1 about two years ago. However, the 2-second timeout feature of WDM prevented my app from being usable with large problems, and I gave up on CUDA in favor of massive CPU multi-threading. But now that CUDA 5 is out, I’m considering getting back into CUDA development. However, this timeout limit is a deal killer. I know that someone provided a registry-edit fix for the developer’s computer, but that’s not good enough. I refuse to tell my customers that they have to edit their registry in order to run my software! I need a way for my app to temporarily disable the timeout while it’s running. Is this possible yet? Thanks!

Tim

pszilard · November 1, 2012, 8:50pm

Switch OS-es and your problem will be gone! ;)

Alternatively, you can tell your customers to use the un-crippled Tesla drivers by either using Tesla or Quadro cards (and paying a pretty premium) or by fixing the driver (you can edit the .inf file easily).

sjiagc · November 2, 2012, 3:46am

TDR(2-second timeout problem) is a configuration of Windows. You could disable it by editing the register. Please refer go this page: Microsoft Docs - Developer tools, technical documentation and coding examples

njuffa · November 2, 2012, 6:38pm

As pszilar alludes to, the watchdog timer timeout is not a CUDA issue. Time-outs are also not specific to Windows, they can also occur on Linux (and presumably Mac OS X, but I have no personal experience with that platform). In general, a GPU can, at any given moment, either serve a compute task or a graphics task. This means running a CUDA kernel is mutually exclusive with refreshing a GUI as long as there is only a single GPU in the system. In order to prevent the GUI from becoming unresponsive, operating system implement a watchdog timer.

To avoid the watchdog timer issue on Linux when only one GPU is present, simply run without X. Not sure what all the alternatives are under Windows are, besides manipulating the TDR timeout limit. I seem to recall that using two GPUs, and extending the desktop only to the less powerful one, is one technique one can use. As far as I know, running with the TCC driver on Windows (already mentioned) is mutually exclusive with running a GUI on the same GPU, which is why the TCC driver is not affected by the TDR issue.

eugeneo · November 2, 2012, 11:49pm

I can confirm that this also applies to Mac OS X.

wimpy1 · November 5, 2012, 10:27am

I tried some of the registry tweaks for Windows, but still ran into trouble. I ended up chopping up my CUDA kernel to use smaller batches of data that each finished well under 2 seconds. Running lots of smaller batches worked ok for the task I was facing, but opens the door to other problems.

I now have a Tesla card and am really happy with the TCC driver and additional capabilities of that card, even though it was expensive.

sjiagc · November 6, 2012, 3:39am

I succeeded in it. Have you restarted your machine after changing the registry?

TimothyMasters · November 9, 2012, 11:47am

Thanks for all the replies! Unfortunately, I was not clear enough in my question, for which I apologize. I know that this timeout is an OS issue, not a CUDA issue, and I know that there is a registry fix. But this is not an option for me because the apps I sell are generally installed on all computers in a company, and I can’t ask the IT guy to edit all those registries. Also, breaking up the problem into smaller chunks introduces more complexity and overhead than I want to deal with. But I do see what must be a way: software editing of the registry via the Windows API.

Here is the rough idea: The Windows API offers (I believe) a way for a running program to edit the registry, though I’m sure it imposes some restrictions for security. I’m not enough of a Windows expert to know the restrictions. My hope was that someone would add a CUDA API call that does the appropriate Windows registry edit to allow the CUDA developer to specify whatever timeout is desired. That way the programmer could not only change it as desired, but then put it back to two seconds when the program is finished. Does anyone have any ideas on this approach?

Tim

Topic		Replies	Views
Timeout under Linux, is it possible to remove it? CUDA Programming and Performance cuda	6	1011	January 20, 2023
windows timeout/TDR problem with multiple cards CUDA Programming and Performance	2	1943	June 12, 2012
CUDA Kernel Execution Timeout on GeForce Trying to turn off the Kernel Timeout on gtx480 for compute CUDA Programming and Performance	16	70619	November 9, 2010
Updating desktop to be stopping while running CUDA CUDA Programming and Performance	5	8378	March 8, 2007
CUDA accelerated program running on display GPU freezes system CUDA Programming and Performance	7	3468	August 5, 2018
Need to remove timeouts and the "launch timed out and was terminated" message CUDA Programming and Performance	20	11389	May 24, 2010
Cuda known limitations CUDA Programming and Performance	4	7152	October 9, 2010
Simple CUDA program hitting size limits/errors on Windows but not Linux CUDA Programming and Performance	23	1990	January 12, 2019
Configuring timeout CUDA Programming and Performance	3	3887	October 12, 2007
CUDA kernel timeout CUDA Programming and Performance	12	58858	December 22, 2022

Has 2-second timeout problem been fixed yet?

Related topics