178.08 for Windows XP yes, it fixes the watchdog timer...

XP 32: http://developer.download.nvidia.com/compu…(178_08)Int.exe
XP 64: http://developer.download.nvidia.com/compu…(178_08)Int.exe

Watchdog timer bug and cudaMalloc bugs are fixed. Apparently, these install now–yay!

Now… by “fix”, do you mean that i can run a kernel for as long as i want under xp with the device as my main rendering\monitor_attached device?
Or is there some other bug i was not aware of?

I believe that has something to do with the internals of Windows XP, rather than being a CUDA issue…though I hear the CUDA developers are working on a ‘workaround’ for this problem.

There was a bug that devices that were not attached to a display were affected by the watchdog timer. That is now fixed.

I’m a bit confused.
I’m running the 178.13 WHQL driver from September, 25.

Is this 178.08 older (so it would seem from the version number), despite being released later? :blink:

Thanks

Fernando

I don’t know about the display driver component, but it will definitely have a newer CUDA stack than 178.13.

Completely confused …

NVIDIA released 178.13 as the latest driver package and mfatica answered on my question about the watchdog problem that it is NOT fixed in 178.13.

No tmurray says that 178.08 FIXES the watchdog limitation. Am I right that 178.13 contains older CUDA-related stuff and 178.08 is the right package for CUDA users ?

If so, WHERE can I finally get 178.08 ??? :-)

the install package was broken, we’re working on getting a fixed one out ASAP.

bump–new install package, everything should work now!

No, it does not :-|

Better to say - the watchdog is removed and kernel does not stop working after 11-12 seconds - but it looks like it simply hangs up.

I’m testing it like this: open asyncAPI example from SDK and modify the kernel in the following way:

__global__ void increment_kernel(int *g_data, int inc_value)

{ 

	int idx = blockIdx.x * blockDim.x + threadIdx.x;

	for (int i = 0; i < 900; i++)

  g_data[idx] = g_data[idx] + inc_value;

}

It is possible to force the kernel to work longer and longer by modifying the ‘i < xx’ condition. On my test hardware (8500 GT) it works for about 11 seconds with ‘i < 800’ condition but it works forever with ‘i < 900’ condition. Once again, the 11-12 seconds boundary.

tmurray, I really appreciate your efforts and I say ‘thank you’ personally to you, but it seems like the guy who fixes the watchdog issue should be strongly motivated to finally do this job well (unless I’m wrong with the way I force kernel to work for a longer time).

blah, definitely was passing our tests (the watchdog is no longer being triggered, as it’s easy to determine when it is). will poke around at that sample and try to break it tomorrow morning.

any news ?

Repro’d something, not sure what exactly, but we’re working on it.

Perfect answer :-)

Good luck, I rely upon you, guys.

Would it be possible to merge the latest official display drivers (from 178.13 WHQL) within the new, fixed package?

It would be great!

Thanks,

Fernando

I have no idea–you could certainly try, but I don’t have a clue what would happen. It’s not something we would do, though, because it would make more sense just to roll out a newer build.

(don’t necessarily believe that higher version numbers mean newer for all components)

:( This is even more confusing.

Any chance NVidia rolls out say 178.20 with latest display drivers and OK CUDA components, during the next couple weeks?

Thanks

Fernando

tmurray, sorry for disturbance … but are there any news ? This thing tires both developers and customers, time to finally fix it and forget about it :-)

Well here we go again. I just noticed that the new driver has come out with an working installer in the 7th. I am one of the original posters about the watchdog problem. I have the GTX 280 for the second card.

The problem has now evolved into something more complexing:

The watchdog problem is still there for me. I have an additional problem though. When I run a program that will execute within time allowed everything is good. Then I run the program where I need more then 10 seconds. Some of my functions run 30 secs+. But here is the addtional problem that I was talking about; I can no long try to use that 280 card. I appears to be locked up after the first time watchdog is trigered. If I make a call ot that card, the application locks in what I think is some type of waiting action till the card is freed. However, it is never freed. I can still use the computer resources and the displaying cards resources, but not the 280. To get the resources back I have to restart the system.

Now please tell me, could this be a problem I am creating or are others getting this problem too.

The watchdog is not fixed on all systems with that release. We tracked down the problem, it’s fixed now (seriously, 100%, I promise, you can hit me with a bat if it’s not, because the hanging behavior is exactly what I saw when running on one configuration and it has since been fixed), and I believe we’re getting yet another driver ready for release.