launch timeout on machine with old nvidia display driver

I have 4 machine, two of them have latest display driver installed, the other two are not, they were installed before CUDA 6.5 came out and never updated again.
I’m using CUDA 6.5 to write some application with different encryption algorithm, like md5/aes/unzip etc. Most of them work fine on all the machines, but now I’m working with ‘unzip’, it works fine on the two machine with latest display driver, but failes on the two machine with old display driver. It says launch timeout. I tried the VisualProfiler, the execution time is less than 1 ms, why can it be timeout.
I know it would be best if I update the driver, but the thing is my program need to work on different machines, and I cant force my client update their display driver. I need to change my source code to support the old driver. So, what can possibly causing the launch timeout on old driver?
The only difference I can see is the ‘unzip’ program allocates 100m memory on both host and device side(the return values are verified, all successfully), and the other working programs like ‘md5/aes’ don’t allocate such huge memory. Does it have something to do with launch timeout?
Thanks.

What operating system?

You can’t use a new toolchain on a older display driver.

now five machine

  1. Win7 32bit + latest driver. works fine
  2. Win7 64bit + latest driver. works fine
  3. WinXP 32bit + latest driver. works fine
  4. WinXP 32bit + old driver. crash.
  5. WinXP 32bit + very old driver. crash.

Why it works for all other programs like md5/aes, but not for unzip…

Your data quite clearly proves correct the previous poster’s point that you need to update your driver so it is compatible with the tool chain and CUDA runtime you use.

We could speculate why some apps work even with the old driver, e.g. the specific functionality invoked does not actually invoke CUDA, or those apps call a very limited number of API functions which have retained the same signature across multiple driver versions, or the apps have insufficient error checking so failures at CUDA level go unnoticed.

But that is very much a pointless exercise, in particular since we have no detailed knowledge of the code you are running. The fundamental fact is, every CUDA version requires a certain minimum driver version to work correctly across all applications based on that CUDA version.

Thank you. I’m trying to comment out the code line by line to see which one causes the problem. Maybe I should also check the driver version first to guarantee not working on old driver.