minus sign in kernel time

Dears;

when i estimate the elapsed time for running a kernel, some times it gives a minus it gives a minus sign for the time and the kernel don’t work

why that happens?

likely a case of: the kernel did not launch properly, and hence the time is negative, than: the time is negative, hence the kernel did not launch properly

check that the kernel launched without error: the programming guide specifies how to do this; if i am not mistaken, it amounts to checking for errors just prior to the kernel launch, and then again afterwards

thanks little_jimmy;

I had used the following code after the kernel launch with this code :

cudaError_t error = cudaGetLastError();
		if (error != cudaSuccess)
		{
			// something's gone wrong
			// print out the CUDA error as a string
			printf("CUDA Error: %s\n", cudaGetErrorString(error));

			// we can't recover from the error -- exit the program
			return 1;
		}

it printed on the console window:
the launch timed out and was terminated

my kernel sorts an array , and it works will until size of 2561024
if i increase the size suck as 512
1024 the error occurs, although i use small amount of threads.

I don’t know why the error happens.

to quote the pg:

“Kernel launches do not return any error code, so cudaPeekAtLastError() or
cudaGetLastError() must be called just after the kernel launch to retrieve any
pre-launch errors. To ensure that any error returned by cudaPeekAtLastError()
or cudaGetLastError() does not originate from calls prior to the kernel launch,
one has to make sure that the runtime error variable is set to cudaSuccess just before
the kernel launch, for example, by calling cudaGetLastError() just before the
kernel launch. Kernel launches are asynchronous, so to check for asynchronous
errors, the application must synchronize in-between the kernel launch and the call to
cudaPeekAtLastError() or cudaGetLastError() .”

you should best call cudaGetLastError() just prior to the kernel launch as well, to make absolutely sure that the kernel is indeed the origin of the error

if that is indeed the case, you would likely have to add some breakpoints and debug the kernel; some individual decided to write a debugger, as he realized that in some cases, there is no simple way to correct incorrect code

read this thread:

https://devtalk.nvidia.com/default/topic/459869/cuda-programming-and-performance/-quot-display-driver-stopped-responding-and-has-recovered-quot-wddm-timeout-detection-and-recovery-/

When your kernel takes too long to execute, windows kills it. Increasing the size of your kernel results in longer execution time, resulting in eventually going over the threshold for TDR.

Thank you all

I had used the error check before and after the kernel , i found that the error is from the kernel

for the error(time out) , i had disabled the TDR and the kernel works … no error