How to check if kernel was launched? Is possible that kernel failed to launch but it was not recorde

My question is - if the following is the correct way to check if a kernel was launched:

__global__ void suspiciousKernel(int *i) {

  *i=123;

  __syncthreads();

  [...] //some big code goes over here

}

int main() {

	cudaError err;

	int *gpuI;

	int cpuI=42;

	err=cudaMalloc( (void**)&gpuI, sizeof(int));

	printf("Allocate: %s\n",cudaGetErrorString(err));

	err=cudaMemcpy( gpuI,&cpuI,sizeof(int),cudaMemcpyHostToDevice);

	printf("Send: %s\n",cudaGetErrorString(err));

	suspiciousKernel<<<1,512>>>(gpuI);

	err=cudaThreadSynchronize();

	printf("Launch: %s\n",cudaGetErrorString(err));

	err=cudaMemcpy(&cpuI,gpuI,sizeof(int),cudaMemcpyDeviceToHost);

	printf("Receive: %s\n",cudaGetErrorString(err));

	printf("Got value %d\n",cpuI);

}

According to Programming Guide:

So I would expect that if my kernel call crashes or is not executed for whatever reason, I will get err different than cudaSuccess out from cudaThreadSynchronize.

On the other hand, if the kernel is executed, I should now have value 123 under gpuI pointer, assuming my “some big code goes over here” does not modify (or even read/depend on) the value. What I get out from the above code is:

Allocate: no error

Send: no error

Launch: no error

Receive: no error

Got value 42

So my question is - what must happen so that I have these results?

Some notes:

  • I launch only one block of my suspicious kernel so __syncthreads() stops all threads on whole GPU.

  • It could happen that I change *i accidently, but if that happens what are the odds of setting it back to the old value?

Update: The reason the kernel didn’t launch was that it was using 33 registers and it was too much for this launch configuration.
Still, how am I to detect, at runtime, that kernel failed to launch if not the way I have shown above?

call cudaGetLastError after your kernel launch; launch errors are not sticky (and therefore cudaThreadSynchronize will return cudaSuccess) because they do not result in the context being destroyed.

Tested it out. Got

]Launch (get last error): too many resources requested for launch

Now I am happy :) Thank you!

Thought that errors from kernell calls are catched by the next first function which may return an error (that “Note that this function may also return error codes from previous, asynchronous launches” sentence). Obviously I was mistaken.