CUDA kernel crash on K80

I have a proven working code running fine on Tesla K20 with cuda-6.5 toolkit. When i try to run this code on Tesla K80 with cuda-8.0, it throws up error ‘Illegal memory access at xx line no in file’.

To isolate the problem, I tried on Tesla K20 with Cuda-8.0, it worked fine. That means no issues with the toolkit.

I further went on with debugging. the memory allocation with cudaMalloc was fine and copying the data to device through cudaMemCopy was fine. when i call cudaOccupancyMaxPotentialBlockSize to calculate the block and grid sizes, the application is terminating with the error given above. I commented out this line and hard coded the block size and grid size with block size = 1024 and grid size = 42. when i try to launch the kernel again same error is thrown

Any help?

Thanks in advance.

“Running fine” can also mean “happens to work”, rather than “proven to work”. Latent bugs based on race conditions sometimes don’t manifest on a particular platform.

Keep in mind that CUDA errors are sticky, so if there is not rigorous status checking on every CUDA API call and kernel launch, errors could be reported on a subsequent API call which may be confusing.

Have you tried running the code under control of cuda-memcheck with full checking?

Its a proven program giving correct result without race conditions on Tesla K20. however, now the problem not about the correctness of the result. neither am i able to call the cudaOccupyMaxPotentionBlockSize nor able to launch the kernel.