Strange error after updating the latest driver 372.90

Hello everyone.
I need some help with strange error that had showed after the updating to the latest driver 372.90.
The following error occurs when i try to debug CUDA code:
(before the driver was updated the same program ran smoothly)

The error type is cudaErrorLaunchFailure (4)
“unspecified launch failure”

However when i run Nsight CUDA Debugger the kernel runs and finish successfully
without throwing any error message.

I am using some of Kepler abilities in the kernel and running it with compute_35,sm_35

Any ideas how to solve this problem?
Thanks for your attention.

CUDA versions: 7.5 and 8.0
System: GTX 970 and GTX 750Ti
I7 4790 with 32GB of RAM

This is the kernel that crashes during debugger run but
running smoothly when the Nsight debugger is turned on:

__global__ void GlobalFixedScan(int* ArrayToScanAfterBlockScan, int* GAOR, int ArraySize)
{
	int Tid=threadIdx.x+blockIdx.x*blockDim.x;// global ID
	__shared__ int gaorval[32];
	if (Tid<ArraySize)
	{
		int MyValue=0;
		MyValue=ArrayToScanAfterBlockScan[Tid];//updating values from array
		int ValToAdd=0;
		for (int i=0; i<blockIdx.x ; i++)
		{
			if(threadIdx.x%warpSize==0)
			{gaorval[threadIdx.x/warpSize]=GAOR[i];}//each warp reads separately from global memo
                                                                //GAOR array size is gridDim.x
			ValToAdd+=gaorval[threadIdx.x/warpSize];
		}
		MyValue+=ValToAdd;
		ArrayToScanAfterBlockScan[Tid]=MyValue;
	}
}

I think that this problem has begun after Windows 10 Anniversary Update.
I ran the same code in another machine with Tesla K20 and Windows 7 and it worked.

Any ideas?

I think that this problem has begun after Windows 10 Anniversary Update.
I ran the same code in another machine with Tesla K20 and Windows 7 and it worked.

Any ideas?

I think that this problem has begun after Windows 10 Anniversary Update.
I ran the same code in another machine with Tesla K20 and Windows 7 and it worked.

Any ideas?