Hello everyone.
I need some help with strange error that had showed after the updating to the latest driver 372.90.
The following error occurs when i try to debug CUDA code:
(before the driver was updated the same program ran smoothly)
The error type is cudaErrorLaunchFailure (4)
“unspecified launch failure”
However when i run Nsight CUDA Debugger the kernel runs and finish successfully
without throwing any error message.
I am using some of Kepler abilities in the kernel and running it with compute_35,sm_35
Any ideas how to solve this problem?
Thanks for your attention.
CUDA versions: 7.5 and 8.0
System: GTX 970 and GTX 750Ti
I7 4790 with 32GB of RAM
This is the kernel that crashes during debugger run but
running smoothly when the Nsight debugger is turned on:
__global__ void GlobalFixedScan(int* ArrayToScanAfterBlockScan, int* GAOR, int ArraySize)
{
int Tid=threadIdx.x+blockIdx.x*blockDim.x;// global ID
__shared__ int gaorval[32];
if (Tid<ArraySize)
{
int MyValue=0;
MyValue=ArrayToScanAfterBlockScan[Tid];//updating values from array
int ValToAdd=0;
for (int i=0; i<blockIdx.x ; i++)
{
if(threadIdx.x%warpSize==0)
{gaorval[threadIdx.x/warpSize]=GAOR[i];}//each warp reads separately from global memo
//GAOR array size is gridDim.x
ValToAdd+=gaorval[threadIdx.x/warpSize];
}
MyValue+=ValToAdd;
ArrayToScanAfterBlockScan[Tid]=MyValue;
}
}