Does anyone have any advice as to what is the most likely cause of this pretty generic error ( returned by cudaGetLastError())?
[codebox] unsigned int num_threads = 256;
unsigned int blocks = (len/num_threads) + 1; //printf("block: %d\r\n",blocks); dim3 grid(blocks, 1); dim3 threads(num_threads, 1);
//dim3 grid(1, 1);
//dim3 threads(1, 1);
actFuncDouble<<< grid, threads >>>(f);
cutilCheckMsg("Kernel execution failed");[/codebox]
actFuncDouble( double* d_data )
// write data to global memory const unsigned int tid = blockIdx.x*blockDim.x + threadIdx.x; //double data = d_data[tid]; d_data[tid] = 1/( 1+exp(-d_data[tid]) );
In most of my code I am calling CUBLAS functions, but I also needed to add a few basic ones of my own (listed above). Seems simple enough, I am wondering if there is a CUBLAS error being thrown and reqular driver/runtime CUDA doesn’t identifiy those even through I syncThreads after a cublas command completes. Any ideas?