support with error invalid configuration argument


I was trying to implement multiplication of 30 million cuComplex numbers with another 30 million cuComplex numbers.I launched it with <<<10000,3000>>> it worked fine. after some time it showing me error “invalid configuration argument”(checked with cudaGetLastError()), also no other process is runing in the nivida GPU. But it works with <<<30000,1000>>> .please help me with it.


Impossible to diagnose without additional information. The description is equivalent to calling a car repair shop and telling them: “My car makes a strange noise. What’s wrong with it?”. For help with debugging issues, post complete buildable code, as minimal as possible (e.g. 50 lines).

The dimensions mentioned may suggest that your kernel are exceeding the kernel time limit imposed by an operating system watchdog timer for any GPUs that service a GUI in addition to being used to run computer kernels. Typically those limits are around two seconds. Try smaller problem sizes and observe the kernel execution times for those, with the help of the CUDA profiler.

THis is illegal in CUDA:


It is not possible to choose 3000 threads per block.

This is legal:


the maximum is 1024 threads per block in CUDA.

regarding full datails
GPU: GTX 980
CPU:Intel® Core™ i7-6700 CPU @ 3.40GHz
Ram:16GB DDR4

global void multiply(cuComplex A,double B){
int temp=blockIdx.x
blockDim.x + threadIdx.x ;
A[temp]= cuCmulf(A[temp],make_cuComplex(cos(tempB),-sin(tempB)));

in main:
cudaError_t error = cudaGetLastError();
printf(“after rearrange CUDA error: %s \n”, cudaGetErrorString(error));

What txbob says. Note: I don’t look at problems requesting debugging help at all unless buildable repro code is posted. txbob is more accomodating :-)