I was trying to implement multiplication of 30 million cuComplex numbers with another 30 million cuComplex numbers.I launched it with <<<10000,3000>>> it worked fine. after some time it showing me error “invalid configuration argument”(checked with cudaGetLastError()), also no other process is runing in the nivida GPU. But it works with <<<30000,1000>>> .please help me with it.
Impossible to diagnose without additional information. The description is equivalent to calling a car repair shop and telling them: “My car makes a strange noise. What’s wrong with it?”. For help with debugging issues, post complete buildable code, as minimal as possible (e.g. 50 lines).
The dimensions mentioned may suggest that your kernel are exceeding the kernel time limit imposed by an operating system watchdog timer for any GPUs that service a GUI in addition to being used to run computer kernels. Typically those limits are around two seconds. Try smaller problem sizes and observe the kernel execution times for those, with the help of the CUDA profiler.