cudnnConvolutionForward() returns CUDNN_STATUS_EXECUTION_FAILED when the available memory on GPU is low, but we have preallocated the needed workspace and passed the pointer into it. What’s the reason why it still needs additional memory on GPU? Is it possible to make it only use the workspace?
Hi @shshao ,
We suggest not to pre allocate all available GPU memory. CUDA operations such as loading modules and launching kernels may allocate memory under the hood and may fail if insufficient.
Thanks