I get the following error when I run an inference on an engine I generated after parsing an ONNX model:
../rtSafe/cuda/cudaElementWiseRunner.cpp (149) - Cuda Error in execute: 9 (invalid configuration argument
According to Cuda’s documentation, this error occurs when:
cudaErrorInvalidConfiguration = 9
This indicates that a kernel launch is requesting resources that can never be satisfied by the current device. Requesting more shared memory per block than the device supports will trigger this error, as will requesting too many threads or blocks. See cudaDeviceProp for more device limitations.
It’s probably because of my layer is requesting too much memory, which is likely due to bad computation of some size. As the source code for this part is not public, could you provide help on whatever it is trying to do so that I can better understand the root cause?