How to solve cudnn_conv_layer.cpp:53] Check failed: status == CUDNN_STATUS_SUCCESS?

Hi, I am re-training the recognition network according https://github.com/dusty-nv/jetson-inference/blob/master/docs/imagenet-training.md.

When I step in training the model, cudnn_conv_layer.cpp:53] Check failed: status == CUDNN_STATUS_SUCCESS? appears. I tried to find engine:CAFFE in deploy.prototxt, but I did not find the engine. Anyone please give me a hand, thanks in advance!

Notes:
1, I use an eGPU( rtx2080 connected to my pc via type-c)
2, When I use nvcr.io/nvidia/digits:19.05-tensorflow (tensorflow python file), I also meet this “status == CUDNN_STATUS_SUCCESS” error. And I just added the bellowing code in main.py file to solved it.

from tensorflow.compat.v1 import ConfigProto
from tensorflow.compat.v1 import InteractiveSession

config = ConfigProto()
config.gpu_options.allow_growth = True
session = InteractiveSession(config=config)

3, When I use nvcr.io/nvidia/digits:18.05( Caffe, cpp file), is there anywhere to added some code like upper to fix it?

Hi,

May I know your host environment first? Is it Ubuntu?
If yes, which version do you use?

The most common issue is that the GPU status/driver is not well setup.
Would you try to get nvidia-smi log first and share with us?

$ nvidi-smi

And please also test this command inside the container to make sure the GPU is accessible in the container.

Thanks.